Weekend Scripter: Convert Word Documents to PDF Files with PowerShell

Weekend Scripter: Convert Word Documents to PDF Files with PowerShell

  • Comments 17
  • Likes

Summary: Windows PowerShell MVP, Sean Kearney, talks about using Windows PowerShell to convert Word documents to PDF files en-masse.

Microsoft Scripting Guy, Ed Wilson, is here. Today’s blog is brought to you by Windows PowerShell MVP and honorary Scripting Guy, Sean Kearney.

Previous blog posts by Sean Kearney

Take it away Sean…

My boss looks up at me today, and sighs, “I love the built-in SaveAs PDF in Word 2013. But I want to do multiple documents at the same time. Oh, if ONLY there was some way to do this in bulk.”

The wires and lights starting blinking in my head. I’m about to leap out of my chair because I hear “Bulk.” Many to do, repeatable, and…

POWERSHELL!

So accessing a file in Microsoft Word programmatically is quite easy. We’ve been doing it for years.

$Word=NEW-OBJECT –COMOBJECT WORD.APPLICATION

$Doc=$Word.Documents.Open(“C:\Foofile.docx”)

And along the same lines, we could save this same file in the following manner.

$Doc.saveas([ref] “C:\Foofile.docx”)

If we would like to save it in an alternate format, like in .pdf format, things get a wee bit fancier. We have to speak a bit of .NET.

$Doc.saveas([ref] “C:\Foofile.pdf”, [ref] 17)

If your brain didn’t pop out just now, you’re OK. Let’s get a little fancier. What if I want Microsoft Word to save that PDF file with the same name as the parent without knowing the name?

Now we’re stepping into the land of fun. We can access the file name of that single document in the following way. Three of the available properties in the Word object are the Name of the document, the Path to the document, and the FullName path. I poked out using the following cmdlet to…well, to be honest…to guess.

$Doc | get-member –membertype property *Name*

Image of command output

Note the two little nuggets. In poking about and using a similar search for Path,  I found the property holding its path.

$Doc | get-member –membertype property *Path*

So I could do something like this: Open some file, get the file name information, swap out .docx with .pdf, and then save it.

Image of command output

With these three bits of information, I don’t have to actually know  the file name. Word will tell me based on the document. But let’s simplify this down a bit. Let’s just take a known file name and resave it as a PDF file with the same file name. All we need to do is run a Replace() method on the provided file name, and swap .docx with .pdf.

Yes….we’re presuming it’s a .pdf, J.

$File=”C:\Foofolder\foofile.docx”

$Word=NEW-OBJECT –COMOBJECT WORD.APPLICATION

$Doc=$Word.Documents.Open($File)

$Doc.saveas([ref] (($File).replace(“docx”,”pdf”)), [ref] 17)

$Doc.close()

So that’s all fine and dandy. But what if we have a folder of DOCX files that we want to convert at once?

Let’s pretend we ran the following lines in Windows PowerShell:

$File=”C:\Foofolder\foofile.docx”

$Files=GET-CHLDITEM ‘C:\FooFolder\*.DOCX’

Now we could cheat and access the full file name and path by accessing the FullName from the drive system. But let’s have some fun with Word and ask it these questions.

Foreach ($File in $Files) {

    # open a Word document, filename from the directory

    $Doc=$Word.Documents.Open($File.fullname)

Now let’s ask Word what is the name of the file. And while we’re at it, let’s swap the .docx file extension with .pdf.

  $Name=($Doc.Fullname).replace(“docx”,”pdf”)

Then we’ll access that file within Microsoft Word and use the built-in SaveAs PDF option in Word 2010 or Word 2013 to produce a PDF file in the same folder as the original Word document. When done, we close the file.

    $Doc.saveas([ref] $Name, [ref] 17)

    $Doc.close()

}

So when done our script will look like this:

# Acquire a list of DOCX files in a folder

$Files=GET-CHLDITEM ‘C:\FooFolder\*.DOCX’

 

Foreach ($File in $Files) {

    # open a Word document, filename from the directory

    $Doc=$Word.Documents.Open($File.fullname)

 

    # Swap out DOCX with PDF in the Filename

$Name=($Doc.Fullname).replace(“docx”,”pdf”)

 

    # Save this File as a PDF in Word 2010/2013

    $Doc.saveas([ref] $Name, [ref] 17)

    $Doc.close()

}

So why the big secrecy about using the document names in Word itself? In the future, I’ll show you how to programmatically trigger a mail merge, and you’ll see where it’s needed there.

Cheers and remember to keep on scriptin’.

~Sean,
The Energized Tech

Thanks, Sean, for taking the time to share your scripting expertise with us today. Join me tomorrow when I have a guest blog post from Ingo Karstein about using Windows PowerShell with SharePoint. It is cool and you do not want to miss it.

I invite you to follow me on Twitter and Facebook. If you have any questions, send email to me at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum. See you tomorrow. Until then, peace.

Ed Wilson, Microsoft Scripting Guy 

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
  • Just wanted to mention that this is also possible to do with office 2007 ( if you have the "save to pdf"-Add on installed)

  • @MB I forgot about that add on. It has been a long time since I used Office3 2007. Thank you for sharing this.

  • MB - You're absolutely correct!  The Save as PDF is a feature you can download and install and this would most definitely work for the same reason!

    But of course I personally love Office 2013 and how it integrates into Skydrive out of the box! :)

    Sean

  • I agree with Sean.  2013 has some excellent new bits but takes some getting used to due to the UI diffrences.  Many features are self-announcing and many need to be discovered.

    SkyDrive is now more useful due to 2013 integration and the new subscritpion purchase at $99 for 5 copies of the full product including MSAccess is awsome for home users.

  • Hey thanks for getting me started.  I used this script but needed to make a couple of changes for it to work

    My final script can be found here http://pastebin.com/U7xgs8mM

    I had to add I to CHLDITEM

    I also had to add a line for the $File and $Word variables.

  • Nice script.

    In your final code though, please change

    "$Files=GET-CHLDITEM .."

    with

    "$Files=GET-CHILDITEM .."

    You missed "I" in CHILD.

  • The blog give us the technique for converting word document into PDF. The method seems to be good and prove vital for many member.

    <a href="http://www.emailconvert.com"> pst to pdf</a>  

  • Good ideas very helpful .I was able to fill out a form online yesterday (http://goo.gl/4o0t8R) you might want to try. It was easy and it works for me.

  • For those that want to convert doc->docx, add these lines:
    $saveFormat = [Enum]::Parse([Microsoft.Office.Interop.Word.WdSaveFormat],”wdFormatDocumentDefault”);
    $Doc.Saveas([ref] $Name, [ref] $saveFormat)
    and
    it should be:
    $Word = new-object -ComObject "word.application"

  • thanks

  • Hello there! I just wanted to inform you about a free toolkit kitpdf.com which might be useful when in need for various conversions. It converts webpages into pdf files, as well as pdf documents to epub/mobi formats for a better reading experience on your eReader. Thanks!