Hey, Scripting Guy! Part 1: How Can I Update Many Office Word Documents at Once?

Hey, Scripting Guy! Part 1: How Can I Update Many Office Word Documents at Once?

  • Comments 1
  • Likes

Bookmark and Share

(Editor's note: This is part 1 of a two-part article originally intended for TechNet Magazine. Part 2 will be published tomorrow.) 
 

Little clouds of dust rise from the ground like miniature tornados carrying powdery sand that drifts along lazily on a nearly still breeze. The tires that support the ancient bus grind well-worn rocks and produce more powdery sand that readily fills any void created by previous traffic. Hushed voices leave words that hang in the air like the soft dust that clings to anything moist. The approach to Saqqara follows an ancient route along winding canals, date palms, and camels. The short ride from bustling Cairo scrolls me back thousands of years as I enter the temple complex of Djoser. Across the complex, rising with the seeming force of eternity stands the step pyramid of Djoser. I hold my breath as I stand in wonderment at the massive structure that has remained for more than 3,000 years. As I close my eyes, a breeze seems to pick up and for a few moments I am transported back to the time when this complex was new and served the impressive gateway for which it had been designed. I stand as one more point on a vast time continuum.

Few things seem to remain for multiple millennia, and those that do gain traction from the current milieu will not emerge from the world of information technology. Documents are often out of date before the virtual ink dries on the screen of the laptop. Web sites come and go, and migrate from URL to URL. Product names change, and so do the names of features that persist from version to version. In a rapidly changing world, how can you keep your documentation up to date? Using Microsoft Office Word, you can use the find-and-replace tool to change all instances of a word or phrase with another word or phrase of your choosing. But what do you do when you have hundreds of Word documents that need to be inspected and updated? This is where scripting comes in.

The complete ReplaceWordsLogResults.ps1 script is seen here.

ReplaceWordsLogResults.ps1

# ------------------------------------------------------------------------
# NAME: ReplaceWordsLogResults.ps1
# AUTHOR: ed wilson, Microsoft
# DATE: 8/11/2009
#
# KEYWORDS: Word.Application
#
# COMMENTS: This script replaces a word in a document file
# with another word or phrase.
#
# TNM
# ------------------------------------------------------------------------
Param(
 $path = "C:\data\fso"
)#end param

# *** Functions here ***
Function Get-WordSelection([string]$file,[boolean]$visible)
{
 $script:Word = New-Object -ComObject word.application
 $Word.Visible = $visible
 $script:doc = $Word.Documents.Open($file)
 $script:selection = $Word.Selection
 } #end function Get-WordSelection
 
 Function New-LogObject ($document, $replaced)
 {
 $logObject = New-Object psobject
 $logObject | Add-Member -MemberType noteProperty -Name "document" -Value $document
 $logObject | Add-Member -MemberType scriptProperty -Name "replaced" -Value { $rtn }
 $logObject | Add-Member -MemberType scriptProperty -Name "date" -value { get-date -uformat "%Y/%m/%d" }
 $logObject | Add-Member -MemberType scriptProperty -Name "user" -value { $env:username }
 $Script:doc.Save()
 $Script:doc.close()
 $logObject
 } # end function new-logobject

# *** Entry to Script ***

$files = Get-ChildItem -Path $path -Include *.doc,*.docx -Recurse
$wordHash = @{"misspelled" = "spelled incorrectly" ; "done"="finished"}
$logfile = "ReplaceResults.csv"
$i=0
$ReplaceAll = 2
$FindContinue = 1
$MatchCase = $False
$MatchWholeWord = $True
$MatchWildcards = $False
$MatchSoundsLike = $False
$MatchAllWordForms = $False
$Forward = $True
$Wrap = $FindContinue
$Format = $False

$files | ForEach-Object {
  $file = $_.fullname
  $i ++
  write-progress -activity "Searching For Word documents" `
 -status "Progress:" -percentcomplete ($i/$files.count*100)
 
 Get-WordSelection -file $_.fullname -visible $false
 
 foreach($FindText in $wordHash.keys)
 {
  $rtn = $Script:Selection.Find.Execute($FindText,$MatchCase,
   $MatchWholeWord,$MatchWildcards,$MatchSoundsLike,
   $MatchAllWordForms,$Forward,$Wrap,$Format,
   $wordHash.$FindText,$ReplaceAll)
 } #end foreach findtext
 
 New-LogObject -document $file -replaced $rtn
} | Export-Csv -Path (join-path -Path $path -ChildPath $logfile) -NoTypeInformation

$Script:word.quit()

The first thing the ReplaceWordsLogResults.ps1 script does is create a command-line parameter named path to allow you to specify the directory to search for Word documents. The Param keyword is used to create the parameter. To use the parameter from the Windows PowerShell console, you specify the name of the script and the path you wish to search. This is seen here:

ReplaceWordsLogResults.ps1 –path c:\mydirectory

The param statement is shown here:

Param(
 $path = "C:\data\fso"
)#end param

The first function that is created is the Get-WordSelection function. It receives two inputs when it is called from within the script. The first parameter is the file parameter and the second parameter is the Boolean visible parameter. The file parameter contains the path to a Word document that will be processed. The visible parameter is used to indicate whether the Word document will be visible while it is being scanned. The variables that are created inside the Get-WordSelection are marked with the script scope. This means the variables will be available anywhere within the ReplaceWordsLogResults.ps1 script. The Word.Application object is the main object that is used when automating Microsoft Word. It is used to obtain the Documents collection object to open the Word document. It will also be used to exit the Word application after all the Word documents have been processed. This portion of the Get-WordSelection function is seen here:

Function Get-WordSelection([string]$file,[boolean]$visible)
{
 $script:Word = New-Object -ComObject word.application
 $Word.Visible = $visible
 $script:doc = $Word.Documents.Open($file)

To work with the text inside the Word document, you will need to obtain a Selection object. The easiest way to obtain a Selection object is to query the selection property from the Word application object. The returned Selection object will need to be used outside the Get-WordSelection function. The variable that is created to hold the Selection object is therefore created with a script-level scope. This is seen here:

 $script:selection = $Word.Selection
 } #end function Get-WordSelection

Because the ReplaceWordsLogResults.ps1 script will need to create a log that is used to track the progress of the changes made to the Word documents in the target folder, a function named New-LogObject is created. It receives two parameters: the path to the document being scanned and a Boolean value that indicates if changes were made to it. This is seen here:

 Function New-LogObject ($document, $replaced)
 {

The log object that is created is an instance of a PSObject. The New-Object cmdlet is used to create the object. The returned object is stored in the variable $logObject:

 $logObject = New-Object psobject

Four properties are added to the $logObject. The Add-Member cmdlet is used to add the members to the object. The first two properties are passed to the New-LogObject function when it is called. The Date property is created by using the Get-Date cmdlet, and the user name is pulled from the environmental drive. This is seen here:

 $logObject | Add-Member -MemberType noteProperty -Name "document" -Value $document

 $logObject | Add-Member -MemberType scriptProperty -Name "replaced" -Value { $rtn }

 $logObject | Add-Member -MemberType scriptProperty -Name "date" -value { get-date -uformat "%Y/%m/%d" }

 $logObject | Add-Member -MemberType scriptProperty -Name "user" -value { $env:username }

The Word document is saved and closed by using the appropriate methods from the Word Document object. The newly created object is then emitted back to the calling script. This is seen here:

 $Script:doc.Save()

 $Script:doc.close()

 $logObject

 } # end function new-logobject


Join us tomorrow for part 2, the conclusion to this post about updating many Word documents at once. In the meantime, follow us on Twitter or Facebook. If you have any questions, send e-mail to us at scripter@microsoft.com or post your questions on the Official Scripting Guys Forum. See you on Monday. Peace!

Ed Wilson and Craig Liebendorfer, Scripting Guys

 

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment