A question that is often asked of Support in regard to legacy Public Folders is whether they're replicating and how much progress they're making. The most common scenario arises when the administrator is adding a new Public Folder database to the organization and replicating a large amount of data to it. What commonly happens is that the administrator calls Support and says:
The database on the old server is 300GB, but the new database is only 150GB! How can I tell what still needs to be replicated? Is it still progressing??
You can raise diagnostic logging for public folders, but reading the events to see which folders are replicating is tedious. Most administrators want a more detailed way of estimating the progress of replication than comparing file sizes. They also want to avoid checking all the individual replication events.
There are a number of ways to monitor replication progress so that one can make an educated guess as to how long a particular environment will take to complete an operation. In this post, I'm going to provide a detailed example of one approach to estimating the progress of replication by comparing item counts between different public folder stores.
To get the item counts in an Exchange 2003 Public folder database you can use PFDAVAdmin. The process is outlined in this previous EHLO blog post. For what we're doing below, you'll need the DisplayName, Folderpath and the total number of items in the folder. The rest of the fields aren't necessary.
To get the item counts on an Exchange 2007 server, use (remember there is only one Pub per server):
Get-PublicFolderStatistics -Server <servername> | Export-Csv c:\file1.txt
To get the item counts on an Exchange 2010 server, you use:
Get-PublicFolderStatistics -Server <servername> -ResultSize unlimited | Export-Csv c:\file1.txt
There are some very important caveats to this whole procedure. The things you need to watch out for are:
For the actual comparison you can use any number of products. For this blog I have chosen Microsoft Access for demonstrating the process of comparing the CSV files from the different servers. To keep things simple I am going to use the Access database. There are some limitations to my approach:
An outline of the process is:
Assumptions for the steps below:
If your file is different than expected you will have to modify the steps as you go along
Here are the steps for conducting the comparison:
1. Create a new blank Microsoft Access database in a location that has more than double the size of your CSV files available as free space.
2. By default, the Export-Csv cmdlet includes the .NET type information in the first line of the CSV output. Because this line will interfere with the import, we'll need to remove it. Open each CSV file in notepad (this can take a while for larger files) and remove the line highlighted below. In this example the line starting with “AdminDisplayName” would become the topmost line of the file. Once the top line is deleted close and save the file.
TIP You can avoid this step by including the -NoTypeInformation switch when using the Export-CSV cmdlet, which filters out the .NET object type information from the CSV output. For details, see Using the Export-Csv cmdlet on TechNet. (Thanks to #MSExchange MVP @SteveGoodman for the tip!)
3. Import the CSV file to a new table:
4. In the wizard that starts specify the file is delimited as shown and then click Next.
5. Tell the wizard that the text qualifier is the double quote (character 34 in ASCII), the delimiter is the comma and that the “First Row Contains Field Names” as shown in Figure 4.
Note: It is possible that you will receive a warning when you click “First Row Contains Field Names”. If any of the field names violate the rules for a field name Access will display a warning. Don’t panic. Access will replace the non-conforming names with ones it considers appropriate (typically Field1, Field2, etc.). You can change the names if you wish on the Advanced screen.
6. Switch to Advanced view (click the Advanced button highlighted in Figure 4) so that we can change the data type of the FolderPath field. In Access 2010 and older the data type needs to be changed from Text to Memo. In Access 2013 it needs to be changed from Short Text to Long Text. While we are in this window you have the option to exclude columns that are not needed by placing a checkmark in the box from the skip column. In this blog we are only going to use the FolderPath, name and the item count. You can also exclude fields earlier in the process by specifying what fields will be exported when you do the export-csv. The following screenshots show the Advanced properties window.
Figure 5a: Access 2010 and older
Figure 5b: Access 2013
Note: If you think you will be doing this frequently you can use the Save As button to save your settings. The settings will be saved inside the Access database and can then be selected during future imports by clicking on the Specs button.
7. Click OK on the Advanced dialog and then click Finish in the wizard.
8. When prompted to save the Import steps click Close. If you think you will be repeating this process in the future feel free to explore saving the import steps.
9. Access will import the data into a table. By default the table will have the same name as the source CSV file. The files used in creating this blog were called 2007PF_120301 and 2010 PF_120301. If there are any import errors they will be saved in a separate table. Take a moment to examine what they are. The most common is that a field got truncated. If that field is the folderpath it will affect the comparisons later. If there are other problems you will have to troubleshoot what is wrong with the highlighted lines (typically there should be no import errors as long as the FolderPath is set as a Memo field).
10. Go back to Step 2 to import the second file that will be used in the comparison.
11. Now a query must be run to determine if any folderpath exceeds 255 characters. Fields longer than 255 characters cannot be used for a join in an Access query. If we have values that exceed 255 characters in this field we will need to exclude them from the comparison. Additional work to split a long path across multiple fields can be done, but that is being left as an exercise for any Access savvy readers.
12. To get started select the options highlighted in Yellow in Figure 6:
13. Highlight the table where we want to check the length of the folderpath field as shown in Figure 7. Once you have selected the table click Add and then Close:
14. Switch to SQL view as shown in Figure 8:
15. Replace the default select statement with one that looks like this (please make sure you substitute your own table name for the one that I have Bolded in the example):
SELECT Len([FolderPath]) AS Expr1, [2007PF_120301].FolderPath FROM 2007PF_120301 WHERE (((Len([FolderPath]))>254));
Note: Be sure the semi-colon is the last character in the statement.
16. Run the query using the red “!” as shown in Figure 9:
17. If the result is a single empty row (as shown in Figure 10) then skip down to step 19. If the result is at least one row then go back to SQL view (as shown in Figure 8) and change the statement to look like this one (as before please make sure 2007PF_120301 is replaced with the table name actually being used in your database):
SELECT [2007PF_120301].FolderPath, [2007PF_120301].ItemCount, [2007PF_120301].Name, [2007PF_120301].Identity INTO 2007PF_120301_trimmed FROM 2007PF_120301 WHERE (((Len([FolderPath]))<255));
18. You will get a prompt like the one in Figure 11 when you run the query. Select Yes:
19. After it is done repeat steps 11-18 for the other CSV file that was imported to be part of the comparison. If you have done steps 11-18 for both files you will be comparing then advance to step 20.
20. Originally the FolderPath was imported as a memo field (Long Text if using Access 2013). However we cannot join memo fields in a query. We need to convert them to a text field with a length of 255.
If you got a result greater than zero rows in step 16 this step and the subsequent steps will all be carried out on the table specified in the INTO clause of the SQL statement (in this blog that table is named 2007PF_120301_trimmed).
If you were able to skip steps 17 and 18 this step and the subsequent steps will be carried out on the table you imported (2007PF_120301 in this example).
Open the table in Design view by right-clicking on it and selecting Design View as shown in Figure 12. If you select the wrong tables for the subsequent steps you will get a lot of unwanted duplicates in your final comparison output.
21. Change the folderpath from Memo to Text as shown in Figure 13. If you are using Access 2013 change it from Long Text to Short Text.
22. With the FolderPath field highlighted look to the lower part of the Design window where the properties of the currently selected field are displayed. Change the field size of folderpath to 255 characters as shown in Figure 14.
23. Save the table and close its design view. You will be prompted as shown in Figure 15. Don’t panic. All the folderpaths should be shorter than the 255 characters specified in the properties of the table. The dialog is just a standard warning from Access. No data should be truncated (the earlier queries should have seen to that). Say Yes and repeat steps 20-23 for the other table being used in this comparison. If you make a mistake here remember that you will still have your original CSV files and can always fix the mistake by removing the tables and redoing the import.
24. We have been on a bit of a journey to make sure we prepared the tables. Now for the comparison. Create a new query (as shown in Figure 6) and highlight both tables that have had the FolderPath shortened to 255 characters as shown in Figure 16. Once they are highlight click Add and then close.
25. Drag Folderpath from the table that is the source of your replication to Folderpath on the other database. The result will look like Figure 17.
26. In the top half of the Query Design window we have the tables with their fields listed. In the bottom half we have the query grid. You can make fields appear in the grid in 3 ways:
For this step we need to add:
27. Go to an empty column in the grid. We need to enter the text that will tells us the difference between the two item counts. Type the following text into the column (be sure to use the table names from your own database and not my example):
Note: After steps 25-27 the final result should look like Figure 18. The equivalent SQL looks like this:
SELECT [2007PF_120301_trimmed].FolderPath, [2007PF_120301_trimmed].ItemCount, [2010PF_120301_trimmed].ItemCount, Abs([2007PF_120301_TRIMMED].[ItemCount]-[2010PF_120301_TRIMMED].[ItemCount]) AS Expr1 FROM 2007PF_120301_trimmed INNER JOIN 2010PF_120301_trimmed ON [2007PF_120301_trimmed].FolderPath = [2010PF_120301_trimmed].FolderPath;
28. Run the query using the red “!” shown in Figure 9. The results will show you all the folders that exist in BOTH public folder databases, the itemscount in each database and the difference between them. I like the difference reported as a positive number, but you might prefer to remove the absolute value function.
There is more that can be done with this. You can use Access to run a Find Unmatched query to find all items from one table that are not in the other table (thus locating folders that have an instance in one database, but not the other). You can experiment with different Join types in the query and you can deal with Folderpaths longer than a single text field can accommodate. These and any other additional functionality you desire are left as an exercise for the reader to tackle. I hope this provides you with a process that can be used to compare the item counts between two Public Folder stores (just remember the caveats at the top of the article).
Thanks To Bill Long for reviewing my caveats and Oscar Goco for reviewing my steps with Access.