Thoughts from the EPS Windows Server Performance Team
At least once a week, someone on the Performance team will get a customer call concerning hangs or resource depletion on their file server. The file server in question is used for user home folder storage and users are accessing Outlook Personal Storage (.pst) files stored on the server from their client. The issue will manifest as either a server hang, or PagedPool depletion (Event ID 2020). Oftentimes the issue will occur first thing in the morning - when users are logging on and launching Outlook. In especially severe cases, the issue occurs several times daily. Sometimes the server will hang for a few minutes and then continue operating for a few minutes - and then hang again. Rinse & repeat. The users are frustrated because of slow access to their data, the server administrators are frustrated because they are tasked with fixing the problem, and upper management is frustrated because everyone else is frustrated.
If you're in this situation - there's good news ... and very bad news. The good news is that this problem is very common and is a known issue. The very bad news (from the customer's standpoint) is that PST files on a LAN/WAN is an unsupported configuration. Some customers are very surprised to hear this but Network Stored PST files have been unsupported since the days of Exchange 4.0. Microsoft KB Article 297019 goes into some detail about the effects of Network PST files:
"A .pst file is a file-access-driven method of message storage. File-access-driven means that the computer uses special file access commands that the operating system provides to read and write data to the file.
This is not efficient on WAN or LAN links because WAN/LAN links use network-access-driven methods, commands the operating system provides to send data to or receive from another networked computer. If there is a remote .pst (over a network link), Microsoft Outlook tries to use the file commands to read from the file or write to the file, but the operating system then has to send those commands over the network because the file is not on the local computer. This creates a great deal of overhead and increases the time it takes to read and write to the file. Additionally, the use of a .pst file over a network connection may result in a corrupted .pst file if the connection degrades or fails."
Let's use an example to illustrate the problem and also follow the problem through to its end result.
Let's say that a user sends an e-mail message to 500 users within the company. All of these users have their e-mail delivered directly to their PST file which is stored on the File Server. Some of these 500 users may need to extend their PST files to receive it. To extend a PST, an extra allocation on disk has to be made via NTFS. This locks out the whole volume while free space is allocated and the Master File Table (MFT) is updated. While this is happening for each user, all I/O for the other 499 users is on hold.
Allocating free space can take an extended time, especially if the disk is fragmented. Now factor in multiple users extending their PST's in the same timeframe, and significant periods of MFT lockout might be observed, which in turn is seen as inability to access any other file on the volume, resulting in queueing in the server service work queues, and sometimes SRV 2019, 2020, 2021, or 2022 events being logged. This scenario might overload the disk(s).
Setting aside the example of one email being sent to a group of users, imagine if you had a couple of hundred users who each have two or three PST files. These users have been with the company for a while, and they rarely (if ever!) delete their email from their PST files. The files continue to grow in size - let's use an average of 1 GB as the size of the PST file. Now consider that when each user launches Outlook, they make a request for two (or three) files, each of them being about 1 GB in size. Then consider what happens when 200 users all launch Outlook around the same time when they get to work. 200 x 3 x 1 = 600 GB of data being requested at the same time. That's an awful lot of Disk & Network I/O to process simultaneously. This is a very common scenario - the file server "freezing" for a few minutes at a time while it tries to service these requests.
The queuing in the server service work queues is what causes this temporary hang. The server service uses work items to handle I/O requests that come in over the network - for example: a request to extend a PST file. These work items are queued in the server service work queues, and from there they are handled by the server service worker threads. The work items are allocated from a kernel resource called Non-Paged Pool (NPP).
The server service sends these I/O requests down to the disk subsystem. If, for reasons mentioned above, the disk subsystem does not respond in time, the incoming I/O requests are queued via work items in the server work queues. Since these work items are allocated from NPP, eventually this resource runs empty. Running out of NPP causes systems to hang eventually (logging an Event ID 2019 in the process).
Digging down into this from more of a troubleshooting perspective, we can usually see issues caused by the PST files manifested in Poolmon and Perfmon captures. For example, we may see the LSwn pool tag allocation climbing in a Poolmon trace. These allocations are made by SRV.SYS. The size of the allocation is configurable via the SizReqBuf registry value. One allocation is made for each work item used by the server service. When looking at this through Perfmon, you will notice a steady decrease in the "Available Work Items" counter. If Available Work Items reaches zero, then clients may experience difficulties accessing files (any files, not just the PST files!). You may also experience 2019 errors if the problem lies with LSwn allocations (Non-Paged Pool depletion)
Another tag that highlights the issues with the PST files is the MmSt tag. This tag represents Mm section object prototype PTEs - a memory management-related structure used for mapped files. Put a different way, this is the pool tag that is used to map the OS memory used to track shared files. MmSt issues often manifest as Paged Pool depletion (Event ID 2020).
Is there any server-side tweaking that can be done to mitigate some of these effects? Yes. Is there any guarantee that this will resolve the problem completely and indefinitely? No. As an environment continues to scale up, the problem will continue to manifest itself despite all the tweaking that we can do. At some point, the tweaking itself may contribute to the problem because we've reached a point where the server simply cannot handle the workload.
(Many thanks to Rob and Kevin from our CPR team for their technical input!)
- CC Hameed
This is a problem I am having with Microsoft Office Outlook
Cannot start Microsoft Office Outlook. Cannot open the Outlook window. The set of folders cannot be opened. The file C:\Documents and Settings\Dale\Local Settings\Application Data\Microsoft\Outlook\Outlook.pst is use and cannot be accessed. Close any application that is using this file, and then try again. You might need to restart your computer.
If you want to create a lot of calls to your help desk, end up on a lot of conference calls, up late at night troubleshooting weird issues, and attend a lot of meetings, have someone blindly implement a corporate solution to save .pst files to your fleet W2k3 Standard builds used for your user homedirs.
Over the wan we have many locations across the country with w2k3 standard servers with 600-700 gig of storage most of which is used for personal storage of user homedirs. The builds are on well known hardware from a premier vendor.
A couple of years back someone implemented a corporate solution to save mail archives to the local servers in each office. It wasn't too long after that, the problems began to roll in and it went on for several months.
Some of the major issues were:
- Users would get kicked out of the server or couldn't connect to at all. Their network drive mappings to the server would disappear.
- Sporadically, administrators could not RDP to the servers and when they did the desktop and window icons would be partially blacked out.
- Ntbackup would not run correctly and the backup logs showed zero bytes and files backup in the backup logs but stated that the operation completed successfully.
Server and network management was livid about the entire issue as they had no idea that desktop support had implemented the .pst solution. Being stubborn they insisted that the solution be abandon. Since it was so wide spread there was no way to back out of the .pst solution. I worked with Microsoft for several weeks and we came up with a tweak to the pagepool that has worked now for going on two years. I’ve implemented the same solution on every single server that has had the issue and each issue listed above went away and never came back.
Although things have worked out for us, I would suggest that you do not implement saving .pst file via unc or drive mappings unless you hire additional staff to handle the increase call volume you’ll receive from your user population. Find a different solution.
Can you share the tweak information that fixed this problem?
Are these recommendations for storing or accessing? What if a PST is only stored on the network drive and never accessed?
@Neal - if all you're doing is storing the .PST files on the server and no clients attach to them, then you are OK. Once the clients start attaching to them across the wire, then you may start seeing issues.
@Kates - take a look at KB312362 (referenced in the post) - there is a registry modification that changes the threshold at which Paged Pool memory is trimmed.
Although you can perform registry changes to trim the paged pool memory, you still risk corruption of the PST's and other files residing on a LAN/WAN volume. To be more specific, if the servers are clustered, hangs, MPIO errors, NTFS errors may occur. The server will note/suggest running chkdsk, but unfortunately as long as PST files are stored and accessed, the symptoms will not go away.
There's a reason Microsoft DOES NOT SUPPORT THIS CONFIGURATION. Any arguement against their stance on support is insane, as THEY are the creators of the technology.
Your best bet is to have users be responsible for their own data. You have another alternative. For SMB's, users should store their own PST's. For "Enterprise Companies" most have some sort of Vault/backup/archiving of email. As mentioned earlier in this thread, EAS, Symantec Ent Vault are solutions.
Invest in proper design, as the costs of support or re-design will be greater in the long run.
We have many users who are happily using PSTs over the network. The difference is that they are on NetApp NAS.
Question to the performance team: will the x64 version of Windows (2003 or 2008) also suffer form paged pool depletion if the users have networked PSTs?
From pstools utilities; http://technet.microsoft.com/en-us/sysinternals/bb896649.aspx
psfile.exe | find /i ".pst"
Will give you an idea of the .pst file usage on your file server.
I have about 180 open pst files on my file share as I speak, and it's not all bad (though distributed across 3 drives)
As far as "Don't do it!", I would like to introduce Microsoft to Layer 8 of the OSI model - The political layer:
True, relying on users to manage their own .pst files in a enterprise environment is bad regardless of the file types performance. Budget issues aside, our limited hardware and backup windows prevent our users from having mail stores over 650mb. 3rd party vaulting solutions are not anything we will be looking at for the foreseeable future either, though I'm trying to hype it up so executives would at least want it for themselves (start out at 25 licenses for executives and their assistance's... add some managers... ???... start the road to modernization!). Otherwise, to retrieve mail more than a few months old, it is a multi hour process or days if it involves our offsite storage service.
So, no matter what, about 30% of the user base relies on PST files on network shares - mostly roaming and thin client users.
These are the settings implemented on our file server and seem to allow up to 200 .pst file's to be open over the network without too many problems, though your mileage may vary depending on actual .pst usage and hardware:
:: PoolUsageMaximum below is not present in registry by default,
:: but default system behavior would have it at 80
reg add "HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management" /V PoolUsageMaximum /T REG_DWORD /F /D "40"
:: PagedPoolSize is set to 0 by default
:: do not implement this on 32bit 2003 servers with more than 64gb of RAM
reg add "HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management" /V PagedPoolSize /T REG_DWORD /F /D "4294967295"
:: Chimney is supposed to offload TCP/IP processing to network adapters,
:: but some NIC driver cause high NonPaged pool usage. Optional if NIC drivers are suspect.
Netsh int ip set chimney DISABLED
:: default setting is usually 1. increases the non pagepool memory
FSUtil.exe behavior set memoryusage 2
:: as mentioned in previous posts, your RAID controller needs mondo cache
:: and if your on a SAN, be mindful of other projects being shared on the
:: SAN space.
:: also check out
:: for other file server/client performance tips
In conclusion, the article is much appreciated! Still, the archaic high I/O methods used to access .pst files in these modern times boggles many of us. If my users of Word, Excel, Photoshop and AutoCAD had the same problems accessing their files, network shares would be impossible. I find it unbelievable that one user transacting with his or her .pst file over the network can cause more havoc than dozens of MS Access databases being written to at once. Please MS, bring pst files into the 21st century.
What I have to do with Terminal server users in a 2003 enviroment?
Thay have Roaming profile and network storage only.
Have I to buy a 3rd part solution?
@Giorigo - If the .PST files are stored in the user's roaming profile and downloaded to the Terminal Server at user logon that is different than having the .PST file stored on a file server and accessed across the LAN / WAN.
Struggled for months with our file server, with a lot of .pst files.
PST are used by users for archiving, our email is hosted with a 3th party.
Often users complained about they couldn't open their PST file.
I was able to reduce the problem by disabling the Shadow Copies on that drive !
A few people have asked for instructions on how to determine the issue exists in their environments. It's odd that as technical people we'll provide the instructions for a workaround, but not the instructions to show how we determined the issue actually exists.
Neither the Microsoft article nor the attached RPC troubleshooting document have this information. I thought if a company really wanted the problem gone, they would provide information for companies to prove the issue exists.
Don't get me wrong, from reading there does appear to be an issue, but does anyone have instructions to show how to determine the issue actually exists?
Thank you, in advance.
I do not think there is a magic solution to the performance issue. Once you place PST files on the network and access them from Outlook -- you are asking for trouble.
So there are two fundamental options:
A. Kill the PST (Yea, let see you come with a reasonable to manage alternative and cost)
B. Move the PST file to the workstations, where they belong, and back them up using some PST backup utility like EdgeSafe or MailSync
Worse thing you could ever do, I hate network drives for this kind of application.
Does anyone know if EdgeSafe PST2PST Backup can also backup PST files created by Google's Sync tool?
I can not find any refernce for this http://www.backupst.com, nor on their site at http://www.datamills.com