Chad's Eclectic Tips and Tricks

Tips some days, not everyday, mostly Windows & SharePoint from Chad Schultz - Premier Field Engineer

How to use Sysinternals Process Monitor and Process Explorer to Troubleshoot SharePoint

How to use Sysinternals Process Monitor and Process Explorer to Troubleshoot SharePoint

  • Comments 6
  • Likes

Sysinternals Process Tools Descriptions and Information

The Sysinternals web site was created in 1996 by Mark Russinovich and Bryce Cogswell to host their advanced system utilities and technical information. Microsoft acquired Sysinternals in July, 2006. These tools are not loaded on Windows operating systems by default. They can be downloaded from http://technet.microsoft.com/en-us/sysinternals/default.aspx or http://live.sysinternals.com/. The http://live.sysinternals.com site has the latest public builds of the tools and is more up to date than the TechNet site. All examples are based on at least Process Explorer version 11.31.0.0 and Process Monitor 2.3.0.0. There is no installer/uninstall for these tools. The first time the programs are run, the EULA will display, after accepting the EULA the first time, this screen should not reappear. The tools can also be run straight from the web using the following format: http://live.sysinternals.com/Procexp.exe and http://live.sysinternals.com/ProcMon.exe . You can also map a drive letter right to the public location by running SUBST drive: \\live.sysinternals.com\tools although this may not work when a proxy server is set.

Process Monitor (Process Monitor.exe)

Monitors File, Registry, network and process activity by process. Collects data when running and can be filtered to track down process issues. Process Monitor replaces FileMon and RegMon, except for back level operating systems.

Process Monitor runs on Windows 2000 SP4, XP SP2, Vista, 2003, 2008 and Windows 7 32 bit and 64 bit.

Process Monitor does not run on Windows 2000 pre SP4 and may not always be able to be used to troubleshoot SharePoint Portal Server 2001. You must use Filemon and Regmon to monitor Windows 2000 and SharePoint Portal Server 2001; if Process Monitor does not run on you server. These can be downloaded from the Sysinternals TechNet site, http://technet.microsoft.com/en-us/sysinternals/default.aspx.

Some of the command-line switches are below. Refer to the Procmon.chm file for a complete list.

/Openlog <saved PML log file>

Directs Process Monitor to open and load the specified log file.

/Noconnect

When this flag is present Process Monitor does not automatically start logging activity.

/AcceptEula

Automatically accepts the license and bypasses the EULA dialog.

/Quiet

Don't confirm filter settings on startup.

/Run32

Use this switch to run the 32-bit version of Process Monitor on 64-bit Windows to open logs generated on 32-bit systems

*Note: If you are going to open a .PML log file from a 32 bit computer on a 64 bit Windows computer you will need to enter the /Run32 switch to view the log, or you will get the following error when trying to open the log file.

Procmon_32bitError

Process Monitor User Interface

Procmon_Menu3

By default, Process Monitor uses virtual memory to store captured data. Use the Backing Files dialog, which you access from the File menu, to configure Process Monitor to store captured data in files on disk. Enabling this option has Process Monitor log data to the disk in its native PML format as it captures it. If running a long capture you can set the logging to a backing file, File->Backing Files...

Profiling

This event class can be enabled from the Options menu. When active, Process Monitor scans all the active threads in the system and generates a profiling even for each one that records the kernel and user CPU time consumed, as well as the number of context switches executed, by the thread since its previous profiling event. Note: the System process is not included in profiling.

Profiling Events

Use this menu entry to open the thread profiling configuration dialog, where you enable thread profiling and the rate at which thread profiling events generate. When thread profiling is enabled, Process Monitor captures thread stack traces and CPU utilization that you can use to identify the source of CPU-related performance issues.

Process Explorer (Process Explorer.exe)

Monitors data on running processes in real time, does not capture historical data, except for CPU, Disk, Page File and Network activity graphs. The data in the graphs cannot be saved.

Process Explorer can be used to view detailed information on the currently running process on a system; including which images or executable code is loaded in memory for a process and the handles to the registry, file system and other types of handles. You can also change the priority; suspend and kill processes and even certain threads of a process.

Some of the command-line switches are below. Refer to the Procexp.chm file for a complete list.

/e

Request UAC

/t

Start minimized

/p:[r|h|l]

Run at a different priority than normal

/ s:[PID]

Select the specified process

Troubleshooting SharePoint with Sysinternals Process Tools

These are some of the SharePoint processes that can be monitored.

·     Windows SharePoint Services Timer (OWSTIMER.EXE)

·     Internet Information Server application pools (w3wp.exe)

·     Office SharePoint Server Search parent process (mssearch.exe)

·     SharePoint Search crawl process (msdmn.exe)

Although there are other processes that may affect SharePoint, these are the major processes that should be monitored when using Process Monitor and Process Explorer Sysinternals tools. The first step in troubleshooting with Process Monitor or Process Explorer is to identify if the issue can be captured by monitoring process activity. Here is a list of some of the SharePoint processes that Process Monitor and Process Explorer can be useful to monitor.

·     Badly performing and errors in application pools, especially when other web servers in the same farm are running normally.

·     Security event login failures to local resources

·     Timer job errors

·     Crawling local file system content sources

·     Installation errors

Here is a list of SharePoint issues that may not be best to monitor with Process Monitor or Process Explorer.

·     Crawling SharePoint Sites or external sources

·     SQL login failures

·     SQL connection issues

Performance issues where detailed historical performance information is needed, use Perfmon.exe instead.

Examples:

Example 1:  Troubleshooting Application Pool Processes

Scenario

In this lab you will learn to view properties of the SharePoint web application processes, w3wp.exe. By viewing the properties of the running w3wp.exe application pool process you can determine possible issues with permissions, ASP.Net setup, performance and application pools hangs/leaks.

*Important - Do not follow the steps in any of these examples on production or live test systems as you will be running commands that can and will break functionality of the server. Only attempt these examples on computers that are expendable!

Tasks

For all of the Process Explorer labs you will want to enable the ‘Command Line’ and ‘User Name’ columns in the main window view. To do this right-click the columns where it says Name, PID, CPU, Description and Company and select ‘Select Columns’. Check the columns you want to add to the main window.

On your SharePoint web front end server run IISRESET. After this command completes, start Process Explorer. Look for a process with the path of svchost.exe -k iissvcs. There should not be any child process under this svchost.exe process yet, Figure PE1-1.

Figure PE1-1:  Application pool process after restarting IIS

Procexp_Lab1-5

Browse to your SharePoint site to produce some application pool events. After your site is fully rendered on your browser, look back to Process Explorer. You should see at least 1 w3wp.exe child process under the svchost.exe process. If you have the Command Line column in the view you will see which application pool process was started when you browsed to the site. In this case it was the Central Administration site, Figure PE1-2.

Figure PE1-2:  Application pool process after browsing to site first time

Procexp_Lab1-0

You will notice that the svchost.exe parent process of our application pool process as seen with the blue line in Figure PE1-2. The application pool process identity can be determined if the ‘User Name’ column is enabled in the main window. Also notice the w3wp.exe is pink and that the w3wp.exe process is yellow. This process color coding can help in troubleshooting errors with SharePoint sites being unavailable as we will see in the next exercise.

On your SharePoint web front end server run Internet Information Services Manager, (%SystemRoot%\system32\inetsrv\iis.msc). Select the Web Sites node in the left-hand side TreeView, as seen in Figure PE1-3. Make a note of the web application identifier of your SharePoint web application.

Figure PE1-3:  Finding the IIS site identifier

Procexp_Lab1-3

Next, open a command prompt and type the following command:

CD %SYSTEMROOT%\Microsoft.Net\Framework\v2.0.50727

On 64-bit systems use the following path:

CD %SYSTEMROOT%\Microsoft.Net\Framework64\v2.0.50727

Type the following command. Remember to change the web application ID number to your specific value from Figure PE1-3:

aspnet_regiis.exe -kn w3svc/221506137/ROOT

You should see similar output from the aspnet_regiis.exe command if ASP.Net was uninstalled successfully from the web application:

Start removing any version of ASP.NET DLL at w3svc/221506137/ROOT.

Finished removing any version of ASP.NET DLL at w3svc/221506137/ROOT.

Try to browse to your SharePoint site. You will receive an HTTP 404 error, the page cannot be found. Why is this? The web application and application pool are running in IIS Manager. The SharePoint database is running. To find out why you received the 404 response open the Process Explorer window and look for the w3wp.exe process with the application pool name in the Command Line column. You will see that the background color of the process is white, not yellow like previously. This is because the only code running in the w3wp.exe process of the application pool is unmanaged code; basically just the skeleton of the w3wp.exe process. Remember that in WSS 3.0 and Microsoft Office SharePoint Server 2007 one of the requirements is .Net 2.0 Framework. Since we uninstalled all versions of .Net for the web application in Figure PE1-3 SharePoint cannot run on that web application. See Figures PE1-4 and PE1-5 for examples of broken w3wp.exe application pool processes.

Figure PE1-4:  Application pool managed code (.Net) process type

Procexp_AppPool2_BadPort80Site1

Figure PE1-5:  Application pool unmanaged code (Win32) process type

Procexp_AppPool2_BadPort80Site2

To re-enable .Net code on the web application in Figure PE1-3, run the following command from the command prompt. Remember to change the web application ID number to your specific value from Figure PE1-3:

aspnet_regiis.exe -sn w3svc/221506137/ROOT

Browse to your SharePoint site. You should not see the 404 error and the site should be visible now. Go back to the Process Explorer window and you will see the w3wp.exe process background is yellow.

Review

Remember that a lot of information can be gleaned from just the main Process Explorer window if the pertinent columns are enabled. You can verify application pool identities; verify .Net is running in your application pools, CPU usage, number of threads for each process, how many bytes were written and read for each process, even the time the process started.

Example 2: Basic usage and SharePoint Installation Error

Scenario

In this lab you will learn how to configure Process Monitor for effective SharePoint troubleshooting. You will also track down a generic Windows SharePoint Services 3 install error.

*Important - Do not follow the steps in any of these labs on production systems as you will be running commands that can and will break functionality of the lab server. Only attempt these labs on computers that are expendable!

Tasks

*Note - It is recommended that if you are running this lab exercise in a virtual environment that you take a snapshot before you begin as this exercise will leave your install unable to run SharePoint or reinstall without a significant amount work.

Login to your server using an administrative account that will be used to install WSS 3.0. Then run regedit.exe from Start->Run.

Navigate to registry key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools. Right-click the Shared Tools and select ‘Permissions’. Add the currently logged in account and select Deny/Full Control. The WSS 3.0 installer needs access to this registry key to create the registry entries for Web Server Extensions, so the install will fail as seen in Figure PM1-1. The error you will receive is not very descriptive, so if you didn’t know that your account had deny access to the ‘Shared Tools’ registry key it would be difficult to troubleshoot this error without Process Monitor. With Process Monitor tracking down this error will be straightforward.

Start Process Monitor capturing all activity by enabling the buttons in the red box in Figure PM1-1. Make sure the filter is reset to the defaults before capturing.

Figure PM1-1:  Enabling all activity tracing

Procmon_Lab2_0

Start the WSS 3.0 setup program. At the end of the install process, the installer will fail with the error in Figure PM1-2.

Figure PM1-2:  WSS 3.0 Installation Error

Procmon_Lab1_0-3

Click ‘Don’t Send’ on the error; stop capturing activity in Process Monitor.

When capturing the WSS 3.0 install Process Monitor captured over 1.2 million events, Figure PM1-3 red box. While this may seem like a lot of events to search through to find the cause of the install error you will see that by using filtering we can quickly find the cause of the install error.

The capture in Figure PM1-3 was using the page file or virtual memory to write events to while running. This can be changed to write to a backing file so that virtual memory is not used up while taking a long capture. To change where the events are captured to go to the menu->File->Backing File; from there you can change the location to a file. This is useful if you know that virtual memory on the server is low or you want to take a long capture and do not want filling of the pagefile to impact the performance of virtual memory.

To start narrowing down the events to the event that caused the installation failure; go to the menu Filter->Filter. In the filter window add 2 filters. The first is Process Name is setup.exe then include. The second is Process Name is msiexec.exe then include. Click OK.

Figure PM1-3:  Filtering

Procmon_Lab1_00

After these filters are loaded you will see only events for the WSS 3.0 install. There is another way to filter events also from the main window based on the information shown for an event. Right-click on one of the events where the Result column equals ‘SUCCESS’ and click Exclude ‘SUCCESS’. Since most programs do not error on successful actions, this will cut down the events even further. The right-click is column sensitive, so that if the PID column were right-clicked, then it will list include and exclude for the PID number and so on.

Figure PM1-4:  Right-click filtering

Procexp_Lab1-6

Enabling these filters should have cut down the pertinent events, but you may still have thousands of events to look through. A faster way to track down a possible cause of the error is to add a filter for the following; Result is ACCESS DENIED then include, as seen in Figure PM1-5.  Another good filter for finding possible causes of errors is; Result contains NOT then include. This will show events like Path Not Found and Name Not Found. Indicating there may be a problem finding a file or registry value that is needed.

Figure PM1-5:  Filtering only ACCESS DENIED

Procmon_Lab1_0-5

After the ‘ACCESS DENIED’ filter is enabled, there should not be very many events listed. The events should be similar to the ones in Figure PM1-6. From these events you can clearly see that there was an access problem opening the registry key HKLM\Software\Microsoft\Shared Tools\Web Server Extensions.

Figure PM1-6:  Cause of failed WSS 3.0 install

Procmon_Lab1_0-6

Review

One of the most powerful aspects of Process Monitor is the filter capabilities. By mastering filtering, you can quickly track down causes of errors related to failed file access, registry access and network activity. For an advanced troubleshooting video on Process Monitor please see Advanced Windows Troubleshooting with Sysinternalss Process Monitor.

Example 3: Application Pool Performance

Scenario

In this lab you will learn how to track down long running threads in an application pool to determine performance related issues.

*Important - Do not follow the steps in any of these labs on production systems as you will be running commands that can and will break functionality of the lab server. Only attempt these labs on computers that are expendable!

Tasks

On your SharePoint web front end server run IISRESET. After this command completes, start Process Monitor capturing all activity by enabling the buttons in the red box in Figure PM2-1. Make sure the filter is reset to the defaults before capturing.

Figure PM2-1:  Enabling all activity tracing

Procmon_Lab2_0

Browse to your SharePoint site to produce some application pool events. After your site is fully rendered on your browser, stop Process Monitor in the server. The browser page render should have taken more time than usual to render because the application pool was just restarted when you ran the IISRESET command.

In Process Monitor on the server you will see many events in the main window since the default filter was used to capture the events on the server. In this lab we are only going to concern ourselves with application pool performance, so we are going to filter out all other processes besides the application pool events.

Go to the Process Monitor menu; Filter->Filter to bring up the filter window in Figure PM2-2. Select Process Name, is, w3wp.exe then include and click Add. Click OK.

Figure PM2-2:  Filtering only application pools

Procmon_Lab2_2

After filtering the events you will see only application pool events. Use the columns discussed in Process Monitor Lab1 so that you can view the command line column information and process ID. If you have multiple application pools running you may see different process IDs (PIDs) for the w3wp.exe processes. In the command line column you will see a reference to the application pool name, as seen in Figure PM2-3. Note that this Process Monitor capture was done on Windows 2008 Server, if using Windows 2003 you will see a different command line format, but the application pool name will be listed in the command. Make a note of the PID for the application pool you want to view more information on; in this case the PID is 5828 for the application pool ‘SharePoint - 80’.

Figure PM2-3:  Events filter by application pools

Procmon_Lab2_3-2

To view more information on the application pool and what it was doing while the capture was taken go to the Process Monitor menu; Tools->Stack Summary… as seen in Figure PM2-4.

Note that viewing the stack works best when the Debugging Tools for Windows is loaded on the system and the symbol path is set via Options->Configure Symbols. The path to the DbgHelp.dll will need to be set to the file from the Debugging Tools. The public symbol server can be used for the symbol path; srv*http://msdl.microsoft.com/download/symbols. See http://www.microsoft.com/whdc/devtools/debugging for more information on downloading and installing the Debugging Tools.

Figure PM2-4:  Viewing the thread stack of an application pool

Procmon_Lab2_4

In the Stack Summary screen you will see on the left-hand side a treeview with 1 node, All. This is the node for all processes that were captured when the trace was running along with other information on the total time and CPU time for all processes.

Expand the All node and you will see a list of w3wp.exe processes since this is what our current filter is set to view. Find the w3wp.exe node that has the PID number from Figure PM2-3 in the Module or Path column as seen in Figure PM2-5.

Figure PM2-5:  Application pool thread information

Procmon_Lab2_5

After finding the w3wp.exe process for the application pool; expand the w3wp.exe process. You will see a list of the treads that were executed in the w3wp.exe process while the trace was running. To find the thread that was using the most time focus on the % Time column. You can also see how many Process Monitor events where captured for each process and tread, Figure PM2-6 red 1. Expand the thread listed under the w3wp.exe process with the highest % Time value. Keep expanding as seen in Figure PM2-6 red#2; if you expand a function and more than 1 function is listed below the current, expand the one with the highest % Time value as seen in Figure PM2-7. The function with address 0x5a5 is higher than the function with 0x374. While expanding the tread functions look for function names that may give a clue to what the function does. In this case I stopped expanding when I found a function called: VdbThemeManager::loadTheme. Just from the name you can hypothesize that this function looks up the selected theme for the site and loads the theme into the application pool memory to be used to construct the HTML page. In this case the % Time is not that great, Figure PM2-6 red 4&5, but if it was high you could focus troubleshooting on the theme and theme files. You can also see which executable file, Dll or exe the function was running from, Figure PM2-6 red 3. In this case it is stswel.dll, one of the main SharePoint Dlls.

Figure PM2-6: Unrolling the stack

Procmon_Lab2_6-2

Figure PM2-7:  Selecting a high % Time fork

Procmon_Lab2_7

Review

The main advantage of using Process Monitor to track down poor performing processes is that you can take a Process Monitor trace when the issue is happening live and save the file to view at any time.

These same troubleshooting steps can be taken for any poorly performing processes including the SharePoint timer service, OWSTIMER.EXE, SharePoint search service, mssearch.exe and any other process running on the computer that may be contributing to performance issues.

Example 4: SharePoint Server Error

Scenario

In this lab you will use both of the tools you have seen previously, plus some other Sysinternals tools to effectively troubleshoot an unexpected error in SharePoint.

*Important - Do not follow the steps in any of these labs on production systems as you will be running commands that can and will break functionality of the lab server. Only attempt these labs on computers that are expendable!

Tasks

Make sure you have at least 1 SharePoint web application created on your SharePoint farm/server. Make a note of the application pool identity on this web application. If you followed the Basic install, then the identity should be NETWORK SERVICE. On your SharePoint web front end server run IISRESET /STOP. After this command completes, navigate to %SYSTEMROOT%\Microsoft.Net\Framework\v2.0.50727. Right-click on the “Temporary ASP.NET Files” folder and select Properties. Select the Security tab. Click the Edit button to edit the permissions of this folder. Add the application pool account if it is not listed. Select the “Full control” “Deny” checkbox for this account.  This will make the application pool account unable to write files to this location breaking any application pool running .Net 2.0 code under the user account, including SharePoint.

Run IISRESET /START and browse to the web application. You will see the following page.

Figure 5-1:  SharePoint site error

Procmon_Lab3_1

Run Process Explorer to see if there is any information it provides that can help in identifying this error. After Process Explorer opens navigate to the w3wp.exe process for the application pool. Remember to enable the User Name and Command Line columns by right-clicking the column bar and selecting, Select Columns…

You should see something similar to Figure 5-2.
Figure 5-2:  W3wp.exe process information

Procmon_Lab1_2

From this we can see that the w3wp.exe process is yellow, which means it is running .Net code. To check the version of .Net code running make sure the lower pane is visible by selecting View->Show Lower Pane in the Process Explorer menu. Also make sure the lower pane is showing DLLs by selecting View->Lower Pane View->DLLs in the Process Explorer window. Sort the lower pane by Name and find aspnet_filter.dll make a note of the Version shown, if it starts with 2.00.50727, then the application pool is running .Net 2.0 code.

See Figure 5-3 for more information on selecting the DLL lower pane view and finding the DLL version information.

Figure 5-3:  View DLL information for a process

Sys_Lab4

In this case Process Explorer is telling us that things look to be working as far as the basic process is concerned.

Next, let’s open run Process Monitor using a filter on the w3wp.exe process and capture events while refreshing the SharePoint page. Right-click on one of the SUCCESS results and select Exclude ‘SUCCESS’. You will see many BUFFER OVERFLOW, NO SUCH FILE, NAME NOT FOUND and many more results displayed. Remember that the result that causes most issues is ‘ACCESS DENIED’, so create a filter to include ‘ACCESS DENIED’. You should see at least 1 access denied event as seen in Figure 5-4.

Figure 5-4:  Access Denied Event

Sys_Lab5

Since an access denied event occurred the next step will be to check the permissions on the folder structure specified in the Path column. Another Sysinternals tool, AccessEnum.exe can be used to view the permissions of a file or folders and the child files and folders that have different permissions.

The AccessEnum.exe tool can be downloaded from the same sites as Process Explorer and Process Monitor listed at the beginning of this article.

Run AccessEnum.exe on the SharePoint server and choose the folder from the Process Monitor Path column; C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\Temporary ASP.NET Files. The default view when the scan is run is to only show files ad folders that have more restrictive permissions than the selected folder. This can be changed from the menu-Options->File display options to include any differing permissions from the selected folder, Figure 5-5.

Figure 5-5:  AccessEnum.exe, Changing Display Options

Sys_Lab2

Click the scan button. You will see, depending on the display option and permissions, a few or many paths and their permissions. In the Deny column you should see the application pool user account listed on the selected folder as in Figure 5-6 in the red box.

Figure 5-6:  AccessEnum.exe, Access Deny

Sys_Lab1

After the scan is complete the permissions can be saved in text format to look for differences between the current permissions and known good permissions, as seen in Figure 5-6 green box.

The root cause of the SharePoint failure has been determined. Remove the Deny permission for the application pool and run IISRESET. When you browse to the web site it will compile and there should be no errors.

Review

While Process Explorer and Process Monitor may not always be the best tool to use for every troubleshooting situation they can be an invaluable resource, especially when there are not clear Event Logs or ULS logs pointing to the cause, or when an issue only effects a certain server and not others in a farm. In SharePoint if you are relatively sure the issue is not database related and that there may be a problem with how the server is working locally on the file system or registry then it can’t hurt to load up Process Explorer and/or Process Monitor to see if the problem can be tracked down by these great tools.

Attachment: Troubleshoot SharePoint and Office with Sysinternals.pptx
Comments
  • Awesome

  • Really excellent article, wish I could find more information on troubleshooting the stack though.

  • Very good effort in putting all together. Nice Work Chad.

  • Fantastic article. Thank You!

  • Really good article, Thanks

  • thanks

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment