This post is a contribution from Paul Chan, an engineer with the SharePoint Developer Support team
This blog article will list the popular tools that can be used when troubleshooting performance issues. Note that the tools being mentioned here are listed in random order.
NOTE: If you feel needed or press on time, feel free to contact Microsoft and create a support case to work with us, instead of dragging the resolution of the issue for too long until the last minute.
Task Manager: This is the first tool can be used to identify high memory and high CPU scenario; e.g. to check if it is the w3wp.exe process taking a lot of memory or CPU cycles. This tool is always available from the OS. However, it only gives a high level point of view with no specific details.
Performance Monitor: This is another tool to identify which process is taking high memory or CPU. It also provides a logging option that you can track the usages in a period of time. However, it still won’t give you the clear target where the issue comes from.
Custom Logging: This method is being mentioned in both the high CPU and slow issues (in Part 2 and Part 3). To be honest, it is more useful in slow issues than high CPU issues. This method could be the one that give you the clearest target where the slowness come from.
Code Review: Although code review is the method being mentioned in all 3 types of performance issues, it is not really troubleshooting but trouble-scanning; i.e. it’s not shooting at any specific target but more like shooting in the dark; especially when it’s up to high memory usage. This is the main tool to use that fits all 3 types of performance issues though.
NP .NET Profiler: I mentioned it in Part 1 and 3 before. This is a nice tool but it does have its limitation. I think it can also be used in small scale code tests as well. One of the good things about this tool (if it does apply) is it doesn’t need to modify any code that it attaches itself to the process to do the profiling. But the main drawback is it needs to terminate the process when attaching and detaching itself.
Microsoft SharePoint Development Support: What do we do in our support team? Do we have any secret scrolls or magic spells? Maybe someone in the team does, but not me. Basically, we troubleshoot performance issues using the similar tools I mentioned as the condition fits. Of course, we might also use some tools that only available internally in Microsoft. Personally, I also rely on memory hang dumps to troubleshoot performance issues. However, this specific topic is too much to cover that it is not suitable to discuss in a small blog article like this. Memory dump analysis is not an internal thing to Microsoft only, you can look it up over the internet and you will find people discuss about it.
DebugDiag: This is a tool that associated to the memory hang dumps mentioned above. This tool provides an Analysis feature that it does the memory dump analysis via a script. So it kind of sits between a memory hang dump analysis done by a human being, and the tools that only give a high level overview of the issue. However, everyone have an uncomfortable feeling about analyzing data automatically done by a script. On the other hand, understanding its analysis results could be challenging as well. However, it is better than nothing, and don’t get me wrong that it is in fact a good tool. I personally use this tool as a reference in addition to my own memory hang dump analysis. Talking about how to use this tool will take a long time, not to mention how to read its results. Obviously, I will not include such information in this blog article. Lookup the term DebugDiag over the internet and you will find a lot of references about it.
ULS Log: This tool has been overlooked from time to time that people usually open the logs when there is an error pops up along with a correlation Id. Performance issues usually don’t apply to this because they are not really errors or exceptions from the code, but a symptom caused by the environment (exposed by the code). However, sometimes the ULS logs will provide useful information; such as query time taken, that could help in slow issues. If you are using the SPMonitoredScope to do the logging in the code, the log also provides the timestamp and what activities are happening around that time. Looking into ULS log without correlation Id is painful and need patience, but sometimes this is what needs to do.
NOTE: For any performance issues that you don’t feel comfortable to handle, please feel free to create a support case and work with Microsoft support. We do not want the resolution of our customer issues being delayed.
Another popular performance issues I have seen when working on customer’s issues is slow response. Obviously, this blog article addresses a slow scenario.
The symptom of a slow issue could be confused by high CPU usage. When a process is using high CPU utilization, its response is obvious slow. This condition is not considered a slow issue but high CPU instead. Similarly, a slow response could also being caused by high memory (or I should say high resource) usage. Failing to release resource from the class objects being instantiated in the code could lead to a slow response; especially if the resource related to network connections. In theory, it is not high memory usage will cause a slow response directly, but indirectly. For a “real” slow issue, it means the CPU and memory usage are usually low, which technically means that the w3wp.exe process is waiting on something; more likely some requests to a remote process, web service, SQL, etc.
If you do have access to the custom code and afford to change and redeploy it, then the most simple way is to add time logging in the code; i.e. either log the time at selected location in the code and record the time (or the duration of a method or section of the code) into a text file or anything available. Then simply check the time and reference in the code will give you an idea where in the code execution is taking a long time. The DateTime.Now, the System.Diagnostics.Stopwatch, or the Microsoft.SharePoint.Utilities.SPMonitoredScope will be useful in this approach.
If you can’t change the code, try to see if the code will attempt to connect to any remote services, and then check the remote side of the request and see how long it takes. For example, if the code will try to access a web services hosting by a website on an IIS server, simply check the Time-Taken column value in the IIS log of the hosting website will give you an idea how long the request to the web method took.
For SharePoint specific issues, the access to lists with a lot of items (e.g. 2000+ items) could be a problem. Complicated CAML queries should be avoided, prevent expensive operations (e.g. break inheritance, create new site, etc.). In addition, the ULS log might give you some clue (e.g. a query duration time) to show where the slowness could come from.
For a tool that could help troubleshooting slow response, a code profiler will be nice, but not everyone has access to one. There is one I mentioned in “How to troubleshoot performance issues in SharePoint custom code on your own? Part 1 – High Memory Usage” that is not very friendly but could somewhat do the job (with certain limitations); i.e. NP .NET Profiler. You can read about it (contains some simple walkthroughs):
http://blogs.msdn.com/b/webapps/archive/2012/09/28/troubleshooting-performance-issues-in-web-application.aspx
Can be downloaded from here:
http://www.microsoft.com/en-us/download/details.aspx?id=35370
NOTE: For any performance issues that you don’t feel comfortable to handle, please feel free to create a support case and work with Microsoft support.
One of the popular performance issues I have seen when working on customer’s issues is high CPU utilization (aka 100% CPU, or high CPU).
When troubleshooting a high CPU issue on your own (i.e. not working with a Microsoft support engineer), it could be very complicated. Most of the time there isn’t much can be done but code review.
The first thing to confirm is which process is running on high CPU. This can be verified by simply looking into the Task Manager of the server. If it is a custom code on SharePoint server, the w3wp.exe process is what we’re looking into.
From my experience, there is about 40% of the time a high CPU issue actually caused by high memory usage. What could happen is the memory usage is too high that it triggers .NET garbage collection (GC) more often than it should have. Since GC is an expensive operation, the CPU will go high whenever GC is occurring. In such situation, resolving the high memory issue will lower down the CPU usage. Check the memory usage of the w3wp.exe process will give you an idea if it’s running on high memory or not (or take a look to the article “How to troubleshoot performance issues in SharePoint custom code on your own? Part 1 – High Memory Usage”).
What if the memory usage appears normal? Then we move on to code review. What to focus is any type of loop, foreach, while, etc. and any repeated operations. It is usually not a single call of API will lead to 100% CPU, but a lot of API calls that being executed repeatedly.
Another way “try” to find out where in the custom code that leads to 100% CPU is to logging. Simply add some logging/reporting code (e.g. write to a text file, SharePoint list, etc.) in the code from the beginning to the point that you suspect where the issue could be. When the log shows that an expected entry (and further) is missing, then it means the issue occurs after the last recorded entry and before the missing entry, because the code being executed before the logging is too busy that it doesn’t get the chance to execute the logging piece until after a long time. This could help narrow down the focus in the code.
There seems no clear way to identify a line of code that uses large amount of CPU power. What exactly need is the information about the CPU usage of a thread in the process. Since w3wp.exe process is going to run multiple threads, getting the CPU % from the w3wp.exe process cannot identify which thread is using high CPU. Even if we can tell which thread is using high CPU (e.g. from Performance Monitor’s Thread object, % User Time counter), we still need to tell what kind of code a specific thread is executing. Is this impossible? No, but it is a bit tricky try to get such information in a practical world. The type and amount of information to describe the process will not fit in this little blog article however.
One of the popular performance issues I have seen when working on customer’s issues is high memory usage. Note that performance issues in Microsoft development support are divided into different categories; i.e. a) High memory usage; b) High CPU utilization; c) Slow. One of the main things we will need to distinguish is what type of performance issue we are working on. Obviously, this blog article addresses a high memory usage scenario.
There may be special tools or techniques that a Microsoft support engineer uses that depend on the nature of the issue. It is not the intention of this blog to discuss about those special things, but something that any developers can do. Most of the approaches being discussed here will also being used by Microsoft support engineers as well.
When troubleshooting a high memory usage issue, it is either very simple or very hard. Under a single blog article like this one, I only talk about the simple scenario here. The majority of high memory usage in custom SharePoint code related to the instantiation of the SharePoint class objects (e.g. SPSite & SPWeb) but not disposing them in the code after using them.
The quite popular tool SPDisposeCheck (http://download.microsoft.com/download/B/4/D/B4D279A0-E159-40BF-A5E8-F49ABDBE95C7/SPDisposeCheck.msi) is created to check such objects in the dll. This tool is created by Roger Lamb that the results of the tool reference the sections in his own blog articles where he talks about the scenarios. Here are a couple of his articles that worth to take a look:
http://blogs.msdn.com/b/rogerla/archive/2008/02/12/sharepoint-2007-and-wss-3-0-dispose-patterns-by-example.aspx
http://blogs.msdn.com/b/rogerla/archive/2008/10/04/updated-spsite-rootweb-dispose-guidance.aspx
Since Mr. Lamb already has some quality articles written, I’m not going to repeat what he has covered but talk about something else.
There are two main causes of high memory usage; i.e. a) memory leak; b) high memory usage by design.
How do we know there is a high memory usage issue on a SharePoint site? What is considered high? Basically, we measure the memory usage of the w3wp.exe worker process that runs the SharePoint site; e.g. a quick look into the Task Manager will give a rough idea. If it is using close to or over 1GB of memory in a 32 bits OS, then it’s consider high. In 64 bits OS, having a w3wp.exe process running with 2GB of memory is quite normal, and even up to 4GB. Note that these are just rough figures for reference only that the real life situation could different in a site by site basis.
Another thing to check is to run a Performance Monitor logging to check the w3wp.exe process’s Private Bytes and Virtual Bytes counters in the Process object for a period of time. If you see a trend that the counters keep rises, then there is likely a memory leak. If you see a trend that the memory rise at the beginning, and then kind of goes flat (at a relatively high memory usage value) eventually, then there is a high memory usage.
Personally, I don’t consider high memory usage necessary a problem unless the server cannot handle such memory usage and causing issues to the site. But it will be good to minimize the usage if possible. The problem that introduces a bigger negative impact is memory leak, which means potentially the process will use up all the memory or until a critical issue occur to the site.
The way to handle both types of memory usage issues is quite the same; i.e. code review (if you’re not going to work with a Microsoft support personnel). For memory leak that specific in SharePoint class objects, we are lucky to have the SPDisposeCheck tool. However, developers can put any type of code that uses any class objects in it. There are other non-SharePoint class objects that also need to dispose. A few popular one I have seen are; e.g. WebResponse (also the Stream object from the WebResponse.GetResponseStream method), web service proxy, DirectorySearcher, IO Stream, etc. The general rule is: Whatever objects you use the “new” keyword to instantiate and that class object does expose a Dispose (or Close) method, dispose it. Sometimes it is not only concern memory usage, but also any resources the specific object going to use (e.g. network connections).
Wait, I didn’t say much here that help but just SPDisposeCheck (that a lot of people know about it already) and code review. For another tool that could help troubleshooting performance issues, a code profiler will be nice, but not everyone has access to one of them. There is one that not very friendly but could somewhat do the job (with certain limitations); i.e. NP .NET Profiler. You can read about it (contains some simple walkthroughs):
http://blogs.msdn.com/b/webapps/archive/2012/09/29/faq-on-np-net-profiler.aspx
http://blogs.msdn.com/b/webapps/archive/2012/11/08/list-of-blog-posts-on-np-net-profiler.aspx
Note that this blog is not meant to give a training or walkthrough how to use the .NET Profiler. Such information can be found in its own site already. I simply mention it and hope that it does help.
This post is a contribution from Aaron Miao, an engineer with the SharePoint Developer Support team
SharePoint 2013 SPSecurityEventReceiver provides methods to trap events that are raised for security. Tim Ferro’s this blog provides great details missing from MSDN document about the class.
This blog is to provide one detail about the issue of canceling RoleAssignmentAdding event.With the code below,
public override void RoleAssignmentAdding(SPSecurityEventProperties properties)
{
base.RoleAssignmentAdding(properties);
// more code here: if user is “everyone” cancel the adding
string errMsg = "This user is not allowed to be added to this site";
properties.ErrorMessage = errMsg;
properties.Status = SPEventReceiverStatus.CancelWithError;
}
when adding a user from _layouts/15/user.aspx page, like this (adding “everyone” to a team site with explicitly specifying Read permission):
You would expect an (out-of-box) error page shows up with the error you set like below.
This works just fine with GroupUserAdding event (adding a user without explicitly specifying permission). However the error page won’t show up when canceling RoleAssignmentAdding event. This due to a defect in SharePoint product. The problem will be likely addressed in next release of SharePoint.
Fortunately you can work around the issue by creating a custom error page. This blog has all the details about SharePoint 2013 event receiver redirect.Code (as described in the blog) like below should lunch your custom error page to notify users.
private readonly HttpContext _currentContext;
public UserAddingEventReceiver(ISecurityEventConfig config)
{ _currentContext = HttpContext.Current;
string url = new StringBuilder("CustomErrorPage.aspx");
string urlRedirect = null;
bool flag = SPUtility.DetermineRedirectUrl(url.ToString(), SPRedirectFlags.RelativeToLayoutsPage, _currentContext, null, out urlRedirect);
_currentContext.Response.Redirect(urlRedirect + "&Error=" + errMsg, true);
With SharePoint 2010 you can set time zone for a user using code like this:
// Code1using (SPSite cSite = new SPSite(siteURL))
using (SPWeb curWeb = cSite.RootWeb)
curWeb.AllowUnsafeUpdates = true;
SPUser editUser = curWeb.EnsureUser(loginName);
SPRegionalSettings userRegSettings = new SPRegionalSettings(curWeb, true);
userRegSettings.TimeZone.ID = 10; // see List of TimeZone ID
editUser.RegionalSettings = userRegSettings;
editUser.Update();
curWeb.AllowUnsafeUpdates = false;
After the code is executed, you can browse to a Document library to check a datetime column, for instance, Modified. It shows the datetime in the time zone you set.
However this is reported not working any more with SharePoint 2013. The reason is that SharePoint 2013 sets default value of User Profile property “SPS-RegionalSettings-Initialized” to false, which is true in SharePoint 2010. You need first to set the property to true like this:
// Code2using (SPSite ccSite = new SPSite(siteURL))
SPServiceContext serviceContext = SPServiceContext.GetContext(ccSite);
UserProfileManager userProfileMgr = new UserProfileManager(serviceContext);
UserProfile updUser = userProfileMgr.GetUserProfile(loginName);
updUser["SPS-RegionalSettings-Initialized"].Value = true;
updUser.Commit();
If you execute the Code1 immediately after Code2 you will find that the user still see datetime column value (e.g. Modified) in old time zone.
This is because the timer job “User Profile Service Application - User Profile to SharePoint Language and Region Synchronization” (scheduled for 1 minute interval) has not completed yet. Setting ["SPS-RegionalSettings-Initialized"].Value to true will cause synchronizing language and region information from the User Profile service application to SharePoint users.
To set a time zone for a user on a site, first run Code2, set some time delay to allow timer job “User Profile Service Application - User Profile to SharePoint Language and Region Synchronization” to complete synchronization and then run Code1.