|
|
-
After my last post back in March, my role at work changed and rather than focusing on reactive, break/fix MOSS issues, I was focused heavily in deploying MOSS from scratch in several environments. However, as of Monday I will be taking on a new role which will incorporate both the break/fix and deployments of MOSS and will begin posting more frequently.
If you are still on the fence on whether to blog or not, I definitely recommend giving it a shot. At first it was very painful for me to get into a “blogging” mindset and I can’t say that I am anywhere close to obtaining one in the near future. However, as I look back at this blog and the posts that I made, it was definitely worth it for the following reasons:
1. It helped me to organize my thoughts, document it and share it with others easily.
2. It provided a me a forum where I could synergize my two passions: problem solving and helping people.
3. It forced me to become more knowledgeable on all of the different technologies incorporated in MOSS (i.e. SQL, OS, Active Directory, IIS, Networking, etc…).
As I prepare for my new role on Monday, I found it interesting to see that the first thing that I will be focusing on is optimizing performance for several of our company’s business critical MOSS portals. Because I took the time to blog earlier this year, I feel like I have now gone full circle as it is the first place I am going to look to get ramped up on best practices.
wayne
|
-
MOSS Design Considerations taken from Optimizing Office SharePoint Server for WAN environments & Optimizing custom Web parts for the WAN and my expereinces
Recommendation: Minimize the amount of secured items on a page
Reason: When a user authenticates, two things happen. First, the system validates credentials to determine who the user is. Second, the role provider enumerates the list of SharePoint groups to which the user belongs. Each time a page is requested, the role provider is called again to enumerate all of the groups to which the user belongs.
However, this call for group membership can happen multiple times on a single page. For example, the default Collaboration Portal site template page requires two calls to the role provider when you go to the home page—one for the page itself and one for the image on the page. Each image that is stored in a SharePoint library and is on the page will force an additional call to the role provider to verify permissions, even if all of the images are stored in the same image library. That verification occurs whether the images are added as fields on the page—that is, part of the page's content—or whether they are added to the master page for the site.
Benefit: Reduces the number of security checks and therefore reduces the risk of getting latency.
Recommendation: Minimize the amount and size of images/webparts
Reason: Since the browser only allows you to download 2 items at a time, the more images/webparts, the more time it will take to download the page.
Benefit: Reduces the amount of time it takes to render a page and reduces the amount of data sent over the network
The recommended way is to embed multiple images in a single file and then reference individual images in your page. Not only will file download size decrease, but fewer files result in less network traffic. It is more complicated to author pages by using this technique, but in situations where every round trip and file size counts, it can prove to be valuable way to help improve performance.
Recommendation: Store all images locally on the farm using the relative path versus hardcoded links or the absolute path
Reason: By storing images on a remote location, there is now a dependency on the availability, permissions, and the network to the remote location.
Benefit: The page renders more quickly because you can take advantage of BLOB caching. There is a reduction in network traffic to remote locations and the availability and permissions can all be controlled via the site. Also, when restoring the site in PPE or in another test environment, testing or deployments are not able to affect production as it is self contained.
Recommendation: Reuse existing css files, versus having every group create their own
Reason: By reusing those styles, you can minimize the impact on the page because the page won’t need to download an additional style sheet just to support the different webs. In addition, after the initial site visit the user will already have the core.css file downloaded to their cache. If we do require custom styles, then consider using a custom style sheet that can be used with the blob cache. If you store it in a document library, the style sheet can have a cacheability directive associated with it so that it doesn’t need to be downloaded after the initial page hit. This will have a lower impact on the site than using inline styles, for example, which would be transmitted over the network every time the page is rendered.
Benefit: By using styles that are part of core.css, you ensure that no additional downloads are required for style support. Therefore you are reducing the amount of round trips and amount of data that needs to be downloaded.
Recommendation: Avoid using ViewState in custom web parts, if possible.
Reason: Web Parts may need to track such information as the user, the request, and the data source. In general, there are two common ones you can use with Web Parts: ViewState and server Cache. Using the server Cache class allows you to store state information at the server level. The drawback to using server Cache is that it is not really intended to be used as a per-user state mechanism (although depending on the circumstances it can be made to work that way). Also, the cache information is not replicated throughout all the front-end Web servers on the farm. If your part depends on having that state information present regardless of which front-end Web server a user request ends up hitting, then server Cache is not a good choice. In that scenario, another option is to use Session State. Session State is turned off by default, but is enabled when you activate Microsoft Office InfoPath Forms Services (IPFS) in a farm. When it is enabled, it uses Microsoft SQL Server to track state, which means that session state values can be used no matter which front-end Web server receives the HTTP request. The drawback to session state is that data stays in memory until it is removed or expires. Large datasets stored in session state can therefore degrade server performance if not carefully managed. Because of these constraints, we do not recommend using use session state unless absolutely necessary. You can also experiment with setting ViewState to off for all pages in the site by editing the web.config file for a Web application. It contains a pages element that has an enableViewState attribute. If you have a significant concern about the size of ViewState in the page, you can experiment with setting this attribute to false (it is true by default). If you do this, you need to thoroughly test your site and all of the functionality to ensure that it works properly, because some controls and Web Parts may expect that ViewState will be on.
Benefit: In a low-bandwidth or highly latent environment, avoid ViewState if possible because it adds content to the page both on the download and as any postback. This would apply to other forms of state that also involve transmitting data over the network, such as query strings, hidden fields, and cookies.
Recommendation: For lists, web parts or any other database calls ((i.e. excel services, infopath, product studio, data view web part, etc…), strike a balance between how many trips to the server are required, versus how much data should be retrieved for a request. Minimize the amount of data connections or web parts applied to a page. Also, for controls that emit rows of data, include a property that allows an administrator to control how many rows are displayed.
Reason: Depending on the latency and bandwidth in which the control is going to be used, end users have flexibility to either turn up or down the amount of data that is rendered in each page request. This will impact the number of requests that are required to view all of the data. If the number of rows returned is a property that can be set by end users, consider adding constraints so the choices of one user don’t overwhelm the network.
Benefit: Reduces the page rendering time, reduces network traffic and protects the user from unintentionally configuring their call wrong.
|
-
MOSS Server Performance Considerations taken from Optimizing Office SharePoint Server for WAN environments and my experiences
Recommendation: use x64 hardware on all servers to optimize page downloads by minimizing number of round trips between client computers and server computers.
Reason: With x64 you are not limited to 2 GB of Memory.
Benefit: Reduces network traffic and latency due to FE servers hitting the memory limit and needing to recycle.
Recommendation: apply Windows Server 2003 SP2
Reason: A list of all of the reasons can be found here: http://support.microsoft.com/kb/914962/en-us
Benefit: Optimizes the Front End server’s performance
Recommendation: Defragment the Front end Server’s drives
Reason: best practice for optimal disk performance
Benefit: Optimizes the Front End server’s performance
Recommendation: apply MOSS SP1 and post SP1 rollup fixes
Reason: The service packs addresses a lot of the known bugs that currently exist.
Benefit: Reduces network traffic and latency due to FE servers hitting the memory limit and needing to recycle.
Recommendation: Use a dedicated FE crawl target that is not in NLB (i.e. NLB is not installed)
Reason: By default, Office SharePoint Server 2007 uses all Web servers to crawl content in the server farm. When your server farm is configured to use all Web servers for crawling, the index server sends requests to each Web server in the farm.
Benefit: Crawling content in your farm places a heavy load on the Web servers. This tends to cause spikes and surges in network traffic, and is CPU-intensive and memory-intensive. By pointing to a dedicated FE server, users aren’t impacted negatively.
Recommendation: All server computers in the farm are on the same network segment. There is no switching in routers at the data layer. The network link between a Web server and the database server is less than 1 milliseconds (ms) latency and located 10 or fewer miles from the database server.
Reason: Routers and switches will increase latency even if the network connection between these is very fast. If the type of load the Web server is serving is some subset of user browse requests, we expect Office SharePoint Server 2007 to tolerate some latency between the Web server and the database server. On the other hand, pages with many or custom Web Parts, Stsadm commands, and search crawls are likely to fair less well.
Benefit: Reduces latency and provides a better experience for users in remote geographic locations
Recommendation: Do not use web gardens and BLOB cache at the same time
Reason: Because only one process can acquire the lock necessary to manage the cache, successful use the cache depends on which thread services a request. If a Web garden that does not have the BLOB cache lock services a request, the content it sends in response will not have caching directives associated with it.
Benefit: Reduces the amount of requests and data sent over the network by allowing BLOB Caching to work.
Recommendation: Optimize Blob Cache by including most used file extensions and increasing the cacheabilty value
Reason: The BLOB cache enables you to configure caching directives that are associated with items served from publishing site lists, for example, the Pages library and Site Collection Images. When the browser on the client computer encounters a caching directive, it detects that the item it is retrieving can be saved locally and does not need to be requested again until the caching directive expires. When the BLOB cache is turned on, a couple of different things happen. First, each time a cacheable item is requested, MOSS searches the hard disk drive of the Web server that received the request to see if a copy exists locally. If it does, the file is streamed directly from the local disk to the user. If it isn't on the local disk yet, a copy of the item is made from the SQL database where it is stored, and then the item is sent to the user making the request. From that point forward, all requests for the item can be served directly from the Web server until the item's cacheability value indicates that it has expired. The other thing it does is append a cacheability header to the item when the item is sent to the client. This header instructs the browser how long the item should be cached. For example, if a picture had a cacheability value of three days, the browser uses the copy of the image it has in its local cache if the picture is requested again within the next three days; it does not request it from the server again.
Benefit: That results in better performance in the server farm by reducing contention on the database server. In a geographically distributed environment, this is critically important because it reduces the number of items requested and sent over the network.
Please note that you need to ensure that the “Office Sharepoint Server Publishing” feature is activated on all Publishing sites. Also, you should update the location to the best performing drive, add in extra paths, and add in the max-age to expire based on your situation. For example:
Current setting: <BlobCache location="C:\blobcache" path="\.(gif|jpg|png|css|js)$ " maxSize="10" enabled="false"/>
Example Setting: <BlobCache location="E:\blobcache" path="\.(gif|jpg|png|css|js|htc|)$ " maxSize="10" max-age="86400" enabled="true"/>
Recommendation: Optimizing IIS Compression by changing the level of IIS compression from 0 to 9 and adding in certain dynamic files
Reason: When MOSS is installed, setup configures IIS to compress the static file types .htm, .html and .txt; it compresses the dynamic file types .asp and .exe. After a site has been hit by a few users, you can verify that compression is working by viewing the %WINDIR%\IIS Temporary Compressed Files directory on a Web server. It should contain multiple files, which indicates that static files have been requested and IIS has compressed a copy of them and stored them on the local drive. When that file is requested again, whether it's the same user or not, the compressed version of the file is served directly from this folder. Dynamic files can be compressed as well, but they are always compressed on the fly; copies are not kept on the local Web server. It may be advantageous to compress additional file types. For example, it probably makes sense to also compress the static file types .css and .js; it may also make sense to compress the dynamic file types .axd and .aspx.
Benefit: The IIS Compression can result in significant bandwidth savings. For example, the core.js file is included on every SharePoint page. When it's uncompressed, it is 257 KB; after compression, the file is only 54 KB without performing additional tuning to IIS compression.
Please note: Before applying IIS Compression, you will need to ensure that you only include files that are well-suited to being compressed. For example, .jpg files are not a good candidate for compression because the file format is inherently compressed already. Also, 2007 Microsoft Office system file types such as docx, xlsx and pptx are not a good choice for compression because the files are not served directly from the server; instead, they are routed through the different ISAPI filters that are used to manage the rich integrated end user experience for Microsoft Office content. In addition, in the 2007 Microsoft Office system, these file types are inherently compressed.
|
-
Recommendation: Help Users ensure that they have configured their computers optimally.
Reason: Having the right combination of BIOS, drivers, and manufacturer tools is critical for a computer system performance and stability.
Benefit: Optimizes the Users’ computer performance
Since performance is subjective, from a client's point of view, make sure and provide your Users with guidance on how they can optimize their laptops and desktops to enhance their browsing experience. For example:
-
If I am running SQL, Visual Studio, SPD and Outlook on my laptop, then it is normal for the performance on my computer to be slow.
-
Users sometimes forget the perfromance impact RSS Feeds, Vista Side Bar gadgets, or other products have on your computer. In order to ensure that users are aware of what is running on their computers, have them download and run Autoruns to identify and manage programs. This tool is fantastic at identifying what programs, services, and scheduled tasks are started when the computer starts up.
|
-
Problem:
When anyone tried to edit a content editor web part on a MOSS site, they would get the following error:
“This Page has been modified since you opened it. You must open the page again.
Refresh page“
The users would hit F5 to refresh the page, but they would still get the same error message.
Tools used to help troubleshoot: Sharepoint Designer, MOSS Site and Content and Structure Page
Resolution:
1. We checked that no pages were checked out or locked by another user, using the tools above. We didn’t find anything locking this file in SPD or via the site
2. Rather than hitting F5, we actually clicked the “Refresh Page” link, then went to Edit Page and edit the content editor web part and the error message was gone
|
-
Over the past 2 years, I've spent may days and nights trying to optimize our MOSS Enterprise Portals' performance (i.e. rendering time). In the end, the 3 biggest perf gains came from the following:
1. moving from x32 to x64 OS
2. moving the farm content dbs from a shared/busy SQL Backend Server to a dedicated/quieter one
3. moving from a 100 MB to 1 GB NIC connection on all of the Front end Servers
Although this did not solve all of our problems, it helped buy us time to perform a deeper investigation to identify what exactly was happening. For example, these were some of the problems found:
------
Problem: a high amount of SQL roundtrips from the Front end servers
Tool used to identify issue: SQL Profiler on the backend SQL profiler
Resolution: rewrite custom code to use more of the out of the box MOSS functionality versus extra db calls
------
Problem: caching wasn’t happening as expected on sone of the front end servers
Tool used to identify issue: Fiddler
Resolution: We turned on blob caching on the web.config file on server 1, but the changes didn’t replicate to server 2, server 3 and server 4 of the farm. We ended up manually updating the other web.config files
------
Problem: too many data connections, gifs, web parts, RSS feeds all on one page
Tool used to identify issue: Fiddler, Visual Studio 2005 - VSTT
Resolution: With fiddler, we were able to see how one page had multiple connections to other dbs, had over 10 gifs (some were actually being stored on a server on a different continent) and some RSS feeds. To fix this, we contacted the user and requested they reduce the amount of stuff put on a page.
-----
Problem: Front end Servers were under heavy load and there was a lot of blocking happening in SQL
Tool used to identify issue: Logparser
Resolution: Using Logparser, we were able to see that someone had an automated script running against the MOSS portal causing a type of denial of service attach. To fix this, we just contacted the user and let them know what they were doing to the farm.
-----
|
-
I recently watched a webcast done by Laura Chappell on how to troubleshoot a slow network. She did an exceptional job in providing the essential information I needed in analyzing network captures for latency. Before I get ahead of myself, here are some key concepts and tools that you should familiarize yourself with:
The TCP Triple Handshake
Besides the fact that I get asked this question in every interview I have ever been to in the last 8 years, understanding how this works help you when troubleshooting network related issues such as Performance, Authorization and Page Not Found Errors. In a nutshell, the triple handshake consists of:
1. Syn – this is the initial packet that is sent from the client to the server
2. Syn Ack – this is initial packet that the server send back to the client confirming that is received the client’s request
3. Ack – this is the packet the client sends back to server confirming that it received the server’s packet and completes the handshake
High Latency is a MOSS killer
The network connection between the clients and the Front end Servers and the network connection between the Front End Servers and the SQL Server, Latency is the problem you need to find and remediate or mitigate for your MOSS Deployments. Kimmo Forss and Dino Dat-on have published a whitepaper on this topic and also gave some great presentations at TechReady 6 and the MS Sharepoint 2008 Conference on geo-dispersed environments and WAN optimization. It is definitely a must read!!!
Network Tools that I have used
Netmon
Wireshark
=====
Once you are ready to troubleshoot the network connection, follow the following steps to dig deeper into problem:
Wire latency:
1. On the client computer, open up netmon or wireshark and start a network capture
2. On the client computer, browse a MOSS page on the farm
3. Stop the network capture and open it up
4. If you see a big delay between the initial SYN (client) and SYN ACK (server), then it is wire latency
5. The next steps would be to investigate what is happening on the network as this may not be a MOSS, Client or Server issue.
Client latency:
1. On the MOSS Front End Server, open up netmon or wireshark and start a network capture
2. On the client computer, browse a MOSS page on the farm
3. Stop the network capture and open it up
4. If you see the initial SYN (client) , SYN ACK (server), and ACK (client) and then a long delay where the server is waiting for the client to send the next set of packets, then is it client latency
5. The next steps would be to investigate what is happening on the Client.
Server latency:
1. On the client computer, open up netmon or wireshark and start a network capture
2. On the client computer, browse a MOSS page on the farm
3. Stop the network capture and open it up
4. If you see the initial SYN (client) , SYN ACK (server), and ACK (client) and then a long delay where the client is waiting for the server to send the next set of packets, then is it server latency
5. The next steps would be to investigate what is happening on the Server
|
-
As an Ops person, one of the things that excites me is automation and being able to use technology to make my life easier. At the MS Sharepoint Conference, Ben Curry from Mindsharp gave a great presentation on how to perform an automated install of a MOSS farm using psconfig and stsadm commands. I haven't had a chance to try and tweak it for my work environment, however from what I saw, it was fantastic! Here is a template of how my current team has been approaching this silent or unattended MOSS installation.
Below is a list of the different Areas to focus on:
1. Operating System:
-
Use an image build to install the OS
-
Create some scripts (i.e. we use .vbs, .bat, or .cmd files) to configure the OS to follow your groups best practices (i.e. recyclebin, pagefile, autoupdate or antivirus settings). We were lucky to have some great developers to put this together for us, however if you don't have the teammates who can script or code, check out the library of existing scripts in Scriptcenter on Technet.
2. MOSS Prerequisites:
-
Create some scripts to install .net 2.0, .net 3.0.
-
Also create some scripts to configure and install IIS. For example, adding in MIME types or enabling web extensions
3. Install MOSS and create Farm:
-
Here is where we are going to look to leverage Ben Curry's automation scripts. They could be found on http://www.mindsharp.com under Premium Content.
-
His scripts include how to do an automated install of the MOSS bits, create SQL dbs (if desired), create a MOSS farm and some stsadm steps on how to configure it after the build.
-
We would also include into this section Language Packs and Hotfixes, so that every server is always at the right version.
4. Customization (or as my boss like's to call it - Flavoring):
-
Here is where we ensure that all customizations are built into a Sharepoint Solution so that customizations can be easily deployed, retracted or upgraded easily.
-
Again, we were lucky to have some devs create a solution install wrapper that allows us to easily manage Solutions so that anyone can manage them, regardless of their experience with Sharepoint.
-
Each farm would have their own group of solutions that would meet that farms need (Flavoring). However, the same solution install tool would be used and common solutions would be available to any farm.
hope this helps...
wayne
|
-
I had the privilege of attending a session at the MS Sharepoint 2008 Conference covering how MS Support troubleshoots MOSS issues. The session really hit home on how depending on a person’s experience and confidence, they will either make lots of changes hoping to get things fixed or they will do a lot of analysis and end up taking too long to find a solution.
Below is the initial process that I follow when someone asks me for help in solving a problem. Understanding that each issue/problem is unique, this first post of the Troubleshooting Series will cover only the first steps, which is the same regardless of the issue. Later posts in the series will focus on specific problems or areas that are unique to an issue based on the results of this first analysis stage.
hope this helps
wayne
Preliminary Analysis
1. Before starting anything, understand the problem at hand. This includes:
a. What exactly is the issue – the more detailed the info, the more effective you will be at quickly solving the issue
b. What is the urgency of the problem? For example, is it a:
i. Break/fix (i.e. is it an urgent issue where you need to drop everything to fix)
ii. Roadblock (i.e. is it causing a work stoppage that doesn't need to be solved right away)
iii. Nice to have (someone is trying to do something new or is looking for some consulting type of advice)
c. Who exactly is affected or impacted by this issue
i. How many people are having this problem or are affected by the issue
ii. Where are they located? Locally or somewhere remote?
d. What is the timeline of the issue
i. When did the issue start happening
ii. What was done so far to troubleshoot this
iii. How can I repro the issue
2. 2. Once you have the facts, put it in perspective. In other words, make sure everyone involved is on the same page:
a. Confirm everyone is clear what the problem is
b. Confirm everyone is clear what the desired end result is and when it is needed by
c. Don't feel like you have to know all the answers. Get help if needed (i.e. from other end users, coworkers, management, MS Support, etc…)
3. Make a plan of action. At this stage, you know what the issue is, you’ve put it in perspective and can start working on resolving it. It is important to factor in your personal life here so that you are setting realisitc expectations that are acheivable.
Preliminary Investigation
1. Try to repro the issue or watch the user repro the issue. With XP, Vista, Live Meeting, Communicator, etc…, you can share a session with anyone in the world, anytime.
2. Even if you can’t repro the issue, you can start ruling things out. I’ve noticed that everything usually ends up into one of these categories:
a. Knowledge/Perception
b. Client computer
c. Front End Server
d. SQL
e. Network
f. Active Directory
g. Sharepoint
h. IIS
i. Farm Topology
3. 3. Keep track of what you see and do and at what times. The better notes you take on the work you’ve done, the easier it is to get help from others, write a post mortem and document it for future reference. One thing I learned from the MS Support session at the conference was to document everything, even the steps that didn’t work.
Resolving the issue:
Most people like to jump right to this step, however they end up wasting time looking at the wrong things or in the wrong places. In the next parts of the series I'll break down how I tell which category is the problem and how I drill down deeper to get to the root cause.
|
-
Unfortunately, i'm still ramping up on how to blog and how to make my site look pretty and be organized well. Rather then waiting any longer, i'm gonna jump in and start posting some of the great stuff I have learned over the past few months, especially from TechReady 6 and the MS Sharepoint 2008 Conference that just finished.
So think of this as a brain dump first and then an orginaztion of the site (i.e. ensuring that the tags are in sync) at a later date.
thanks for your patience.
wayne
|
-
After writing my first two posts, I realized that this wasn’t as easy as I originally thought and spent the last three months researching other blogs and analyzing what I liked or looked for when reading them. I noticed that the ones I bookmarked contained details versus being very generic like my two first posts. L
Starting in Jan 2008, I am going to start posting more detailed posts on topics that I experienced in supporting SharePoint that would help those who administer and support it on a daily basis.
wayne
|
-
Change the following values in your web.config file so that you can see what the actual error is:
1. Change CallStack="false" to CallStack="true"
2. Change <customErrors mode="On"/> to <customErrors mode="Off"/>
3. Change <compilation batch="false" debug="false"> to <compilation batch="false" debug="true">
When you are done, make sure to change these three back to its original value as there is a performance hit when these are turned on/enabled.
|
-
-
This blog is focused on sharing some of the best practices, lessons learned and scenarios that I have encountered over the past several years supporting SharePoint in an Enterprise IT environment. My background and expertise is in IT Operations, so this blog is focused on Operational issues such as Performance, Scalability and Monitoring of SharePoint farms.
Thanks for taking the time to check out my blog.
wayne
All opinions are my own and do not represent any company or affiliation.
|
|
|
|