Welcome to TechNet Blogs Sign in | Join | Help

SharePoint IIS Traffic Analysis

SharePoint IIS Traffic Analysis

Internally we have experienced situations where internal users have built tools or solutions which have negatively affected performance for our SharePoint users.  This can be done in different ways such as Setting up multiple MOSS Farms to crawl our customer facing front ends at high performance setting or creating a custom application to execute against various MOSS web services without throttling the application in any way.  From that need spawned a little tool we created with the help of log parser J  The following is an example of how you could setup a scheduled task to notify you on number of Hits against your moss environments and how much bandwidth each user is consuming.

 

Please remember to never perform any of the following on production without first trying it out in a test environment, all items here are merely examples of how you ‘could’ perform such an action. J

 

Skills and Items required / assumed to perform this task.

·         A SQL 2005 server with XP_CMDSHELL enabled

·         Read Permissions for SQL Server Account Granted to the IIS Log directory of your SharePoint Servers

·         Access to create and modify databases on your SQL Server

·         Logparser 2.2 Installed on your SQL Server

·         Good Understanding of SQL Server and TSQL

·         Good Understanding of IIS

·         Access to all customer Facing SharePoint Front Ends

·         Good Understanding of SharePoint J

 

 

Step 1: Create a reporting database on your SQL Server, for the benefit of this blog entry we will call this database IISReports.  Set that reporting database to autogrow for the data file and log file.

 

Step 2:  Grant read permissions to all IIS log directories where your Web application Logs are stored on all servers you wish to report on.

 Example: %WINDIR%\System32\Logfiles\ex080218.log  For better performance internally we place our IIS Log files on a serperate drive

 

Step 3:  Install Logparser 2.2 to a directory on your sql server(make note of this directory)  For the purpose of this entry we will call this c:\temp\logparser.exe

Logparser 2.2 can be downloaded @ http://www.microsoft.com/technet/scriptcenter/tools/logparser/default.mspx 

 

Step 4:  Create Tables that we will use to pull in and manipulate the data

Click Here to open example Script

 Step 5:  execute the Create script for the stored procedure that will be used to pull the data.

Click Here to Open example Script – Make sure to read the /**notes**/ as they direct you to make changes to the script to cater to your environment.

Step 6: If everything has been followed and updated accordingly you can now run the following in SQL..  shortly there after you will receive an email similar to the following J

 

use [IISReports]

go

exec IISLogtracking

go

 

Example email received

Top 20 for BW used

Bandwidth Used(MB)

Total Hits

Logfilename

Url Host

Username

IP

20563

114502

\\servername\e$\w3svc\ex080226.log

sharepoint

Domain\username

127.0.0.1

5317

438216

\\servername\e$\w3svc\ex080226.log

sharepoint

Domain\username

127.0.0.1

4074

9055

\\servername\e$\w3svc\ex080226.log

sharepoint

Domain\username

127.0.0.1

2854

7720

\\servername\e$\w3svc\ex080226.log

sharepoint

Domain\username

127.0.0.1

2231

42288

\\servername\e$\w3svc\ex080226.log

my

Domain\username

127.0.0.1

Top 20 for Total Hits

Total Hits

Logfilename

Url Host

UserName

IP

Bandwidth Used (MB)

438216

\\servername\e$\w3svc\ex080226.log

sharepoint

Domain\username

127.0.0.1

5317

114502

\\servername\e$\w3svc\ex080226.log

sharepoint

Domain\username

127.0.0.1

20563

92346

\\servername\e$\w3svc\ex080226.log

sharepoint

Domain\username

127.0.0.1

532

56829

\\servername\e$\w3svc\ex080226.log

my

Domain\username

127.0.0.1

122

42288

\\servername\e$\w3svc\ex080226.log

my

Domain\username

127.0.0.1

2231

 

 Cory

MSIT SharePoint Reporting Framework 2.0 Released!

I am pleased to announce I have finished version 2.0 of the MSIT SharePoint Reporting Framework.  Versions 1.0 was used heavily internally for our service metrics and operational tracking.  With version 2.0 many great improvements have come out of it, additionally with the release to the public :)  I encourage everyone to head over and check it out.  Thanks!!

http://www.codeplex.com/MSITSRF

This is one of many very great tools that will be available on the CD included in our upcoming book 'SharePoint for Architects and Engineers'

 

Online Services from Microsoft Launched! (SharePoint , Exchange among others)

"Online services from Microsoft are hosted solutions for businesses with advanced IT needs, including the ability to control access to data, manage users, apply business and compliance policy, and meet high availability standards. These services deliver the core enterprise-class capabilities of Microsoft software as hosted business services, providing the performance, scalability, security, management features, and service level capabilities to support mission-critical applications"

 See the full detailed site at www.microsoft.com/online

 

 Go SharePoint! :)

 

 Cory

High Performance Upgrades for MOSS 2007 & WSS 3.0 (Database Migration Methods)

If you’re looking for the best performance and quickest upgrade I highly recommend the Database Migration approach.  At MSIT our preferred method is such, it allows a creative flexible upgrade path that has the fastest throughput.  Using a method we call parallel upgrades,  in MSIT we were able to upgrade at 4 times faster than a standard database migration.  What is a parallel farm upgrade you ask?  Well read on :)

 

Consider the following scenario

 

4 Terabytes of data to upgrade

12,000 site collections to upgrade

48 hours to complete the upgrade

 

Product throughput on optimal hardware from our experiences internally was 50-70GB per hour.

 

Let’s assume we only get 50GB/hr that comes out to around 80 hours to complete, this was not fast enough in our situation.  With that being said we had to get creative and came up with the parallel method.  This entails creating multiple single box farms with single web applications on a shared high powered SQL 2005 cluster and balancing out the database attach.   Let’s assume that http://portal has 80 content databases at 50GB per database. 

 

Some standards to note that we hold true internally.

 

·         Your root site should be kept in its own database with the site warning maximum set to 0/1, this allows for easy manipulation of the root site in upgrades away from the rest of the farm

·         Site collection count / Site size should be balanced accordingly internally we try to maintain no more than 500 site collections per database, once reaching that limit we cap off the database and create a new content database (this will not be true if you are going toward a database mirror scenario)  reason being is due to the max quotas we have set on site collections this is the best site cap by design storage wise.  In your environment this may be different.

 

 

The following steps are assuming you have already run prescan on your environment and cleaned up any discrepancies.  Additionally the assumption would be you have a new Production MOSS Farm created and awaiting your content databases.  For more information on prescan Bill Baer has several great posts explaining this in detail @ http://blogs.technet.com/wbaer/search.aspx?q=prescan&p=1

 

 

 

 

Step 1 would be to create the 4 farms; this can be virtualized however in our experience we used physical hardware. 

 

  

Step 2 would be to backup your databases,  in a database migration approach you would backup your databases, restore them with new names and upgrade your backups, thus in the event you need to back-out your old farm with your old databases are running untouched.

 

Step 3 would be to create your batch files for easy automation of your database migration.  Your batch files may look similar to the following.  Please only upgrade copies of your databases and never the actual databases if at all possible.  Also once you execute a database attach command the schema changes instantly, canceling the command will not save the database J

 

Batch1.cmd

Stsadm –o addcontentdb –url http://webapp1 –databasename contentdb_1 –databaseserver SQLsrvA

Stsadm –o addcontentdb –url http://webapp1 –databasename contentdb_2 –databaseserver SQLsrvA

Stsadm –o addcontentdb –url http://webapp1 –databasename contentdb_3 –databaseserver SQLsrvA

Stsadm –o addcontentdb –url http://webapp1 –databasename contentdb_4 –databaseserver SQLsrvA

 

Batch2.cmd

Stsadm –o addcontentdb –url http://webapp2  –databasename contentdb_5 –databaseserver SQLsrvA

Stsadm –o addcontentdb –url http://webapp2  –databasename contentdb_6 –databaseserver SQLsrvA

Stsadm –o addcontentdb –url http://webapp2 –databasename contentdb_7 –databaseserver SQLsrvA

Stsadm –o addcontentdb –url http://webapp2  –databasename contentdb_8 –databaseserver SQLsrvA

 

 

Batch3.cmd

Stsadm –o addcontentdb –url http://webapp3  –databasename contentdb_9 –databaseserver SQLsrvA

Stsadm –o addcontentdb –url http://webapp3  –databasename contentdb_10 –databaseserver SQLsrvA

Stsadm –o addcontentdb –url http://webapp3  –databasename contentdb_11 –databaseserver SQLsrvA

Stsadm –o addcontentdb –url http://webapp3  –databasename contentdb_12 –databaseserver SQLsrvA

 

Batch4.cmd

Stsadm –o addcontentdb –url http://webapp4  –databasename contentdb_13 –databaseserver SQLsrvA

Stsadm –o addcontentdb –url http://webapp4  –databasename contentdb_14 –databaseserver SQLsrvA

Stsadm –o addcontentdb –url http://webapp4  –databasename contentdb_15 –databaseserver SQLsrvA

Stsadm –o addcontentdb –url http://webapp4  –databasename contentdb_16 –databaseserver SQLsrvA

 

 

 

Notice how we are evenly distributing the content databases between the 4 parallel farms.  These batch files should be ran from your bin folder on their respected parallel farm installs (%systemdrive%\program files\common files\web server extensions\12\bin)

 

Once all 4 batch scripts have completed check your upgrade.log on all 4 parallel farms for discrepancies.

 

Step 4 Verify upgrade has completed successfully for all databases

 

Step 5 Detach all of your databases from the parallel farms, this can be done by modifying your batch scripts to use stsadm –o deletecontentdb rather then addcontentdb.

 

Step 6 Create a new batch file that would dbattach all of your upgraded databases to your production MOSS farm that you have created and waiting on standby.  Ensure that you DBattach your root site collection database first.

 

And that’s it!  Essentially we have quadrupled the throughput of your database migration.

 

Happy Upgrading!

 

 

Cory

Posted by corybu | 4 Comments

SharePoint Site Moves, Database Moves and Balancing Growth

SharePoint adoption is at an all time high right now with the release of MOSS 2007 and WSS 3.0, this is wonderful for your investment but can be a nightmare for a poorly planned deployment that was not ready for massive growth.  Certain operational tasks may be on the horizon to better balance your content databases in regard to site size, database size and potentially even your SQL storage plan in general.

 

Site Moves are an important part of maintaining your SharePoint environment.  Situations may arise where a site has outgrown other sites by a substantial amount.  To help support this larger site in an intelligent manner we need to migrate it to its own database away from other site collections (Split).   Additionally if you know that a site has the potential to grow substantially this would be a good idea as a proactive approach. We will refer to this action as splitting a site.  Other situations can also warrant splitting sites out of a database such as  your site max cap was set to high initially and now you have too many site collections in each database or for legal or business reasons you need to separate certain sites from other sites even at a database level.

Balancing Databases would be another action you may need to entertain to help gain control over your growth.  This may be as simple as detaching a database from SQL and moving it to another LUN on your SQL instance or migration of the database to an entirely different SQL server, or as complex as load balancing your site creation between LUNS / SQL Servers.

  

Site Moves

To migrate a site collection from 1 database to another with control first we need to understand how SharePoint decides where a new site collection is created.  SharePoint will place a new sitecollection into the content database with the largest delta between sitecount and sitemax.  What does that mean exactly, lets look deeper.

 

As you can see above the delta for demo_content_1 is 500 – 3 (497) and the delta for demo_content_2 is 500.  The larger the delta is determines where your site goes.  The next site created will go into demo_content_2.

With that being said you now have the ability to migrate sites to the database of your choice.  To perform this move you would simply do the following 3 commands.

stsadm -o backup -url http://demo/sites/test2 -filename e:\test2.bak

stsadm -o deletesite -url http://demo/sites/test2

stsadm -o restore -url http://demo/sites/test2 -filename e:\test2.bak

If we refresh the content database page we can see that our site has moved to demo_content_2.

 

Database Stripping

In the event that your site has grown beyond the limitations of stsadm (in our testing no more then 15GB) you may need to entertain the path of database stripping.  For database stripping we would take a backup of the database that contains the larger site and attach it to a testing / restore environment.  You would then use stsadm to delete all site collections except the larger site.  This in effect has singled out your site collection into its own database.  Then when you reattach that database to your production farm which contains this 1 site you want to set your site warning limit to 0 and site max to 1, thus to keep that 1 site separate into its own database.

 

  

Balancing Databases

To help increase performance and manageability you may want to at some point balance your content databases across multiple LUNS or even SQL servers.  In the event you want to migrate your database to another LUN on the same SQL server instance I recommend the following steps. 

 

- Inter SQL Server Database Move

 First leave your database attached to your farm, there is no reason to detach your database from your SharePoint farm if your moving it inter SQL Server.  Simple Detach the database from SQL and copy your .MDF and LDF to the share of your choice.  Then reattach your database in SQL and perform an IIS reset on your SharePoint Front end.  Also leaving your SharePoint Content database attached helps avoid orphan site collection creation.

 

 

- External SQL Server Database Move

  •  Log Shipping Method(low impact)

To migrate your content database from 1 SQL Server to another there are a few choices, your choice will depend on how much downtime you are willing to take.  First option being you can setup SQL Log shipping, Once your 2 databases are in sync you would do the following steps.

 Stsadm –o preparetomove –contentdb SQLServerA:demo_content_1

 Stsadm –o deletecontentdb –url http://demo –databasename demo_content_1 –databaseserver SQLServerA

 At this point you would perform the final cutover on your log shipped database to SQLServerB

 Then finish with

 Stsadm –o addcontentdb –url http://demo –databasename demo_content_1 –databaseserver SQLServerB

 

  • Copy Method(high impact)

Your other option would be to simply detach the database from SQL and copy the .mdf and .ldf to the new SQL server.  While in transit your users will get a generic “Cannot connect to content database” error when visiting their site.  Following the copy you would perform the following on your web front end.

 Stsadm –o preparetomove –contentdb SQLServerA:demo_content_1

 Stsadm –o deletecontentdb –url http://demo –databasename demo_content_1 –databaseserver SQLServerA

 Stsadm –o addcontentdb –url http://demo –databasename demo_content_1 –databaseserver SQLServerB

 

Load Balancing Site Creation

Using the methods explained above we can in a sense load balance our site creation between content databases and SQL Servers.  We can do this by creating multiple content databases across SQL LUNS and SQL Server instances. 

Since site creation is based on delta it will create the next new site collection in demo_content_3 or demo_content_4.  Once both of those databases reach a site count of 1 it will repeat the process against all 4 and so on.  You will then always have +1 -1 Number of site collections between your content databases.

 

SharePoint PG pre-release of Community Kit for Sharepoint!

The CKS:EBE was designed to enhance and extend the functionality of the built-in blog feature in Windows SharePoint Services 3.0. By leveraging its innovative Modular Theme Framework (MTF), your blog can look nothing like a typical SharePoint blog while retaining the ease of use of the SharePoint blog’s built-in functionality. You can modify an existing theme or install a new one even if your blog is provided by an external SharePoint hoster or an internal IT department, and you only have Contributor level permissions. The EBE team’s motto is that your blog should look as smart or creative as you think you are.  To read the whole article click Click Here!

Posted by corybu | 0 Comments

Sharepoint Reporting Methods

Reporting in SharePoint can be very helpful for all types of scenarios, storage growth projection, weekly service reviews, operational tracking for backups / restores and more. Below are a few scripts that I have wrote for MOSS 2007 / WSS3.0, These can be used to pull all kinds of good information that might not be available from the Front End(OM). 

Please keep in mind that any direct database queries or modifications are not supported by any means.  A suggestion might be to perform these against a test / backed up copy of your production data.

 

Content Script 1 

The following script grabs all kinds of great data from your moss 2007 / WSS3.0 Content Database such as:

·          SiteURL – Path for a site collection

·          SiteAdmin – Site Owner / Admin for the site Collection

·          RecycleBin – Amount of disk space RecycleBin is currently using

·          BandwidthUsed – Amount of traffic the site has generated

·          SiteSize – Amount of space the site takes up in the database

·          SiteMaxQuota – Specific Max Quota the site has specified

·          SiteID – Site GUID assigned for specific collection

·          Content_DB – Database that site resides in

·          ServerName – Server that Database Resides on

·          LastContentChange – Last time a user modified any part of the site

·          DaysSinceLastChange – Days Since last modification to site

 

Use <ContentDatabase>

select distinct a.fullurl as [SiteUrl],

b.tp_login as [SiteAdmin],

sum(cast(c.size as decimal))/1024/1024 as [recyclebin],

cast(d.bwused as decimal)/1024/1024 as [BandwidthUsed],

cast(d.diskused as decimal)/1024/1024 as [SiteSize],

cast(d.diskquota as decimal)/1024/1024 as [SiteMaxQuota],

d.id as [SiteID],(select db_name(dbid) from master..sysprocesses where spid=@@SPID) as [Content_DB],

(select @@servername) as [ServerName],

d.lastcontentchange as [LastContentChange],

(select datediff(day,d.lastcontentchange,current_timestamp)) as [DaysSinceLastChange]

 

from webs as a inner join  

            sites as d on a.siteid=d.id inner join

       userinfo as b on a.siteid=b.tp_siteid left join

         recyclebin as c on a.siteid=c.siteid where b.tp_siteadmin = '1' and a.parentwebid is null   

group by a.fullurl, b.tp_login, d.diskused, d.id, d.bwused, d.diskquota, d.lastcontentchange

Order by a.fullurl

 

Configuration Script 1

The following script will create a high level snapshot of how your configuration database looks from a SQL standpoint.  It would be run against your configuration database.  It includes the following set of columns

 

·          ConfigServer – Server your configuration database lives on

·          Config_DB – Name of your configuration database

·          Content_DB – Name of the content database that specific site lives in

·          SiteID – Site GUID assigned for specific collection

·          ServerName – Server where content database lives

·          SiteURL – Full SiteUrl for each site collection in the Farm

 

Use <Configuration Database>

            SELECT (Select @@servername) as [ConfigServer],

(select db_name(dbid) from master..sysprocesses where spid=@@SPID) AS [config_db], a.name as [content_db],c.id as [SiteID],  b.name as [servername], 'http://'+d.name+c.path as [Siteurl]

      from objects as a inner join

      sitemap  as c on a.id=c.databaseid inner join

      objects  as d on c.applicationid=d.id inner join

      objects as F on a.parentid=f.id inner join

      objects as b on f.parentid=b.id

where a.id in (select databaseid from sitecounts)

order by a.name

 

Content Script 2 

The following script grabs lower level web information for your site collections including the following columns:

 

·          ServerName -  ServerName content database is located on

·          Content_DB – Content Database web lives in

·          SiteID – Site GUID assigned for specific collection

·          WebID – Web GUID assigned for specific Web

·          WebURL – URL for web

·          ParentwebID – Relative GUID in which web is a child of ( Null = Site Collection )

·          WebTemplate – Template Web was provisioned as

·          Language – Language web was provisioned as

·          ProductVersion – Version of site

·          DocumentCount – Number of documents that live within web

 

select distinct(select @@servername) as [servername], (select db_name(dbid) from master..sysprocesses where spid=@@SPID) as [Content_db], a.siteid, a.id as [WebID], a.fullurl as [WebURL], a.parentwebid, a.webtemplate, a.language, a.productversion, count(b.id) as [DocumentCount]

from webs as a inner join

       alldocs as b on a.siteid=b.siteid and a.id=b.webid

group by a.siteid, a.id, a.fullurl, a.parentwebid, a.webtemplate, a.language, a.productversion

order by a.fullurl

 

As you can see all 3 queries contain the site GUID, a content_db and a Servername… With that being said if you were to pull this data into a central repository you could then write SQL joins on those 3 columns and create all kinds of helpful views on your infrastructure.   Additionally you can write SQL cursors around these scripts to pick up on table SCHEMA and loop through all of your content databases rather than manually executing on just one.  The sky really is the limit.  I will have an update to this with some example cursors and joins in the coming weeks.

 Cory

 

 

 

Detaching Databases in MOSS 2007 Environments.

While performing database maintenance there are few things to consider and execute to keep your environment healthy.  First being preparetomove, if you are detaching databases from a production environment this is a must.

 

The following line is extremely important so I thought I would make it bold :)

 

Any database that is attached to a MOSS 2007 farm and consuming from an SSP has to have stsadm -o preparetomove executed prior to the actual detach from the farm. 

 

 

This is due to several reasons, the main being IT WILL break the relationship between your shared service provider and your database.  This relationship can be repaired but it essentially means losing the part of your search index that relates to that database.

 

The full command is below.

 

stsadm -o preparetomove
           {-ContentDB <DatabaseServer:DatabaseName> |
       -Site <URL>}
           [-OldContentDB <uniqueidentifier>]
           [-undo]

 

 

For detaching databases cleanly from the farm we really only care about the first 2 switches.  So your command might look like this.

 

stsadm -o preparetomove -contentdb SQLSERVER:DBNAME -site http://www.sharepointskills.com

 

Followed by the following command to actually remove your database from your farm.

 

stsadm -o deletecontentdb -url http://www.sharepointskills.com -databaseserver SQLSERVER -databasename DBNAME

 

When you want to reattach your database to the farm it’s only 1 step rather than the 2 step process.

 

stsadm -o addcontentdb -url http://www.sharepointskills.com -databasename DBNAME -databaseserver SQLSERVER

 

Essentially what this does is it tells the SSP that you are relocating this database and to prepare to get a new database GUID.  It tells the system that the current GUID for databaseA is stale and that when this database name is picked up the next time in the crawl to replace its GUID.  With doing that is shifts all of the Site information and index's for that database to the new database GUID.

 

If you fail to use this command prior to the move it will create a stale entry in your SSP and all sites going forward in the database will not be crawled.  Additionally it will create a pretty ugly error in your web front end application log that looks similar to the following.

 

Failure trying to synch web application 09a21da5-4485-4b00-8268-772aea7fea12, ContentDB 65301403-c277-4b4c-ad5a-e822572d10ea: A duplicate site ID 3b3a4372-aa91-4e0c-ba57-2567958d81bb(http://portal/sites/test1) was found. This might be caused by restoring a content database from one server farm into a different server farm without first removing the original database and then running stsadm -o preparetomove. If this is the cause, the stsadm -o preparetomove command can be used with the -OldContentDB command line option to resolve this issue.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.

 

 

This is wonderful it gives you a resolution!  However there is much more to this resolution than what meets the eye.

 

First we need to find out which databases are currently not in sync, this can be done from a simple stsadm command on your SSP web front end.

 

stsadm -o sync -listolddatabases 5

 

This command will list all databases that have not synced up correctly with the SSP in greater than 5 days.  Based on your schedule to index your age value may change.  You may want to change this value to 10 or 15.  The result set returned is a list of database GUID's and the date/time they were last synchronized. Now that we have determined the list we need to clean them up and get those guys synced.  To do this we execute an stsadm command that sets all of those GUID's as old, then during your next index it will pick up the new GUID's for your production databases not currently being synced.

 

stsadm -o preparetomove -contentdb sqlserver:dbname -oldcontentdb <GUID>

 

where sqlserver is where your content is housed, dbname is any database in your child farm, and GUID is a guild from the list of non-syncing databases we generated with the command above.

 

Based on your crawl schedule wait until the next full crawl.  Once it completes go back into your SSP and run the stsadm -o sync -listolddatabases 5 command again.  Anything left can likely be removed at this point.  You can remove what is left by running the following command.

 

stsadm -o sync -deleteolddatabases 5

 

this will delete all GUID entries in the SSP for anything that is out of sync for more than 5 days.  You have now successfully cleaned all inconsistencies with syncing in your SSP.  All out of sync alerts should now subside.

 

Remember use stsadm -o preparetomove before detaching a database and this will not be a problem!

 

Cory

Sharepoint Orphans Explained

What are Orphans in SharePoint?

Orphans have been coined the name for objects in a SharePoint schema that live without a parent or child relationship in the database. In effect these database inconsistencies can be created due to many different reasons.  They cause confusion, helpdesk tickets and can cause your upgrade from wss2.0 / sps2003 to MOSS 2007 / wss3.0 to fail.  We will explore some of the ways these can be created, how to locate them, methods for resolving orphans and a few key things you can do to help avoid them in the first place.

 

In order to really understand orphans we need to understand a high level overview of how SharePoint configuration databases and SharePoint content databases interact.

Figure 1.1 Configuration Database Content Database relationships

 

Orphans come from a host of different reasons. Unfinished create and delete transactions being the main source. In SPS2003 if you were to create a new SharePoint site and click back, or close the window in mid creation it would essentially stop where you clicked back. More often than not it would not roll back the transaction or clean it in any shape or form, a very sloppy end to a standard task in SharePoint. This in turn creates a Surface orphan. For the sake of this entry any site level orphan (content>config/config>content) we will refer to as Surface orphans.

Figure 1.2 Surface Orphans

 

Another source of orphan creation is actually caused from farm maintenance. Any farm running a mysitehost should proceed with caution while performing database maintenance. MySitehosts are root portals that automatically provision a new site collection to a user when he/she visits the portal for the first time. This can be very handy for on-boarding new employees etc. This can be a huge nightmare for the operations team which supports the mysitehost. In SharePoint if you move a database from 1 SQL server to another it requires you to detach the database from the Farm Admin GUI and reattach using the new SQL instance name. The huge problem with doing this while the farm is accessible by its users is the fact that once you remove that mysitehost database from the admin GUI that mysite host no longer has entries for any site in the content database you are migrating. Thus any user that visits your mysite host that has a site in the content database being moved will automatically be provisioned a new site. Once you reattach that database in the Admin GUI it will repopulate all of its configuration database entries. However if a site had been created with a same URL while that database was not attached, it will skip that site collection as one already exists. Thus you have another surface orphan. This causes a great deal of confusion from a customer standpoint.

Let’s step back and view this from the customer/user.

Monday – I visit http://mysitehost and my site is created, I add new documents, NEO information etc…. Throughout the week I update my Documents all is well.

Friday 7pm-11pm - the IT Team performs maintenance on mysitehost_database_1 they are migrating it to another SQL server. It just so happens that my site lives in mysitehost_database_1, this migration consists of the IT Team detaching the database from the FARM while it is in transit.

Friday 8:30pm – I the unknowing user to this maintenance visit http://mysitehost, since the content database which contains my site has been detached from the farm it does not know of its existence. It in turns creates a brand new site for me at http://mysitehost

I notice all of my documents I have been working on all week are missing; I file a restore request to helpdesk to retrieve my documents.

Are the documents really gone? Nope, they are in that database being migrated. However SharePoint now has a new entry in that darn configuration database for this same site. When the IT Team at 11:00pm reattaches mysitehost_database_1 to the farm it will not write an entry in the configuration database to point to this week old site. It already exists right!? We just created a new one 2.5 hours ago… Thus a new surface orphan.

Below is a 3 step story board showing this visually.

Phase 1 Before Database Migration

Phase 2 During Migration

Phase 3 End result.

 

Now imagine that your company of 5000 employee’s all have their mysite host site living on 2 databases. I think you start to get the picture, if even 100 users visit their site while the migration is underway of 1 of those databases that’s 100 helpdesk tickets and 100 new orphans in your farm.

Alas, there is a bit of light throughout this dark tunnel with these different types or orphans. You can reverse the process. It is neither pretty, nor fun for the IT Team J but it is possible. And most importantly it is preventable for the most part.

 

 

How to find Surface Orphans

So you have orphans, before we can fix them we need to find them!

You can only detect orphans through SQL, there is no real out of the box way to detect them from the web UI. That’s ok SQL is fun anyway J Into SQL we go. To use the following script all you require is an open Query window with read access to all databases within your farm. The following script assumes you run all of your databases on the same SQL instance.  This script is an example for finding orphans in MOSS 2007/WSS3.0 

PS. I am not a SQL Guru

ps#2 Use these SQL scripts at your own risk, even querying the database is not supported!

Use MSDB
Drop table orphanlist

CREATE TABLE [dbo].[orphanlist](
 [farm] [varchar](250) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
 [databasename] [varchar](250) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
 [SiteID] [uniqueidentifier] NULL,
 [sitepath] [varchar](250) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
 [type] [varchar](250) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
)

drop table orphan_hopper  
declare  
@dbname as varchar(250), 
@cmdstr as varchar(2000), 
@dbid as varchar(250),
@configdb as varchar(250)  
 
/** only change the following line and nothing else, change spskills_config_db to your config db name **/
select @configdb = 'spskills_config_db'

/** Change nothing below this line **/
select @cmdstr = 
'select distinct b.name as ''databasename'', b.id as ''dbid'' into orphan_hopper 
from 
 [' + @configdb + '].dbo.sitemap as a inner join   
 [' + @configdb + '].dbo.objects as b on a.databaseid=b.id inner join   
 [' + @configdb + '].dbo.objects as c on c.id=a.applicationid inner join   
 [' + @configdb + '].dbo.objects as d on b.parentid=d.id inner join   
 [' + @configdb + '].dbo.objects as e on d.parentid=e.id  '  
exec (@cmdstr)    
 
DECLARE DBCursor CURSOR For 
  Select databasename, dbid 
  From orphan_hopper 
 
OPEN DBCursor 
FETCH NEXT FROM DBCursor into @DBName, @dbid 
 
WHILE @@FETCH_STATUS =0 
BEGIN 
 INSERT INTO orphanlist([Type], farm, databasename,[sitepath], SiteID) 
 EXEC 
  (' 
select ''Potential ConfigDB orphan:'' +
'''+@dbname+'''   as [Type], '''+@configdb+''' as [farm], '''+@dbname+''' as [databasename],path as [sitepath], id as [SiteID] from ['+@configdb+'].dbo.sitemap where id not in (select id from ['+@dbname+'].dbo.sites) and databaseid = '''+@dbid+''' 
union 
select ''Potential ConfigDB orphan:'' +
'''+@dbname+'''   as [Type], '''+@configdb+''' as [farm], '''+@dbname+''' as [databasename],path as [sitepath], id as [SiteID] from ['+@configdb+'].dbo.sitemap where id not in (select siteid from ['+@dbname+'].dbo.webs where parentwebid is null) and databaseid = '''+@dbid+''' 
union 
select ''Potential ContentDB orphans:'' +
'''+@dbname+''' as [Type], '''+@configdb+''' as [farm], '''+@dbname+''' as [databasename],fullurl as [sitepath], siteid as [SiteID] from ['+@dbname+'].dbo.webs where parentwebid is null and siteid not in (select id from ['+@configdb+'].dbo.sitemap where databaseid = '''+@dbid+'''
union 
select ''Potential ContentDB orphan:'' +
'''+@dbname+'''  as [Type], '''+@configdb+''' as [farm], '''+@dbname+''' as [databasename],fullurl as [sitepath], siteid as [SiteID] from ['+@dbname+'].dbo.webs where parentwebid is null and siteid not in (select id from ['+@dbname+'].dbo.sites) 
') 
 FETCH NEXT FROM DBCursor into @DBName, @dbid 
END 
CLOSE DBCursor 
DEALLOCATE DBCursor 
 
select * from orphanlist 

 

 Running this script will give you ConfigurationDatabase, ContentDatabase, SitePath, SiteID and Potential Type of orphan(contentDB or ConfigDB).

 

Wonderful now we have a list of items to clean!

 

 

How to Clean Orphans from your environment.

(Be smart have backups prior whenever you are performing maintenance on your farm, use suggestions at your own risk.)

 

Configuration Orphans: These are the orphans that reside in your configuration database but have no child counterpart (contentDB entry).  Cleaning these are the easiest of all the orphans.  Simply detach the content database from your farm that was included in the result set and reattach it.  This will refresh the sitelist that is tied to that content database and will remove the stale entry.

 

Content Database Orphans:  This is where it gets tricky.

 

There are 2 essential types of content database orphans.

 

Type 1 Scenario(reactive maintenance):  Your site that has the content you need is not mapped to the configuration database but resides in a content database that is connected to the farm, additionally  a blank new site is mapped to the configuration database.  This would hold true to the second scenario explained above.  To resolve this simply backup the site that is accessible, then delete it.  Once done detach and reattach the database that contains the real site.  This will remap the site to the configuration database.  You now have access to a once orphaned site and all of your content is restored.

 

Type 2 Scenario(Planned Maintenance): The correct site is mapped to the configuration database, however you have stale orphans in other databases.  Simply backup your production site and delete it.  Once that completes detach and attach the database that contains the stale orphan, this will in effect map the orphan to the configuration database and render it accessible.  You can then delete it using STSADM.  Perform these steps until you have cleaned all orphans.  Once all orphans are clean you can then restore your production site back into the farm.   Viola, orphan free :)

 

 

How to help prevent orphans.

 

    - User Education

   Remind users that creating and deleting a site can be a sometimes several minute process.  Regardless of how long it takes let the application finish what it is doing.  If it times out then try again but never click back, stop, or close the window when performing these types of administration tasks.  Network Latency, Web front end performance and SQL backend performance can all attribute to a slow create or delete statement.  Be patient :).

 

- Operations Maintenance

When performing maintenance to a database where you are taking it offline for any reason consider it a good practice that if possible takes the entire portal offline.  Even though you may be only working with a small fraction of the farm you could potentially be creating a lot more problems than you realize.  If you are not able to bring down the entire portal a suggestion to alleviate the potential for orphans it to leave the database attached to the farm UI while it is in transit / detached from SQL.  The user will then get a "Cannot connect to database error."  Rather than being able to create a potential orphan.  Then as your last step for the migration perform the detach/attach in the admin UI very quickly limiting the possibility for orphan creation.

 

Hope this helps some of the confusion out in the world on orphans!

 

Cory

Posted by corybu | 8 Comments
Filed under: ,
 
Page view tracker