-
I’m working on another BizTalk performance gig and created a few custom HAT queries for measuring BizTalk artifact durations. I like to call them BizTalk Artifact Duration Aggregations or BADAggs for short – pun intended. ;-)
To aggregate artifact durations for the last x amount of time, you can modify the time range by modifying the dateadd parameters on line 2 of the following query. In this case, this query aggregates the durations of all of the artifacts that executed within the last 30 minutes.
declare @Timestamp as datetime
set @Timestamp = dateadd(minute, -30, GETUTCDATE())
SELECT
[Service/Name],
AVG([ServiceInstance/Duration]) as AverageDuration
FROM dbo.dtav_ServiceFacts sf WITH (READPAST)
WHERE [ServiceInstance/StartTime] > @Timestamp
GROUP BY [Service/Name]
ORDER BY AverageDuration desc
This next query allows you to choose a begin and end time range to aggregate the durations of the artifacts.
declare @BeginTime as datetime
declare @EndTime as datetime
set @BeginTime = CAST('2008-05-04 00:00:00.000'as datetime)
set @EndTime = CAST('2008-05-06 00:00:00.000'as datetime)
SELECT
[Service/Name],
AVG([ServiceInstance/Duration]) as AverageDuration
FROM dbo.dtav_ServiceFacts sf WITH (READPAST)
WHERE [ServiceInstance/StartTime] > @BeginTime
AND [ServiceInstance/StartTime] < @EndTime
GROUP BY [Service/Name]
ORDER BY AverageDuration desc
Both of these return results similar to this:
[Service/Name], AverageDuration
Microsoft.BizTalk.DefaultPipelines.XMLReceive,1057
Microsoft.Samples.BizTalk.ConsumeWebService.ReceivePOandSubmitToWS,7375
Microsoft.BizTalk.DefaultPipelines.PassThruTransmit,1115
Enjoy!
-
We [Microsoft] tend to have quite a few diamonds in the rough. One of those is our DLL Help database. This is a public tool where you can search to see nearly all of the public releases of a specific file. For example, if you are having an issue with a Microsoft product and have it narrowed down to a DLL, EXE, or SYS file, then you can look it up to see if there are newer releases of it and/or what public release the file was installed by. Anyway, here is the location of the database.
Microsoft DLL Help Database
http://support.microsoft.com/dllhelp/
The disadvantage of this database is that it doesn't always show *all* of the public releases such as hotfixes. For that you need to search the Microsoft Knowledge Base for the file name located at:
http://support.microsoft.com
-
Introduction
The purpose of this article is to provide prescriptive guidance on how to troubleshoot free system page table entries (PTEs) in regards to Windows performance analysis.
Start with the following performance counters to analyze free system PTE’s:
· \Memory\Free System Page Table Entries
A page table is the data structure used by the Windows Virtual Memory Manager (VMM) to store the mapping between virtual addresses and physical addresses in memory. The performance counter Free System Page Table Entries is the number of page table entries not currently used by the system.
From the process perspective, each element of virtual address conceptually refers to a byte of physical memory. It is the responsibility of the Virtual Memory Manager (VMM) in conjunction with processor memory management unit (MMU) to translate or map each virtual address into a corresponding physical address.
The VMM performs the mapping by dividing the RAM into fixed-size page frames, creating system PTEs to store information about these page frames, and mapping them. System PTEs are small kernel-mode buffers of memory that are used to communicate with the disk I/O subsystem and the network. Each PTE represents a page frame and contains information necessary for the VMM to locate a page.
Note: Troubleshooting System PTE’s is explained in more detail at in the “Detection, Analysis, and Corrective Actions for Low Page Table Entry Issues” article mentioned in the References section below.
This article is grouped by symptoms, then by possible causes.
Symptoms: Lack of Free System Page Table Entries (PTEs) and system-wide delays (I/O request failures)
Applies to:
- 32-bit Windows Server 2003 (all editions) unless otherwise specified
- 32-bit Windows XP (all editions) unless otherwise specified
- 32-bit Windows Server 2000 (all editions) unless otherwise specified
Symptom Details:
· Lack of Free System Page Table Entries (PTEs): Use the “Memory\Free System Page Table Entries” performance counter for values under 5000. Alternatively, the !pte command in the kernel debugger can be used to examine PTEs.
- Periodic system-wide delays: System-wide delays or hangs that occur regularly or occur during elevated load on the system. This is measured by the user experience and I/O response times of the system.
Possible Cause: Use of Physical Address Extension (PAE) Kernel
How to Diagnose
· To determine if the Windows 2003 server is booted with the PAE kernel, by checking for a value of 1 in the registry key, “HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management\PhysicalAddressExtension”. Alternatively, the boot.ini file can be searched for the PAE boot switch.
Note: when execution protection is enabled on 32-bit Windows, the system automatically boots in PAE mode (automatically selecting the PAE kernel).
· The PAE switch allows the Windows operating system to address more than 4GB’s of physical memory and in certain environments the PAE switch is automatically enabled. On x86 systems, the page table index is 10 bits wide (9 on PAE), allowing you to reference up to 1024 4-byte PTEs. Therefore, the /PAE switch causes PTE entries to use twice the normal allotted virtual address space. For more information, see “Microsoft Windows Internals” by Mark E. Russinovich and David Soloman.
Possible Solutions and/or Recommendations
· Migrate to a 64-bit Windows: 64-bit Windows has much higher amount of memory available System PTE’s. Specifically 64-bit has a maximum memory size of 128GBs for PTE’s while 32-bit Windows has a 660MB maximum for PTE’s. For more information, see http://support.microsoft.com/kb/294418.
· Migrate to Windows Server 2008 or Windows Vista: Windows Vista and Windows Server 2008 use dynamically memory pools to better manage system PTE memory. For more information, see http://support.microsoft.com/kb/294418.
· Remove the /PAE switch: If not needed, then remove the /PAE switch from the boot.ini file.
Possible Cause: Use of the /3GB switch
How to Diagnose
· The /3GB switch significantly reduces the number of free system PTE’s. To check if the /3GB switch is enabled use one of the following methods:
o Open the boot.ini file and look for the /3GB switch on the most recently used boot option.
o Check the value of the registry key: “HKLM\SYSTEM\CurrentControlSet\Control\SystemStartOptions”
o Check the value of the WMI property root\cimv2\Win32_OperatingSystem.MaxProcessMemorySize for a value close to 3,145,536 (3GBs) and greater than 2,097,024 (2GBs).
· Refer to the chart below for the starting amount of free PTE’s in different operating system configurations. Note the different in the amount of free PTE’s when the /3GB switch is enabled.
Possible Solutions and/or Recommendations
· Use the USERVA switch: Use the /USERVA switch on Windows Server 2003 to give back some of the memory to the kernel specifically for system PTE’s. For more information on the USERVA switch, go to: “How to use the /userva switch with the /3GB switch to tune the User-mode space to a value between 2 GB and 3 GB” http://support.microsoft.com/kb/316739.
Note: For a Windows 2000 server in a /3GB scenario the SystemPages registry setting is used to accomplish the same effect as the /userva switch. For more information, go to http://technet2.microsoft.com/windowsserver/en/library/c5ccbaec-f552-4f61-a488-8ee3330d1eeb1033.mspx.
· Remove the /3GB switch: If not needed, then remove the /3GB switch from the boot.ini file.
· Reduce the amount of physical memory: See “To much physical memory on 32-bit Windows” below.
· Migrate to a 64-bit Windows: 64-bit Windows has much higher amount of memory available System PTE’s. Specifically 64-bit has a maximum memory size of 128GBs for PTE’s while 32-bit Windows has a 660MB maximum for PTE’s. For more information, see http://support.microsoft.com/kb/294418.
· Migrate to Windows Server 2008 or Windows Vista: Windows Vista and Windows Server 2008 use dynamically memory pools to better manage system PTE memory. For more information, see http://support.microsoft.com/kb/294418.
Possible Cause: Too much physical memory
How to Diagnose
· Refer to the “Estimated Kernel Resources Chart” in the More Information section for the starting amount of free PTE’s in different operating system configurations. Note the how increasing the amount of physical memory on the server reduces the number of free PTE’s on the server.
Possible Solutions and/or Recommendations
· Reduce the amount of physical memory: Reduce the amount of physical memory to free up more virtual memory for system PTE’s.
· Migrate to a 64-bit Windows: 64-bit Windows has much higher amount of memory available System PTE’s. Specifically 64-bit has a maximum memory size of 128GBs for PTE’s while 32-bit Windows has a 660MB maximum for PTE’s. For more information, see http://support.microsoft.com/kb/294418.
· Migrate to Windows Server 2008 or Windows Vista: Windows Vista and Windows Server 2008 use dynamically memory pools to better manage system PTE memory. For more information, see http://support.microsoft.com/kb/294418.
Possible Cause: High resource consumption and/or poorly written device drivers
How to Diagnose
Possible Solutions and/or Recommendations
· Remove any unnecessary third-party drivers and drivers not listed on the Hardware Compatibility List (HCL): Drivers that are not on the Hardware Compatibility List (HCL) have a higher likelihood of causing system problems and/or misuse of system resources. For more information about the Microsoft Hardware Compatibility List (HCL) go to, http://winqual.microsoft.com/hcl/.
· Use /BASEVIDEO: Use the /BASEVIDEO boot switch in the boot.ini file or a generic video driver to free up system page table entries. Video boards use the system page table entries to map their buffers in kernel space. This usage competes with the need for system page table entries by Microsoft Exchange.
· Use the USERVA switch: Use the /USERVA switch on Windows Server 2003 to give back some of the memory to the kernel specifically for system PTE’s. For more information on the USERVA switch, go to: “How to use the /userva switch with the /3GB switch to tune the User-mode space to a value between 2 GB and 3 GB” http://support.microsoft.com/kb/316739. For a Windows 2000 server in a /3GB scenario the SystemPages registry setting is used to accomplish the same effect as the /userva switch.
· Remove the /3GB switch: If not needed, then remove the /3GB switch from the boot.ini file.
· Reduce the amount of physical memory: See “To much physical memory on 32-bit Windows” below.
· Migrate to a 64-bit Windows: 64-bit Windows has much higher amount of memory available System PTE’s. Specifically 64-bit has a maximum memory size of 128GBs for PTE’s while 32-bit Windows has a 660MB maximum for PTE’s. For more information, see http://support.microsoft.com/kb/294418.
· Migrate to Windows Server 2008 or Windows Vista: Windows Vista and Windows Server 2008 use dynamically memory pools to better manage system PTE memory. For more information, go to http://msdn2.microsoft.com/en-us/library/bb870880(VS.85).aspx.
Possible Cause: High number of mailboxes on Microsoft Exchange Server
How to Diagnose
Possible Solutions and/or Recommendations
· Reduce the number of mailboxes on the server and/or remove the public folder role from the mailbox server.
High number of Microsoft Exchange mailbox stores that have the public folder server set as their default public store.
· Reduce the number of mailbox stores that have the public folder server set as their default public store. This action will reduce the number of clients (and, as a result, the number of user sessions) on the public folder server.
References
Contributors:
Clint Huffman, Shane Creamer, Rick Anderson, Maximilian Silva, Matthew Walker, Pavel Lebedynskiy, John Rodriguez, Mike Lagase, Yong Rhee
-
I'm proud to announce that I will be speaking at TechEd 2008 this year in Orlando, Florida. It will be a 1 hour chalk talk during the ITPro week (June 10th - 13th, 2008). I hope to see some of the PAL tool users there! :-)
My session will be on Microsoft BizTalk Server Performance Analysis. This session covers how to leverage performance counters and various tools to identify performance bottlenecks in Microsoft BizTalk Server 2006. This session includes a brief introduction to the PAL tool (Performance Analysis of Logs tool). BizTalk has over 300 performance counters, so having an automated tool will helps with analysis.
My PAL tool is a free download at http://www.codeplex.com/PAL. It analyzes performance monitor counter logs (*.blg) and creates an HTML report showing where the counters broke known thresholds. It's not your typical tool. It questions the user about more information about the server where the counter log was capture and uses those answers as variables in the thresholds. The thresholds are actual code that is executed at runtime. I wrote it because I teach performance analysis to customers and analyzing perf counter logs can be a very time consuming process. The tool does not replace normal performance analysis, but is simply a time saver letting you know what to and what not to look at. Even though I wrote it, I can't take credit for all of it. My team, Microsoft Premier Field Engineering (PFE) and Microsoft Customer Support Services (CSS) helped out a lot with creating the thresholds.
-
While troubleshooting disk latency ghosts (see my previous blog entry on AMD Opteron processors), i came across a set of tools. Apparently these tools have been around awhile, but I never knew.
Here is the link to them:
http://research.microsoft.com/barc/Sequential_IO/
Here is a quick summary of the tools:
The DiskSpd program to measure disk bandwidth (DiskSpd.zip).
The NetSpd program to measure network bandwidth (NetSpd.zip).
The MemSpd program to measure memory bandwidth (MemSpd.zip).
The GenFile program to quickly create large disk files with randomized content (GenFile.zip).
The DumpFile program displays sections of arbitrarily large files in hex and/or ASCII (DumpFile.zip).
They were helpful this week and I'll add these little guys to my performance analysis toolkit.
-
I've been gathering performance counter data all week on a server and found a really long disk latency issue where disk writes were taking longer than 1 second (should be less than 15ms). The odd part is that the server was very responsive and the typical symptoms just weren't there. I even wrote my own tool to measure disk write times which showed less than 1ms responses.
A long story short, the cause was really a calculation drift issue on certain AMD processor computers. The tell tell sign of this issue is when your ping responses are negative. Like this:
Reply from 216.109.112.135: bytes=32 time=-2695ms TTL=53
Reply from 216.109.112.135: bytes=32 time=-2697ms TTL=52
Reply from 216.109.112.135: bytes=32 time=-2759ms TTL=52
Reply from 216.109.112.135: bytes=32 time=-2694ms TTL=53
According to the knowledge base article 938448, the fix is simple:
Add the /usepmtimer parameter to the Boot.ini file, and then restart the server. The /usepmtimer parameter is automatically added to the Boot.ini file when you install the latest AMD PowerNow! Technology driver from AMD. The updated driver itself does not resolve this problem. However, the installation process makes the necessary changes to the Boot.ini file to resolve this problem.
Here is the KB article link:
A Windows Server 2003-based server may experience time-stamp counter drift if the server uses dual-core AMD Opteron processors or multiprocessor AMD Opteron processors
http://support.microsoft.com/kb/938448/en-us
Anyway, I just wanted to let everyone know that this is a very *real* issue and if you apply the boot.ini setting, then it will save you a whole lot of time and effort. Believe me I wasted a week chasing ghosts on this until one of the guys I was working with casually mentioned this issue.
-
I recently had a Microsoft SQL Server expert ask me about some tips regarding Microsoft BizTalk Server 2006 databases, so I thought I'd share this information on my blog.
First of all BizTalk database are not your normal SQL databases. Yes, I said databases… as in plural. BizTalk comes with about 6 core databases (MsgBox, Tracking, MgmtDB, SSODB, BAM, and RulesEngine). When you add on other bolt on products like SharePoint, and custom databases, you can see that there is a *lot* to manage. It’s common to have BizTalk databases are on a SAN due to its thirst for disk I/O. It’s like this because BizTalk must not lose any messages or transactional states.
BizTalk Database Backups: They *must* be backed up using a specific SQL job created by BizTalk during configuration. Normal backups don’t work because BizTalk has the potential to do DTC transactions across databases, therefore all of the databases must be log marked, then backed up to ensure a good restore point.
BizTalk Clustering versus NLB: Generally speaking, when BizTalk is *pulling* in messages, then cluster the BizTalk service using Microsoft Cluster Services. When messages are *pushed* to BizTalk, then use Network Load Balancing (NLB).
BizTalk Clustering and SQL Clustering: Without going into much detail, a BizTalk cluster and a SQL cluster should not exist on the same servers/cluster at the same time. This primariliy depends on if the SSO service is clustered on the same cluster. Almost always you have a backend SQL cluster. The BizTalk servers don’t always need to be clustered – see the BizTalk Clustering versus NLB section above.
Tuning of BizTalk Databases (SQL Tuning): BizTalk database are *not* tunable from a SQL Server perspective. BizTalk places hints on all of its queries, so tuning of indexes and such would be very difficult. Effectively, you perf tune BizTalk by changing BizTalk’s configuration.
Max Degree of Parallelism: The BizTalk MessageBox Database requires a Max Degree of Parallelism of 1. Anything else is not supported. With that said, any of its other databases can have a non-1 setting.
I hope this helps clear up some things.
-
Here are a few quick network checks I do when I go on-site with customers.
First, I do a ping to measure latency – LAN should respond in less than a few milliseconds. Next, I do a PathPing (sends a burst of 100 ping packets at each hop, then measures how many come back) to determine if there is any packet loss, then finally I do a 100MB file copy. On a healthy 100Mbit network a 100MByte file should copy in about 10 to 20 seconds. On a healthy 1Gbit network a 100MByte should take 3 to 5 seconds to copy. If the copy is taking minutes, then you know there is a problem. Every once in awhile I come across a network where the auto detect feature of the network adapter auto detected to 10Mbit half-duplex which explains the minutes to copy.
I do these tests at nearly every customer site I go to and they are low overhead tests.
-
Microsoft Office SharePoint Server 2007 (MOSS 2007) has some great output cache settings, but what about SharePoint 2003? Internet Explorer does conditional caching by default, but wouldn't it be nice to eliminate the majority of HTTP round trips (HTTP 304 - Not Modified) requests?
In the Internet Information Services (IIS) administration console, navigate to the web site properties of our SharePoint 2003 enabled web site and add the HTTP header "Cache-Control" as a custom HTTP header with a value of "max-age=86400" which effectively sets an expiration date of 24 hours. This can also be accomplished by enabling the content expiration feature in the same form. This will now add the Cache-control HTTP header to all outbound content on the web site. This has the effect of making the web site load *very* quickly due to far less HTTP round trips.
Important facts to know about this setting:
1) While this setting applies to all of the outbound SharePoint content - not all of the content is cacheable by the browser. I'll describe some of my observations later in this posting.
2) Normally, this setting only applies to content in the IIS file system, but since we set it at a web site level, it is applied to all of the outbound traffic.
Since this brings great performance, then why is this not on by default? Well, there are drawbacks to this configuration. The biggest drawback is that *nearly all* of the content will be cached. This means that images loaded on the web site as content will show the cached data until the expiration date you set has been reached or if the user explicitly does refresh in the browser and other SharePoint content such as Microsoft Office 2007 documents will also show the *cached* version of the document versus any updated versions. To some people this is a very important issue, but again it is worked around by refreshing the browser.
Here is the cacheable items that I experienced:
Non-Cached items: ASPX pages and Microsoft Office 2003 documents do not seem to be effected by the cache-control setting. Effectively, everything that is *not* opened directly in the browser with the exception of ASPX pages is not cached.
Cached items: Effectively everything that is opened by the browser is cached. This includes images (*.jpg, *.gif, *.tif, etc) and Microsoft Office 2007 documents (*.pptx, *.docx, *.xlsx). This means once it is downloaded once, the browser will just get it from local cache and won't ask for it unless the content has expired, or an explicit Refresh is done in the browser (Internet Explorer).
So how would you get the best of both world - meaning you want to explicitly not cache certain content items under certain paths? We considered using HTTP handlers in IIS6 to do this, but HTTP handlers only work with .NET content - not static items such as Office 2007 documents and images. Therefore, the only viable solution is to upgrade to MOSS 2007 (assuming it supports this kind of caching) or implement an IIS ISAPI filter.
IIS ISAPI filters are DLL's that have exclusive access to inbound and outbound HTTP traffic before and after IIS is finished processing the request. In this case, you could write an outbound ISAPI filter that looks for a specific path and a specific file extension, then have it modify the cache-control header on the way out. While this is a great benefit, developing a custom ISAPI filter is no easy task.
In conclusion, it is certainly possible to greatly increase the perceived performance of SharePoint 2003 web sites by adding the cache-control (expiration date) on your content, but be aware of the disadvantages of this implementation. Furthermore, to do cache control correctly, you would need a custom ISAPI filter and that can be a difficult task.
-
A few months ago, I tech reviewed a very well done article on BizTalk performance. It is called "BizTalk Server Database Optimization" and it is located here:
http://msdn2.microsoft.com/en-us/library/bb743398.aspx
John B. Brockmeyer did a great job with the article and I commend his attention to detail. I now reference this document often to my customers.
I had the honor of tech reviewing it prior to it being publish and I was amazed that they kept me in the credits as a contributor.
Regarding the article, John did an outstanding job with describing each of the settings, difficulty level of changing them, and likely benefits of changing then. Whenever I come across a performance bottleneck in one of these areas, this is the first article I bring up to see what changes we can make.
On the negative side, I disagree with how RAID5 is condemned. I agree that RAID5 is bad for disk write operations, but its actually very good at read operations and disk capacity. In a perfect world, you would have a separate RAID0+1 for each data file and log file of the 10+ BizTalk related databases, but that is impractical. My approach is to first see which disk is slow using (Avg Disk sec/read/write) look for disk latency of greater than 15ms, then I use the Microsoft Server Performance Advisor (SPA) or Process Monitor (SysInternals tool - now owned by Microsoft) to identify the files and processing causing the highest amount of I/O. Finally, I move the files causing the most write I/O to isolated RAID0+1 arrays.
Again, this article is a great achievement, so don't let my small details slow it down. Great job, John!
-
I'm a contributor for the book "Performance Testing Guidance for Web Applications" and it is now published on Amazon.com. This was a lot of work on everyone's part to get this completed and I got to work with a lot of old friends as well such as Mark Tomlinson and Ed Glas. Mark and I used to work in the Microsoft Services Labs together doing test consulting with customers, so we have a lot of real world experience in this guide. In any case, Scott Barber put in a *ton* of work into this, so I'm glad to see him as the author.
Our patterns & practices Performance Testing Guidance for Web Applications book is now available on Amazon.
After a left the labs, I focus nearly all of my attention on performance analysis now, so the two How to articles that I wrote didn't make it into the book because they deal more with analysis than testing. I spoke to J.D. Meier to see if we could do a book on Performance Analysis next, but they have to finish other projects first. In any case, he liked the idea, so who knows... maybe in another year or two we'll have a book on it. In the meantime, I'll just blog about it. ;-)
Here are the two How to articles that I wrote:
How To - Identify a Disk Performance Bottleneck Using SPA
http://www.codeplex.com/PerfTesting/Wiki/View.aspx?title=How%20To%3a%20Identify%20a%20Disk%20Performance%20Bottleneck%20Using%20SPA1&referringTitle=How%20Tos
How To - Identify Functions Causing High CPU
http://www.codeplex.com/PerfTesting/Wiki/View.aspx?title=How%20To%3a%20Identify%20a%20Disk%20Performance%20Bottleneck%20Using%20SPA&referringTitle=How%20Tos
As you can see, these articles are geared for perf analysis versus testing, so they didn't make it into the book. In any case, I'm happy to have them up on Codeplex.com, because I reference them a *lot* when working with my customers on perf issues.
-
Process Monitor by SysInternals (owned by Microsoft) (not to be confused with Process Explorer) is a rewrite from the ground up of Regmon and FileMon. It combines the features or RegMon, FileMon, and adds Processes and Threads as well. It will aggregate the data in the trace, so you can see stuff like which process is accessing the disk/registry the most. Furthermore, you can add advanced filters such as monitoring a particular regkey, file, process, etc. Finally, the best part is that once you see a *problem*, you can get the thread *stack* (both kernel mode and usermode) of the process that is accessing that resource... how cool is that?! This requires the Debugging Tools for Windows to be installed and symbols, but that is easily done.
Process Explorer rocks as well because it can show you the current function calls that each of the threads of your process are one. For example, when Outlook is hung, you can see its current thread stacks (requires the Debugging Tools for Windows to be installed). Unfortunately, I can’t seem to get it to use my symbols path properly to make this feature more effective. In any case, it has information on just about anything you want to know about process.
-
<rhetorical question>Having performance on your Vista laptop and a disk queue length of about 20?</rhetorical question>
This week has been a particularly bad week where the TrustedInstaller.exe was wreaking havoc on my disk so much that my laptop was unusable for minutes at a time… nothing new, but extremely irritating. I typically disable and kill it, but it was getting very aggressive this week despite my best efforts. In desperation, I updated the BIOS, all of my device drivers and the problem still persisted. Finally, I installed the Vista SP1 (beta) RC1 last night (2 hours to download and 1 hour to install). I am *very* happy to say that my laptop is running great – best I have ever seen it! Furthermore, I’ve been watching out for TrustedInstaller.exe using Process Monitor (http://www.microsoft.com/technet/sysinternals/default.mspx) and it appears to been pacified. I hope this will continue to be the case.
In any case, I wanted to pass on the news that better performance is just a few clicks (and lots of time) away. ;-)
-
When I go on-site with customers who are using Microsoft Network Load Balancing (NLB), most of the time they have it in a *working* condition, but may be having intermitten networking issues with it. In this blog posting, I'm going to talk about pro's and con's of using NLB versus hardware solutions, how it is different than Microsoft Cluster Services (MSCS) and how to configure your network topology for optimal performance and reliability.
First, let's compare NLB to a hardware load balancer such as f5 Big-IP or Cisco Context Switch (CCS). In many ways, hardware load balancers are superior to NLB, but they are very expensive devices that you might not need. Plain vanilla NLB works best when the client IP addresses are exposed to the NLB enabled servers - this means it generally works best for intranet applications or internet applications that don't require server affinity/stickiness. A while ago, Microsoft Application Center 2000 solved this problem by adding cookie based server affinity/stickiness and full monitoring which brought the NLB servers online/offline when certain conditions were met like IIS services running and HTTP requests responded, but now we just have NLB again. The hardware load balancers typically come with cookie based server affinity/stickiness and monitoring in one package. Finally, the two great advantages of NLB is it's cheap (comes with the operating system) and there is not single point of failure - meaning each server is a peer and there is no central load distributor giving it maximum redundancy. In summary, NLB is best used for intranet applications or internet applications that do not require server affinity/stickiness, otherwise consider a hardware load balancing solution.
Regarding NLB compared to Microsoft Cluster Services (MSCS), they are different technologies, yet similar enough in terminology and topology that we inevitiably mix the two up. MSCS is where two or more servers share a hardware resource so that if one of them goes down, then the other will pick it up. MSCS uses a backend "heartbeat" network used to determine if their partner server is alive or down. NLB also has a "heartbeat", but it's heartbeat is really just broadcast TCP/IP packets on the NLB enabled network... let me repeat... on the NLB enabled network. The reason I repeated that statement is because it is common and recommended to add a second physical network adapter for servers using NLB. Unfortunately, this makes it *look* like an MSCS solution and causes confusion. The purpose of the second network adapter is in fact so the two NLB enabled servers can communicate with each other just like MSCS, but this is over a normal, everyday network (MSCS heartbeat network is almost always an isolated network or on a cross-over cable) which leads me to my last topic regarding network topology.
NLB should be configured with 2 physical network adapters in each server. The NLB enabled or frontend/public network adapter and the normal, everyday, backend, network adapter. The backend/normal/non-NLB network adapter is needed when NLB is enabled on a network adapter, it will mask the MAC address of the network adapter with a virtual MAC address that all of the NLB servers will use typically starting off with "02-BF". Imagine for a second that you only have one network adapter on the servers and you enabled NLB. NLB would work just fine and be very happy (again, it's heartbeat is a broadcast packet on the NLB enabled adapter network), but the NLB server would not be able to communicate with each other because their MAC addresses are the same. If server A wanted to network to server B, then server A would send a TCP/IP packet addressed to server B, but since the MAC addresses are the same, it never leaves server A. Some might now suggest using the multicast mode of NLB which allows the real MAC address of the network adapter to be revealed again, but many network switches don't like the idea of having 2 MAC addresses on the same port let alone having multiple ports with the same MAC addresses which I will address this issue further down.
The reason I call the backend network adapter the normal, everyday, network adapter is because *that* is what it should be. This means that the server's NetBIOS name in WINS, DNS, and whatever other network naming service you have should point to it's backend network adapter. The reaon I stress this is because if you have all of your incoming and outgoing traffic flowing only through the frontend (NLB enabled) network adpater, then what happens when your server needs to authenticate to Active Directory or get data from your database? The return response from these servers in some cases will come back to the virtual IP address which subsequently gets load balanced to one of the servers and lost - not a good thing. This is why you should do all of your internal network communications over the backend (non-NLB enabled) network adapter. Again, some may suggest using a dedicated IP address (DIP). Regardless if you use a DIP or not, the MAC address is still virtualized and therefore could still be inadvertantely load balanced to the wrong server. In addition, do not put a default gateway on the NLB enabled network adapter and do not register the server's NetBIOS name to the NLB adapter. Only add a static DNS entry in DNS to point your end users to the virtual IP address(es) on the NLB enabled network adapter. Now, you are probably thinking that if I don't put a default gateway on my NLB enabled network adapter, then how will my end users receive the server responses... read on.
If you have followed my advise so far, you should have just the virtual IP (VIP) address(es) and Dedicated IP (DIP) address on the NLB enabled network adapter. By the way, don't confuse the DIP with the backend network adapter - the DIP refers to a special IP address that NLB uses *only* on the NLB enabled network adapter - not the backend adapter. All other networking setting are set on your backend (non-NLB enabled) network adapter (WINS, DNS, NetBIOS, default gateway, etc). If configured properly, your user requests should go to the virtual IP address on the NLB enabled network adapter and flow out through the backend (non-NLB enabled) network adapter. Yes, this is possible and is recommended especially because of switch incompatibility issues, which we will talk about next. Finally, all of your internal network communication is now being conducted over a normal, everyday, backend, network adapter, so no possibility of being misrouted.
Earlier I mentioned how some network switches don't do well with NLB. This is because of the way NLB uses MAC addresses. Well, NLB was originally designed to be used with a network hub which doesn't care about MAC addresses. I'm not saying that NLB doesn't work with switches, I'm just saying that you need to plug NLB enabled network adapters into network devices that 1) allow multiple ports to have the same MAC address, 2) allow all of the ports to receive all of the traffic, and 3) the device allow the NLB heartbeat (broadcast packets) to actually broadcast to all of the ports. If you device meets this criteria then great. Otherwise, plug all of the NLB network adapters into a hub and uplink the hub to your network switch. Doing the hub test is especially important if you are having problems with NLB not converging.
So let's say you did the hub test and NLB finally is working for you. You are left with a problem... there aren't very many 100Mb or 1Gb hubs these days, so now we have a throughput problem, right? Well, if you followed my advise so far, then it shouldn't be a problem even if you are using a 10Mb hub. At this point, your server requests are flowing in through the NLB enabled network adapters, then flowing out the backend (non-NLB enabled) network adapter. Well, server requests are typically very small and since your responses are flowing out of the backend network adapter, all your NLB adapters have to handle is incoming requests. Hopefully, since this incoming network traffic small, a 10Mb hub should be able to handle it with the heavier output traffic going out the faster 100Mb or 1Gb network network adapter.
I hope it is helpful. NLB is still a great and cost effective solution when well understood.
One last note... after posting this blog entry, I was made aware that we (Microsoft) may be releasing a patch that will allow IGMP support for NLB in multicast, so this may help.
-
I'm a Premier Field Engineer (PFE) and I go onsite with customers on a regular basis to conduct Health Checks. This is my first blog posting on TechNet, but I figured it would be on something important versus "Hello World". ;-)
More and more I am seeing customers who are not aware of kernel memory issues on the 32-bit Windows architecture. If you are running 32-bit Windows 2000 or 32-bit Windows 2003, then check the kernel memory. Lack of kernel memory can lead to system-wide hangs which seem unexplainable, so this is a serious issue. Always generally use and/or recommend 64-bit or Windows Server 2008 server to avoid these issues (Windows Server 2008 and Windows Vista have automatically adjusting kernel memory pool sizes). This issue and other performance issues are addressed in the Vital Signs workshop (written by Shane Creamer), which can be delivered by my team, Premier Field Engineering (PFE) - just contact your Technical Account Manager (TAM) if you are interested in this course.
Here is a kernel memory chart for Windows 2003 Server:
|
Memory |
Default ( /PAE for 6-16GB ) |
/3GB |
|
1GB |
Free System PTE: 51k
Paged Pool: 282MB
Non Paged Pool: 212MB |
Free System PTE: 32k
Paged Pool: 163MB
Non Paged Pool: 131MB |
|
2GB |
Free System PTE: 196k
Paged Pool: 360MB
Non Paged Pool: 262MB |
Free System PTE: 16k
Paged Pool: 262MB
Non Paged Pool: 131MB |
|
3GB |
Free System PTE: 195k
Paged Pool: 360MB
Non Paged Pool: 262MB |
Free System PTE: 14k
Paged Pool: 262MB
Non Paged Pool: 131MB |
|
4GB |
Free System PTE: 106k
Paged Pool: 336MB
Non Paged Pool: 285MB |
Free System PTE: 15k
Paged Pool: 258MB
Non Paged Pool: 154MB |
|
6GB |
Free System PTE: 186k
Paged Pool: 366MB
Non Paged Pool: 262MB |
Free System PTE: 12k
Paged Pool: 239MB
Non Paged Pool: 131MB |
|
8GB |
Free System PTE: 182k
Paged Pool: 366MB
Non Paged Pool: 262MB |
Free System PTE: 12k
Paged Pool: 225MB
Non Paged Pool: 131MB |
|
12GB |
Free System PTE: 175k
Paged Pool: 366MB
Non Paged Pool: 262MB |
Free System PTE: 12k
Paged Pool: 196MB
Non Paged Pool: 131MB |
|
16GB |
Free System PTE: 167k
Paged Pool: 366MB
Non Paged Pool: 262MB |
Free System PTE: 12k
Paged Pool: 169MB
Non Paged Pool: 131MB |
Check your 32-bit servers to see if they are within 80% of the memory pool sizes for Pool Paged Memory and Pool non-Paged memory according to the chart above. The Free Page Table Entries (PTEs) listed in the chart above is the starting amount based on the specification of the hardware and boot.ini switches. Here are the performance counters to monitor:
Memory\Free System Page Table Entries
Memory\Pool Nonpaged Bytes
Memory\Pool Paged Bytes
For example, if I have a server that is a Windows 2003 SP1 (SP1 is needed to accurately see the Free PTE data in performance monitor) with 4GB of memory and /3GB switch turned on, then Free System PTE’s start off at 15,000 (according to the chart). If the value gets below 5,000, then the system could hang temporarily. This system has a Paged Pool memory maximum of 258MB and Non-Paged Pool maximum memory of 154MBs. If the respective “Memory\Pool Paged Bytes” and “Memory\Pool Nonpaged Bytes” counter values come within 80% of these maximum pool sizes, then the system could hang temporarily – in this case if the values go over 206MB’s and 123MB’s respectively, then it’s a critical issue.
Fix for Win2003 SP1 systems with /3GB and low on PTE’s: If the system is low on PTE’s, running Windows 2003, and using /3GB switch, then consider using the /USERVA switch to give back some of the memory to the kernel. Note, this only works for Free System PTE issues. For more information on the USERVA switch, go to:
How to use the /userva switch with the /3GB switch to tune the User-mode space to a value between 2 GB and 3 GB
http://support.microsoft.com/kb/316739
Lack of Paged Pool or non-Paged Pool Memory: If the system is low on Paged Pool or non-Paged pool memory, then first consider opening a support case with Microsoft to address this. Alternatively, you can use a free and public tool called Poolmon.exe to see what DLL’s are using kernel memory (see the article below). Most kernel memory leaks can be tracked back to a usermode process. To identify which user mode process is responsible, reboot the system (so you start off with a clean system), start a performance monitor log intending to run for a week or more capturing the Memory and Process objects, then analyze the perfmon log looking for memory leaks and/or handle leaks in one or more of the processes.
How to Use Memory Pool Monitor (Poolmon.exe) to Troubleshoot Kernel Mode Memory Leaks
http://support.microsoft.com/kb/177415
In summary, always consider 64-bit and always keep an eye on kernel memory.
In addition, I and a few of my colleages wrote a tool that can help identify these and many other performance issues by analyzing performance counter logs. The tool is located at http://www.codeplex.com/PAL. Codeplex.com is Microsoft's open source web site.