-
It took awhile, but I was able get a new blog created using my email name “ClintH” versus “clint_huffman”, so I will no longer be posting blog entries to *this* blog. All of my blog entries will now be at: http://blogs.technet.com/clinth
I have two large blog entries there already which talk about my recent System Page Table Entries (PTE) adventures.
Please update your RSS feeds to http://blogs.technet.com/clinth
Thank you!
-
I’m doing another blog post about the processor utilization in a Hyper-V environment because it seems like my last blog post on this subject was misinterpreted.
When Ewan Fairweather and I ran a few BizTalk servers as guest computers to 100% CPU. We brought up Task Manager and it showed very little, if any CPU usage. We thought to ourselves that either Hyper-V has the most efficient use of CPU utilization or there is something wrong. It turns out that the host computer (root partition) is considered just another virtual computer running on the hypervisor. This means that the CPU usage we see in Task Manager is only the CPU usage of the host computer only. Therefore, if you want to know how much CPU utilization your physical processors are really getting, then use the “\Hyper-V Hypervisor Logical Processor(*)\% Total Run Time”. The processor resource is the only resource affected in this way – meaning you can measure memory and network using the normal counters.
Read Tony Voellm’s blog (he is the Hyper-V Performance Lead) at:
http://blogs.msdn.com/tvoellm/
I was involved in testing performance of Hyper-V in a Microsoft BizTalk Server environment with Ewan Fairweather and we wrote this article:
http://msdn.microsoft.com/en-us/library/cc768535.aspx
Always use the Hyper-V Hypervisor performance counters first!
-
Awhile ago someone asked me about how to make their laptop prefer their wired internet connection versus their wireless when both are connected to the internet. The short answer is that Windows (Vista, 7, 2008, and I’m pretty sure XP and 2003 does as well) does this by default. The key here is the network interface metric. When you have more than one default gateway defined (indicated by a network destination of 0.0.0.0), then the internet bound packets go out the interface with the lowest metric. In the case of a tie (same metric), then the internet bound packets will go out the interface listed first.
Below is the routing table of my laptop when it is docked at my home network where I am connected to the internet through my wired docking station and by my wireless access point. My wired IP address is 192.168.1.12 and my wireless IP address is 192.168.1.13. Windows is aware of this condition and automatically prefers my wired connection by giving it a lower metric of 20 versus my wireless connection metric of 25. In this case, this is my Windows 7 (Beta 1) laptop.
You can, of course, permanently alter your metrics by editing your TCP/IP settings on your network adapter’s advanced settings. By default, it is set to Automatic metric allowing Windows to decide which metrics are best. Here is a screenshot. 
In conclusion, Windows will automatically prefer your wired connections versus your wireless connections when both are connected to the internet. This is great for us road warriors.
Scott Landry adds:
You should know that Vista made a change to how we handle existing sockets – after plugging in, connections will not be switched over, you must re-establish the connection in order to make use of a wired connection. For example, if you’re downloading something from a website and realize that it would go faster by plugging in, you’d have to cancel and start over after plugging in. This is a change from XP and 2003. Here is a good reference:
The Cable Guy Strong and Weak Host Models http://technet.microsoft.com/en-us/magazine/2007.09.cableguy.aspx
-
While onsite with customers, I have found several more cases where the Windows Server 2003 32-bit is running dangerously low on kernel memory. Here lately it has been a lack of pool paged memory. In any case, you can use my earlier blog posts to estimate the kernel memory size. With that said, when you run into this issue, you need to really know what the real maximum sizes are. In this blog post, I will show you two relatively easy ways to get this information.
The memory pool usage in pool paged memory and pool non-paged memory should not exceed 80% of their respective maximums. The pool usage values are exposed the performance counters, “\Memory\Pool Paged Bytes” and “\Memory\Pool NonPaged Bytes”, but their maximum sizes can only be found by debugging the kernel. I know that debugging the kernel sounds very intimidating, but I assure you it is easy, but *dangerous* if you mess it up, so try this in a test environment first.
Getting Maximum Sizes for Pool Paged Bytes and Pool Non-Paged Bytes:
- Install the Microsoft Debugging Tools for Windows. This is a free download from http://download.microsoft.com. Installation of these tools does not require a reboot.
- Use Microsoft WinDBG or Process Explorer (both Microsoft tools):
-
WinDBG method:
- Open WinDBG.
- Ensure the symbol path is set: Click File, “Symbol File Path…”. Ensure the following path is in place without the double quotes: “SRV*C:\symbols*http://msdl.microsoft.com/download/symbols”. Click OK.
- Click File, “Kernel Debug…”. Select the Local tab, then click OK.
- Select “No” to any Workspace dialog boxes.
- At the “lkd>” prompt, type “!vm”
“PagedPool Maximum” is the maximum size for the Pool Paged memory.
“NonPagedPool Max” is the maximum size for the Pool Non-Paged memory.
-
Process Explorer method: This is preferred if you need to get the maximum pool sizes often.
- Download Process Explorer from:
http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx or run directly from http://live.sysinternals.com.
No installation is required. - Run procexp.exe.
- Ensure the symbol path is set: Click Options, “Configure Symbols…”
- Set the Symbols path to (without the double quotes):
“SRV*C:\symbols*http://msdl.microsoft.com/download/symbols” - Set the Dbghelp.dll path to the installation directory of the Debugging Tools for Windows. Default location is: “C:\Program Files\Debugging Tools for Windows\dbghelp.dll” Click OK.
- Click View, “System Information…”
- “Paged Limit” is the maximum size for the Pool Paged Memory.
“Nonpaged Limit” is the maximum size for the Pool Non-Paged memory. - If you need to get the maximum pool sizes again, then just follow steps 2, 6 and 7.
-
While doing a BizTalk Health check this week, the customer and I thought it would be cool to try to do as much as possible in an interactive PowerShell session to collect the data we needed. We created a lot of PowerShell one-liners and I thought I’d share it. These one-liners use WMI, so nearly all of them can be ran from a remote workstation, so this means that PowerShell is not needed on the remote BizTalk servers unless otherwise noted.
Be careful of word wrap in this email. All of these commands are *one-liners* – meaning they all go on one line ;-).
# List of BizTalk Servers. This is needed, so that the rest of the commands will execute on each of them per command.
$BizTalkServers = "BizTalkServer1","BizTalkServer2"
# Ping servers (latency check)
$BizTalkServers | foreach-object {"================";$_;"================";ping $_}
# Pathping (packet loss check)
$BizTalkServers | foreach-object {"================";$_;"================";pathping $_}
# Get Total Physical Memory
$BizTalkServers | ForEach-Object {"================";$_;"================";Get-wmiobject -Query "SELECT TotalPhysicalMemory FROM Win32_ComputerSystem" -ComputerName $_} | Format-List TotalPhysicalMemory
# Installed hotfixes (quick)
$BizTalkServers | ForEach-Object {"================";$_;"================";Get-wmiobject -Query "SELECT HotFixID,ServicePackInEffect FROM Win32_QuickFixEngineering" -ComputerName $_} | Format-list HotfixID, ServicePackInEffect
# NumberOfProcessors
$BizTalkServers | ForEach-Object {"================";$_;"================";Get-wmiobject -Query "SELECT NumberOfProcessors FROM Win32_ComputerSystem" -ComputerName $_} | Format-List NumberOfProcessors
# System Information
$BizTalkServers | ForEach-Object {"================";$_;"================";Get-wmiobject -Query "SELECT Caption, CSDVersion, MaxProcessMemorySize, PAEEnabled, ServicePackMajorVersion FROM Win32_OperatingSystem" -ComputerName $_} | Format-List Caption, ServicePackMajorVersion, CSDVersion, MaxProcessMemorySize, PAEEnabled
# Clock Synchronization Check
$BizTalkServers | ForEach-Object {"================";$_;"================";Get-wmiobject -Query "SELECT * FROM Win32_LocalTime" -ComputerName $_} | format-list Year, Month, Day, Hour, Minute, Second
# BizTalk Host Instance information (WMIC) (Must be ran locally) (WMIC is native to the operating system)
wmic /NAMESPACE:"\\root\MicrosoftBizTalkServer" PATH MSBTS_HostInstance GET HostName, HostType, Logon, NTGroupName, RunningServer
# BizTalk Host Instance information (PowerShell) (Must be ran locally) (PowerShell must be installed)
$BizTalkServers | ForEach-Object {"================";$_;"================";Get-wmiobject -Namespace "root\MicrosoftBizTalkServer" -Query "SELECT * FROM MSBTS_Host" -ComputerName $_} | Format-List HostName, HostType, Logon, NTGroupName, RunningServer
-
At TechEd 2008, I manned the Microsoft BizTalk Server booth. Nearly every question I got was, “I’ve heard of BizTalk, but what does it do?”. If you are like me 3 years ago which is sounds like many people are, I wish someone would have just came out and given me a simple description of BizTalk. Now that I know BizTalk, here is the *simple* description/FAQ of what it is and why it is a great product.
What is BizTalk?
BizTalk is a message conversion system. You give it MessageA and it converts it to MessageB. For example, if MessageA is an EDI (Electronic Data Interchange) message, then BizTalk is able to easily read the message and convert it into just about any other message type such as an XML document (MessageB). Will it convert a Word document to a PDF document? It can, but it was really designed to allow businesses to easily communicate with each other using practically any message format and nearly any network protocol.
I can just code this in .NET why do I need BizTalk?
Sure, you can fire up Visual Studio and write this on your own, but did you think about guaranteed delivery, disaster recovery, tracking, troubleshooting, security, or authentication? What if the network protocol to receive and send these messages needs to change? How long would it take to change your custom application compared to a simple configuration change (no recompile necessary) in BizTalk. When you keep creating monolithic applications to handle your business to business (B2B) transactions, it becomes increasingly difficult to manage it and you find yourself trying to write an infrastructure to handle it. BizTalk provides all of this infrastructure for you. It has guaranteed delivery, standardized tracking, a business rules engine, redundancy, and much more.
I am not doing B2B. Is BizTalk still helpful?
Yes. Many people use BizTalk as an integration platform – meaning all of those old systems that you have that don’t communicate with each other can be bridged together with BizTalk. The marketing term for this is Solution Oriented Architecture (SOA). Also, many companies love BizTalk so much that all of their business logic internally in their company flows through it with the confidence of BizTalk’s ability to make nearly any business process a structured transaction.
BizTalk is *huge*, do I really need all of that infrastructure?
BizTalk is not for everyone. Again, you can certainly write your own .NET application to do very basic message conversion, but once your company matures and needs enterprise level messaging, reliability, and flexibility, then that is when you get BizTalk.
What is BizTalk really good at?
BizTalk is great when dealing with frequent changes in how you do business with other businesses. For example, if you deal with a lot of businesses that have different message requirements, frequent network protocol changes, then BizTalk is for you. Also, many customers really like how easy it is to map the schemas of messages. For example, FName in MessageA can be easily mapped to FirstName in MessageB regardless of how the messages are structured.
You keep talking about “messaging”. Does this mean that BizTalk does email?
The messaging we are talking about would be something like reading in a purchase order from CompanyA and sending the order to your fulfillment and shipping departments. BizTalk can also send email if you want it to, but it’s really designed to get businesses to talk to each other hence why it is called “Biz-Talk”.
Additional Information
One of my colleagues at Microsoft requested that I include information on some of the other features that BizTalk Server provides:
- Multiple adapters provide support for most industry standard transports (FTP, HTTP, File, SOAP, WCF, SQL, etc.)
- Multiple accelerators are available that support for industry standard document formats (HIPAA, EDI, SWIFT)
- RFID Platform
- High availability
- Fault tolerance
- Scalability
- End to end message tracking (including message body tracking if needed.
- Messaging subsystem accommodates transactional messaging, i.e. BizTalk can send to/receive from trading partners within the context of a distributed transaction as long as the adapter supports transactions.
- XLANGs workflow engine
- Business Activity Monitoring
- Business Rule Engine (BRE)
- Management functionality via Management console
- Integration with MOM
- Integration with Enterprise Single Sign-On
These features are available in BizTalk Server, which allows you to quickly integrate your business processes with other internal business processes and with external trading partners. Several of these features are described in further detail at http://www.microsoft.com/biztalk/en/us/capabilities.aspx.
Thanks to Trace Young (BizTalk Technical Writer) for this bit of information.
-
Work on PAL v2.0 (PowerShell version) is coming along nicely. While working on it, I created a functional proof of concept PowerShell script called Bling.ps1 which will read a performance monitor log (BLG or CSV) and will create graphical charts for all of the counters in it. PAL v1.x only creates charts and stats for performance counters named in the threshold files, so this is a new feature. Oh, and BLING stands for BLg INto imaGes. :-)
Bling.ps1 v1.0 (Proof of Concept for PAL 2.0)
http://www.codeplex.com/PAL/Release/ProjectReleases.aspx?ReleaseId=21260
No installation is required, but dependent, free, products are required.
Required Products (free):
- PowerShell v1.0 or greater.
- Microsoft .NET Framework 3.5 Service Pack 1
- Microsoft Chart Controls for Microsoft .NET Framework 3.5
Syntax:
Bling.ps1 /LOG:[AbsolutePathToPerfmonLog[;AbsolutePathToPerfmonLog] /OUTPUTDIR:[AbsolutePathToOutputDirectory]
Note: Use full, absolute paths in each of the arguments.
Basic Example:
.\Bling.ps1 /LOG:C:\Users\clinth\Documents\SamplePerfmonLog.blg /OUTPUTDIR:C:\Users\clinth\Documents\Output
See the Readme.txt file for more details.
Note: Be sure to set PowerShell to unrestricted access by typing the following command in an elevated PowerShell command prompt:
Set-ExecutionPolicy Unrestricted
Special Thanks to Greg Varveris for his help with understanding the Microsoft Chart Controls for Microsoft .NET Framework 3.5.
All my posts are provided "AS IS" with no warranties, and confer no rights. For PFE Job Opportunities at Microsoft, please visit our website at: http://members.microsoft.com/careers/search/default.aspx - search for keyword “PFE”
“PFE: The best place to be at Microsoft”
-
A few weeks ago, I did another show on the internet radio talk show called Run As Radio at http://www.ranasradio.com. The session I did with Richard and Greg this time was on Windows Management Instrumentation (WMI). It should be up within the next few weeks. :-)
Also, I’m proposing to do another show on Windows Performance Tools. The various tools I use to diagnose root causes of performance issues in Windows. In the meantime, I’ll try to blog about them as I go.
All my posts are provided "AS IS" with no warranties, and confer no rights. For PFE Job Opportunities at Microsoft, please visit our website at: http://members.microsoft.com/careers/search/default.aspx - search for keyword “PFE”
“PFE: The best place to be at Microsoft”
-
About a month ago, I was on Run As Radio (www.runasradio.com) doing a session on the PAL tool. This was a lot of fun. Here is the link.
Clint Huffman Does Performance Analysis of Logs!http://www.runasradio.com/default.aspx?showNum=82
Also, a friend of mine, Shane Creamer, did an earlier session. It worked well because his session talks about core performance counters and the Vitals Signs workshop and mine talks about how to use the PAL tool to apply the rules and thresholds of the Vital Signs workshop.
Shane Creamer Goes Deep on Performance Monitor!http://www.runasradio.com/default.aspx?showNum=81
Also, Mark Minasi (one of my IT heroes) did a very humorous and very informative session on IPv6 and Vista networking. Very enjoyable session.
Mark Minasi on Networking in Windows Server 2008!
http://www.runasradio.com/default.aspx?showNum=65
Finally, I will be doing another session on Run As Radio this Friday (published a few weeks after) on Windows Management Instrumentation (WMI). :-)
-
Relevant Environment Information:
Microsoft BizTalk Server 2006 R2 (32-bit)
Microsoft Windows Server 2003 R2 (32-bit)
Symptoms: While doing a BizTalk Health Check with a customer, we encountered errors with the Microsoft Distributed Transaction Coordinator (DTC) while using the BizTalk 2006 R2 Admin Console. I, unfortunately don’t have the exact error any longer, but it mentioned something about unable to contact the DTC service which is a very serious issue because BizTalk is heavily reliant on the DTC service. The odd thing is that the BizTalk services continued to operate normally, so only the admin console was effected by this.
Looking at the event logs, I found a few errors similar to this one:
Category: ENTSSO
Type: Warning
Source: Enterprise Single Sign-On
Event ID: 10532
Description: "Failed to retrieve master secrets. Verify that the master secret server name is correct and that it is available.
Secret Server Name: <Computer>
Error Code: 0x800706BF, The remote procedure call failed and did not execute."
This error is also a very scary error to get because just like the DTC service, BizTalk is heavily reliant on the Enterprise Single Sign-On service (ENTSSO).
We called up Microsoft Customer Support Services (CSS) to help us with this. Yes, I am a Microsoft field support professional, but I don’t know everything. ;-)
CSS had us restart the DTC service on the BizTalk Servers, then restart the Microsoft ENTSSO Master Secret Server. After that, it was working again. No errors in the BizTalk Admin Console.
Cause: The DTC service was restarted and the ENTSSO Master Secret service lost contact with it.
Solution: Restart the DTC service, the restart the ENTSSO Master Secret Service.
Lesson Learned: Always restart the ENTSSO Master Secret Service after restarting the DTC service.
More Information: The Enterprise SSO Master Secret key is extremely important. If your ENTSSO service becomes corrupted, then you lose everything. There are certainly *many* more reasons why the ENTSSO Master Secret Service may fail, so always make sure you backup the Master Secret key using SSOManage.exe, backup the key to some place you can get to and where it is not on the local machine, and finally, ensure you remember the restore password.
All my posts are provided "AS IS" with no warranties, and confer no rights. For PFE Job Opportunities at Microsoft, please visit our website at: http://members.microsoft.com/careers/search/default.aspx - search for keyword “PFE”
“PFE: The best place to be at Microsoft”
-
As a reminder to everyone, use the perf counter “\Hyper-V Hypervisor Logical Processor(_Total)\% Total Run Time” to measure CPU in a Hyper-V environment.
If you are trying to measure CPU utilization in a Hyper-V environment and are a bit frustrated as to why the numbers don’t match up, then read the excerpt below. I spent a month with the BizTalk Product Team doing analysis of BizTalk in a Hyper-V environment. We had the virtual computers (guest) running at 100% CPU, yet Task Manager on the host computer (root partition) was about 1% on all of the CPUs.
Excerpt from http://msdn.microsoft.com/en-us/library/cc768535.aspx
Measure overall processor utilization of the Hyper-V environment using Hyper-V performance monitor counters – For purposes of measuring processor utilization, the host operating system is logically viewed as just another guest operating system. Therefore, the “\Processor(*)\% Processor Time” monitor counter measures the processor utilization of the host operating system only. To measure total physical processor utilization of the host operating system and all guest operating systems, use the “\Hyper-V Hypervisor Logical Processor(_Total)\% Total Run Time” performance monitor counter. This counter measures the total percentage of time spent by the processor running the both the host operating system and all guest operating systems.
Also, the PAL tool has a threshold file for Hyper-V with this information in it. http://www.codeplex.com/PAL
BizTalk Server 2006 R2 Hyper-V Guide (Full Guide)
http://msdn.microsoft.com/en-us/library/cc768518.aspx
All my posts are provided "AS IS" with no warranties, and confer no rights. For PFE Job Opportunities at Microsoft, please visit our website at: http://members.microsoft.com/careers/search/default.aspx - search for keyword “PFE”
“PFE: The best place to be at Microsoft”
-
Introduction
Performance analysis of log files (*.blg files) in Microsoft Windows has primarily been a manual process for as long as I can remember. We [Microsoft] have made some strides in this area, but there is still little out there that analyzes log files *after* a problem has occurred. Furthermore, nearly all of the Microsoft tools require the user to capture performance data as the problem is occurring. While this is a great way to analyze a performance problem, it is sometimes impractical. This would be similar to asking a criminal to reenact a crime scene while we film it. Therefore, we need a tool which analyzes existing logs and pieces together the evidence after the performance issue (crime scene) has occurred. Until this happens, we must continue analyzing logs manually with our limited, individual knowledge. The knowledge is all around us, we just need to harness it. The Performance Analysis of Logs (PAL) tool (http://www.codeplex.com) is our first step towards the realization of this goal, but we cannot do it alone.
The Challenge
I go on-site with customer every week to assist with performance issues. Most of the time the customer has perfmon logs (*.blg), Event Logs (*.csv), IIS logs (*.log), and if we are lucky a Event Tracing for Windows (ETW) log (*.etl). The assumption is that network administrators should analyze these logs by hand assuming they know their environment best. The problem is that in order to *properly* analyze these log files, you would need have a working knowledge of Windows Architecture. The reality is that most people have a heavy work load and do not have time to thoroughly understand Windows Architecture enough to keep up with it. I know enough about Windows Architecture to put the above log files to use and to typically formulate an hypothesis, but even I struggle with understanding some of the issues that I see and need to rely on others who know more. Even when you have a social network of subject matter experts, the process of analyzing Windows performance can still be slow. Therefore, the knowledge these experts have needs to be consolidated into an analysis tool or a central location. Many people may say that performance analysis is an art versus a science. Well I say, let's take the art out of it and make it a science as much as we can.
One of the fruits of my team's labor towards this goal is the PAL tool (http://www.codeplex.com/PAL). The PAL tool (Performance Analysis of Logs tool) is a tool that takes in all of the variables needed to analyze a performance issue and generates a report on its findings. It does this by interviewing the user to find out more about the computer, then using those answers in its analysis. Now, you are probably asking me, “Why did we write our own tool when there are so many great performance analysis tools that Microsoft has written?” Well, let’s talk about a few of the more recent and relative tool that I am aware of and how PAL is different:
Microsoft Server Performance Advisor (SPA): SPA was written by the Windows Fundamentals team. It analyzes perfmon logs and ETL (Event Tracing for Windows log). It does a great job of aggregating data, but analyses are based on XPath statements which unfortunately is not flexible enough to consider all of the factors in analyzing performance. Furthermore, the problem must be reproduced while it’s gathering data. The tool should only be ran for short time periods due to the large amount of data it gathers, and finally takes in *all* of the data points in the logs as its average values – meaning if the problem occurred for 1 minute and the collection period is 20 minutes, then the averages are skewed. In any case, I am very fond of this tool and highly encourage it's use. As a matter of fact, I'm reusing many of it's concepts in the PAL tool today. SPA has the right idea, it just needs to be taken to the next level. You can download it at: http://www.microsoft.com/downloads/details.aspx?FamilyID=09115420-8c9d-46b9-a9a5-9bffcd237da2&DisplayLang=en
Microsoft Visual Studio Profiler: This tool is very good, but its focus is on application functions versus operating system performance. I firmly believe that this is one of the best approaches to performance analysis for applications because you can identify what functions that the application is waiting on. Unfortunately, you can only profile one process at a time, the profiling has a little bit of overhead, and you have to reproduce the problem as it is occurring. I wrote a white paper on how to do this in a production environment located here (http://go.microsoft.com/fwlink/?LinkId=105797). I recommend using both this tool and a performance counter analysis tool such as PAL.
Microsoft xPerf/xTrace: Written by the Windows Fundamentals team. This the latest/greatest tool out there for perf analysis. Unfortunately, it is currently lacking an intuitive UI and analyses around the data collected. With that said, they are rapidly improving it. Like many of the other tools you must be able to reproduce the problem as it is occurring. Furthermore, it only runs on Windows Vista and Windows Server 2008. Unfortunately, customers are not continuously logging ETW data, so this doesn’t help me much for post analysis. Finally, it only analyzes ETL – no other log format. This would be like asking a crime investigator to analyze a crime scene using only one type of evidence. If that type of evidence isn’t available, then no analysis can be done. xPerf is part of the Windows Performance Toolkit located here: http://www.microsoft.com/whdc/system/sysperf/perftools.mspx
Microsoft System Center Operations Manager (SCOM): I’m always impressed with the SCOM team and their product. They analyze perf data as they go and do a great job with providing guidance and trends on the data shown. While this great for customers who have SCOM… not all customers have it installed. Therefore, I am again left with manual analysis. Furthermore, SCOM might not have all of the data I need to analyze a problem.
Microsoft Log Parser: This is an incredible tool that parses many log types in an easy to use Sequel Query Language (SQL) syntax. It just doesn't do any analysis. Therefore, the PAL tool uses Microsoft Log Parser as its data access layer. Microsoft Log Parser can be downloaded at: http://www.microsoft.com/downloads/details.aspx?FamilyID=890cd06b-abf8-4c25-91b2-f8d975cf8c07&DisplayLang=en
There are certainly more tools out there, but my point is that none of the tools above meet the all of the needs of post log analysis. This is why the PAL tool initiative was started.
The Solution
I strongly feel that if you complain about something, then you need to offer a viable solution, so here are the requirements of a tool that would be of practical use in the field.
Consolidated Guidance: First, there needs to be a central repository of guidance on performance analysis. We have many great whitepapers out there, but the knowledge is spread out and it takes a great deal of time to read and understand them especially when you are trying to solve a problem. It’s like a guy bringing his car to an automotive repair shop and the mechanic hands the guy a huge book and says, “you can fix your car by reading this”. The guy will ask, “this is nice, but how to I fix my problem?” Shane Creamer’s Vital Signs workshop has done a great job with consolidating the basics into a short, 2-day workshop offered by my team (Microsoft Premier Field Engineering). The PAL tool has nearly all of the consolidated guidance from the Vital Signs workshop built-into its report, so when a threshold is broken the guidance is context sensitive to that threshold. If you are interested in the Vitals Signs workshop, then contact your Microsoft Technical Account Manager (TAM).
Log File Data Access Layer (DAL): Next, a simple to use data access layer is needed to analyze log files in a common way. The Microsoft Log Parser tool is a great tool for this, but it is based on legacy COM. No future versions of it are planned at this time, but I have asked the IIS product team to write a new version it. They are considering it. Currently, the PAL tool uses Microsoft Log Parser as its DAL, but inherits some of the same limitations of Log Parser because of this.
Analyze More Data Points: Some of our log analysis tools such as SPA reads in the entire log and generate an average, minimum, and maximum values from it. This doesn’t cut it when the problem occurs only in a small portion of the log because the problem is averaged out by the sheer size of the counter log. The PAL tool breaks down perfmon logs into smaller time slices and analyzes each time slice individually for better accuracy.
Dynamically Changing Thresholds and Interviewing the User: One of the assumptions most performance analysis tool make is that they assume the user knows how to change the threshold and what to change them to. Likewise, unless you are a Windows architecture guru, then you as the user assume the tool is using the appropriate thresholds. When you have both parties relying on the other to make the best decision, then this can cause confusion and misdiagnosis. In order for next generation tools to be effective, they need to have dynamically changing thresholds based on the environment. The point is that you need to have a tool that learns the customer’s environment and adjusts its thresholds appropriately even if this means simply asking the user for some additional input. For example, to determine if a computer is running out of paged pool or non-paged pool memory the PAL tool asks the user a series of questions to estimate the maximum sizes of these memory pools, then computes a respective 60% and 80% threshold for it. The PAL tool does this by running executable code at run-time using the user’s input as variables for the code to determine if the thresholds are broken. Using executable code as the thresholds and being able to ask the user questions about the environment makes the tool flexible enough to handle nearly any performance analysis challenge.
Reusable: Our next generation tools need to be reusable – meaning portions of it can be reused by other applications and tool. Luckily, tools like xPerf are modulized in this way, but I wanted to emphasize that this needs to continue. Currently, the PAL tool is a hybrid VB.NET/ VBScript and open source, so users can simply copy code they want to reuse.
Free and Public: Our next generation tools need to be free for all of our users. Many times when tools become intellectual property (IP), then they inherit licensing restrictions such as a cost to use or other restrictions. The PAL tool is a free, open source tool available at http://www.codeplex.com/PAL.
Extensibility of the Thresholds: As mentioned above, the thresholds need to be executable code to be flexible enough to handle complex analyses. In addition, the code needs to be open to where users can add to them or update existing ones. This is important because no one person can claim that they know all of the technical aspects of all performance problems in Windows. You have to allow the people who are experts in their field to have the empowerment to add, edit, and delete the thresholds in the tool. The PAL tool accomplishes this by using VBScript for the thresholds and the VBScript is embedded in XML based threshold files. Included with the PAL tool is an editor to make it easy for subject matter experts to edit PAL threshold files. Furthermore, with the help of other subject matter experts, we have several product specific threshold files namely Active Directory, IIS, MOSS, SQL Server, BizTalk Server, Exchange Server, and general Windows.
Low Requirements: Some of the tools written by other teams at Microsoft such as the Visual Studio Load Test tool require a back end SQL database in order to process the collected data. Many people in the field don’t have SQL Server running on their laptops, so our tools need to able to run on workstation class computers. The PAL tool simply requires Microsoft Log Parser and Office Web Components 11 both which are free downloads.
Conclusion
We need a tool that can analyze a wide variety of logs similar to how a crime scene investigator analyzes a crime scene by gathering the evidence from the scene (in this case the log files) and analyzing them with scientific precision. The PAL tool is a tentative solution to the problem and has enjoyed great success with it with over 2000 downloads per month. With that said, we cannot do this alone especially since this is not part of my regular job. While a few of the Microsoft product groups are starting to follow some of the concepts of the PAL tool, no product group at Microsoft that I know specializes in performance analysis in this fashion. The possibility of creating a tool with all of the aspects I mentioned above is difficult. With that said, the benefits of such a tool are clear – the better Microsoft Windows performs, the happier customers are with Microsoft products. In the end, my real intention is to simply make my parents computer run faster. ;-)
Moving Forward
If you want to assist with this effort, then please try out the PAL tool and help with the development of it. For more information on the PAL Tool, please go to http://www.codeplex.com/PAL.
All my posts are provided "AS IS" with no warranties, and confer no rights. For PFE Job Opportunities at Microsoft, please visit our website at: http://members.microsoft.com/careers/search/default.aspx - search for keyword “PFE”
“PFE: The best place to be at Microsoft”
-
Sorry, I haven't blogged much in the past few months. I've been heads down on several exciting BizTalk projects which I'll blog about soon after this one. For now, I want to get more of the word out about the Performance Analysis of Logs (PAL) tool that I wrote in collaboration with other people on my team and how to use the tool to analyze BizTalk servers. I use the PAL tool when I conduct BizTalk Health Checks for customers. I'm a field guy (Microsoft Premier Field Engineer) who lives for performance issues and the tool takes the mundane work out of performance analysis. With that said, PAL is not a replacement for performance analysis - it is simply a really nice time saver!
The PAL (Performance Analysis of Logs) tool is a new and powerful tool that reads in a performance monitor counter log (any known format) and analyzes it using complex, but known, thresholds (provided). The tool generates an HTML based report that graphically charts important performance counters and throws alerts when thresholds are exceeded. The thresholds are originally based on thresholds defined by the Microsoft product teams, including BizTalk Server, and members of Microsoft support. This tool is not a replacement of traditional performance analysis, but automates the analysis of performance counter logs enough to help save you time. This is a VBScript and requires Microsoft Log Parser (free download). The tool is available at http://www.codeplex.com/PAL.
BizTalk Performance Counters
Microsoft BizTalk Server 2006 shipped with about 294 performance counters. This means a BizTalk Server implementation with at least 2 servers for redundancy means there are at least 588 BizTalk performance counters that may need to be analyzed. Therefore, a performance monitor log analysis tool is helpful with analyzing BizTalk performance counters.
Consider the Operating System Performance
Many BizTalk performance issues can be narrowed down by analyzing the resources of the operating system (CPU, disk, memory, and network). For example, if the tracking (DTADB) database file is heavily using a disk, then reducing the amount of tracking would be a logical step towards alleviating the bottleneck.
Usage
The PAL tool is can be used to create a Microsoft Performance Monitor (perfmon) template file from PAL threshold files and can analyze the perfmon log after its collection period. In this example, we will create, gather, and analyze a BizTalk performance monitor log file.
Collect a Microsoft Performance Monitor (perfmon) log
1. Export a Microsoft Performance Monitor Log from the BizTalk Server threshold file.
a. Selecting the Microsoft BizTalk Server Server 2006 threshold file and clicking the Export… button, then save the file. This is a perfmon template file.
2. Copy the perfmon template file to the BizTalk server(s) you have chosen for analysis. This includes the Microsoft SQL Servers hosting the BizTalk databases.
3. Create a new performance monitor log using the template exported in step 1. For more information on how to create performance monitor logs from a template file, please refer to the Windows help documentation.
a. Adjust the perfmon log settings if needed.
4. Start the new performance monitor log. Stop the perfmon log when the collection period is over.
5. Copy the perfmon log to the installation directory of the PAL tool.
Filling out the PAL Wizard Form
- Counter Log Path: Specify the path to the Microsoft Performance Monitor (perfmon) log. The log can be in any of the known perfmon log formats such as BLG (binary) or CSV (text). Use the ellipsis button to browse for the perfmon log. If the perfmon log file is in the installation directory of PAL, then use the drop down arrow to select one of the perfmon log files.
Note: Multiple log files can be merged by separating them with semicolons, but this may produce unpredictable results. PAL is best used with logs containing data from a single computer. - Date/Time Range: This is the date/time range that you can restrict the analysis to. For example, if you did load testing on BizTalk during a specific time.
- Threshold File Title: This is the threshold file that you want to use to determine thresholds that will be analyzed by PAL. If you do not see a product specific threshold file, then consider using the default System Overview threshold file. The System Overview threshold file analyzes the basic operating system performance counters only.
- Question Variable Names: The question variables will be different depending on which threshold file is selected. Click on each question variable name and answer the question in respect to the perfmon log chosen earlier.
- Analysis Interval: Specify the time (in seconds) that you want the PAL tool to analyze the perfmon log. Choose “AUTO” to have PAL automatically detect and choose an appropriate interval size. Choose “ALL” to have PAL analyze all of the data in the perfmon log. Be careful about choosing “ALL” because it is extremely resource intensive. The Analysis Interval determines how the perfmon log will be broken up when analyzed. For example, if you gathered a 24 hour log and choose a 1 hour analysis interval, then PAL will analyze each hour in the log for minimum, maximum, average, and thread values for that time interval. The data points in the charts generated by PAL are based on the analysis interval. “AUTO” is the default and is recommended.
- Output Options: You can optionally specify an output directory for the output, the file name format for the HTML report, and an optional XML document output. The XML output is useful if you want to analyze the results using another tool.
- The Queue: The queue is really just a batch file with line feeds for readability.
- Execute:
- Execute: Execute what is currently in the queue.
- Add to Queue: Do not execute yet, but add the current configuration into the queue and restart the wizard allowing you to add more. This is useful if you intend to "batch" the processing while you go to lunch or overnight.
- Execute and Restart: This executes what is in the queue now and restarts the wizard. This is useful if you want to start processing one of the logs right away while answering the questions for the other logs.
- Execute as a low priority process: This will run the PAL.vbs script at a low process priority so that when PAL.vbs is running, it will have the little to no impact on your computer. PAL processing can be very resource intensive. This option will make the processing take longer.
Interpreting the Report
The report generated by the PAL tool is simply an interpretation of the perfmon log data using generalized thresholds. The intention is to assist with performance analysis, but not to replace traditional performance analysis. Therefore, you should have a working knowledge of the BizTalk architecture and BizTalk performance analysis.
The thresholds used in the BizTalk threshold file include operating system thresholds (CPU, disk, memory, and network), Microsoft SQL Server counter thresholds, and BizTalk counter thresholds. The BizTalk thresholds generally focus on host throttling, adapter latency, service instance statistics (suspended, dehydrated, etc.), database sizes, and memory usage.
The PAL report is separated by categories and in each category is a collection of analyses. Each analysis focuses on a specific performance counter. If any of the thresholds are exceeded, then an alert is raised. The number of alerts in each analysis and the alert condition (typically Warning or Critical) typically indicate the severity of the results. For example, a critical alert in Pool Paged Bytes is a very serious condition and should be resolved immediately. When critical alerts occur, go to the analysis section associated to the alert. In the analysis section, there is typically a description of the analysis, a description of the thresholds used, and a link for more information on the topic.
The analyses in the report contain content describing the purpose of the analysis, why the thresholds are there, and references to more information. To learn more about interpreting the PAL report for BizTalk analysis, then read the following article:
Using the Performance Analysis of Logs (PAL) Tool
http://msdn.microsoft.com/en-us/library/cc296652.aspx
All my posts are provided "AS IS" with no warranties, and confer no rights. For PFE Job Opportunities at Microsoft, please visit our website at: http://members.microsoft.com/careers/search/default.aspx - search for keyword “PFE”
“PFE: The best place to be at Microsoft”
-
I’m working on another BizTalk performance gig and created a few custom HAT queries for measuring BizTalk artifact durations. I like to call them BizTalk Artifact Duration Aggregations or BADAggs for short – pun intended. ;-)
To aggregate artifact durations for the last x amount of time, you can modify the time range by modifying the dateadd parameters on line 2 of the following query. In this case, this query aggregates the durations of all of the artifacts that executed within the last 30 minutes.
declare @Timestamp as datetime
set @Timestamp = dateadd(minute, -30, GETUTCDATE())
SELECT
[Service/Name],
AVG([ServiceInstance/Duration]) as AverageDuration
FROM dbo.dtav_ServiceFacts sf WITH (READPAST)
WHERE [ServiceInstance/StartTime] > @Timestamp
GROUP BY [Service/Name]
ORDER BY AverageDuration desc
This next query allows you to choose a begin and end time range to aggregate the durations of the artifacts.
declare @BeginTime as datetime
declare @EndTime as datetime
set @BeginTime = CAST('2008-05-04 00:00:00.000'as datetime)
set @EndTime = CAST('2008-05-06 00:00:00.000'as datetime)
SELECT
[Service/Name],
AVG([ServiceInstance/Duration]) as AverageDuration
FROM dbo.dtav_ServiceFacts sf WITH (READPAST)
WHERE [ServiceInstance/StartTime] > @BeginTime
AND [ServiceInstance/StartTime] < @EndTime
GROUP BY [Service/Name]
ORDER BY AverageDuration desc
Both of these return results similar to this:
[Service/Name], AverageDuration
Microsoft.BizTalk.DefaultPipelines.XMLReceive,1057
Microsoft.Samples.BizTalk.ConsumeWebService.ReceivePOandSubmitToWS,7375
Microsoft.BizTalk.DefaultPipelines.PassThruTransmit,1115
Enjoy!
-
We [Microsoft] tend to have quite a few diamonds in the rough. One of those is our DLL Help database. This is a public tool where you can search to see nearly all of the public releases of a specific file. For example, if you are having an issue with a Microsoft product and have it narrowed down to a DLL, EXE, or SYS file, then you can look it up to see if there are newer releases of it and/or what public release the file was installed by. Anyway, here is the location of the database.
Microsoft DLL Help Database
http://support.microsoft.com/dllhelp/
The disadvantage of this database is that it doesn't always show *all* of the public releases such as hotfixes. For that you need to search the Microsoft Knowledge Base for the file name located at:
http://support.microsoft.com