Microsoft Enterprise Platforms Support: Windows Server Core Team
EPS Team Blogs
Product Team Blogs
The most popular question everyone asks themselves before calling Microsoft Customer Service and Support (CSS) for assistance in determining a Root Cause Analysis be done.
Why did the resources failover to the other node?
Some times a Root Cause Analysis for a Failover Cluster can be very time consuming, especially if it's a Windows 2003 (8) node Failover Cluster. Even though the references listed below may state they are for Windows 2000 Advanced Servers Cluster Service (MSCS), the same references can be used in the analysis of a Windows 2003 Failover Cluster.
Here's how we begin with the Root Cause Analysis:
Since there's not one single point of reference in determining why the cluster resources failed over, the following are some of the ones used in getting started. Techniques for Tracking the Source of a Problem http://www.microsoft.com/technet/prodtechnol/windows2000serv/reskit/distrib/dsdg_icl_lrwh.mspx?mfr=true Anatomy of a Cluster Log Entry http://www.microsoft.com/technet/prodtechnol/windows2000serv/reskit/distrib/dsdg_icl_fved.mspx?mfr=true Interpreting the Cluster loghttp://www.microsoft.com/technet/prodtechnol/windows2000serv/reskit/distrib/dsdg_icl_nnti.mspx?mfr=true
There are other great Cluster references in the Microsoft Knowledge Base besides some of the ones used listed below: 286052 The meaning of state codes in the Cluster log168801 How to turn on cluster logging in Microsoft Cluster Server892422 Overview of event ID 1123 and event ID 1122 logging in Windows 2000-based and Windows Server 2003-based server clusters914458 Behavior of the LooksAlive and IsAlive functions for the resources that are included in the Windows Server Clustering component of Windows Server 2003242450 How to query the Microsoft Knowledge Base by using keywords and query words 926079 Frequently asked questions about the Microsoft Support Diagnostic Tool (MSDT)
Thanks and remember, doing RCA is very tedious. This is just a guide to get you pointed down the right path.
Author: Mike RosadoSupport Engineer Microsoft - Windows Server - Enterprise Platforms Support - Core team (Setup, Cluster and Performance)
Explanation with an example would have been better.
how can I detect a failover from eventlogs or cluster.log? thank you
How find when was last cluster failure ?