Ne-am confruntat de-a lungul timpului cu un numar mare de cazuri legate de pierderea conectivitatii intr-un cluster.
Acest efect se poate observa in System Event Log prin aparitia urmatoarelor event-uri:

Event ID: 1123
Source: ClusSvc
Description:
The node lost communication with cluster node ComputerName on network 'Public Network'.

Event ID: 1122
Source: ClusSvc
Description:
The node (re)established communication with cluster node ComputerName on network 'Public Network'.

Acest articol prezinta un overview al acestor event-uri pe Windows Server 2000/2003:

Overview of event ID 1123 and event ID 1122 logging in Windows 2000-based and Windows Server 2003-based server clusters
http://support.microsoft.com/kb/892422/en-us

In afara de pasii si sugestiile prezentate in articolul de mai sus, as vrea sa mai adaug urmatoarele:

* Verificati setarile “spanning tree and trunking” pe switch-uri si deactivati-le

* Activati functia “Fast detect” pe switch-uri.

* Setati manual viteza si duplex-ul retelei de heartbeat, cit si pe switch-uri (http://support.microsoft.com/kb/258750/en-us)

* Deactivati software-ul de teaming pe reteaua “public” pentru o perioada (http://support.microsoft.com/kb/254101/)
   (atentie! teaming pe reteaua Private/Heartbeat nu este o solutie suportata de Microsoft)

* Daca aveti mai mult de 3 noduri in Cluster, deactivati multicast-ul (http://support.microsoft.com/kb/307962)
   (cluster CLUSTERNAME /priv MulticastClusterDisabled=1:DWORD)

In cazul in care aveti switch-uri Cisco in environment-ul dvs., puteti consulta articolul de mai jos pentru mai multe detalii specifice:

Using PortFast and Other Commands to Fix Workstation Startup Connectivity Delays
http://www.cisco.com/en/US/products/hw/switches/ps700/products_tech_note09186a00800b1500.shtml

Bogdan Palos
- Technical Lead / Enterprise Platforms Support (Core)