[Last updated on 27th February, 2014]
Hi there,
You can find a list of CU versions compiled from KB articles for cumulative update packages released for Lync client and Lync servers. Please note that the list only includes version information for Lync 2010 (desktop edition), Lync Server 2010, Lync 2013 (desktop edition) and Lync Server 2013 and it excludes interim updates, security updates, updates for non-desktop clients like WP8, iOS devices, Android, Lync phone edition)
Lync 2010
http://support.microsoft.com/?kbid=2467763
Description of the cumulative update package for Lync 2010: January 2011
(Version: 4.0.7577.108)
http://support.microsoft.com/?kbid=2496325
Description of the cumulative update package for Lync 2010: April 2011
(Version: 4.0.7577.253)
http://support.microsoft.com/kb/2551268
Description of the cumulative update package for Lync 2010: May 2011
(Version: 4.0.7577.280)
http://support.microsoft.com/?kbid=2571543
Description of the cumulative update package for Lync 2010: July 2011
(Version: 4.0.7577.314)
http://support.microsoft.com/?kbid=2514982
Description of the cumulative update package for Lync 2010: November 2011
(Version: 4.0.7577.4051)
http://support.microsoft.com/kb/2670326
Description of the cumulative update package for Lync 2010: February 2012
(Version: 4.0.7577.4072)
http://support.microsoft.com/kb/2701664
Description of the cumulative update package for Lync 2010: June 2012
(Version: 4.0.7577.4103)
http://support.microsoft.com/kb/2737155
Description of the cumulative update package for Lync 2010: October 2012
(Version: 4.0.7577.4356)
http://support.microsoft.com/kb/2791382
Description of the cumulative update package for Lync 2010: March 2013
(Version: 4.0.7577.4378)
http://support.microsoft.com/kb/2815347
Description of the cumulative update package for Lync 2010: April 2013
(Version: 4.0.7577.4384)
http://support.microsoft.com/kb/2842627
Description of the cumulative update package for Lync 2010: July 2013
(Version: 4.0.7577.4398)
http://support.microsoft.com/kb/2884632
Description of the cumulative update package for Lync 2010: October 2013
(Version: 4.0.7577.4409)
http://support.microsoft.com/kb/2912208
Description of the cumulative update package for Lync 2010: January 2014
(Version: 4.0.7577.4419)
Lync Server 2010
http://support.microsoft.com/kb/2467775
Description of the cumulative update for Lync Server 2010, Core Components: January 2011
http://support.microsoft.com/kb/2500442
Description of the cumulative update for Lync Server 2010: April 2011
(Version: 4.0.7577.137)
http://support.microsoft.com/kb/2571546
Description of the cumulative update for Lync Server 2010: July 2011
(Version: 4.0.7577.166)
http://support.microsoft.com/kb/2514980
Description of the cumulative update for Lync Server 2010: November 2011
(Version: 4.0.7577.183)
http://support.microsoft.com/kb/2670352
Description of the cumulative update for Lync Server 2010: February 2012
(Version: 4.0.7577.190)
http://support.microsoft.com/kb/2701585
Description of the cumulative update for Lync Server 2010: June 2012
(Version: 4.0.7577.199)
http://support.microsoft.com/kb/2737915
Description of the cumulative update for Lync Server 2010: October 2012
(Version: 4.0.7577.203)
http://support.microsoft.com/kb/2791381
Description of the cumulative update for Lync Server 2010: March 2013
(Version: 4.0.7577.216)
http://support.microsoft.com/kb/2860700
Description of the cumulative update for Lync Server 2010: July 2013
(Version: 4.0.7577.217)
http://support.microsoft.com/kb/2889610
Description of the cumulative update for Lync Server 2010: October 2013
(Version: 4.0.7577.223)
http://support.microsoft.com/kb/2909888
Description of the cumulative update for Lync Server 2010: January 2014
(Version: 4.0.7577.225)
Lync 2013
http://support.microsoft.com/kb/2812461
Description of the Lync 2013 updates 15.0.4454.1509: February 2013
http://support.microsoft.com/kb/2817465
MS13-054: Description of the security update for Lync 2013: July 9, 2013
(Version: 15.0.4517.1004)
http://support.microsoft.com/kb/2825630
Description of the Lync 2013 update 15.0.4551.1005: November 7, 2013
http://support.microsoft.com/kb/2850057
MS13-096: Description of the security update for Lync 2013: December 10, 2013
(Version: 15.0.4551.1007)
http://support.microsoft.com/kb/2817430
Description of Microsoft Office 2013 Service Pack 1 (SP1)
(Version: 15.0.4569.1503)
Lync Server 2013
http://support.microsoft.com/kb/2781547
Description of the cumulative update 5.0.8308.291 for Lync Server 2013: February 2013
http://support.microsoft.com/kb/2819565
Description of the cumulative update 5.0.8308.420 for Lync Server 2013: July 2013
http://support.microsoft.com/kb/2881684
Description of the cumulative update 5.0.8308.556 for Lync Server 2013
(Front End Server and Edge Server) : October 2013
http://support.microsoft.com/kb/2905048
Description of the cumulative update 5.0.8308.577 for Lync Server 2013 (Front End Server and Edge Server): January 2014
Hope this helps
Thanks,
Murat
[Last updated: 13th January 2014]
Hi,
In this blog entry, I wanted to talk about some changes made in Syn attack protection on Windows Vista onwards systems.
Syn attack protection has been in place since Windows 2000 and is enabled by default since Windows 2003/SP1. In the earlier implementation (Windows 2000/Windows 2003), syn attack protection mechanism was configurable via various registry keys (like SynAttackProtect, TcpMaxHalfOpen, TcpMaxHalfOpenRetried, TcpMaxPortsExhausted). With this previous version of syn attack protection, TCPIP stack starts dropping new connection requests when the threshold values are met regardless of how much system memory or CPU power available to the system. As of Windows Vista and onwards (Vista/2008/Win 7/2008 R2/Windows 8/Windows 2012/Windows 2012 R2), syn attack protection algorithm has been changed in the following ways:
1) SynAttack protection is enabled by default and cannot be disabled! 2) SynAttack protection dynamically calculates the thresholds (of when it considers an attack has started) based on the number of CPU cores and memory available and hence it doesn’t expose any configurable parameters via registry, netsh etc. 3) Since TCPIP driver goes into attack state based on the number of CPU cores and the amount of memory available, systems with more resources will start dropping new connection attempts later compared to systems with less resources. That was hard-coded (as per the configured registry settings) on pre-Vista systems where the system was moved to attack state regardless of how much resources were available to the system. The new algorithm eliminates the need of any fine tuning and TCPIP stack will self-tune to best values possible depending on the available resources.
One of the questions asked most about TCP Syn attack protection is how an administrator could identify if a server has moved into attack state. Currently there's no event logged whether or not the system has entered into attack state and started dropping TCP Syn packets on Vista and later systems. The only way of understanding that syn attack protection has kicked in is to collect an ETL trace (and you need start it before the attack starts so that you can see the relevant TCPIP ETL entry).
The command that you need to run is the following from an elevated command prompt (Note: "netsh trace" command only works on Windows 7/Windows 2008 R2 and later systems)
netsh trace start capture=yes provider=Microsoft-Windows-TCPIP level=0x05 tracefile=TCPIP.etl
Once Syn attack starts, the ETL trace could be stopped with the below command:
netsh trace stop
Then you can open it up with Network Monitor 3.4. The ETL entry that you should be looking for is the below one:
Thanks,Murat
I had to deal with a number of support cases where IPReassemblytimeout reg key was set but didn't take effect on Windows 2003 or a later system and I thought I should be sharing more information about this here. Here are some details:
IP fragmentation is needed when an upper layer packet whose payload is bigger than the IP MTU needs to be sent to a destination. This could happen when the packet initially leaves the host or could happen when a router needs to forward a packet that it received through one interface (with a bigger MTU) to another interface (Smaller MTU). That also happens when packets need to traverse VPN links where there's an additional VPN related overhead causing the original packet to be fragmented.
The final receiver of the fragmented IP packets re-assembles those fragments and forms the original packet before passing to upper layer protocols. Receiver waits for a period called "re-assembly timeout" for all the fragments that belong to the original packet to be received. If any one of the fragments is dropped on the way, receiver drops the other fragments belonging to the original packet.
In NT 3.1, there was a registry key called "IPReassemblytimeout" (which is referred to by KB 102973). But that registry key doesn't apply to Windows 2000, XP, 2003, Vista, 2008, Windows 7 or 2008 R2!
Some more facts:
1) IP Re-assembly timeout is hardcoded on Windows 2000, Windows 2003, Windows XP, Windows Vista, Windows 2008, Windows 7 and Windows 2008 R2 and cannot be changed by any means (registry, netsh etc)
2) For Windows Vista, Windows 2008, Windows 7 and Windows 2008 R2, it's hardcoded to 60 seconds as per section 4.5 of RFC 2460:
"If insufficient fragments are received to complete reassembly of a packet within 60 seconds of the reception of the first-arriving fragment of that packet, reassembly of that packet must be abandoned and all the fragments that have been received for that packet must be discarded."
For more information, please see RFC2460 (http://www.ietf.org/rfc/rfc2460.txt)
3) For Windows 2000, Windows XP and Windows 2003, it's at least 60 seconds but it may be higher depending on the value of TTL in the IP header and it may go up to 120 seconds.
Hope this helps..
In this blog post, I would like to talk about a misconfiguration which is still in place on many customer installations. I dealt with many network performance issues where the problem was stemming from using a small MTU size (576 bytes) when communicating with off the subnet hosts.
PMTU discovery option helps communicating endpoints find the most optimum MTU in a TCP session. If this feature is turned off, MTU is set to 576 bytes for all communication with off the subnet hosts. This might badly impact the performance while communicating with the remote hosts.
By default PMTU Discovery is enabled (EnablePMTUDiscovery is set to 1) but due to some older security recommendations, it is set to to 0 as part of server hardening. The reason behind setting that registry key to 0 was to prevent an attacker from forcing Windows to use very small MTU values to decrease the performance.
But that security recommendation is not a valid recommendation anymore as of MS05-019. After that security update, it’s not a security concern anymore because an attacker cannot set MTU size lower than 576 even if PMTU Discovery is enabled. So it shouldn't be set to 0 for security reasons as part of server hardening. This causes performance loss where there's no security concern in terms of small MTU usage.
You can find more information about the changed behavior at the below article:
http://www.microsoft.com/technet/security/bulletin/MS05-019.mspxVulnerabilities in TCP/IP Could Allow Remote Code Execution and Denial of Service (893066)
(From General Information > Vulnerability Details > ICMP Path MTU Vulnerability > Faq for ICMP Path MTU Vulnerability at the above link)
What is Path MTU Discovery?Path maximum transmission unit (PMTU) discovery is the process of discovering the maximum size of packet that can be sent across the network between two hosts without fragmentation (that is, without the packet being broken into multiple frames during transmission). It is described in RFC 1191. For more information, see RFC 1191. For additional information, see the following MSDN Web site.
What is wrong with the Path MTU Discovery process?Path maximum transmission unit (PMTU) discovery allows an attacker to specify a value that can degrade network performance for other connections. On unsecured networks, allowing PMTU discovery carries the risk that an attacker might force the MTU to a very small value and overwork the local system's TCP/IP stack. Normally this behavior would be restricted to the single connection that an attacker could establish. However, this vulnerability allows an attacker to modify the MTU value on other connections beyond their own connection to the affected system.
What does the update do?The update removes the vulnerability by restricting the minimum value of the MTU to 576 bytes. This update also modifies the way that the affected operating systems validate ICMP requests.
In today's blog, I'll talk about an MTU issue that occurs on Windows Vista onwards (Vista/7/2008/2008 R2).
One of our customers reported that their SMTP server (running on Windows 2008) was failing to send e-mails to certain remote SMTP servers because e-mail delivery was disrupted at transport layer.
After analyzing the network trace collected on the source Windows 2008 Server, we found out that the remote system was offering a TCP MSS size of 512 bytes and Windows 2008 server kept sending the data packets with an MSS size of 536 bytes. As a result, those packets weren't succesfully delivered to the remote system. You can find more details about the problem and root cause below:
Note: IP addresses and mail server names are deliberately changed.
Source SMTP server: 10.1.1.1 - mailgateway.contoso.comTarget SMTP server: 10.1.1.5 - mailgateway2.contoso.com
a) Source SMTP server establishes TCP 3-way handshake with the target SMTP server. Source server suggests an MSS size of 1460 bytes and the target suggests an MSS size of 512 bytes:
No. Time Source Destination Protocol Info 1 0.000000 10.1.1.1 10.1.1.5 TCP 28474 > 25 [SYN] Seq=0 Win=8192 Len=0 MSS=1460 WS=8 2 0.022001 10.1.1.5 10.1.1.1 TCP 25 > 28474 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=512 3 0.000000 10.1.1.1 10.1.1.5 TCP 28474 > 25 [ACK] Seq=1 Ack=1 Win=65392 Len=0
b) Then data starts flowing. Under normal circumstances, the minimum of MSS will be selected as the MSS of the given TCP session by both parties and it will be used throughout the session.
No. Time Source Destination Protocol Info 4 0.075005 10.1.1.5 10.1.1.1 SMTP S: 220 mailgateway2.contoso.com ESMTP Tue, 20 Apr 2010 15:18:42 +0200 5 0.001000 10.1.1.1 10.1.1.5 SMTP C: EHLO Mailgateway.contoso.com 6 0.021001 10.1.1.5 10.1.1.1 SMTP S: 250-mailgateway2.contoso.com Hello Mailgateway.contoso.com [10.1.1.1] | 250-SIZE 26214400 | 250-PIPELINING | 250 HELP 7 0.001000 10.1.1.1 10.1.1.5 SMTP C: MAIL FROM:<postmaster@contoso.com> SIZE=2616 | RCPT TO:<test@test.abc.com> 8 0.183011 10.1.1.5 10.1.1.1 SMTP S: 250 OK | 250 Accepted 9 0.000000 10.1.1.1 10.1.1.5 SMTP C: DATA 10 0.022001 10.1.1.5 10.1.1.1 SMTP S: 354 Enter message, ending with "." on a line by itself
c) Even though an MSS size of 512 should be commonly agreed by both parties, Windows 2008 server doesn't seem to be using that value and keeps sending data with an MSS of 536 bytes:
No. Time Source Destination Protocol Info 11 0.294017 10.1.1.1 10.1.1.5 SMTP C: Message Body, 536 bytes
d) Most likely the TCP segment with 536 bytes of data doesn't arrive at the target server and we don't get a TCP ACK back as a result so we start TCP packet retransmissions:
No. Time Source Destination Protocol Info 12 0.600034 10.1.1.1 10.1.1.5 SMTP [TCP Retransmission] C: Message Body, 536 bytes 13 0.190011 10.1.1.5 10.1.1.1 SMTP [TCP Retransmission] S: 354 Enter message, ending with "." on a line by itself 14 0.000000 10.1.1.1 10.1.1.5 TCP [TCP Dup ACK 12#1] 28474 > 25 [ACK] Seq=649 Ack=269 Win=65124 Len=0 15 1.010058 10.1.1.1 10.1.1.5 SMTP [TCP Retransmission] C: Message Body, 536 bytes 16 2.400137 10.1.1.1 10.1.1.5 SMTP [TCP Retransmission] C: Message Body, 536 bytes 17 4.800274 10.1.1.1 10.1.1.5 SMTP [TCP Retransmission] C: Message Body, 536 bytes
e) Finally the source server closes the TCP session as it fails to successfully deliver the 536 bytes TCP segment to the target system:
No. Time Source Destination Protocol Info 18 9.600550 10.1.1.1 10.1.1.5 TCP 28474 > 25 [RST, ACK] Seq=649 Ack=269 Win=0 Len=0
The same problem doesn't happen if the source server is a Windows 2003 server.
After explaining the problem, now let's try to understand the root cause:
This issue stems from the fact that Windows Vista onwards systems don't accept an MTU size lower than 576 bytes:
TCP/IP Registry Values for Microsoft Windows Vista and Windows Server 2008http://www.microsoft.com/downloads/details.aspx?familyid=12AC9780-17B5-480C-AEF7-5C0BDE9060B0&displaylang=en
MTUKey: Tcpip\Parameters\Interfaces\interfaceGUID Value Type: REG_DWORD—numberValid Range: From 576 to the MTU of the underlying networkDefault: 0xFFFFFFFFDescription: This value overrides the default Maximum Transmission Unit (MTU) for a network interface. The MTU is the maximum IP packet size, in bytes, that can be transmitted over the underlying network. For values larger than the default for the underlying network, the network default MTU is used. For values smaller than 576, the MTU of 576 is used. This setting only applies to IPv4.Note: Windows Vista TCP/IP uses path MTU (PMTU) detection by default and queries the network adapter driver to find out what local MTU is supported. Altering the MTU value is typically not necessary and might result in reduced performance.
Since minimum MTU that could be used by a Window Vista onwards system is 576 bytes, a TCP MSS (maximum segment size) should be 536 bytes at miminum so that's why Windows 2008 source server tries to send TCP segments with 536 bytes of data. TCP MSS value is calculated as follows:
TCP MSS = IP MTU - IP header size (20 bytes by default) - TCP header size (20 bytes by default)
Hi there, In today’s blog post, I’m going to talk about an issue that I have come across several times while analyzing network traces with Wireshark. Let’s take the following example: I apply the following filter on a network trace: ip.addr==192.168.100.23 and ip.addr==192.168.121.51 and tcp.port==3268 and tcp.port==8081 And I get the following traffic: No. Time Source Destination Protocol Info 8773 17.458870 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [SYN] Seq=0 Win=65535 Len=0 MSS=1460 8774 17.458988 192.168.121.51 192.168.100.23 TCP 8081 > 3268 [SYN, ACK] Seq=0 Ack=1 Win=8192 [TCP CHECKSUM INCORRECT] Len=0 MSS=1460 8775 17.459239 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [ACK] Seq=1 Ack=1 Win=65535 Len=0 8776 17.459239 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [PSH, ACK] Seq=1 Ack=1 Win=65535 Len=264 8850 17.658922 192.168.121.51 192.168.100.23 TCP 8081 > 3268 [ACK] Seq=1 Ack=265 Win=64240 [TCP CHECKSUM INCORRECT] Len=0 8851 17.659108 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [PSH, ACK] Seq=265 Ack=1 Win=65535 Len=21 8853 17.661356 192.168.100.23 192.168.121.51 TCP [TCP ACKed lost segment] 3268 > 8081 [ACK] Seq=286 Ack=2581 Win=65535 Len=0 8854 17.661404 192.168.121.51 192.168.100.23 TCP 8081 > 3268 [FIN, ACK] Seq=2581 Ack=286 Win=64219 [TCP CHECKSUM INCORRECT] Len=0 8855 17.661605 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [ACK] Seq=286 Ack=2582 Win=65535 Len=0 8859 17.665981 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [FIN, ACK] Seq=286 Ack=2582 Win=65535 Len=0 8860 17.666013 192.168.121.51 192.168.100.23 TCP 8081 > 3268 [ACK] Seq=2582 Ack=287 Win=64219 [TCP CHECKSUM INCORRECT] Len=0 When I take a closer look, I see that a TCP segment is missing from the list of packets and hence the next frame is displayed with a [TCP ACKed lost segment] comment by Wireshark. Interestingly if I apply the following filter, I can see the frame that’s missing from the TCP conversation: ip.addr==192.168.100.23 and ip.addr==192.168.121.51 No. Time Source Destination Protocol Info 8852 17.661030 HewlettP_12:34:56 Cisco_12:34:56 IP Bogus IP length (0, less than header length 20) Frame 8852 (2634 bytes on wire, 2634 bytes captured) Ethernet II, Src: HewlettP_12:34:56 (00:17:a4:12:34:56), Dst: Cisco_12:34:56 (00:15:2c:12:34:56) Destination: Cisco_12:34:56 (00:15:2c:12:34:56) Source: HewlettP_12:34:56 (00:17:a4:12:34:56) Type: IP (0x0800) Internet Protocol Version: 4 Header length: 20 bytes Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00) 0000 00.. = Differentiated Services Codepoint: Default (0x00) .... ..0. = ECN-Capable Transport (ECT): 0 .... ...0 = ECN-CE: 0 Total length: 0 bytes (bogus, less than header length 20) 0000 00 15 2c 31 48 00 00 17 a4 77 00 24 08 00 45 00 ..,1H....w.$..E. 0010 00 00 57 d0 40 00 80 06 00 00 c0 a8 79 33 c0 a8 ..W.@.......y3.. 0020 64 17 1f 91 0c c4 52 83 a3 f2 a2 a2 06 be 50 18 d.....R.......P. 0030 fa db 5e a2 00 00 48 54 54 50 2f 31 2e 31 20 32 ..^...HTTP/1.1 2 0040 30 30 20 4f 4b 0d 0a 50 72 61 67 6d 61 3a 20 6e 00 OK..Pragma: n 0050 6f 2d 63 61 63 68 65 0d 0a 43 6f 6e 74 65 6e 74 o-cache..Content 0060 2d 54 79 70 65 3a 20 74 65 78 74 2f 68 74 6d 6c -Type: text/html 0070 3b 63 68 61 72 73 65 74 3d 75 74 66 2d 38 0d 0a ;charset=utf-8.. 0080 53 65 72 76 65 72 3a 20 4d 69 63 72 6f 73 6f 66 Server: Microsof 0090 74 2d 49 49 53 2f 37 2e 35 0d 0a 58 2d 50 6f 77 t-IIS/7.5..X-Pow ... Even though the total length field is set to 0, I see that the IP packet has some payload (probably including a TCP header). The problem occurs because the Wireshark doesn’t fully parse the IP and TCP headers because of total length field in the IP header is 0. This also explains why we don’t see the same packet when TCP filter is applied. After some testing, I realized that this issue could be fixed by setting the following value in Wireshark settings:
In today’s blog post, I’m going to talk about an issue that I have come across several times while analyzing network traces with Wireshark. Let’s take the following example:
I apply the following filter on a network trace:
ip.addr==192.168.100.23 and ip.addr==192.168.121.51 and tcp.port==3268 and tcp.port==8081
And I get the following traffic:
No. Time Source Destination Protocol Info
8773 17.458870 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [SYN] Seq=0 Win=65535 Len=0 MSS=1460
8774 17.458988 192.168.121.51 192.168.100.23 TCP 8081 > 3268 [SYN, ACK] Seq=0 Ack=1 Win=8192 [TCP CHECKSUM INCORRECT] Len=0 MSS=1460
8775 17.459239 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [ACK] Seq=1 Ack=1 Win=65535 Len=0
8776 17.459239 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [PSH, ACK] Seq=1 Ack=1 Win=65535 Len=264
8850 17.658922 192.168.121.51 192.168.100.23 TCP 8081 > 3268 [ACK] Seq=1 Ack=265 Win=64240 [TCP CHECKSUM INCORRECT] Len=0
8851 17.659108 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [PSH, ACK] Seq=265 Ack=1 Win=65535 Len=21
8853 17.661356 192.168.100.23 192.168.121.51 TCP [TCP ACKed lost segment] 3268 > 8081 [ACK] Seq=286 Ack=2581 Win=65535 Len=0
8854 17.661404 192.168.121.51 192.168.100.23 TCP 8081 > 3268 [FIN, ACK] Seq=2581 Ack=286 Win=64219 [TCP CHECKSUM INCORRECT] Len=0
8855 17.661605 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [ACK] Seq=286 Ack=2582 Win=65535 Len=0
8859 17.665981 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [FIN, ACK] Seq=286 Ack=2582 Win=65535 Len=0
8860 17.666013 192.168.121.51 192.168.100.23 TCP 8081 > 3268 [ACK] Seq=2582 Ack=287 Win=64219 [TCP CHECKSUM INCORRECT] Len=0
When I take a closer look, I see that a TCP segment is missing from the list of packets and hence the next frame is displayed with a [TCP ACKed lost segment] comment by Wireshark. Interestingly if I apply the following filter, I can see the frame that’s missing from the TCP conversation:
ip.addr==192.168.100.23 and ip.addr==192.168.121.51
8852 17.661030 HewlettP_12:34:56 Cisco_12:34:56 IP Bogus IP length (0, less than header length 20)
Frame 8852 (2634 bytes on wire, 2634 bytes captured)
Ethernet II, Src: HewlettP_12:34:56 (00:17:a4:12:34:56), Dst: Cisco_12:34:56 (00:15:2c:12:34:56)
Destination: Cisco_12:34:56 (00:15:2c:12:34:56)
Source: HewlettP_12:34:56 (00:17:a4:12:34:56)
Type: IP (0x0800)
Internet Protocol
Version: 4
Header length: 20 bytes
Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00)
0000 00.. = Differentiated Services Codepoint: Default (0x00)
.... ..0. = ECN-Capable Transport (ECT): 0
.... ...0 = ECN-CE: 0
Total length: 0 bytes (bogus, less than header length 20)
0000 00 15 2c 31 48 00 00 17 a4 77 00 24 08 00 45 00 ..,1H....w.$..E.
0010 00 00 57 d0 40 00 80 06 00 00 c0 a8 79 33 c0 a8 ..W.@.......y3..
0020 64 17 1f 91 0c c4 52 83 a3 f2 a2 a2 06 be 50 18 d.....R.......P.
0030 fa db 5e a2 00 00 48 54 54 50 2f 31 2e 31 20 32 ..^...HTTP/1.1 2
0040 30 30 20 4f 4b 0d 0a 50 72 61 67 6d 61 3a 20 6e 00 OK..Pragma: n
0050 6f 2d 63 61 63 68 65 0d 0a 43 6f 6e 74 65 6e 74 o-cache..Content
0060 2d 54 79 70 65 3a 20 74 65 78 74 2f 68 74 6d 6c -Type: text/html
0070 3b 63 68 61 72 73 65 74 3d 75 74 66 2d 38 0d 0a ;charset=utf-8..
0080 53 65 72 76 65 72 3a 20 4d 69 63 72 6f 73 6f 66 Server: Microsof
0090 74 2d 49 49 53 2f 37 2e 35 0d 0a 58 2d 50 6f 77 t-IIS/7.5..X-Pow
...
Even though the total length field is set to 0, I see that the IP packet has some payload (probably including a TCP header).
The problem occurs because the Wireshark doesn’t fully parse the IP and TCP headers because of total length field in the IP header is 0. This also explains why we don’t see the same packet when TCP filter is applied.
After some testing, I realized that this issue could be fixed by setting the following value in Wireshark settings:
After I enable “Support packet-capture from IP TSO-enabled hardware”, Wireshark also started to correctly display the frames even when the TCP session filter is applied: ip.addr==192.168.100.23 and ip.addr==192.168.121.51 and tcp.port==3268 and tcp.port==8081 No. Time Source Destination Protocol Info 8771 17.458870 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [SYN] Seq=0 Win=65535 Len=0 MSS=1460 SACK_PERM=1 8772 17.458988 192.168.121.51 192.168.100.23 TCP 8081 > 3268 [SYN, ACK] Seq=0 Ack=1 Win=8192 [TCP CHECKSUM INCORRECT] Len=0 MSS=1460 SACK_PERM=1 8773 17.459239 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [ACK] Seq=1 Ack=1 Win=65535 Len=0 8774 17.459239 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [PSH, ACK] Seq=1 Ack=1 Win=65535 Len=264 8848 17.658922 192.168.121.51 192.168.100.23 TCP 8081 > 3268 [ACK] Seq=1 Ack=265 Win=64240 [TCP CHECKSUM INCORRECT] Len=0 8849 17.659108 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [PSH, ACK] Seq=265 Ack=1 Win=65535 Len=21 8850 17.661030 192.168.121.51 192.168.100.23 TCP 8081 > 3268 [PSH, ACK] Seq=1 Ack=286 Win=64219 [TCP CHECKSUM INCORRECT] Len=2580 8851 17.661356 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [ACK] Seq=286 Ack=2581 Win=65535 Len=0 8852 17.661404 192.168.121.51 192.168.100.23 TCP 8081 > 3268 [FIN, ACK] Seq=2581 Ack=286 Win=64219 [TCP CHECKSUM INCORRECT] Len=0 8853 17.661605 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [ACK] Seq=286 Ack=2582 Win=65535 Len=0 8857 17.665981 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [FIN, ACK] Seq=286 Ack=2582 Win=65535 Len=0 8858 17.666013 192.168.121.51 192.168.100.23 TCP 8081 > 3268 [ACK] Seq=2582 Ack=287 Win=64219 [TCP CHECKSUM INCORRECT] Len=0 When TSO (TCP segmentation offloading) is in place, TCPIP stack doesn’t deal with segmentation at TCP layer and leave it to NIC driver for effienciency purposes. Since Wireshark does see the packet before the NIC, we see the total length as 0 in the packet but when that packet is segmented accordingly by the NIC, there will be correct length field set in the packet. This can also be proved by collecting a network trace at the other end of the session Note: Network Monitor already takes that into account and hence you don’t need to take any corrective action if you’re checking the trace with it. Hope this helps Thanks, Murat
After I enable “Support packet-capture from IP TSO-enabled hardware”, Wireshark also started to correctly display the frames even when the TCP session filter is applied:
8771 17.458870 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [SYN] Seq=0 Win=65535 Len=0 MSS=1460 SACK_PERM=1
8772 17.458988 192.168.121.51 192.168.100.23 TCP 8081 > 3268 [SYN, ACK] Seq=0 Ack=1 Win=8192 [TCP CHECKSUM INCORRECT] Len=0 MSS=1460 SACK_PERM=1
8773 17.459239 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [ACK] Seq=1 Ack=1 Win=65535 Len=0
8774 17.459239 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [PSH, ACK] Seq=1 Ack=1 Win=65535 Len=264
8848 17.658922 192.168.121.51 192.168.100.23 TCP 8081 > 3268 [ACK] Seq=1 Ack=265 Win=64240 [TCP CHECKSUM INCORRECT] Len=0
8849 17.659108 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [PSH, ACK] Seq=265 Ack=1 Win=65535 Len=21
8850 17.661030 192.168.121.51 192.168.100.23 TCP 8081 > 3268 [PSH, ACK] Seq=1 Ack=286 Win=64219 [TCP CHECKSUM INCORRECT] Len=2580
8851 17.661356 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [ACK] Seq=286 Ack=2581 Win=65535 Len=0
8852 17.661404 192.168.121.51 192.168.100.23 TCP 8081 > 3268 [FIN, ACK] Seq=2581 Ack=286 Win=64219 [TCP CHECKSUM INCORRECT] Len=0
8853 17.661605 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [ACK] Seq=286 Ack=2582 Win=65535 Len=0
8857 17.665981 192.168.100.23 192.168.121.51 TCP 3268 > 8081 [FIN, ACK] Seq=286 Ack=2582 Win=65535 Len=0
8858 17.666013 192.168.121.51 192.168.100.23 TCP 8081 > 3268 [ACK] Seq=2582 Ack=287 Win=64219 [TCP CHECKSUM INCORRECT] Len=0
When TSO (TCP segmentation offloading) is in place, TCPIP stack doesn’t deal with segmentation at TCP layer and leave it to NIC driver for effienciency purposes. Since Wireshark does see the packet before the NIC, we see the total length as 0 in the packet but when that packet is segmented accordingly by the NIC, there will be correct length field set in the packet. This can also be proved by collecting a network trace at the other end of the session
Note: Network Monitor already takes that into account and hence you don’t need to take any corrective action if you’re checking the trace with it.
In this blog post, I’ll be talking about another TMG problem where FTP over HTTP was failing through TMG server.
Let me first summarize the scenario:
- Internet Explorer clients need to connect to an external FTP site through TMG server
- Due to some other requirements, this FTP site needs to be accessed passively
FTP filter in TMG server already uses passive FTP when connecting to external FTP sites:
(Note: And this is the default behavior, please see http://blogs.technet.com/b/yuridiogenes/archive/2010/03/16/error-502-active-ftp-not-allowed-when-trying-to-list-files-in-a-ftp-session-behind-forefront-tmg-2010.aspx for more information.
That was also the case in my customer’s scenario but passive FTP connection to the target FTP server was still failing. After some troubleshooting, we found out that TMG server was trying to connect to the target FTP site actively even FTP filter was configured as above.
Normally, when you type ftp://target-FTP-Server-FQDN in the IE address bar and IE is configured to use a Proxy server, the connection request will be sent as an HTTP request to the Proxy server (and the FTP GET request will be inside that HTTP request), this is also called FTP over HTTP. So the request flow will be similar to below:
a) Client sends the request via FTP over HTTP to the Proxy server
b) Proxy server connects to the target FTP server via FTP procotol
After some further troubleshooting with TMG data packager and the network trace analysis, I found out that FTP filter wasn’t involved in when Proxy server receives FTP over HTTP traffic from clients and hence FTP filter setting doesn’t apply to FTP over HTTP requests.
The resolution was to set the NonPassiveFTPTransfer registry key on the TMG server and restart the firewall service:
Note: You can find more information about that registry key at http://support.microsoft.com/kb/300641 How to enable passive CERN FTP connections through ISA Server 2000, 2004, or 2006
As mentioned above, after the registry key is created, you’ll need to stop and then start firewall service from an elevated command prompt:
net stop fwsrv
net start fwsrv
To summarize; even though “NonPassiveFTPTransfer” registry key shouldn’t be needed for TMG server, the exact requirements are as follows:
a) If the internal client sends the FTP request directly through FTP procotol, there’s no need to change anything on TMG server side as the FTP filter will kick in and the FTP connection to the external FTP server will be initiated passively (Examples: Command prompt FTP client, 3rd party FTP client applications, IE which isn’t configured to use a Proxy server etc)
b) If the internal client sends the FTP request through FTP over HTTP procotol, then the changes mentioned above needs to be implemented on TMG server side in order for TMG server to initiate the outbound FTP connection passively (Example: IE which is configured to use a Proxy server)
In one of the past cases, one of our customers wanted to know if we supported policy based routing on Windows 2003 or later OSes. First of all, it might be useful to clarify what "policy based routing" means in this context. Let's take the following as an example:
"A server is running as a router and have 3 network interfaces. When the server receives a packet from a specific host (let’s say running at a certain IP address) from one of its interfaces (say interface1), we would like the server to always route that host’s packets through interface2 without consulting the routing table. (The criteria might be different for different scenarios such as "all packets with a destination of TCP port 80 to be sent out from interface3 etc)"
This kind of advanced routing decisions are generally supported by network hardware vendors like Cisco. For example, by using route-map configuration in Cisco IOS, you can affect the conventional routing decisions made by looking up the routing table. You can find more information on that at the following link:
http://www.cisco.com/en/US/docs/ios/12_0/qos/configuration/guide/qcpolicy.html Configuring Policy-Based Routing
And the answer to the original question is: No, we don't support policy based routing on Windows server OSes since this is generally a feature that would be needed on hardware routers whose main purpose is to do packet routing.
In this blog post, I would like to talk about running Lync 2013 Webapp in Windows Terminal server environments. Lync 2013 Webapp feature has a client side plug-in which provides audio/video/application sharing functionality and this plug is installed per user, in other words installation program installs files and creates registry settings in user specific areas of the system. Most of terminal server environments are locked down in production networks and users are generally not allowed to install softwares.
I recently dealt with a couple of cases where it was required to find a solution to this problem. One possible solution is to create exceptions in your software restriction softwares (it could be a 3rd party software or it could be a Microsoft solution (Software restriction policies or Applocker)). You will find steps below to create such exceptions in Software restriction policies and applocker:
Software Restriction policies (That could be applied if the Terminal server is Windows 2003 and later)
Note: Please note that we don't support Lync Webapp on Windows 2003, it's supported on Windows 2008 or later. Please see the below link for more details:
http://technet.microsoft.com/en-us/library/gg425820.aspx Lync Web App Supported Platforms
a) First of all, Lync webapp plugin (LWAPlugin64BitInstaller32.msi) file includes a number of executables each of which needs to be defined within software restriction policy rules. You can extract the MSI file itself with 7zip or a similar tool. Once the msi file is extracted, we have the following executables:
AppSharingHookController.exe
AppSharingHookController64.exe
LWAPlugin.exe
LWAVersionPlugin.exe
b) So we need to create 5 additional rules (1 MSI rule and 4 executable file rules) in Software restriction policies in addition to your existing software restriction policy rules as given below:
Note: It’s best to create File hash rules for the MSI file itself or the other 4 executables that are extracted from MSI file
So if Software restriction policies is already deployed on your network, there could be an exception created for the Lync web app plugin so users will still comply with the application installation/execution policies
Applocker: (That could be applied if the Terminal server is Windows 2008 R2 or later)
a) As mentioned in previous scenario, Lync webapp plugin (LWAPlugin64BitInstaller32.msi) file includes a number of executables each of which needs to be defined within applocker rules. You can extract the MSI file itself with 7zip or a similar tool. Once the msi file is extracted, we have the following executables:
b) So we need to create 1 MSI rule and 4 executable file rules in applocker as given below:
So if Applocker is deployed on your network, there could be an exception created for the Lync web app plugin so users will still comply with the application installation/execution policies
The only drawback in regards to file hash rules is that once Lync server Web components are updated with a cumulative update on Lync server side, you’ll have to create those file hash rules one more time (because probably the content of msi file that is shared from FE server will be different and hence the file hash will change) but considering that the web components are not frequently updated this may need to be done 2 or 3 times in a year. Alternatively there could be file path rule or publisher rule created instead of a file hash rule to avoid such maintenance.
In this blog post, I’ll be talking about a TMG related issue. Actually it’s not an issue that stems from TMG itself but the way TMG server is configured (using authenticated rules on TMG server) triggers the problem.
This is already a known fact and we have a KB article that explains this issue (JVM applications cannot send authentication information when requested) and the workaround is to turn off authentication for the access rule that will allow the client’s connection to external networks:
http://support.microsoft.com/kb/925881/ An ISA server or Forefront Threat Management Gateway server requests credentials when client computers in the same domain use Internet Explorer to access Web sites that contain Java programs
So if you see all or some parts of a web page is not displayed correctly and you see Proxy authentication required or similar messages on the client side and you suspect that Java is involved somehow you’ll have to implement the steps mentioned at the above article.
But sometimes it may not be that clear which was the case in my scenario. The customer reported that videos at an on demand video conference site weren’t successfully viewed and the application running inside IE was displaying an unrelated error. I suspected that we were hitting the problem mentioned above and then requested the customer to configure a temp access rule to allow all outbound access for “All users”, then the videos started to play J
Then we changed the rule target to the target web site only (you can do this via a URL set (for HTTP/HTTPS access) or via domain name set (for any protocols), you can find more information below:
http://technet.microsoft.com/en-us/library/cc441706.aspx Processing domain name sets and URL sets
Since the customer was connecting to https://www.videoondemandwebsite.com , we have added this domain to the rule target. But afterwards the video access was still failing. Then we decided to collect more information on what kind of http activity was taking place on the client side. I asked the customer to install Fiddler on the client to see this activity (you can download the tool at http://www.fiddler2.com)
You’ll find below a sample screen shot taken from Fiddler which was taken when accessing Microsoft’s web site:
As you can see from the above output, even if you see a certain address in IE (www.microsoft.com in this example), the browser might need to connect to other related web sites to load some images, to get a script, etc etc. In the above example, browser also connects to ads1.msn.com or rad.msn.com ...
That was the case in my customer problem, even though the customer was connecting to https://www.videoondemandwebsite.com, the browser was connecting to a few other web sites like *.site1.com and *.site2.com. So we changed the relevant rule to cover these domain names as well and the problem was resolved:
*.videoondemandwebsite.com
*.site1.com
*.site2.com
I would like to go through an integration problem between Lync phone edition devices and Exchange 2010 that I worked on a while ago. Since the integration wasn’t working properly, users couldn’t access call logs, recorded voice mails, calendar information etc from their desktop phones (HP4120).
To understand the problem in more details, we asked our customer to collect Exchange side configuration information, phone edition logs from a problem device, network trace from Lync FE server etc.
Note: You can find full details on how to collect and analyze CELog data in the following Microsoft whitepaper:
http://www.microsoft.com/en-us/download/details.aspx?id=15668 Understanding and Troubleshooting Microsoft Exchange Server Integration
Troubleshooting:
===============
Note: IP addresses, server names, URLs etc are replaced for privacy purposes
1) Lync phone edition device succeeds to obtain autodiscovery information:
0:01:43.203.610 : Raw data 211 (char), UCD_LOG_INFO: 10/19/2012|14:50:27 Aries: 10/19/2012|14:50:27.214 4EC0006:5250012 INFO :: NAutoDiscover::DnsAutodiscoverTask::ParseSoapResponse: InternalEwsUrl is https://casarray.contoso.com/EWS/Exchange.asmx
0:01:43.204.593 : Raw data 212 (char), UCD_LOG_INFO: 10/19/2012|14:50:27 Aries: 10/19/2012|14:50:27.215 4EC0006:5250012 INFO :: NAutoDiscover::DnsAutodiscoverTask::ParseSoapResponse: ExternalEwsUrl is https://autodiscover.contoso.com/EWS/Exchange.asmx
2) But the Lync phone edition device fails to access the EWS site with the following error (the same error is seen in all device logs) so the integration error occurs:
0:01:43.840.224 : Raw data 206 (char), UCD_LOG_ERROR: 10/19/2012|14:50:27 Aries: 10/19/2012|14:50:27.850 4EC0006:5250012 ERROR :: WebServices::CSoapTransport::ExecuteSoapOperation: Insecure server. errorCode=12045, status=014C0220, hr=80f10043
0:01:43.840.617 : Raw data 196 (char), UCD_LOG_ERROR: 10/19/2012|14:50:27 Aries: 10/19/2012|14:50:27.851 4EC0006:5250012 ERROR :: WebServices::CSoapTransport::SendSoapRequest: ExecuteSoapOperation failed. status=014C0220, hr=80f10043
0:01:43.840.963 : Raw data 225 (char), UCD_LOG_INFO: 10/19/2012|14:50:27 Aries: 10/19/2012|14:50:27.851 4EC0006:5250012 INFO :: WebServices::WebRequestImpl::ExecuteCommon: after SendSoapRequest. hr=0x80f10043, url=https://casarray.contoso.com/EWS/Exchange.asmx
0:01:43.841.216 : Raw data 191 (char), UCD_LOG_INFO: 10/19/2012|14:50:27 Aries: 10/19/2012|14:50:27.851 4EC0006:5250012 INFO :: WebServices::CSoapTransport::GetHttpHeaderInfoFromHandle: hSoapHandle=014C0220, headerInfo=013E2708
0:01:43.841.548 : Raw data 197 (char), UCD_LOG_INFO: 10/19/2012|14:50:27 Aries: 10/19/2012|14:50:27.852 4EC0006:5250012 INFO :: WebServices::CSoapTransport::GetCredentialsInfoFromHandle: hSoapHandle=014C0220, credentialsInfo=013E2670
0:01:43.842.135 : Raw data 165 (char), UCD_LOG_INFO: 10/19/2012|14:50:27 Aries: 10/19/2012|14:50:27.852 4EC0006:5250012 INFO :: WebServices::CSoapTransportStatus::~CSoapTransportStatus: status=014C0220
0:01:43.842.492 : Raw data 182 (char), UCD_LOG_ERROR: 10/19/2012|14:50:27 Aries: 10/19/2012|14:50:27.853 4EC0006:5250012 ERROR :: WebServices::WebRequestImpl::ExecuteCommon: Could not execute SOAP request. hr=0x80f10043
err 014C0220
# as an HRESULT: Severity: SUCCESS (0), Facility: 0x14c, Code 0x220
# for hex 0x220 / decimal 544 :
SE_AUDITID_IPSEC_AUTH_FAIL_CERT_TRUST msaudite.h
# IKE security association establishment failed because peer
# could not authenticate.
# The certificate trust could not be established.%n
# Peer Identity: %n%1%n
err 0x80f10043
# (user-to-user)
MIDIERR_NOTREADY mmsystem.h
NMERR_INVALID_PACKET_LENGTH netmon.h
ERROR_BAD_NET_NAME winerror.h
# The network name cannot be found.
LDAP_NOT_ALLOWED_ON_RDN winldap.h
So the problem looks like a certificate issue.
=> When I check the certificate assigned to CAS array, I see the following:
AccessRules : {System.Security.AccessControl.CryptoKeyAccessRule, System.Security.AccessControl.CryptoKeyAccessRule}
CertificateDomains : {casarray.contoso.com, cas01.contoso.com}
HasPrivateKey : True
IsSelfSigned : False
Issuer : CN=Issuing-CA for contoso, DC=contoso, DC=com
NotAfter : 8/8/2014 3:36:18 PM
NotBefore : 8/8/2012 3:36:18 PM
PublicKeySize : 1024
RootCAType : Enterprise
SerialNumber : 362763723200000000524
Services : IMAP, POP, IIS
Status : Valid
Subject : CN=casarray.contoso.com
Thumbprint : EF32873628362BDA8326108875301F38504AB
Issuer is Issuing-CA for contoso, DC=contoso, DC=com
=> On the other hand, the CA that issues the Lync frontend certificate is the following: (this is taken from Lync server side network trace)
+ Tcp: Flags=...A...., SrcPort=5061, DstPort=50855, PayloadLen=1460, Seq=2076240595 - 2076242055, Ack=1109389286, Win=256 (scale factor 0x8) = 65536
TLSSSLData: Transport Layer Security (TLS) Payload Data
- TLS: TLS Rec Layer-1 HandShake: Server Hello. Certificate.
- TlsRecordLayer: TLS Rec Layer-1 HandShake:
ContentType: HandShake:
+ Version: TLS 1.0
Length: 4380 (0x111C)
- SSLHandshake: SSL HandShake Certificate(0x0B)
HandShakeType: ServerHello(0x02)
Length: 77 (0x4D)
+ ServerHello: 0x1
HandShakeType: Certificate(0x0B)
Length: 3423 (0xD5F)
- Cert: 0x1
CertLength: 3420 (0xD5C)
- Certificates:
CertificateLength: 1574 (0x626)
+ Signature: Sha1WithRSAEncryption (1.2.840.113549.1.1.5)
+ Issuer: Issuing CA for contoso,com
+ Validity: From: 09/10/12 10:05:05 UTC To: 09/10/14 10:05:05 UTC
+ Subject: casarray.contoso.com
+ SubjectPublicKeyInfo: RsaEncryption (1.2.840.113549.1.1.1)
+ Tag3:
+ Extensions:
+ SignatureAlgorithm: Sha1WithRSAEncryption (1.2.840.113549.1.1.5)
+ Signature:
Certificates:
=> So the issuers of certificates used by Lync server itself and casarray.contoso.com servers are different CAs:
a) CA that issues Lync FE certificate:
Issuing CA for contoso,com
b) CA that issues CAS array certificate:
Issuing-CA for contoso, DC=contoso, DC=com
Apparently two different CAs with similar names issued Lync FE and Exchange CAS array certificates.
Under normal circumstances, if those two CAs are enterprise CAs, they automatically publish their own CA certificates to AD so the clients can download and use them while verifying the certificate chain. But after some internal research I found out that phone edition devices only trust the same CA that assigned the certificate to Lync FE pool to which phone edition device signs in. (except public certificates listed in http://technet.microsoft.com/en-us/library/gg398270(OCS.14).aspx)
RESULTS:
========
After issuing a new certificate to CAS array from the same enterprise CA (Issuing CA for contoso,com) that issues Lync FE certificate as well, the integration problem was resolved.
Hope this helps you when dealing with similar problems...
In this blog entry, I will be discussing TCP keepalive mechanism and will also provide some information about configuration options on Windows systems.
a) Definition
Let's first understand the mechanism. A TCP keep-alive packet is simply an ACK with the sequence number set to one less than the current sequence number for the connection. A host receiving one of these ACKs responds with an ACK for the current sequence number. Keep-alives can be used to verify that the computer at the remote end of a connection is still available. TCP keep-alives can be sent once every KeepAliveTime (defaults to 7,200,000 milliseconds or two hours) if no other data or higher-level keep-alives have been carried over the TCP connection. If there is no response to a keep-alive, it is repeated once every KeepAliveInterval seconds. KeepAliveInterval defaults to 1 second. NetBT connections, such as those used by other Microsoft networking components, send NetBIOS keep-alives more frequently, so normally no TCP keep-alives are sent on a NetBIOS connection. TCP keep-alives are disabled by default, but Windows Sockets applications can use the setsockopt() function to enable them.
b) Configuration
Now let's talk a little bit about configuration options. There're 3 registry keys where you can affect TCP Keepalive mechanism on Windows systems:
KeepAliveIntervalKey: Tcpip\ParametersValue Type: REG_DWORD-time in millisecondsValid Range: 1-0xFFFFFFFFDefault: 1000 (one second)Description: This parameter determines the interval between TCP keep-alive retransmissions until a response is received. Once a response is received, the delay until the next keep-alive transmission is again controlled by the value of KeepAliveTime. The connection is aborted after the number of retransmissions specified by TcpMaxDataRetransmissions have gone unanswered.
Notes:TCPIP driver waits for a TCP Keepalive ACK for the duration of time specified in this registry entry.
KeepAliveTimeKey: Tcpip\ParametersValue Type: REG_DWORD-time in millisecondsValid Range: 1-0xFFFFFFFFDefault: 7,200,000 (two hours)Description: The parameter controls how often TCP attempts to verify that an idle connection is still intact by sending a keep-alive packet. If the remote system is still reachable and functioning, it acknowledges the keep-alive transmission. Keep-alive packets are not sent by default. This feature may be enabled on a connection by an application
Notes:In order for a TCP session to stay idle, there should be no data sent or received.
c) If OS is Windows XP/2003 the following registry entry applies:
TcpMaxDataRetransmissionsKey: Tcpip\ParametersValue Type: REG_DWORD-numberValid Range: 0-0xFFFFFFFFDefault: 5Description: This parameter controls the number of times that TCP retransmits an individual data segment (not connection request segments) before aborting the connection. The retransmission time-out is doubled with each successive retransmission on a connection. It is reset when responses resume. The Retransmission Timeout (RTO) value is dynamically adjusted, using the historical measured round-trip time (Smoothed Round Trip Time) on each connection. The starting RTO on a new connection is controlled by the TcpInitialRtt registry value.
Notes:This registry entry determines the number of TCP retransmissions for an individual TCP segment. There's no special registry entry to determine the retransmission behavior of TCP Keepalives and this registry entry is also used for the TCP keepalive scenario.
Important note: If OS is Windows Vista/2008, the number of TCP Keepalive attempts are hardcoded to 10 and could not be adjusted via the registry.
d) Some special considerations
=> Even if TCP KeepaliveTime and TCPKeepAliveInterval registry keys are set to a specific value (TCPIP driver uses the deafult values even if we don't set these registry keys from the registry), TCPIP driver won't start sending TCP Keepalives until Keepalives are enabled via various methods at upper layers (layers above TCPIP driver).
=> Native Socket applications can enable TCP keepalives by using anyone of the following methods:
- setsockopt() with SO_KEEPALIVE option- WSAIoctl() with SIO_KEEPALIVE_VALS option (it's also possible to change Keepalive timers with this API call dynamically on a per-socket basis)
=> Managed applications (.NET), can use one of the following methods:
- SetSocketOption method from Socket Class in System.Net.Sockets namespace- GetSocketOption method from Socket Class in System.Net.Sockets namespace
=> Effect of using Keepalives on bandwidth usage
Since TCP Keepalives are TCP segments without data (and the SEQ number set to one less than the current SEQ number), Keepalive usage bandwidth usage can simply be neglected. There's an example below to give an idea about how big a TCP Keepalive packet could be:
- 14 bytes (L2 header - Assuming that Ethernet protocol is used. This could be even lower for other WAN protocols like PPP/HDLC/etc)- 20 bytes (IP header - assuming no IP options are used)- 20 bytes (TCP header - assuming no TCP options are used)
Total: ~54 bytes
Even if TCP Keepalive interval is set to 5 minutes or so (default is 2 hours), given that TCP connection goes idle, TCPIP driver will send a ~54 TCP Keepalive message every 5 minutes and as can be seen it could simply be neglected.
You may also find some references below:
References=============================http://www.microsoft.com/downloads/details.aspx?FamilyID=06c60bfe-4d37-4f50-8587-8b68d32fa6ee&displaylang=en Microsoft Windows Server 2003 TCP/IP Implementation Details
http://www.microsoft.com/downloads/details.aspx?FamilyId=12AC9780-17B5-480C-AEF7-5C0BDE9060B0&displaylang=enTCP/IP Registry Values for Microsoft Windows Vista and Windows Server 2008
http://msdn.microsoft.com/en-us/library/ms740476(VS.85).aspx setsockopt Functionhttp://msdn.microsoft.com/en-us/library/ms741621(VS.85).aspx WSAIoctl Functionhttp://msdn.microsoft.com/en-us/library/system.net.sockets.socket.setsocketoption.aspx Socket..::.SetSocketOption Method http://msdn.microsoft.com/en-us/library/system.net.sockets.socket.getsocketoption.aspx Socket..::.GetSocketOption Method
Hope this helps.
Recently I dealt with a connectivity issue that occurs when deploying OS images with Windows 2008 R2 WDS server to PXE clients running in different subnets. As you may already know, we have a workaround for router MTU size incompatibilities seen when deploying OS images to remote subnets in Windows 2008:
http://support.microsoft.com/kb/975710 Operating system deployment over a network by using WDS fails in Windows Server 2008
The same router packet drop issue was present in my case. Even we configured the MaximumBlockSize reg key on the Windows 2008 R2 WDS server, the issue was still in place and the TFTP server (part of WDS server) was still sending data in chunks requested by the client rather than the size we tried to limit wit the MaximumBlockSize reg key:
Note:10.1.1.1 is PXE client10.2.2.2 is TFTP server=> Client is asking for a TFTP block size of 1456 bytes and TFTP server (that’s running as a part of WDS server) is honoring that request:
10.1.1.1 10.2.2.2 TFTP Read Request, File: boot\x86\wdsnbp.com\000, Transfer type: octet\000, blksize\000=1456\00010.2.2.2 10.1.1.1 TFTP Option Acknowledgement, blksize\000=1456\000
=> Then TFTP server begins sending the first 1456 bytes block of wdsnbp.com file:
10.2.2.2 10.1.1.1 TFTP Data Packet, Block: 1
=> Most likely because the router drops that packet, TFTP client doesn’t receive it and hence it re-sends the Read file request once more:
10.1.1.1 10.2.2.2 TFTP Read Request, File: boot\x86\wdsnbp.com\000, Transfer type: octet\000, blksize\000=1456\000
=> And TFTP server sends the same block once more and tries a number of times:
10.2.2.2 10.1.1.1 TFTP Data Packet, Block: 110.2.2.2 10.1.1.1 TFTP Data Packet, Block: 110.2.2.2 10.1.1.1 TFTP Data Packet, Block: 110.2.2.2 10.1.1.1 TFTP Data Packet, Block: 1...
So PXE boot fails at the end.
The analysis showed us that the reg key didn't take effect on Windows 2008 R2 WDS server. After a source code review, I saw that the feature wasn't integrated into Windows 2008 R2 RTM code due to release timing of Windows 2008 R2 and the fix for Windows 2008. I verified with internal resources and can say that the feature will be a part of Windows 2008 R2 SP1. The SP1 public beta is being made available by the end of July:
http://blogs.technet.com/b/itproaustralia/archive/2010/06/08/windows-7-and-windows-server-2008-r2-sp1-beta-available-end-of-july.aspx Windows 7 and Windows Server 2008 R2 SP1 Beta available end of July
In this post, I’m going to mention about another issue where I helped a colleague of mine to troubleshoot an SCCM package distribution scenario. The problem was that package distribution to clients were visibly slower compared to standard file copy methods (like using xcopy, Windows Explorer etc). In the given setup, the sccm client was accessing and retrieving the distribution package via SMB protocol so BITS was out of the picture. We requested the customer to collect the following logs while reproducing the problem:
a) Create a distribution package which simply includes a 100 MB executable file
b) Collect the following logs for two different scenarios:
=> For standard file copy scenario:
- Start Network traces on the SCCM server (Windows 2008 R2) and the SCCM agent (Windows 7)
- Start Process Explorer on the SCCM agent
- Start file copy by using xcopy from a command prompt on Windows 7 client
=> For SCCM package distribution scenario:
- Trigger packet distribution
After the above logs collected, I analzyed the network traces and Process monitor logs for both scenarios. Let us take a closer look for each scenario:
A. SCCM PACKAGE DISTRIBUTION SCENARIO
The package download activity was seen as below in Process Monitor:
- Ccmexec posts about 4900 ReadFile()s with 64KB buffers each
- This is also supported by the behavior seen in the network trace collected for ccmexec scenario:
No. Time Source Destination Info Protocol
16475 0.005513 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:16121856 File: TEST\100MBFile.txt SMB2
16476 0.000013 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:16187392 File: TEST\100MBFile.txt SMB2
16478 0.001872 192.168.2.77 192.168.1.7 Read Response SMB2
16538 0.005313 192.168.2.77 192.168.1.7 Read Response SMB2
16603 0.080443 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:16252928 File: TEST\100MBFile.txt SMB2
16604 0.000013 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:16318464 File: TEST\100MBFile.txt SMB2
16606 0.001229 192.168.2.77 192.168.1.7 Read Response SMB2
16666 0.005312 192.168.2.77 192.168.1.7 Read Response SMB2
16730 0.005827 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:16384000 File: TEST\100MBFile.txt SMB2
16731 0.000013 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:16449536 File: TEST\100MBFile.txt SMB2
16733 0.001193 192.168.2.77 192.168.1.7 Read Response SMB2
16795 0.005643 192.168.2.77 192.168.1.7 Read Response SMB2
16856 0.070364 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:16515072 File: TEST\100MBFile.txt SMB2
16857 0.000013 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:16580608 File: TEST\100MBFile.txt SMB2
16859 0.001037 192.168.2.77 192.168.1.7 Read Response SMB2
16919 0.005313 192.168.2.77 192.168.1.7 Read Response SMB2
16982 0.005789 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:16646144 File: TEST\100MBFile.txt SMB2
16983 0.000014 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:16711680 File: TEST\100MBFile.txt SMB2
16985 0.001043 192.168.2.77 192.168.1.7 Read Response SMB2
17045 0.005312 192.168.2.77 192.168.1.7 Read Response SMB2
17108 0.048421 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:16777216 File: TEST\100MBFile.txt SMB2
17109 0.000019 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:16842752 File: TEST\100MBFile.txt SMB2
17111 0.002061 192.168.2.77 192.168.1.7 Read Response SMB2
17171 0.005311 192.168.2.77 192.168.1.7 Read Response SMB2
17236 0.055958 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:16908288 File: TEST\100MBFile.txt SMB2
17237 0.000015 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:16973824 File: TEST\100MBFile.txt SMB2
17239 0.002242 192.168.2.77 192.168.1.7 Read Response SMB2
17300 0.005311 192.168.2.77 192.168.1.7 Read Response SMB2
Note: IP addresses are replaced for privacy purposes
B. STANDARD FILE COPY SCENARIO
The standard file copy with xcopy was seen as below in Process Monitor:
- The xcopy tool posts only 100 ReadFile()s each with a 1 MB buffer each
- This is also seen in the network trace collected for the xcopy scenario:
5445 2010-09-21 15:59:29.436686 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:12582912 File: xcopytest\100MBFile.txt SMB2
5446 2010-09-21 15:59:29.436701 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:12648448 File: xcopytest\100MBFile.txt SMB2
5447 2010-09-21 15:59:29.436712 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:12713984 File: xcopytest\100MBFile.txt SMB2
5448 2010-09-21 15:59:29.436723 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:12779520 File: xcopytest\100MBFile.txt SMB2
5449 2010-09-21 15:59:29.436735 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:12845056 File: xcopytest\100MBFile.txt SMB2
5450 2010-09-21 15:59:29.436748 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:12910592 File: xcopytest\100MBFile.txt SMB2
5451 2010-09-21 15:59:29.436760 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:12976128 File: xcopytest\100MBFile.txt SMB2
5452 2010-09-21 15:59:29.436772 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:13041664 File: xcopytest\100MBFile.txt SMB2
5453 2010-09-21 15:59:29.436784 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:13107200 File: xcopytest\100MBFile.txt SMB2
5457 2010-09-21 15:59:29.436798 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:13172736 File: xcopytest\100MBFile.txt SMB2
5458 2010-09-21 15:59:29.436813 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:13238272 File: xcopytest\100MBFile.txt SMB2
5459 2010-09-21 15:59:29.436824 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:13303808 File: xcopytest\100MBFile.txt SMB2
5460 2010-09-21 15:59:29.436835 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:13369344 File: xcopytest\100MBFile.txt SMB2
5461 2010-09-21 15:59:29.436845 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:13434880 File: xcopytest\100MBFile.txt SMB2
5462 2010-09-21 15:59:29.436857 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:13500416 File: xcopytest\100MBFile.txt SMB2
5463 2010-09-21 15:59:29.436869 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:13565952 File: xcopytest\100MBFile.txt SMB2
5509 2010-09-21 15:59:29.441113 192.168.2.77 192.168.1.7 Read Response SMB2
5572 2010-09-21 15:59:29.446773 192.168.2.77 192.168.1.7 Read Response SMB2
5632 2010-09-21 15:59:29.452104 192.168.2.77 192.168.1.7 [Unreassembled Packet] SMB2
5694 2010-09-21 15:59:29.457766 192.168.2.77 192.168.1.7 Read Response SMB2
5755 2010-09-21 15:59:29.463095 192.168.2.77 192.168.1.7 Read Response SMB2
5817 2010-09-21 15:59:29.468755 192.168.2.77 192.168.1.7 Read Response SMB2
5878 2010-09-21 15:59:29.474076 192.168.2.77 192.168.1.7 Read Response SMB2
5940 2010-09-21 15:59:29.479738 192.168.2.77 192.168.1.7 Read Response SMB2
6002 2010-09-21 15:59:29.485400 192.168.2.77 192.168.1.7 Read Response SMB2
6063 2010-09-21 15:59:29.490729 192.168.2.77 192.168.1.7 Read Response SMB2
6125 2010-09-21 15:59:29.496387 192.168.2.77 192.168.1.7 Read Response SMB2
6187 2010-09-21 15:59:29.502044 192.168.2.77 192.168.1.7 Read Response SMB2
6248 2010-09-21 15:59:29.507367 192.168.2.77 192.168.1.7 Read Response SMB2
6310 2010-09-21 15:59:29.513024 192.168.2.77 192.168.1.7 Read Response SMB2
6372 2010-09-21 15:59:29.518677 192.168.2.77 192.168.1.7 Read Response SMB2
6433 2010-09-21 15:59:29.523999 192.168.2.77 192.168.1.7 Read Response SMB2
6447 2010-09-21 15:59:29.525133 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:13631488 File: xcopytest\100MBFile.txt SMB2
6448 2010-09-21 15:59:29.525148 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:13697024 File: xcopytest\100MBFile.txt SMB2
6449 2010-09-21 15:59:29.525159 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:13762560 File: xcopytest\100MBFile.txt SMB2
6450 2010-09-21 15:59:29.525170 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:13828096 File: xcopytest\100MBFile.txt SMB2
6451 2010-09-21 15:59:29.525183 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:13893632 File: xcopytest\100MBFile.txt SMB2
6452 2010-09-21 15:59:29.525196 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:13959168 File: xcopytest\100MBFile.txt SMB2
6453 2010-09-21 15:59:29.525207 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:14024704 File: xcopytest\100MBFile.txt SMB2
6454 2010-09-21 15:59:29.525219 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:14090240 File: xcopytest\100MBFile.txt SMB2
6455 2010-09-21 15:59:29.525231 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:14155776 File: xcopytest\100MBFile.txt SMB2
6456 2010-09-21 15:59:29.525243 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:14221312 File: xcopytest\100MBFile.txt SMB2
6457 2010-09-21 15:59:29.525255 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:14286848 File: xcopytest\100MBFile.txt SMB2
6458 2010-09-21 15:59:29.525267 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:14352384 File: xcopytest\100MBFile.txt SMB2
6459 2010-09-21 15:59:29.525280 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:14417920 File: xcopytest\100MBFile.txt SMB2
6460 2010-09-21 15:59:29.525292 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:14483456 File: xcopytest\100MBFile.txt SMB2
6461 2010-09-21 15:59:29.525304 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:14548992 File: xcopytest\100MBFile.txt SMB2
6462 2010-09-21 15:59:29.525316 192.168.1.7 192.168.2.77 Read Request Len:65536 Off:14614528 File: xcopytest\100MBFile.txt SMB2
6511 2010-09-21 15:59:29.529653 192.168.2.77 192.168.1.7 Read Response SMB2
6573 2010-09-21 15:59:29.534977 192.168.2.77 192.168.1.7 Read Response SMB2
6635 2010-09-21 15:59:29.540629 192.168.2.77 192.168.1.7 Read Response SMB2
6697 2010-09-21 15:59:29.546286 192.168.2.77 192.168.1.7 Read Response SMB2
6758 2010-09-21 15:59:29.551606 192.168.2.77 192.168.1.7 [Unreassembled Packet] SMB2
6821 2010-09-21 15:59:29.557255 192.168.2.77 192.168.1.7 Read Response SMB2
6883 2010-09-21 15:59:29.562576 192.168.2.77 192.168.1.7 Read Response SMB2
6945 2010-09-21 15:59:29.568234 192.168.2.77 192.168.1.7 Read Response SMB2
7007 2010-09-21 15:59:29.573893 192.168.2.77 192.168.1.7 Read Response SMB2
7068 2010-09-21 15:59:29.579219 192.168.2.77 192.168.1.7 Read Response SMB2
7130 2010-09-21 15:59:29.584876 192.168.2.77 192.168.1.7 Read Response SMB2
7192 2010-09-21 15:59:29.590530 192.168.2.77 192.168.1.7 Read Response SMB2
7253 2010-09-21 15:59:29.595858 192.168.2.77 192.168.1.7 Read Response SMB2
7315 2010-09-21 15:59:29.601517 192.168.2.77 192.168.1.7 Read Response SMB2
7377 2010-09-21 15:59:29.607173 192.168.2.77 192.168.1.7 Read Response SMB2
7438 2010-09-21 15:59:29.612499 192.168.2.77 192.168.1.7 Read Response SMB2
7500 2010-09-21 15:59:29.618155 192.168.2.77 192.168.1.7 Read Response SMB2
7561 2010-09-21 15:59:29.623478 192.168.2.77 192.168.1.7 Read Response SMB2
7623 2010-09-21 15:59:29.629132 192.168.2.77 192.168.1.7 Read Response SMB2
7685 2010-09-21 15:59:29.634785 192.168.2.77 192.168.1.7 Read Response SMB2
7746 2010-09-21 15:59:29.640111 192.168.2.77 192.168.1.7 Read Response SMB2
7808 2010-09-21 15:59:29.645771 192.168.2.77 192.168.1.7 Read Response SMB2
7871 2010-09-21 15:59:29.651433 192.168.2.77 192.168.1.7 Read Response SMB2
7932 2010-09-21 15:59:29.656750 192.168.2.77 192.168.1.7 Read Response SMB2
7996 2010-09-21 15:59:29.662406 192.168.2.77 192.168.1.7 Read Response SMB2
8058 2010-09-21 15:59:29.667728 192.168.2.77 192.168.1.7 Read Response SMB2
8120 2010-09-21 15:59:29.673385 192.168.2.77 192.168.1.7 Read Response SMB2
8182 2010-09-21 15:59:29.679045 192.168.2.77 192.168.1.7 Read Response SMB2
8243 2010-09-21 15:59:29.684367 192.168.2.77 192.168.1.7 Read Response SMB2
Note: The above 16 x 64 KB = 1 MB read requests are actually created as a result of 1 MB read requests at the application layer (xcopy)
SUMMARY:
=========
The performance difference between sccm package distribution and xcopy stems from the fact that xcopy tool (and most probably Windows Explorer as well) posts Read requests with larger buffers (1 MB) compared to sccm agent process - ccmexec (64 KB) which results in a better performance in the xcopy scenario since better concurrency is achieved and the network bandwidth is better utilized that way. This is both seen in the network trace and Process Monitor activities. We shared the results with our SCCM colleagues to see if that behaviour could be changed or not, if I receive any update on that I’ll update this post.
In this blog post, I’ll be talking about an interesting problem that I dealt with recently. The problem was that clients running in a certain VLAN were not able to establish HTTPS connections through TMG server. Due to the nature of the network, the clients should be configured as SecureNet clients (my customer cannot configure them as web proxy clients or use TMG client software because these machines are guest machines)
I asked for the usual data from our customer to find out what was happening during the problem:
- Network trace & HTTPWatch logs from a test client
- TMG data packager from the TMG server
1. After receiving the data, I started from client side. What I see on the client side is the client starts failing right after the initial TCP 3-way handshake:
Note: Please note that, for privacy purposes all IP addresses have been replaced with random addresses.
=> Client network trace:
(ip.addr eq 10.110.233.50 and ip.addr eq 172.16.1.1) and (tcp.port eq 50183 and tcp.port eq 443)
453 14:44:32 01.05.2012 27.1395045 0.0000000 10.110.233.50 172.16.1.1 TCP TCP: [Bad CheckSum]Flags=......S., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=0, Seq=367515135, Ack=0, Win=8192 ( Negotiating scale factor 0x2 ) = 8192
456 14:44:32 01.05.2012 27.1565249 0.0170204 172.16.1.1 10.110.233.50 TCP TCP:Flags=...A..S., SrcPort=HTTPS(443), DstPort=50183, PayloadLen=2, Seq=2180033885 - 2180033888, Ack=367515136, Win=64240 ( Scale factor not supported ) = 64240
457 14:44:32 01.05.2012 27.1565443 0.0000194 10.110.233.50 172.16.1.1 TCP TCP: [Bad CheckSum]Flags=...A...., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=0, Seq=367515136, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860
458 14:44:32 01.05.2012 27.1585176 0.0019733 10.110.233.50 172.16.1.1 TLS TLS:TLS Rec Layer-1 HandShake: Client Hello.
465 14:44:32 01.05.2012 27.4744962 0.3159786 10.110.233.50 172.16.1.1 TCP TCP:[ReTransmit #458] [Bad CheckSum]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860
476 14:44:33 01.05.2012 28.0828875 0.6083913 10.110.233.50 172.16.1.1 TCP TCP:[ReTransmit #458] [Bad CheckSum]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860
494 14:44:34 01.05.2012 29.2948179 1.2119304 10.110.233.50 172.16.1.1 TCP TCP:[ReTransmit #458]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860
512 14:44:35 01.05.2012 30.3815127 1.0866948 172.16.1.1 10.110.233.50 TLS TLS:Continued Data: 2 Bytes
513 14:44:35 01.05.2012 30.3815385 0.0000258 10.110.233.50 172.16.1.1 TCP TCP:Flags=...A...., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=0, Seq=367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860
516 14:44:35 01.05.2012 30.5019345 0.1203960 10.110.233.50 172.16.1.1 TCP TCP:[ReTransmit #458]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860
535 14:44:37 01.05.2012 31.7018989 1.1999644 10.110.233.50 172.16.1.1 TCP TCP:[ReTransmit #458] [Bad CheckSum]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860
568 14:44:38 01.05.2012 33.1872833 1.4853844 172.16.1.1 10.110.233.50 TLS TLS:Continued Data: 6 Bytes
- Client cannot connect to remote web site because SSL/TLS negotiation doesn’t succeed because no response from the Web server is received (from client’s perspective)
2. Then I decided to check things from TMG server perspective. I first checked the network trace that was collected on the internal interface of TMG server through which the client request was received:
6457 14:33:49 01.05.2012 39.8915584 0.0000000 10.110.233.50 172.16.1.1 TCP TCP:Flags=......S., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=0, Seq=367515135, Ack=0, Win=8192 ( Negotiating scale factor 0x2 ) = 8192
6461 14:33:49 01.05.2012 39.9079944 0.0164360 172.16.1.1 10.110.233.50 TCP TCP:Flags=...A..S., SrcPort=HTTPS(443), DstPort=50183, PayloadLen=0, Seq=2180033885, Ack=367515136, Win=64240 ( Scale factor not supported ) = 64240
6462 14:33:49 01.05.2012 39.9084181 0.0004237 10.110.233.50 172.16.1.1 TCP TCP:Flags=...A...., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=0, Seq=367515136, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860
6463 14:33:49 01.05.2012 39.9103936 0.0019755 10.110.233.50 172.16.1.1 TLS TLS:TLS Rec Layer-1 HandShake: Client Hello.
6489 14:33:49 01.05.2012 40.2259991 0.3156055 10.110.233.50 172.16.1.1 TCP TCP:[ReTransmit #6463]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860
6614 14:33:50 01.05.2012 40.8343158 0.6083167 10.110.233.50 172.16.1.1 TCP TCP:[ReTransmit #6463]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860
6806 14:33:51 01.05.2012 42.0457229 1.2114071 10.110.233.50 172.16.1.1 TCP TCP:[ReTransmit #6463]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860
6872 14:33:52 01.05.2012 43.1311020 1.0853791 172.16.1.1 10.110.233.50 TCP TCP:Flags=...A..S., SrcPort=HTTPS(443), DstPort=50183, PayloadLen=0, Seq=2180033885, Ack=367515136, Win=64240 ( Scale factor not supported ) = 64240
6873 14:33:52 01.05.2012 43.1314010 0.0002990 10.110.233.50 172.16.1.1 TCP TCP:Flags=...A...., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=0, Seq=367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860
6891 14:33:52 01.05.2012 43.2517820 0.1203810 10.110.233.50 172.16.1.1 TCP TCP:[ReTransmit #6463]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860
7110 14:33:54 01.05.2012 44.4514449 1.1996629 10.110.233.50 172.16.1.1 TCP TCP:[ReTransmit #6463]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860
7481 14:33:55 01.05.2012 45.9360680 1.4846231 172.16.1.1 10.110.233.50 TCP TCP:Flags=...A.R.., SrcPort=HTTPS(443), DstPort=50183, PayloadLen=0, Seq=2180033886, Ack=367515136, Win=8192 (scale factor 0x0) = 8192
- TMG server doesn’t really send responses back to the client
3. Then I decided to check the network trace collected on the external interface of the TMG server:
21 14:33:49 01.05.2012 39.6940232 0.0000000 10.110.235.202 172.16.1.1 TCP TCP:Flags=......S., SrcPort=55073, DstPort=HTTPS(443), PayloadLen=0, Seq=367515135, Ack=0, Win=8192 ( Negotiating scale factor 0x2 ) = 8192
22 14:33:49 01.05.2012 39.7097097 0.0156865 172.16.1.1 10.110.235.202 TCP TCP:Flags=...A..S., SrcPort=HTTPS(443), DstPort=55073, PayloadLen=0, Seq=2180033885, Ack=367515136, Win=64240 ( Scale factor not supported ) = 64240
23 14:33:52 01.05.2012 42.9329903 3.2232806 172.16.1.1 10.110.235.202 TCP TCP:Flags=...A..S., SrcPort=HTTPS(443), DstPort=55073, PayloadLen=0, Seq=2180033885, Ack=367515136, Win=64240 ( Scale factor not supported ) = 64240
24 14:33:55 01.05.2012 45.7379523 2.8049620 172.16.1.1 10.110.235.202 TCP TCP:Flags=...A.R.., SrcPort=HTTPS(443), DstPort=55073, PayloadLen=0, Seq=2180033886, Ack=367515136, Win=8192 (scale factor 0x0) = 8192
- External Web server sends a response to initial TCP 3-way handshake request that is forwarded by TMG, but TMG server doesn’t proceed with the connection
4. Then I checked the Web Proxy/Firewall log on the TMG server:
01.05.2012 14:33 Denied 10.110.233.50 50183 172.16.1.1 443 0xc0040034 FWX_E_SEQ_ACK_MISMATCH
When I check more details on that error code, I see that we fail because we receive a TCP packet with an invalid sequence number:
http://msdn.microsoft.com/en-us/library/ms812624.aspx/
FWX_E_SEQ_ACK_MISMATCH 0xC0040034 A TCP packet was rejected because it has an invalid sequence number or an invalid acknowledgement number.
So TMG Server drops the TCP ACK packet (3rd TCP packet in TCP 3-way handshake) coming from the client because it has an invalid TCP ACK number
5. The problem was also visible in the ETL trace:
…
… handshake packet is dropped becuase ACK (2180033888) no equal ISN(peer)+1 (2180033886)
… Warning:The packet failed TCP sequence validation
… Warning:The packet is dropped because of SEQ_ACK_MISMATCH
6. When we check the TMG side network trace we see it there:
SequenceNumber: 367515135 (0x15E7D5FF)
SequenceNumber: 2180033885 (0x81F0AD5D)
AcknowledgementNumber: 367515136 (0x15E7D600)
SequenceNumber: 367515136 (0x15E7D600)
AcknowledgementNumber: 2180033888 (0x81F0AD60)
That Acknowledgement number SHOULD HAVE BEEN 2180033886 (0x81F0AD5E)
So TMG ignores the rest of the session (like TLS client hello coming from the client machine)
7. When I check the client side trace, I see that the ACK number in the TCP ACK packet is really set to the wrong value (2180033888) by the client:
Frame: Number = 6462, Captured Frame Length = 60, MediaType = ETHERNET
+ Ethernet: Etype = Internet IP (IPv4),DestinationAddress:[00-11-22-33-44-55],SourceAddress:[00-12-34-56-78-1B]
+ Ipv4: Src = 10.110.233.50, Dest = 172.16.1.1, Next Protocol = TCP, Packet ID = 24041, Total IP Length = 40
- Tcp: Flags=...A...., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=0, Seq=367515136, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860
SrcPort: 50183
DstPort: HTTPS(443)
+ DataOffset: 80 (0x50)
+ Flags: ...A....
Window: 64860 (scale factor 0x0) = 64860
Checksum: 0x4B5D, Good
UrgentPointer: 0 (0x0)
8. One can think that it’s a problem with TCPIP stack on the client, but when we check the TCP SYN ACK packet (the second TCP packet sent by TMG server before the TCP ACK with wrong sequence number is sent by the client) we see that the client receives that TCP SYN ACK packet with 2 bytes extra data (which is something unusual for a TCP SYN ACK packet - such packets shouldn’t have data just should have TCP header):
Frame: Number = 456, Captured Frame Length = 60, MediaType = ETHERNET
+ DestinationAddress: Microsoft Corporation [00-11-22-33-44-55]
+ SourceAddress: Test Data [00-12-34-56-78-1B]
EthernetType: Internet IP (IPv4), 2048(0x800)
- Ipv4: Src = 172.16.1.1, Dest = 10.110.233.50, Next Protocol = TCP, Packet ID = 1342, Total IP Length = 46
+ Versions: IPv4, Internet Protocol; Header Length = 20
+ DifferentiatedServicesField: DSCP: 0, ECN: 0
TotalLength: 46 (0x2E)
Identification: 1342 (0x53E)
+ FragmentFlags: 0 (0x0)
TimeToLive: 120 (0x78)
NextProtocol: TCP, 6(0x6)
Checksum: 46957 (0xB76D)
SourceAddress: 194.53.208.72
DestinationAddress: 10.110.233.50
- Tcp: Flags=...A..S., SrcPort=HTTPS(443), DstPort=50183, PayloadLen=2, Seq=2180033885 - 2180033888, Ack=367515136, Win=64240 ( Scale factor not supported ) = 64240
SrcPort: HTTPS(443)
DstPort: 50183
+ DataOffset: 96 (0x60)
+ Flags: ...A..S.
Window: 64240 ( Scale factor not supported ) = 64240
Checksum: 0x365C, Good
- TCPOptions:
- MaxSegmentSize: 1
type: Maximum Segment Size. 2(0x2)
OptionLength: 4 (0x4)
MaxSegmentSize: 1380 (0x564)
- TCPPayload: SourcePort = 443, DestinationPort = 50183
UnknownData: Binary Large Object (2 Bytes)
That’s why the TCPIP stack running on the client sends a TCP ACK number which is 2 more than the value that should be:
ACK number sent by the client: AcknowledgementNumber: 2180033888 (0x81F0AD60)
The correct ACK number that should have been sent: AcknowledgementNumber: 2180033886 (0x81F0AD5E)
9. And when we check the TCP SYN ACK packet that is leaving the TMG server, we don’t see such an extra 2 bytes:
Frame: Number = 6461, Captured Frame Length = 60, MediaType = ETHERNET
+ Ipv4: Src = 172.16.1.1, Dest = 10.110.233.50, Next Protocol = TCP, Packet ID = 1342, Total IP Length = 44
- Tcp: Flags=...A..S., SrcPort=HTTPS(443), DstPort=50183, PayloadLen=0, Seq=2180033885, Ack=367515136, Win=64240 ( Scale factor not supported ) = 64240
Checksum: 0x365E, Good
So that 2 bytes extra data is somehow added to TCP SYN ACK by something else (like the NIC driver on the TMG server, a network device running in between (like Wireless access point etc) or the NIC driver on the client machine
==========
In summary the HTTPS connectivity problem stems from an issue between the Client and TMG server (including the NIC layer or below on the client and server and the network devices/links in between the two)
My customer informed me that the issue was visible with any clients which makes it unlikely that it’s a client side issue. I advised my customer to update NIC drivers on the TMG server side and check the network devices running in the path and upgrade firmwares where possible.
[Updated on 26th October 2013]
The following blog post is the newer version of this blog post:
http://blogs.technet.com/b/nettracer/archive/2013/10/12/decrypting-ssl-tls-sessions-with-wireshark-reloaded.aspx
In this blog post, I would like to talk about decrypting SSL/TLS sessions by using Wireshark provided that you have access to the server certificate’s private key. In some cases it may be quite useful to see what is exchanged under the hood of an SSL/TLS session from troubleshooting purposes. You’ll find complete steps to do this on Windows systems. Even though there’re a couple of documentations around (you can find the references at the end of the blog post), all steps from one document doesn’t fully apply and you get stuck at some point. I tested the following steps a couple of times on a Windows 2008 server and it seems to be working fine.
Here are the details of the process:
First of all we’ll need the following tools for that process: (At least I tested with these versions)
http://www.wireshark.org/download.html
Wireshark -> Version 1.2.8
http://www.slproweb.com/products/Win32OpenSSL.html
(Win32 OpenSSL v1.0.0.a Light)
openssl -> 1.0.0a
1) We first need to export the certificate that is used by the server side in SSL/TLS session with the following steps:
Note: The Certificate export wizard could be started by right clicking the related certificate from certificates mmc and selecting “All Tasks > Export” option.
2) In the second stage, we’ll need to convert the private key file in PKCS12 format to PEM format (which is used by Wireshark) in two stages by using the openssl tool:
c:\OpenSSL-Win32\bin> openssl pkcs12 -nodes -in iis.pfx -out key.pem -nocerts -nodes
Enter Import Password: <<Password used when exporting the certificate in PKCS12 format>>
c:\OpenSSL-Win32\bin> openssl rsa -in key.pem -out keyout.pem
writing RSA key
=> After the last command, the outfile “keyout.pem” should be seen in the following format:
-----BEGIN RSA PRIVATE KEY-----
jffewjlkfjelkjfewlkjfew.....
akfhakdfhsakfskahfksjhgkjsah
-----END RSA PRIVATE KEY-----
3) Now we can use the private key file in Wireshark as given below:
Note: The following dialog box could be seen by first selecting Edit > Preferences and then selecting “Protocols” from the left pane and selecting SSL at the left pane again:
Notes:
- 172.17.1.1 is server IP address. This is the server using the certificate that we extracted the private key from.
- 443 is the TCP port at the server side.
- http is the protocol carried inside the SSL/TLS session
- c:\tls\keyout.pem is the name of the file which includes the converted private key
- c:\tls\debug2.txt is the name of the file which includes information about the decryption process
4) Once all is ready, you can click “Apply” to start the decryption process. Wireshark will show you the packets in the given session in an unencrypted fashion. Here is the difference between the encrypted and unencrypted versions:
a) How it is seen before Wireshark decrypts SSL/TLS session:
b) How it is seen after Wireshark decrypts SSL/TLS session:
5) Since the private key of a certificate could be considered as a password, we couldn’t ask for that from our customers given that you're troubleshooting a problem on behalf of your customers not for your environment . The following alternatives could be used in that case:
Note: It looks like a capture file decrypted by using the private key couldn’t be saved as a different capture file in unencrypted format.
- After decrypting the traffic, we could examine it in a live meeting session where the customer shares his desktop
- The decrypted packets could be printed to a file from File > Print option (by choosing the “Output to file” option)
- By right clicking one of the decrypted packets and selecting “Follow SSL Stream”, we can save the session content to a text file. The following is an example of such a file created that way:
6) More information could be found at the following links:
Citrix
http://support.citrix.com/article/CTX116557
Wireshark
http://wiki.wireshark.org/SSL
In this blog post, I’ll talk about another network trace analysis scenario.
The problem was that some Windows XP clients were copying files from a NAS device very slowly compared to others. As one of the most useful logs to troubleshoot such problems, I requested a network trace to be collected on a problem Windows XP client. Normally it’s best to collect simultaneous network traces but it was a bit diffcult to collect a trace at the NAS device side so we were limited to a client side trace.
Before I start explaining how I got to the bottom of the issue, I would like to provide you with some background on how files are read by Windows via SMB protocol so that you’ll better understand the resolution part:
Windows XP and Windows 2003 use SMB v1 protocol for remote file system access (like creating/reading/writing/deleting/locking files over a network connection). Since it was a file read from the remote server in this scenario, the following SMB activity would be seen between the client and server:
Client Server
===== ======
The client will open the file at the server first:
SMB Create AndX request ---->
<---- SMB Create AndX response
Then the client will send SMB Read AndX requests to retrieve blocks of the file:
SMB Read AndX request ----> (61440 bytes)
<---- SMB Read AndX response
Note: SMBv1 protocol could request 61 KB of data at most in one SMB Read AndX request.
After this short overview, let’s get back to the original problem and analyze the packets taken from the real network trace:
Frame# Time delta Source IP Destination IP Protocol Information
===== ======== ========= ========== ====== ========
....
59269 0.000000 10.1.1.1 10.1.1.2 SMB Read AndX Request, FID: 0x0494, 61440 bytes at offset 263823360
59270 0.000000 10.1.1.2 10.1.1.1 SMB Read AndX Response, 61440 bytes
59271 0.000000 10.1.1.2 10.1.1.1 TCP [Continuation to #59270] microsoft-ds > foliocorp [ACK] Seq=65993793
59272 0.000000 10.1.1.2 10.1.1.1 TCP [Continuation to #59270] microsoft-ds > foliocorp [ACK] Seq=65995249
59273 0.000000 10.1.1.2 10.1.1.1 TCP [Continuation to #59270] microsoft-ds > foliocorp [ACK] Seq=65996705
59320 0.000000 10.1.1.2 10.1.1.1 TCP [Continuation to #59270] microsoft-ds > foliocorp [ACK] Seq=66049121
59321 0.000000 10.1.1.2 10.1.1.1 TCP [Continuation to #59270] microsoft-ds > foliocorp [ACK] Seq=66050577
59322 0.000000 10.1.1.2 10.1.1.1 TCP [Continuation to #59270] microsoft-ds > foliocorp [ACK] Seq=66052033
59323 0.000000 10.1.1.1 10.1.1.2 TCP foliocorp > microsoft-ds [ACK] Seq=67600 Ack=66053489 Win=65535
59325 0.406250 10.1.1.2 10.1.1.1 TCP [Continuation to #59270] microsoft-ds > folioc [PSH, ACK]Seq=66053489
59326 0.000000 10.1.1.1 10.1.1.2 SMB Read AndX Request, FID: 0x0494, 61440 bytes at offset 263884800
59327 0.000000 10.1.1.2 10.1.1.1 SMB Read AndX Response, 61440 bytes
59328 0.000000 10.1.1.2 10.1.1.1 TCP [Continuation to #59327] microsoft-ds > foliocorp [ACK] Seq=66055297
Now let’s take a closer look at some related frames:
Frame# 59269 => Client requests the next 61 KB of data at offset 263823360 from the file represented with FID 0x0494 (this FID is assigned by server side when the file is first opened/created)
Frame# 59270 => Server starts sending 61440 bytes of data back to the client in SMB Read AndX response.
Frame# 59271 => The remaining parts are sent in 1460 bytes chunks because of TCP MSS negotiated, it’s generally 1460 bytes. (like frame# 59272, frame# 59273 etc)
The most noticable thing in the network trace was to see many such 0.4 seconds delays (like the one that we see at frame #59325). Those 0.4 seconds delays were always present at the last fragment of 61 KB of data returned by the server.
Normally 0.4 seconds could be seen as a very low delay but considering that the client will send n x SMB Read Andx request to the server to read the file it will quickly be clear that 0.4 seconds of delay is huge (for example, the client needs to send 1000 SMB Read AndX requests to read a 64 MB file)
Generally we’re used to see some delays in network traces due to packet retransmissions (due to packet loss) or link transfer delayes etc. But seeing a constant delay of 0.4 seconds in every last fragment of a 61 KB block made me suspect that a QoS implementation was in place somewhere between the client and server. By delaying every read request about 0.4 seconds, actually file copy is being slowed down on purpose: traffic shaping/limiting.
Since we didn’t have a network trace collected at the NAS device side, we couldn’t check if the QoS policy was in effect at the NAS device side or on a network device running in between the two. (we checked the client side and there was no QoS configuration in place). After further checking the network devices, it turned out that there was an incorrectly configured QoS policy on one of them. After making the required changes, the problem was resolved...
I would like to talk about an issue that I have dealt with recently regarding Internet Explorer and displaying TMG error messages.
The problem reported was that newer IE versions (like 8 or 9) didn’t display the regular TMG error message which is displayed when the access rule allows certain users and the current user is not one of the allowed users (Error Code: 502 Proxy Error. The Forefront TMG denied the specified Uniform Resource Locator (URL). (12202)),instead the "Page not found" error was displayed and that was causing some help desk calls since the user thought that the target web site was not reachable based on the displayed error message whereas the real problem was user was not allowed to access the given web site.
IE6 didn’t have the same problem. Then we started investigating the problem from TMG perspective to make sure that it wasn’t something stemming from TMG server side. After some further troubleshooting (network traces), we found out that TMG was sending the regular error page back to the client but somehow it wasn’t displayed by the IE client.
Then we focused on the IE side. After some further investigation, I found out that it was the expected default behavior for newer Internet Explorer versions (8 and 9, we haven’t tested 7 but this might apply to 7 as well) for security reasons. You can find below more information about the vulnerability that could be exploited when IE uses Proxy servers to connect to target servers:
Pretty-Bad-Proxy: An Overlooked Adversary in Browsers’ HTTPS Deployments
Having said that, there’s a registry key which allows you to turn this enhanced security feature off in newer IE versions. You can see the details below on how to do this on the client machines:
http://msdn.microsoft.com/en-us/library/ms537184(VS.85).aspx Introduction to Feature Controls
- You’ll need to create the highlighted key at the given path on a client machine:
HKEY_LOCAL_MACHINE (or HKEY_CURRENT_USER) SOFTWARE Microsoft Internet Explorer Main FeatureControl FEATURE_SHOW_FAILED_CONNECT_CONTENT_KB942615 (Note: you’ll also need to create that registry key under “FeatureControl”)
Reg key name: Iexplore.exe
Type: REG_DWORD
Value: 0x00000001
You can also get some more information at http://msdn.microsoft.com/en-us/library/dd565641(VS.85).aspx#eventLog Event 1065 - Web Proxy Error Handling Changes
I would like to re-emphasize that it normally shouldn’t be turned off from security perspective, so please implement it at your own risk.
Recently I dealt with a problem where PDF file downloaded from a certain external web site was always corrupted and I would like to talk about how I troubleshooted that problem. The client was connected to internet through a four node TMG 2010/SP2 array.
We decided to collect the following logs to better understand why the file was corrupted:
- Network trace on the internal client
- TMG data packager on one of the TMG servers
(Since the problem was reproducible by setting any of the TMG servers as the proxy server, we set one of the array members as a proxy server to collect less logs)
Note: TMG data packager is installed as part of TMG Best practices Analyzer installation
http://www.microsoft.com/en-us/download/details.aspx?id=17730
Microsoft Forefront Threat Management Gateway Best Practices Analyzer Tool
The results from the log analysis were as below:
- There weren’t any connectivity problems present in the TCP sessions (through which the file was downloaded) in the network trace collected on the client, internal and external interfaces of TMG server
- The error code for the given file download was 13: (taken from Web proxy log)
Action
Client IP
Destination Host IP
Server Name
Operation
Result Code
URL
Failed
10.200.1.20
10.1.1.1
Proxy1
GET
13
http://.../Report.pdf
Note: IP addresses/links/proxy names etc are deliberately changed
Error 13 is “The data is invalid”:
C:\>net helpmsg 13
The data is invalid.
So TMG server thinks that the received data was invalid. That also explains why the downloaded file was corrupted.
Then I decided to take a look at the ETL trace which was also collected with TMG Data packager. Actually the root cause behind why TMG server thought the data was invalid was clearly visible there:
... GZIP Dempression failed. Drop the request. (connection closed=0) 0x8007000d(ERROR_INVALID_DATA)
Because the file decompression fails on TMG server, TMG server finalizes the session with Error_Invalid_data (error 13)
Note: Please note that you have to contact Microsoft support for ETL trace conversion
Note: You can also collect a similar diagnostics log from TMG server’s console:
(Before reproducing the problem you have to enable logging from “Enable Diagnostic Logging” and once the problem is reproduced you have to disable logging by selecting “Disable Diagnostic Logging”)
For troubleshooting purposes, I suggested to turn off Compression on TMG server:
(We remove “External” from the “Request compressed HTTP content when sending requests to these network elements” section.)
As expectedly the corrupted file download problem was resolved. When we make the above configuration change actually we ask the TMG server not to ask for compression when sending HTTP requests out to external web servers. So the file was downloaded in uncompressed format. Please note that TMG server asks for compression for HTTP requests sent to external web sites by default and that provides some bandwidth saving by minimizing the amount of data transferred.
We decided that the problem was somehow related to the target web site or upstream Web proxy because the same TMG server was able to successfully download HTTP content in compressed format from other external web sites.
Normally it’s possible to turn off compression for a specific web site (which could be configured from “Exceptions” tab in the above screen shot). But the TMG array in question was configured to use an upstream proxy for all external web traffic. So creating an exception wouldn’t make much difference here. Our customer decided to keep HTTP compression off (and re-enable it once the file downloads from the given web site were finished)
In this blog post, I would like to talk about a named pipe access issue on Windows 2008 that I had to deal with recently. One of our customers was having problems in accessing named pipes anonymously on Windows 2008 and therefore we were involved in to address the issue. Even the required configuration for anonymous pipe access was in place, the pipe client was getting ACCESS DENIED when trying to access the pipe.
The problem was easy to reproduce on any Windows Vista or later system. Just run a named pipe server application which creates a pipe, then try to connect to the pipe anonymously from a remote system. You can see more details below on how to reproduce this behavior:
a) Compile the sample pipe server&client application given at the following MSDN link:
http://msdn.microsoft.com/en-us/library/aa365588(VS.85).aspx Multithreaded Pipe Serverhttp://msdn.microsoft.com/en-us/library/aa365592(VS.85).aspx Named Pipe Client
b) Add the named pipe created by Pipe server to the Null session pipe lists (configuring "Nullsessionpipes" registry key under LanmanServer)
c) Do not enable “Network access: Let Everyone permissions apply to anonymous users” from local GPO or domain GPO
d) Make sure that the Pipe ACL included Anonymous user with Full Control permission. (You can do that by using a 3rd party application like pipeacl.exe)
e) Start a command line within the Local System account security context by running a command similar to below:
at 12:40 /interactive cmd.exe
f) Run the pipe client application from the command line and try to connect to the pipe. You'll get an ACCESS_DENIED in response from the server.
Note that as soon as you re-enable “Network access: Let Everyone permissions apply to anonymous users”, pipe client starts successfully opening the pipe and reading from/writing to pipe.
UNDERSTANDING THE ROOT CAUSE:==============================
Well now after explaining the problem, now let's take a look at the root cause of this problem:
Note: Some of the outputs below are WinDBG (debugger) outputs.
1) From Vista onwards, in order to access an object, you need to pass two security checks:
a) Integrity check => For Vista onwards
b) Classical access check (checking object’s security descriptor against the desired access) => For all Windows versions
Note: Integrity check couldn’t be turned off even if you disable UAC (and we wouldn’t want to do that either)
2) In test pipe application, we see the following differences when “Network access: Let Everyone permissions apply to anonymous users” is enabled and disabled:
a) “Network access: Let Everyone permissions apply to anonymous users” ENABLED situation
=> Token of the thread:
- The user is anonymous (as expected)
- The token has also the SID S-1-16-8192 (which represents Medium integrity level). So the thread will be accessing the pipe object while its integrity level is Medium
kd> !token -n
_ETHREAD 892ef810, _TOKEN 9a983b00
TS Session ID: 0
User: S-1-5-7 (Well Known Group: NT AUTHORITY\ANONYMOUS LOGON)
Groups:
00 S-1-0-0 (Well Known Group: localhost\NULL SID)
Attributes -
01 S-1-1-0 (Well Known Group: localhost\Everyone)
Attributes - Mandatory Default Enabled
02 S-1-5-2 (Well Known Group: NT AUTHORITY\NETWORK)
03 S-1-5-15 (Well Known Group: NT AUTHORITY\This Organization)
04 S-1-5-64-10 (Well Known Group: NT AUTHORITY\NTLM Authentication)
05 S-1-16-8192 Unrecognized SID
Attributes - GroupIntegrity GroupIntegrityEnabled
Primary Group: S-1-0-0 (Well Known Group: localhost\NULL SID)
Privs:
23 0x000000017 SeChangeNotifyPrivilege Attributes - Enabled Default
Authentication ID: (0,15a3fe2)
Impersonation Level: Impersonation
TokenType: Impersonation
Source: NtLmSsp TokenFlags: 0x2000
Token ID: 15a3fe5 ParentToken ID: 0
Modified ID: (0, 15a3fe8)
RestrictedSidCount: 0 RestrictedSids: 00000000
OriginatingLogonSession: 0
=> Security descriptor of the pipe (DACL & SACL of the pipe object)
- Anonymous Logon has full access to the pipe object
- Integrity level’s of objects are stored in the SACL of the security descriptor of the object. If the integrity level is not explicitly assigned, the object’s integrity level is Medium.
kd> !sd 0x81f69848 1
->Revision: 0x1
->Sbz1 : 0x0
->Control : 0x8004
SE_DACL_PRESENT
SE_SELF_RELATIVE
->Owner : S-1-5-32-544 (Alias: BUILTIN\Administrators)
->Group : S-1-5-21-1181840707-4124064209-3703316816-513 (no name mapped)
->Dacl :
->Dacl : ->AclRevision: 0x2
->Dacl : ->Sbz1 : 0x0
->Dacl : ->AclSize : 0x5c
->Dacl : ->AceCount : 0x4
->Dacl : ->Sbz2 : 0x0
->Dacl : ->Ace[0]: ->AceType: ACCESS_ALLOWED_ACE_TYPE
->Dacl : ->Ace[0]: ->AceFlags: 0x0
->Dacl : ->Ace[0]: ->AceSize: 0x18
->Dacl : ->Ace[0]: ->Mask : 0x001f01ff
->Dacl : ->Ace[0]: ->SID: S-1-5-32-544 (Alias: BUILTIN\Administrators)
->Dacl : ->Ace[1]: ->AceType: ACCESS_ALLOWED_ACE_TYPE
->Dacl : ->Ace[1]: ->AceFlags: 0x0
->Dacl : ->Ace[1]: ->AceSize: 0x14
->Dacl : ->Ace[1]: ->Mask : 0x001f01ff
->Dacl : ->Ace[1]: ->SID: S-1-5-7 (Well Known Group: NT AUTHORITY\ANONYMOUS LOGON)
->Dacl : ->Ace[2]: ->AceType: ACCESS_ALLOWED_ACE_TYPE
->Dacl : ->Ace[2]: ->AceFlags: 0x0
->Dacl : ->Ace[2]: ->AceSize: 0x14
->Dacl : ->Ace[2]: ->Mask : 0x00120089
->Dacl : ->Ace[2]: ->SID: S-1-1-0 (Well Known Group: localhost\Everyone)
->Dacl : ->Ace[3]: ->AceType: ACCESS_ALLOWED_ACE_TYPE
->Dacl : ->Ace[3]: ->AceFlags: 0x0
->Dacl : ->Ace[3]: ->AceSize: 0x14
->Dacl : ->Ace[3]: ->Mask : 0x001f01ff
->Dacl : ->Ace[3]: ->SID: S-1-5-18 (Well Known Group: NT AUTHORITY\SYSTEM)
->Sacl : is NULL
So in this scenario, a thread with integrity level of Medium is accessing an object with an integrity level of Medium. Hence it’s ok from Integrity check perspective to access the object. Once the integrity check is passed, DACL evaluation is made next (the classical access check that is done in all Windows versions). Since Anonymous user has access on the DACL of the pipe, it passes that stage as well and access to the pipe object is granted.
b) “Network access: Let Everyone permissions apply to anonymous users” DISABLED situation
- The token has also the SID S-1-16-0 (which represents Untrusted integrity level). So the thread will be accessing the pipe object while its integrity level is Untrusted
_ETHREAD 892dab58, _TOKEN 9a81b7f8
01 S-1-5-2 (Well Known Group: NT AUTHORITY\NETWORK)
02 S-1-5-15 (Well Known Group: NT AUTHORITY\This Organization)
03 S-1-5-64-10 (Well Known Group: NT AUTHORITY\NTLM Authentication)
04 S-1-16-0 Unrecognized SID
Authentication ID: (0,15a3909)
Source: NtLmSsp TokenFlags: 0x0
Token ID: 15a390c ParentToken ID: 0
Modified ID: (0, 15a390f)
- Integrity level’s of objects are stored in the SACL of the security descriptor objects. If the integrity level is not explicitly assigned, the object’s integrity level is Medium.
So in this scenario, a thread with integrity level of Untrusted is accessing an object with an integrity level of Medium. Hence it’s NOT OK from Integrity check perspective to access the object and integrity check mechanism denies access to the object. Classical DACL evaluation is even not done here.
In summary,
anonymous pipe access fails because of integrity check. When “Network access: Let Everyone permissions apply to anonymous users” is enabled, the EVERYONE SID is also added to the thread token and hence the token’s integrity level is raised (to medium in this scenario). So integrity check succeeds when this policy is enabled.
HOW TO FIX IT:
============
1) The most meaningful solution here is to set the Pipe object’s integrity level to Untrusted. If we could achieve this, we should be able to pass the integrity check because the integrity level of both the token and the object that the token was trying to open (with Read/Write permissions) would be the same (untrusted)
2) Changing the pipe object’s integrity level could be achieved in two different ways:
a) Setting the integrity level while creating the PIPE from the server application (via CreateFile() API)
b) Setting the integrity level after the PIPE is created (via SetSecurityInfo() API) (Either from the server application or from another application)
3) While searching for possible programmatic solutions, we have come across a very good source code example on how to set the integrity level of the pipe to Untrusted while creating the pipe. It’s also a good example of how to create pipe applications that will be using Anonymous pipes on Vista onwards systems:
=> Blog link: (in German)
http://blog.m-ri.de/index.php/2009/12/08/windows-integrity-control-schreibzugriff-auf-eine-named-pipe-eines-services-ueber-anonymen-zugriff-auf-vista-windows-2008-server-und-windows-7/
Note: It's a 3rd party link so please connect to it at your own risk.
=> Just a few notes from the source code to further explain how it could be done:
a) While the pipe is created, a security attributes structure is passed:
hPipe = CreateNamedPipe(
server,
PIPE_ACCESS_DUPLEX,
PIPE_TYPE_MESSAGE | PIPE_READMODE_MESSAGE | PIPE_WAIT,
PIPE_UNLIMITED_INSTANCES,
sizeof(DWORD),
0,
NMPWAIT_USE_DEFAULT_WAIT,
&sa );
b) Especially integrity level related part of that the security attribute structure is built as follows:
// We need this only if we have Windows Vista, Windows 7 or Windows 2008 Server
OSVERSIONINFO osvi;
osvi.dwOSVersionInfoSize = sizeof(osvi);
if (!GetVersionEx(&osvi))
{
DisplayError( L"GetVersionInfoEx" );
return FALSE;
}
// If Vista, Server 2008, or Windows7!
if (osvi.dwMajorVersion>=6)
// Now the trick with the SACL:
// We set SECURITY_MANDATORY_UNTRUSTED_RID to SYSTEM_MANDATORY_POLICY_NO_WRITE_UP
// Anonymous access is untrusted, and this process runs equal or above medium
// integrity level. Setting "S:(ML;;NW;;;LW)" is not sufficient.
_tcscat(szBuff,_T("S:(ML;;NW;;;S-1-16-0)"));
The highlighted part will cause the integrity level to be set to Untrusted on the pipe object while the pipe is created via CreateNamedPipe().
You can find more information on Integrity check at the following link:
http://msdn.microsoft.com/en-us/library/bb625963.aspx Windows Integrity Mechanism Design
http://msdn.microsoft.com/en-us/library/aa379588(VS.85).aspx SetSecurityInfo Function
http://msdn.microsoft.com/en-us/library/aa363858(VS.85).aspx CreateFile Function
Hi there, In today's blog, I would like to talk about NLB cluster access problems that our customers experience most of the time... When Microsoft NLB cluster operates in multicast mode, in certain scenarios you may not be able to access the NLB cluster IP address from remote subnets whereas suame subnet access keeps working fine. You can find more information at the two most common scenarios below:
In today's blog, I would like to talk about NLB cluster access problems that our customers experience most of the time...
When Microsoft NLB cluster operates in multicast mode, in certain scenarios you may not be able to access the NLB cluster IP address from remote subnets whereas suame subnet access keeps working fine. You can find more information at the two most common scenarios below:
Problem 1:
When NLB cluster on Windows 2008/SP2 operates in multicast mode, due to a problem with NLB implementation on 2008 remote subnets cannot access NLB cluster IP address
Solution 1:
- This problem was stemming from NLB implementation
- This has been fixed by Microsoft with the hotfix KB960916.
- KB960916 is already included in Windows 2008 SP2
Problem 2:
When NLB cluster on Windows 2003 or Windows 2008 operates in multicast mode, remote subnets cannot access NLB cluster IP address. That second problem stems from the fact that some vendors (like Cisco) don't accept mapping L3 unicast IP addresses to L2 multicast MAC addresses (this happens when NLB cluster operates in multicast mode - L3 unicast IP address is the NLB cluster IP address and L2 mac address is the multicast MAC address that is chose by NLB) so you have to create a static mapping on the router to avoid such a problem. You can find more information about this problem at the following link:
http://www.cisco.com/en/US/products/hw/switches/ps708/products_configuration_example09186a0080a07203.shtml
(Taken from the above link)
Multicast Mode
Another solution is to use multicast mode in MS NLB configuration GUI instead of Unicast mode. In Multicast Mode, the system admin clicks the IGMP Multicast button in the MS NLB configuration GUI. This choice instructs the cluster members to respond to ARPs for their virtual address using a multicast MAC address for example 0300.5e11.1111 and to send IGMP Membership Report packets. If IGMP snooping is enabled on the local switch, it snoops the IGMP packets that pass through it. In this way, when a client ARPs for the cluster’s virtual IP address, the cluster responds with multicast MAC for example 0300.5e11.1111. When the client sends the packet to 0300.5e11.1111, the local switch forwards the packet out each of the ports connected to the cluster members. In this case, there is no chance of flooding the ARP packet out of all the ports. The issue with the multicast mode is virtual IP address becomes unreachable when accessed from outside the local subnet because Cisco devices do not accept an arp reply for a unicast IP address that contains a multicast MAC address. So the MAC portion of the ARP entry shows as incomplete. (Issue the command show arp to view the output.) As there is no MAC portion in the arp reply, the ARP entry never appeared in the ARP table. It eventually quit ARPing and returned an ICMP Host unreachable to the clients. In order to override this, use static ARP entry to populate the ARP table as given below. In theory, this allows the Cisco device to populate its mac-address-table. For example, if the virtual ip address is 172.16.63.241 and multicast mac address is 0300.5e11.1111, use this command in order to populate the ARP table statically:
Solution 2:
In order to resolve that problem, you have two choices:
a) Adding a static ARP entry on the router
b) Changing NLB cluster mode to Unicast
Also please always keep in mind the following when troubleshooting NLB problems:
1) Do I run the latest NLB driver available from Microsoft? We have released a few updates on NLB drivers on Windows 2003, Windows 2008 and Windows 2008 R2 to address a few problems
2) Do I run the latest NIC driver and teaming driver? We generally prefer not to run teaming on NLB clusters and may ask to dissolve the teaming if needed even though we don't have strict "not supported" statement.
3) Do the NLB rules are correctly configured? The most common problem with that is to set affinity to "None" for stateful protocols which causes many NLB cluster access problems.
4) Do I run the latest TCPIP driver? (preferrably the latest security update which updates TCPIP driver)
5) Do I run the latest 3rd party filter drivers that run at NDIS layer? (for example security drivers)
6) If NLB cluster runs on Windows 2008 R2 Hyper-V, do you disable "Enable spoofing of MAC addresses"?
I'm going to talk about troubleshooting approaches in another blog post.
In today’s blog post, I’m going to show you how I found out why a Domain controller was contacting random clients in the domain. This issue was reported by the customer due to security concerns. They suspected that a suspicious process might be running on the DC and the case was raised as a result of security concerns. In general we don’t expect Domain controllers to contact the clients running in the domain so our customer wanted to understand the reason behind that.
We first verified that the DC was really contacting some clients by collecting a network trace on the DC. You can see one of those clients (client1) contacted by the DC (DC1):
Note: DC and client IP addresses are replaced for data privacy.
11415 14:21:12 05.07.2010 DC1 CLIENT1 TCP TCP:Flags=......S., SrcPort=3912, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=70128947, Ack=0, Win=65535 ( ) = 65535 {TCP:515, IPv4:46}
11443 14:21:12 05.07.2010 DC1 CLIENT1 TCP TCP:Flags=......S., SrcPort=3913, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=3133793441, Ack=0, Win=65535 ( ) = 65535 {TCP:518, IPv4:46}
30922 14:33:17 05.07.2010 DC1 CLIENT1 TCP TCP:Flags=......S., SrcPort=4118, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=2414564040, Ack=0, Win=65535 ( ) = 65535 {TCP:1270, IPv4:46}
30950 14:33:17 05.07.2010 DC1 CLIENT1 TCP TCP:Flags=......S., SrcPort=4120, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=1797119693, Ack=0, Win=65535 ( ) = 65535 {TCP:1273, IPv4:46}
51472 14:45:22 05.07.2010 DC1 CLIENT1 TCP TCP:Flags=......S., SrcPort=4314, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=1834145861, Ack=0, Win=65535 ( ) = 65535 {TCP:1403, IPv4:46}
51500 14:45:22 05.07.2010 DC1 CLIENT1 TCP TCP:Flags=......S., SrcPort=4315, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=4278939251, Ack=0, Win=65535 ( ) = 65535 {TCP:1406, IPv4:46}
67096 14:57:26 05.07.2010 DC1 CLIENT1 TCP TCP:Flags=......S., SrcPort=4514, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=1707963693, Ack=0, Win=65535 ( ) = 65535 {TCP:1945, IPv4:46}
67126 14:57:26 05.07.2010 DC1 CLIENT1 TCP TCP:Flags=......S., SrcPort=4515, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=3807245641, Ack=0, Win=65535 ( ) = 65535 {TCP:1948, IPv4:46}
74691 15:09:30 05.07.2010 DC1 CLIENT1 TCP TCP:Flags=......S., SrcPort=4740, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=1036190517, Ack=0, Win=65535 ( ) = 65535 {TCP:1983, IPv4:46}
74721 15:09:31 05.07.2010 DC1 CLIENT1 TCP TCP:Flags=......S., SrcPort=4741, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=2281072822, Ack=0, Win=65535 ( ) = 65535 {TCP:1986, IPv4:46}
84937 15:21:35 05.07.2010 DC1 CLIENT1 TCP TCP:Flags=......S., SrcPort=4930, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=3190224054, Ack=0, Win=65535 ( ) = 65535 {TCP:2104, IPv4:46}
84965 15:21:35 05.07.2010 DC1 CLIENT1 TCP TCP:Flags=......S., SrcPort=4931, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=2774224583, Ack=0, Win=65535 ( ) = 65535 {TCP:2107, IPv4:46}
At first look, it drew my attention that the connection attempt was repeated every 12 minutes or so. Then this should have been something running periodically on the DC. Normally Network Monitor should show you the process that is initiating those TCP sessions but under heavy load Network monitor stops to do so in favor of performance as it’s a costly operation. There’re some other methods to find out a process sending a certain packet but I decided to let the DC do whatever it would do against the client to see the whole activity.
So the customer removed firewall filters and allowed the DC to connect to Client1. After doing so we collected a new network trace to see the latest situation. We got the expected results by examining the new network trace:
a) The first interesting finding was that the client was sending a “Master Browser” announcement to the DC (DC1) shortly before one of these connection attempts from the DC side:
47140 07:30:31 08.07.2010 CLIENT1 DC1 BROWSER BROWSER:Master Announcement {SMB:351, UDP:350, IPv4:3}
b) After that browser announcement, the DC contacted the client at TCP port 139 to establish an SMB session:
47595 07:30:33 08.07.2010 DC1 CLIENT1 TCP TCP:Flags=......S., SrcPort=3787, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=2594372577, Ack=0, Win=65535 ( ) = 65535 {TCP:373, IPv4:3}
47596 07:30:33 08.07.2010 CLIENT1 DC1 TCP TCP:Flags=...A..S., SrcPort=NETBIOS Session Service(139), DstPort=3787, PayloadLen=0, Seq=2981880191, Ack=2594372578, Win=8192 ( Scale factor not supported ) = 8192 {TCP:373, IPv4:3}
47597 07:30:33 08.07.2010 DC1 CLIENT1 TCP TCP:Flags=...A...., SrcPort=3787, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=2594372578, Ack=2981880192, Win=65535 (scale factor 0x0) = 65535 {TCP:373, IPv4:3}
c) Then it initiated a NetBT session to the client:
47598 07:30:33 08.07.2010 DC1 CLIENT1 NbtSS NbtSS:SESSION REQUEST, Length =68 {NbtSS:374, TCP:373, IPv4:3}
47599 07:30:33 08.07.2010 CLIENT1 DC1 NbtSS NbtSS:POSITIVE SESSION RESPONSE, Length =0 {NbtSS:374, TCP:373, IPv4:3}
d) Then it established an SMB connection:
47600 07:30:33 08.07.2010 DC1 CLIENT1 SMB SMB:C; Negotiate, Dialect = PC NETWORK PROGRAM 1.0, LANMAN1.0, Windows for Workgroups 3.1a, LM1.2X002, LANMAN2.1, NT LM 0.12 {NbtSS:374, TCP:373, IPv4:3}
47602 07:30:33 08.07.2010 CLIENT1 DC1 SMB SMB:R; Negotiate, Dialect is NT LM 0.12 (#5), SpnegoToken (1.3.6.1.5.5.2) {NbtSS:374, TCP:373, IPv4:3}
47614 07:30:34 08.07.2010 DC1 CLIENT1 SMB SMB:C; Session Setup Andx, NTLM NEGOTIATE MESSAGE {NbtSS:374, TCP:373, IPv4:3}
47615 07:30:34 08.07.2010 CLIENT1 DC1 SMB SMB:R; Session Setup Andx, NTLM CHALLENGE MESSAGE - NT Status: System - Error, Code = (22) STATUS_MORE_PROCESSING_REQUIRED {NbtSS:374, TCP:373, IPv4:3}
47616 07:30:34 08.07.2010 DC1 CLIENT1 SMB SMB:C; Session Setup Andx, NTLM AUTHENTICATE MESSAGE, Workstation: DC1 {NbtSS:374, TCP:373, IPv4:3}
47621 07:30:34 08.07.2010 CLIENT1 DC1 SMB SMB:R; Session Setup Andx {NbtSS:374, TCP:373, IPv4:3}
e) Then it connected to the interprocess communication share (IPC$):
47625 07:30:34 08.07.2010 DC1 CLIENT1 SMB SMB:C; Tree Connect Andx, Path = \\CLIENT1\IPC$, Service = ????? {NbtSS:374, TCP:373, IPv4:3}
47626 07:30:34 08.07.2010 CLIENT1 DC1 SMB SMB:R; Tree Connect Andx, Service = IPC {NbtSS:374, TCP:373, IPv4:3}
f) Then it called RAP (Remote Administration Protocol) APIs like NetServerEnum2 etc:
47630 07:30:34 08.07.2010 DC1 CLIENT1 RAP RAP:NetServerEnum2 Request, InfoLevel = 1, LocalList in {SMB:379, NbtSS:374, TCP:373, IPv4:3}
47631 07:30:34 08.07.2010 CLIENT1 DC1 RAP RAP:NetServerEnum2 Response, Count = 1 {SMB:379, NbtSS:374, TCP:373, IPv4:3}
g) Once it got the requested info, it logged off and disconnected the TCP session:
47642 07:30:34 08.07.2010 DC1 CLIENT1 SMB SMB:C; Logoff Andx {NbtSS:374, TCP:373, IPv4:3}
47643 07:30:34 08.07.2010 CLIENT1 DC1 SMB SMB:R; Logoff Andx {NbtSS:374, TCP:373, IPv4:3}
47650 07:30:34 08.07.2010 DC1 CLIENT1 SMB SMB:C; Tree Disconnect {NbtSS:374, TCP:373, IPv4:3}
47651 07:30:34 08.07.2010 CLIENT1 DC1 SMB SMB:R; Tree Disconnect {NbtSS:374, TCP:373, IPv4:3}
47657 07:30:34 08.07.2010 DC1 CLIENT1 TCP TCP:Flags=...A...F, SrcPort=3787, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=2594373651, Ack=2981881320, Win=64407 (scale factor 0x0) = 64407 {TCP:373, IPv4:3}
47658 07:30:34 08.07.2010 CLIENT1 DC1 TCP TCP:Flags=...A...F, SrcPort=NETBIOS Session Service(139), DstPort=3787, PayloadLen=0, Seq=2981881320, Ack=2594373652, Win=15559 (scale factor 0x0) = 15559 {TCP:373, IPv4:3}
47662 07:30:34 08.07.2010 DC1 CLIENT1 TCP TCP:Flags=...A...., SrcPort=3787, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=2594373652, Ack=2981881321, Win=64407 (scale factor 0x0) = 64407 {TCP:373, IPv4:3}
h) That similar activity was seen every 12 minutes in the network trace.
1) After analyzing the second network trace, the reason behind DC’s connection to clients was clear now:
Every 12 minutes or so master browser in a network segment informs the domain master browser (which is the DC) that it’s a master browser. And then the DC connects to master browser in turn to retrieve the browse list from that master browser. You can find more details below:
Taken from http://technet.microsoft.com/en-us/library/cc737661(WS.10).aspx How Computer Browser Service Works:
When a domain spans multiple subnets, the master browse servers for each subnet use a unicast Master Announcement message to announce themselves to the domain master browse server. This message notifies the domain master browse server that the sending computer is a master browse server in the same domain. When the domain master browse server receives a Master Browse Server Announcement message, it returns to the “announcing” master browse server a request for a list of the server’s in that master browse server’s subnet. When that list is received, the domain master browse server merges it with its own server list.
This process, repeated every 12 minutes, guarantees that the domain master browse server has a complete browse list of all the servers in the domain. Thus, when a client sends a browse request to a backup browse server, the backup browse server can return a list of all the servers in the domain, regardless of the subnet on which those servers are located.
I would like to talk about a few network trace analysis cases where we were requested to find out why certain packets (spefically ICMP and UDP) were sent by Exchange servers. You’ll find below more details about how we found the processes sending those packets:
a) Exchange servers sending UDP packets with random source or destination ports to various clients
In one scenario, our customer’s security team wanted to find out the reason of why the Exchange servers were sending UDP packets to random clients on the network because of security concerns. There was no deterministic pattern regarding source or destination UDP ports. The only consistency was that each UDP packet sent by the Exchange servers had always 8 byte as payload. You can see a sample network trace output below:
Note: Addresses were replaced for privacy purposes even though private IP address space was in use.
105528 2010-01-14 15:20:14.454856 10.1.1.1 172.1.10.14 UDP Source port: 35996 Destination port: mxomss
105530 2010-01-14 15:20:14.454856 10.1.1.1 172.18.10.27 UDP Source port: 35997 Destination port: edtools
105531 2010-01-14 15:20:14.454856 10.1.1.1 172.17.17.95 UDP Source port: 35998 Destination port: fiveacross
105535 2010-01-14 15:20:14.454856 10.1.1.1 172.17.11.51 UDP Source port: 36000 Destination port: kwdb-commn
105540 2010-01-14 15:20:14.454856 10.1.1.1 172.23.98.97 UDP Source port: 36003 Destination port: dicom-tls
105541 2010-01-14 15:20:14.454856 10.1.1.1 172.24.12.8 UDP Source port: 36004 Destination port: dkmessenger
105542 2010-01-14 15:20:14.454856 10.1.1.1 172.28.2.52 UDP Source port: 36005 Destination port: tragic
105545 2010-01-14 15:20:14.454856 10.1.1.1 172.31.5.14 UDP Source port: 36006 Destination port: xds
105546 2010-01-14 15:20:14.454856 10.1.1.1 172.2.10.63 UDP Source port: 36007 Destination port: 4642
105547 2010-01-14 15:20:14.454856 10.1.1.1 172.2.35.68 UDP Source port: 36008 Destination port: foliocorp
105552 2010-01-14 15:20:14.454856 10.1.1.1 172.18.12.55 UDP Source port: 36010 Destination port: saphostctrl
105553 2010-01-14 15:20:14.454856 10.1.1.1 172.48.199.45 UDP Source port: 36011 Destination port: slinkysearch
105554 2010-01-14 15:20:14.454856 10.1.1.1 172.27.133.42 UDP Source port: 36012 Destination port: oracle-oms
105555 2010-01-14 15:20:14.454856 10.1.1.1 172.27.121.40 UDP Source port: 36013 Destination port: proxy-gateway
105558 2010-01-14 15:20:14.454856 10.1.1.1 172.24.7.11 UDP Source port: 36016 Destination port: fcmsys
- Source UDP port is increasing and destination UDP port seems random at first sight
- The data part of the UDP datagrams are always 8 bytes. As an example:
Frame 105540 (50 bytes on wire, 50 bytes captured)
Ethernet II, Src: HewlettP_11:11:11 (00:1c:c4:11:11:11), Dst: All-HSRP-routers_15 (00:00:0c:07:ac:15)
Internet Protocol, Src: 10.1.1.1 (10.1.1.1), Dst: 172.23.98.97 (172.23.98.97)
User Datagram Protocol, Src Port: 36003 (36003), Dst Port: dicom-tls (2762)
Data (8 bytes)
=> To better understand which process might be sending that packet, we decided to collect a kernel TCPIP trace on the source Windows 2003 server. You can find more information about methods that could be used to identify the process sending a certain packet, please see my previous post on this
After collecting a network trace and an accompanying kernel TCPIP trace as described in the other post (option 4), we managed to catch the UDP packet that we see in the above network trace (actually the above network trace and the below kernel TCPIP trace were collected together). As an example:
UdpIp, Send, 0xFFFFFFFF, 129079416141424158, 0, 0, 2136, 8, 172.023.098.097, 010.001.001.001, 2762, 36003, 0, 0
UdpIp, Send, 0xFFFFFFFF, 129079416141424158, 0, 0, 2136, 8, 172.027.153.050, 172.023.021.024, 6004, 36009, 0, 0
UdpIp, Send, 0xFFFFFFFF, 129079416141424158, 0, 0, 2136, 8, 172.028.097.111, 172.023.021.024, 2344, 36016, 0, 0
UdpIp, Send, 0xFFFFFFFF, 129079416141424158, 0, 0, 2136, 8, 172.027.102.056, 172.023.021.024, 1116, 36022, 0, 0
- For example, in the red line, 10.1.1.1 (Exchange server) is sending a UDP packet to 172.23.98.97. Packet lenght is 8 bytes and source UDP port is 36003 and destination UDP port is 2762. And process ID that is sending the UDP packet is 2136. Actually in all such UDP packets, process ID is always 2136.
- The above line in Red taken from the kernel trace is the packet #105540 seen in the network trace
=> After checking the “tasklist /SVC” output, we saw that process ID 2136 was store.exe (which is Exchange Information store process):
wmiprvse.exe 5176 Console 0 2,168 K
mad.exe 7176 Console 0 45,792 K
AntigenStore.exe 10092 Console 0 200 K
store.exe 2136 Console 0 1,040,592 K
emsmta.exe 12020 Console 0 29,092 K
=> After further investigation at Exchange side with the help of an Exchange expert, we found out that this traffic was expected and was used as an E-mail notification mechanism:
http://support.microsoft.com/default.aspx?scid=kb;EN-US;811061 XCCC: Exchange Clients Do Not Receive "New Mail" Notification Messages
The Information Store process (Store.exe) sends a User Datagram Protocol (UDP) packet for new mail notifications. However, because the Store process does not run on an Exchange virtual server but on the cluster node, the UDP packet is sent from the IP address of that node. If you fail over the cluster node, the data and Exchange 2000 Server virtual server configuration are moved to the Store process that is running on the other cluster server node. New mail notifications are sent from the IP address of that second cluster node.
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
b) Exchange servers sending 1 byte pings to DCs
One of our customers reported that their DCs were getting constant ICMP Echo requests from a number of member servers and they wanted to get help in finding the process behind it because of security concerns. After some analysis and testing with the help of an Exchange expert colleague of mine, we found out that those ICMP echo requests were sent by the Exchange server related services. The ICMP echo request has the following characteristics:
=> It’s payload is always 1 byte
=> The payload itself is “3F”
Those ICMP echo requests cease once Exchange server related services are stopped which is another indication. This behavior is partly explained at the following article:
http://support.microsoft.com/kb/270836 Exchange Server static port mappings
Taken from the article:
Note In a perimeter network firewall scenario, there is no Internet Control Message Protocol (ICMP) connectivity between the Exchange server and the domain controllers. By default, Directory Access (DSAccess) uses ICMP to ping each server to which it connects to determine whether the server is available. When there is no ICMP connectivity, Directory Access responds as if every domain controller were unavailable
=> You can also see a sample network trace output collected on an Exchange server:
I was collaborating with a colleague of mine on a problem where SCCM client push installation was failing. They suspected network connectivity problems and collected simultaneous network traces from SCCM server and from a problem client machine and involved me in for further analysis.
When I check the SCCM server and client side traces, I saw that SCCM server was successfully accessing the client through TCP port 135
=> SCCM server side trace:
- TCP three way handshake between SCCM server and client:
5851 14:42:47 05.09.2012 34.0337296 10.0.9.149 CLIENTNAME.company.com TCP TCP: [Bad CheckSum]Flags=......S., SrcPort=51763, DstPort=DCE endpoint resolution(135), PayloadLen=0, Seq=2250995253, Ack=0, Win=8192 ( Negotiating scale factor 0x8 ) = 8192 {TCP:861, IPv4:843}
5852 14:42:47 05.09.2012 34.0364843 CLIENTNAME.company.com 10.0.9.149 TCP TCP:Flags=...A..S., SrcPort=DCE endpoint resolution(135), DstPort=51763, PayloadLen=0, Seq=1315818582, Ack=2250995254, Win=65535 ( Negotiated scale factor 0x0 ) = 65535 {TCP:861, IPv4:843}
5853 14:42:47 05.09.2012 34.0365076 10.0.9.149 CLIENTNAME.company.com TCP TCP: [Bad CheckSum]Flags=...A...., SrcPort=51763, DstPort=DCE endpoint resolution(135), PayloadLen=0, Seq=2250995254, Ack=1315818583, Win=258 (scale factor 0x8) = 66048 {TCP:861, IPv4:843}
- SCCM server binds to SCMActivator and activates WMI component:
5877 14:42:47 05.09.2012 34.0610846 10.0.9.149 CLIENTNAME.company.com MSRPC MSRPC:c/o Bind: IRemoteSCMActivator(DCOM) UUID{000001A0-0000-0000-C000-000000000046} Call=0x3 Assoc Grp=0xBB15 Xmit=0x16D0 Recv=0x16D0 {MSRPC:865, TCP:861, IPv4:843}
5880 14:42:47 05.09.2012 34.0642128 CLIENTNAME.company.com 10.0.9.149 TCP TCP:Flags=...A...., SrcPort=DCE endpoint resolution(135), DstPort=51763, PayloadLen=0, Seq=1315818583, Ack=2250996747, Win=65535 (scale factor 0x0) = 65535 {TCP:861, IPv4:843}
5882 14:42:47 05.09.2012 34.0748352 CLIENTNAME.company.com 10.0.9.149 MSRPC MSRPC:c/o Bind Ack: Call=0x3 Assoc Grp=0xBB15 Xmit=0x16D0 Recv=0x16D0 {MSRPC:865, TCP:861, IPv4:843}
5883 14:42:47 05.09.2012 34.0750212 10.0.9.149 CLIENTNAME.company.com MSRPC MSRPC:c/o Alter Cont: IRemoteSCMActivator(DCOM) UUID{000001A0-0000-0000-C000-000000000046} Call=0x3 {MSRPC:865, TCP:861, IPv4:843}
5884 14:42:47 05.09.2012 34.0785470 CLIENTNAME.company.com 10.0.9.149 MSRPC MSRPC:c/o Alter Cont Resp: Call=0x3 Assoc Grp=0xBB15 Xmit=0x16D0 Recv=0x16D0 {MSRPC:865, TCP:861, IPv4:843}
5885 14:42:47 05.09.2012 34.0786863 10.0.9.149 CLIENTNAME.company.com DCOM DCOM:RemoteCreateInstance Request, DCOM Version=5.7 Causality Id={FEEE1975-B61E-42EB-B500-939EA5EE4B2A} {MSRPC:865, TCP:861, IPv4:843}
Frame: Number = 5885, Captured Frame Length = 923, MediaType = ETHERNET
+ Ethernet: Etype = Internet IP (IPv4),DestinationAddress:[00-22-90-E3-B7-80],SourceAddress:[00-22-64-08-91-A6]
+ Ipv4: Src = 10.0.9.149, Dest = 10.102.0.230, Next Protocol = TCP, Packet ID = 639, Total IP Length = 909
+ Tcp: [Bad CheckSum]Flags=...AP..., SrcPort=51763, DstPort=DCE endpoint resolution(135), PayloadLen=869, Seq=2250996924 - 2250997793, Ack=1315818870, Win=257 (scale factor 0x8) = 65792
+ Msrpc: c/o Request: IRemoteSCMActivator(DCOM) {000001A0-0000-0000-C000-000000000046} Call=0x3 Opnum=0x4 Context=0x1 Hint=0x318
- DCOM: RemoteCreateInstance Request, DCOM Version=5.7 Causality Id={FEEE1975-B61E-42EB-B500-939EA5EE4B2A}
+ HeaderReq: DCOM Version=5.7 Causality Id={FEEE1975-B61E-42EB-B500-939EA5EE4B2A}
+ AggregationInterface: NULL
- ActivationProperties: OBJREFCUSTOM - {000001A2-0000-0000-C000-000000000046}
+ MInterfacePointerPtr: Pointer To 0x00020000
- Interface: OBJREFCUSTOM - {000001A2-0000-0000-C000-000000000046}
+ Size: 744 Elements
InterfaceSize: 744 (0x2E8)
Signature: 1464812877 (0x574F454D)
Flags: OBJREFCUSTOM - Represents a custom marshaled object reference
MarshaledInterfaceIID: {000001A2-0000-0000-C000-000000000046}
- Custom:
ClassId: {00000338-0000-0000-C000-000000000046}
ExtensionSize: 0 (0x0)
ObjectReferenceSize: 704 (0x2C0)
- ActivationProperties:
TotalSize: 688 (0x2B0)
Reserved: 0 (0x0)
+ CustomHeader:
- Properties: 6 Property Structures
+ Special:
- Instantiation:
+ Header:
InstantiatedObjectClsId: {8BC3F05E-D86B-11D0-A075-00C04FB68820} => This is WMI
ClassContext: 20 (0x14)
ActivationFlags: 2 (0x2)
FlagsSurrogate: 0 (0x0)
- Server responds with success and provides the endpoint information for WMI service:
5886 14:42:47 05.09.2012 34.0848992 CLIENTNAME.company.com 10.0.9.149 DCOM DCOM:RemoteCreateInstance Response, ORPCFLOCAL - Local call to this computer {MSRPC:865, TCP:861, IPv4:843}
- ScmReply:
+ Ptr: Pointer To NULL
+ RemoteReplyPtr: Pointer To 0x00106E98
- RemoteReply:
ObjectExporterId: 13300677357152346811 (0xB8957F961925A2BB)
+ OxidBindingsPtr: Pointer To 0x00102FF0
IRemUnknownInterfacePointerId: {0000B400-0580-0000-9A5E-C2357038B9DF}
AuthenticationHint: 4 (0x4)
+ Version: DCOM Version=5.7
- OxidBindings:
+ Size: 378 Elements
- Bindings:
WNumEntries: 378 (0x17A)
WSecurityOffsets: 263 (0x107)
- StringBindings:
TowerId: 15 (0xF)
NetworkAddress: \\\\CLIENTNAME[\\PIPE\\atsvc]
NetworkAddress: \\\\CLIENTNAME[\\PIPE\\wkssvc]
NetworkAddress: \\\\CLIENTNAME[\\pipe\\keysvc]
NetworkAddress: \\\\CLIENTNAME[\\PIPE\\srvsvc]
NetworkAddress: \\\\CLIENTNAME[\\pipe\\trkwks]
NetworkAddress: \\\\CLIENTNAME[\\PIPE\\W32TIME]
NetworkAddress: \\\\CLIENTNAME[\\PIPE\\ROUTER]
TowerId: 7 (0x7)
NetworkAddress: CLIENTNAME[1431]
NetworkAddress: 10.102.0.230[1431]
Terminator1: 0 (0x0)
+ SecurityBindings:
Terminator2: 0 (0x0)
- Since WMI listens on TCP 1431, SCCM server tries to connect to that endpoint to access WMI subsystem:
8980 14:43:08 05.09.2012 55.1014127 10.0.9.149 CLIENTNAME.company.com TCP TCP: [Bad CheckSum]Flags=......S., SrcPort=51785, DstPort=1431, PayloadLen=0, Seq=1764982397, Ack=0, Win=8192 ( Negotiating scale factor 0x8 ) = 8192 {TCP:1203, IPv4:843}
9390 14:43:11 05.09.2012 58.1101896 10.0.9.149 CLIENTNAME.company.com TCP TCP:[SynReTransmit #8980] [Bad CheckSum]Flags=......S., SrcPort=51785, DstPort=1431, PayloadLen=0, Seq=1764982397, Ack=0, Win=8192 ( Negotiating scale factor 0x8 ) = 8192 {TCP:1203, IPv4:843}
11236 14:43:17 05.09.2012 64.1163158 10.0.9.149 CLIENTNAME.company.com TCP TCP:[SynReTransmit #8980] [Bad CheckSum]Flags=......S., SrcPort=51785, DstPort=1431, PayloadLen=0, Seq=1764982397, Ack=0, Win=8192 ( Negotiating scale factor 0x8 ) = 8192 {TCP:1203, IPv4:843}
- But this TCP session request fails because SCCM server doesn’t get a response to TCP SYN requests.
- When we check the client side network trace, we cannot see any of those TCP SYNs sent by the SCCM server.
This is most of the time a hardware router/firewall filtering problem. After our customer made the necessary configuration changes in the firewall, SCCM client push installation started working properly.
Since WMI is assigned a random TCP port from dynamic RPC port range at every startup, network/firewall administrators need to allow that range as well in addition to allowing TCP 135 activity towards the clients. One other alternative in this instance could be fixing the TCPIP port than WMI subsytem obtains at each startup. You can see the below article for more information on this:
http://support.microsoft.com/kb/897571 FIX: A DCOM static TCP endpoint is ignored when you configure the endpoint for WMI on a Windows Server 2003-based computer
In this post, I would like to talk about some important points about network capturing. If a network trace is not collected appropriately, it won’t provide any useful information and it will be a waste of time analyzing such a network trace.
Additionally, just collecting the network trace isn’t sufficient if you intend to ask for some help when analyzing that network trace, you also have to provide some information about the trace itself. Generally I collaborate with other colleagues in terms of network trace analysis and I have a standard template of questions when I’m approached by a colleague for assistance in analyzing a network trace:
- What is the exact problem definition
- Which network traces were collected on which system
- The IP addresses of the relevant systems (like client/server/DC/DNS)
- OS versions for relevant systems
- Network topology between the source and target systems on which network traces were collected
- The exact date & time of the problem & error seen
- The exact error message seen
- What were the exact actions taken when collecting the network traces (as much detailed as possible)
Now let’s talk about some important points that we need to be aware of to be able to collect a useable network trace that will really help you troubleshoot a given problem.
1) First of all, we need to make sure that it really makes sense to collect a network trace for the problem in hand. You can check the previous blog post to have a better idea on this:
http://blogs.technet.com/b/nettracer/archive/2012/06/22/when-do-we-need-collect-network-traces.aspx
2) Especially in switched networks, when we collect a network trace from a given node (a client or server), only the following traffic will be seen by the capturing agent (like Network monitor/Wireshark/...) running on the node:
- Packets sent out by the node itself
- Packets sent to the node’s unicast address
- Packets sent to unknown unicast addresses (switch doesn’t have that MAC address at its MAC address table yet so it floods the frame everywhere)
- Packets sent to broadcast address
- Packets sent to multicast addresses
So we won’t be able to see the packets sent to/received from client2 in a network trace collected on client1. If you really have to see the packets sent to/received from a node other than the node on which network trace is collected, you have to do port mirroring configuration (and your LAN switch should support it as well). Most of the LAN switches used in enterprise networks support port mirroring. You can see below a link for making such a configuration on Cisco LAN switches:
http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note09186a008015c612.shtml
Switched Port Analyzer (SPAN) Configuration Example
3) If you troubleshoot a communication or performance problem between two processes running on the same node, that traffic won’t leave the machine and hence the network traffic won’t be captured by the capturing agent (Network Monitor/Wireshark). The traffic will be looped back by TCPIP stack. As an example, you won’t be able to see the network activity taking place between Internet Explorer and the Web server running on the same machine. If you need to troubleshoot such a scenario, you might try to collect an ETL trace, but the node will have to be running Windows 7 or Windows 2008 R2 for that. Please see the following post for more details on collecting such an ETL trace:
http://blogs.technet.com/b/nettracer/archive/2010/10/06/how-it-works-under-the-hood-a-closer-look-at-tcpip-and-winsock-etl-tracing-on-windows-7-and-windows-2008-r2-with-an-example.aspx
4) When collecting a network trace from a busy server, a capture filter might be applied to minimize the amount of traffic captured. We generally don’t prefer to capturing network traffic with a capture filter because when such a capture filter is applied, we take the risk of excluding some of the traffic that might be really relevant to the issue. If you’re really sure about what you have to check, then you may want to apply such a filter. You can find below an example of capturing with a filter with nmcap (command line version of Network monitor)
Note: The following is taken from nmcap /examples output:
This example starts capturing network frames that DO NOT contain ARPs, ICMP, NBtNs and BROWSER frames. If you want to stop capturing, Press Control+C.
nmcap /network * /capture (!ARP AND !ICMP AND !NBTNS AND !BROWSER) /File NoNoise.cap
5) If you really need to capture network traffic from a very busy server and you don’t want to take the risk of excluding some network traffic that might be relevant, you might want to capture let’s say only the first 256 bytes of each packets. Considering that a standard ethernet frame is about 1500 bytes, this will provide you a saving of ~%80. You can find an example for nmcap where only the first 256 bytes of each packet is captured:
nmcap /network * /MaxFrameLength 256
6) If network traces will be collected for an extended period, capturing all packets inside the same file will make it nearly impossible to analyze it (example: 5 GB network trace). To be able to collect manageable and analyzable network traces, it’s suggested to collect chained and fragmented network traces. You can find below an example for nmcap again:
nmcap /network * /capture /file ServerTest.chn:200M
Note: nmcap will create a new capture file once the first one if full (200 MB) and so on. So please make sure that you have enough free disk space on the related drive.
Note: The traces created will be named as ServerTest.cap, ServerTest(1).cap, ServerTest(2).cap,...
7) If you have to collect network traces for an unspecified period of time and you would like to see some activity taking place some time before the problem, you may have to collect network traces in a circular fashion which is possible with dumpcap (command line version of Wireshark for trace collection). You can see an example below:
dumpcap -i 2 -w c:\traces\servername.pcap -b filesize:204800 -b files:80
Notes 1: interface id "2" will be monitored and each capture file will be 204800 KB (200 MB)
Notes 2: The command assumes that c:\traces folder already exists. Also please make sure that there's enough free space on that drive (C: in this instance). 16 GB's of free space will be required to create and save 80 x 200 MB traces.
Notes 3: Eighty different files will be created with "servername_0000n_Date&time.pcap" syntax.
Example:
servername_00001_20120622134811.pcap
servername_00002_20120622135617.pcap
servername_00003_20120622141512.pcap
.
Notes 4: When all eighty files are created and full, it will start overwriting starting from the oldest trace file
Notes 5: Trace could be stopped any time by pressing Ctrl+C
8) It’s important to mark network traces with pings to be able to narrow down the time period that you need focus on in the trace. For example, you can ping the default gateway of the client just before and right after reproducing the problem.
Example1:
<<Start network trace on the client>>
ping -l 22 -n 2 IP-address-of-default-gateway
<<Reproduce the problem now. Example: Try to connect to www.microsoft.com from IE and once you get the “page not found” run the second ping>>
ping -l 33 -n 2 IP-address-of-default-gateway
<<Stop network trace on the client>>
Example2:
ping -l 22 -n 5 IP-address-of-the-file-server
start > run > \\server\share
<<assuming that it takes 5+ seconds to open up the share content. Once the share content is listed, please run the below command>>
ping -l 33 -n 5 IP-address-of-the-file-server
<<Please write down the following information : the exact date&time of this test / how long it took to display the share content / exact \\server\share that you accessed >>
dir \\server\share
<<Please write down the following information : how long it took to display the share content when you used "dir" command>>
ping -l 44 -n 5 IP-address-of-the-file-server
When you start analyzing a network trace collected in that fashion, you can easily focus on a certain range of packets in the trace. Example:
Packet1
Packet2
Packet3
Packet4
Packet5
<<22 bytes ICMP echo request>>
Packet6
Packet7
Packet8
Packet9
Packet10
<<33 bytes ICMP echo request>>
Packet11
Packet12
We know that the issue was reproduced between 22 and 33 bytes ping markers, we can only focus on the activity taking place between packet #6 and packet # 10. Consider that it was a 50000 packets trace, you now isolated the problem down to 5 packets. (you may not be always lucky that much J)
You might be wondering "how can I identify those 22 and 33 bytes ICMP packets in the network trace". Here's a trick that I generally use. I first apply the following Wireshark filters in the network trace:
ip.len==50 and icmp (to identify the 22 bytes ping)
ip.len==61 and icmp (to identify the 33 bytes ping)
9) One of the most important points that you need to take into consideration is collecting simultaneous network traces where possible. With “simultaneous network traces” I mean “collecting a network trace on the source and on the target systems at the same time”. That may not be always possible especially if one one of those systems is not controlled by you (example you’re troubleshooting a connectivity problem to a web site that belongs to another company)
Other than that, I cannot stress more how important it’s to collect simultaneous network traces. When troubleshooting network connectivity issues, you cannot conclude whether or not the target server received the packet, or it sent a response back to the source or the source received the response without simultaneous network traces. Similarly, in network performance issues, you cannot conclude whether or not the response delay stems from the network path in between or from target/source systems. Let me try to explain what I mean with a couple of examples:
We look at a client side network trace and see that the client sends 3 x TCP SYN segments to target without a response:
No. Time Delta Source Destination Protocol Info
141154 2011-03-31 16:52:29.488847 0.000000 192.168.4.71 10.1.1.1 TCP 37389 > 443 [SYN] Seq=0 Win=65535 Len=0
141158 2011-03-31 16:52:29.488847 0.000000 192.168.4.71 10.1.1.1 TCP 37389 > 443 [SYN] Seq=0 Win=65535 Len=0
144808 2011-03-31 16:52:29.801347 0.312500 192.168.4.71 10.1.1.1 TCP 37389 > 80 [SYN] Seq=0 Win=65535 Len=0
By looking at the client side trace, can you answer the following?
=> Did the target server really receive the above 3 TCP SYN segments?
=> Did the target server send a response back to the above TCP SYN segment?
=> Did the target server really send the response and we didn’t see it at the client side?
All the answers are NO. You cannot say if the target server really received those TCP SYNs or received and sent a response back or didn’t send any response at all. To be able to correctly answer those questions, you will have to see the story from target server’s perspective by looking at a network trace collected on that system.
We look at a client side network trace and see that HTTP response is sent by the HTTP server after 4 seconds:
Time Delta Source Destination Protocol Info
16:57:37.537895 0.000000 192.168.4.71 10.17.200.49 TCP 45221 > 80 [SYN] Seq=0 Win=65535 Len=0 MSS=1460 SACK_PERM=1
16:57:37.787895 0.250000 192.168.4.71 10.17.200.49 TCP 45221 > 80 [ACK] Seq=1 Ack=1 Win=65535 [TCP CHECKSUM INCORRECT]
16:57:37.787895 0.000000 10.17.200.49 192.168.4.71 TCP 80 > 45221 [SYN, ACK] Seq=0 Ack=1 Win=5840 Len=0 MSS=1380
16:57:37.787895 0.000000 192.168.4.71 10.17.200.49 HTTP GET /images/downloads/cartoons/thumb_1.jpg HTTP/1.1
16:57:38.053520 0.265625 10.17.200.49 192.168.4.71 TCP 80 > 45221 [ACK] Seq=1 Ack=356 Win=6432 Len=0
16:57:42.084770 4.031250 10.17.200.49 192.168.4.71 HTTP HTTP/1.1 200 OK (JPEG JFIF image)
16:57:42.084770 0.000000 10.17.200.49 192.168.4.71 HTTP Continuation or non-HTTP traffic
16:57:42.084770 0.000000 192.168.4.71 10.17.200.49 TCP 45221 > 80 [ACK] Seq=356 Ack=2761 Win=65535 [TCP CHECKSUM
16:57:42.350395 0.265625 10.17.200.49 192.168.4.71 HTTP Continuation or non-HTTP traffic
=> Does the 4 second delay come from the target server or a network device running in between?
=> Did the target server wait for 4 seconds before responding or did it immediately send a response back but we see it after 4 seconds at the client side?
All the answers are NO. You cannot say if that 4 seconds delay really comes from the target web server or network device (web proxy for example) running in between. To be able to correctly answer those questions, you will have to see the story from target server’s perspective by looking at a network trace collected on that system.