nettracer

  • Cumulative update version information for Lync client and Lync server

    [Last updated on 27th February, 2014]

     

    Hi there,

     

    You can find a list of CU versions compiled from KB articles for cumulative update packages released for Lync client and Lync servers. Please note that the list only includes version information for Lync 2010 (desktop edition), Lync Server 2010, Lync 2013 (desktop edition) and Lync Server 2013 and it excludes interim updates, security updates, updates for non-desktop clients like WP8, iOS devices, Android, Lync phone edition)

     

    Lync 2010

    CU # Description
    CU1

    http://support.microsoft.com/?kbid=2467763

    Description of the cumulative update package for Lync 2010: January 2011

    (Version: 4.0.7577.108)

    CU2

    http://support.microsoft.com/?kbid=2496325

    Description of the cumulative update package for Lync 2010: April 2011

    (Version: 4.0.7577.253)

    CU3

    http://support.microsoft.com/kb/2551268

    Description of the cumulative update package for Lync 2010: May 2011

    (Version: 4.0.7577.280)

    CU4

    http://support.microsoft.com/?kbid=2571543

    Description of the cumulative update package for Lync 2010: July 2011

    (Version: 4.0.7577.314)

    CU5

    http://support.microsoft.com/?kbid=2514982

    Description of the cumulative update package for Lync 2010: November 2011

    (Version: 4.0.7577.4051)

    CU6

    http://support.microsoft.com/kb/2670326

    Description of the cumulative update package for Lync 2010: February 2012

    (Version: 4.0.7577.4072)

    CU7

    http://support.microsoft.com/kb/2701664

    Description of the cumulative update package for Lync 2010: June 2012

    (Version: 4.0.7577.4103)

    CU8

    http://support.microsoft.com/kb/2737155

    Description of the cumulative update package for Lync 2010: October 2012

    (Version: 4.0.7577.4356)

    CU9

    http://support.microsoft.com/kb/2791382

    Description of the cumulative update package for Lync 2010: March 2013

    (Version: 4.0.7577.4378

    CU10

    http://support.microsoft.com/kb/2815347

    Description of the cumulative update package for Lync 2010: April 2013

    (Version: 4.0.7577.4384)

    CU11

    http://support.microsoft.com/kb/2842627

    Description of the cumulative update package for Lync 2010: July 2013

    (Version: 4.0.7577.4398)

    CU12

    http://support.microsoft.com/kb/2884632

    Description of the cumulative update package for Lync 2010: October 2013

    (Version: 4.0.7577.4409)

    CU13

    http://support.microsoft.com/kb/2912208

    Description of the cumulative update package for Lync 2010: January 2014

    (Version: 4.0.7577.4419)

     

     

    Lync Server 2010

    CU# Description
    CU1

    http://support.microsoft.com/kb/2467775

    Description of the cumulative update for Lync Server 2010, Core Components: January 2011

    (Version: 4.0.7577.108)

    CU2

    http://support.microsoft.com/kb/2500442

    Description of the cumulative update for Lync Server 2010: April 2011

    (Version: 4.0.7577.137)

    CU3

    http://support.microsoft.com/kb/2571546

    Description of the cumulative update for Lync Server 2010: July 2011

    (Version: 4.0.7577.166)

    CU4

    http://support.microsoft.com/kb/2514980

    Description of the cumulative update for Lync Server 2010: November 2011

    (Version: 4.0.7577.183)

    CU5

    http://support.microsoft.com/kb/2670352

    Description of the cumulative update for Lync Server 2010: February 2012

    (Version: 4.0.7577.190)

    CU6

    http://support.microsoft.com/kb/2701585

    Description of the cumulative update for Lync Server 2010: June 2012

    (Version: 4.0.7577.199)

    CU7

    http://support.microsoft.com/kb/2737915

    Description of the cumulative update for Lync Server 2010: October 2012

    (Version: 4.0.7577.203)

    CU8

    http://support.microsoft.com/kb/2791381

    Description of the cumulative update for Lync Server 2010: March 2013 

    (Version: 4.0.7577.216)

    CU9

    http://support.microsoft.com/kb/2860700

    Description of the cumulative update for Lync Server 2010: July 2013

    (Version: 4.0.7577.217)

    CU10

    http://support.microsoft.com/kb/2889610

    Description of the cumulative update for Lync Server 2010: October 2013 

    (Version: 4.0.7577.223)

    CU11

    http://support.microsoft.com/kb/2909888

    Description of the cumulative update for Lync Server 2010: January 2014

    (Version: 4.0.7577.225)

      

    Lync 2013

    CU# Description
    CU1

    http://support.microsoft.com/kb/2812461

    Description of the Lync 2013 updates 15.0.4454.1509: February 2013

    CU2

    http://support.microsoft.com/kb/2817465

    MS13-054: Description of the security update for Lync 2013: July 9, 2013

    (Version: 15.0.4517.1004)

    CU3

    http://support.microsoft.com/kb/2825630

    Description of the Lync 2013 update 15.0.4551.1005: November 7, 2013

     CU4

    http://support.microsoft.com/kb/2850057

    MS13-096: Description of the security update for Lync 2013: December 10, 2013

    (Version: 15.0.4551.1007)

    CU5

    http://support.microsoft.com/kb/2817430

    Description of Microsoft Office 2013 Service Pack 1 (SP1)

    (Version: 15.0.4569.1503)

     

     

    Lync Server 2013  

    CU# Description
    CU1

    http://support.microsoft.com/kb/2781547

    Description of the cumulative update 5.0.8308.291 for Lync Server 2013: February 2013

    CU2

    http://support.microsoft.com/kb/2819565

    Description of the cumulative update 5.0.8308.420 for Lync Server 2013: July 2013

    CU3

    http://support.microsoft.com/kb/2881684

    Description of the cumulative update 5.0.8308.556 for Lync Server 2013

    (Front End Server and Edge Server) : October 2013

    CU4

    http://support.microsoft.com/kb/2905048

    Description of the cumulative update 5.0.8308.577 for Lync Server 2013 (Front End Server and Edge Server): January 2014

     

    Hope this helps

     

    Thanks,

    Murat 

  • Syn attack protection on Windows Vista, Windows 2008, Windows 7, Windows 2008 R2, Windows 8/8.1, Windows 2012 and Windows 2012 R2

    [Last updated: 13th January 2014]

     

    Hi,

    In this blog entry, I wanted to talk about some changes made in Syn attack protection on Windows Vista onwards systems.

    Syn attack protection has been in place since Windows 2000 and is enabled by default since Windows 2003/SP1. In the earlier implementation (Windows 2000/Windows 2003), syn attack protection mechanism was configurable via various registry keys (like SynAttackProtect, TcpMaxHalfOpen, TcpMaxHalfOpenRetried, TcpMaxPortsExhausted). With this previous version of syn attack protection, TCPIP stack starts dropping new connection requests when the threshold values are met regardless of how much system memory or CPU power available to the system. As of Windows Vista and onwards (Vista/2008/Win 7/2008 R2/Windows 8/Windows 2012/Windows 2012 R2), syn attack protection algorithm has been changed in the following ways:

    1) SynAttack protection is enabled by default and cannot be disabled!
     
    2) SynAttack protection dynamically calculates the thresholds (of when it considers an attack has started) based on the number of CPU cores and memory available and hence it doesn’t expose any configurable parameters via registry, netsh etc.
     
    3) Since TCPIP driver goes into attack state based on the number of CPU cores and the amount of memory available, systems with more resources will start dropping new connection attempts later compared to systems with less resources. That was hard-coded (as per the configured registry settings) on pre-Vista systems where the system was moved to attack state regardless of how much resources were available to the system. The new algorithm eliminates the need of any fine tuning and TCPIP stack will self-tune to best values possible depending on the available resources.

    One of the questions asked most about TCP Syn attack protection is how an administrator could identify if a server has moved into attack state. Currently there's no event logged whether or not the system has entered into attack state and started dropping TCP Syn packets on Vista and later systems. The only way of understanding that syn attack protection has kicked in is to collect an ETL trace (and you need start it before the attack starts so that you can see the relevant TCPIP ETL entry).

    The command that you need to run is the following from an elevated command prompt (Note: "netsh trace" command only works on Windows 7/Windows 2008 R2 and later systems)

     

    netsh trace start capture=yes provider=Microsoft-Windows-TCPIP level=0x05 tracefile=TCPIP.etl

     

    Once Syn attack starts, the ETL trace could be stopped with the below command:

     

    netsh trace stop

     

    Then you can open it up with Network Monitor 3.4. The ETL entry that you should be looking for is the below one:

    Hope this helps

    Thanks,
    Murat

  • Why doesn't IPReassemblytimeout registry key take effect on Windows 2000 or later systems?

    Hi,

    I had to deal with a number of support cases where IPReassemblytimeout reg key was set but didn't take effect on Windows 2003 or a later system and I thought I should be sharing more information about this here. Here are some details:

    IP fragmentation is needed when an upper layer packet whose payload is bigger than the IP MTU needs to be sent to a destination. This could happen when the packet initially leaves the host or could happen when a router needs to forward a packet that it received through one interface (with a bigger MTU) to another interface (Smaller MTU). That also happens when packets need to traverse VPN links where there's an additional VPN related overhead causing the original packet to be fragmented.

    The final receiver of the fragmented IP packets re-assembles those fragments and forms the original packet before passing to upper layer protocols. Receiver waits for a period called "re-assembly timeout" for all the fragments that belong to the original packet to be received. If any one of the fragments is dropped on the way, receiver drops the other fragments belonging to the original packet.

    In NT 3.1, there was a registry key called "IPReassemblytimeout" (which is referred to by KB 102973). But that registry key doesn't apply to Windows 2000, XP, 2003, Vista, 2008, Windows 7 or 2008 R2!

    Some more facts:

    1) IP Re-assembly timeout is hardcoded on Windows 2000, Windows 2003, Windows XP, Windows Vista, Windows 2008, Windows 7 and Windows 2008 R2 and cannot be changed by any means (registry, netsh etc)

    2) For Windows Vista, Windows 2008, Windows 7 and Windows 2008 R2, it's hardcoded to 60 seconds as per section 4.5 of RFC 2460:

    "If insufficient fragments are received to complete reassembly of a packet within 60 seconds of the reception of the first-arriving fragment of that packet, reassembly of that packet must be abandoned and all the fragments that have been received for that packet must be discarded."

    For more information, please see RFC2460 (http://www.ietf.org/rfc/rfc2460.txt)

    3) For Windows 2000, Windows XP and Windows 2003, it's at least 60 seconds but it may be higher depending on the value of TTL in the IP header and it may go up to 120 seconds.

    Hope this helps..

    Thanks,
    Murat

  • Do you still set EnablePMTUDiscovery to 0?

    Hi,

    In this blog post, I would like to talk about a misconfiguration which is still in place on many customer installations. I dealt with many network performance issues where the problem was stemming from using a small MTU size (576 bytes) when communicating with off the subnet hosts.

    PMTU discovery option helps communicating endpoints find the most optimum MTU in a TCP session. If this feature is turned off, MTU is set to 576 bytes for all communication with off the subnet hosts. This might badly impact the performance while communicating with the remote hosts.

    By default PMTU Discovery is enabled (EnablePMTUDiscovery is set to 1) but due to some older security recommendations, it is set to to 0 as part of server hardening. The reason behind setting that registry key to 0 was to prevent an attacker from forcing Windows to use very small MTU values to decrease the performance.

    But that security recommendation is not a valid recommendation anymore as of MS05-019. After that security update, it’s not a security concern anymore because an attacker cannot set MTU size lower than 576 even if PMTU Discovery is enabled. So it shouldn't be set to 0 for security reasons as part of server hardening. This causes performance loss where there's no security concern in terms of small MTU usage.

    You can find more information about the changed behavior at the below article:

    http://www.microsoft.com/technet/security/bulletin/MS05-019.mspx
    Vulnerabilities in TCP/IP Could Allow Remote Code Execution and Denial of Service (893066)

    (From General Information > Vulnerability Details > ICMP Path MTU Vulnerability > Faq for ICMP Path MTU Vulnerability at the above link)

    What is Path MTU Discovery?
    Path maximum transmission unit (PMTU) discovery is the process of discovering the maximum size of packet that can be sent across the network between two hosts without fragmentation (that is, without the packet being broken into multiple frames during transmission). It is described in RFC 1191. For more information, see RFC 1191. For additional information, see the following MSDN Web site.

    What is wrong with the Path MTU Discovery process?
    Path maximum transmission unit (PMTU) discovery allows an attacker to specify a value that can degrade network performance for other connections. On unsecured networks, allowing PMTU discovery carries the risk that an attacker might force the MTU to a very small value and overwork the local system's TCP/IP stack. Normally this behavior would be restricted to the single connection that an attacker could establish. However, this vulnerability allows an attacker to modify the MTU value on other connections beyond their own connection to the affected system.

    What does the update do?
    The update removes the vulnerability by restricting the minimum value of the MTU to 576 bytes. This update also modifies the way that the affected operating systems validate ICMP requests.

    Thanks,
    Murat

  • Why doesn't Windows 2008 server negotiate TCP MSS smaller than 536 bytes?

    Hi,

    In today's blog, I'll talk about an MTU issue that occurs on Windows Vista onwards (Vista/7/2008/2008 R2).

    One of our customers reported that their SMTP server (running on Windows 2008) was failing to send e-mails to certain remote SMTP servers because e-mail delivery was disrupted at transport layer.

    After analyzing the network trace collected on the source Windows 2008 Server, we found out that the remote system was offering a TCP MSS size of 512 bytes and Windows 2008 server kept sending the data packets with an MSS size of 536 bytes. As a result, those packets weren't succesfully delivered to the remote system. You can find more details about the problem and root cause below:


    Note: IP addresses and mail server names are deliberately changed.

    Source SMTP server: 10.1.1.1 - mailgateway.contoso.com
    Target SMTP server: 10.1.1.5 - mailgateway2.contoso.com


    a) Source SMTP server establishes TCP 3-way handshake with the target SMTP server. Source server suggests an MSS size of 1460 bytes and the target suggests an MSS size of 512 bytes:


    No.     Time        Source        Destination    Protocol   Info
          1 0.000000    10.1.1.1       10.1.1.5        TCP      28474 > 25 [SYN] Seq=0 Win=8192 Len=0 MSS=1460 WS=8
          2 0.022001    10.1.1.5       10.1.1.1        TCP      25 > 28474 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=512
          3 0.000000    10.1.1.1       10.1.1.5        TCP      28474 > 25 [ACK] Seq=1 Ack=1 Win=65392 Len=0


    b) Then data starts flowing. Under normal circumstances, the minimum of MSS will be selected as the MSS of the given TCP session by both parties and it will be used throughout the session.


    No.     Time        Source        Destination    Protocol   Info
          4 0.075005    10.1.1.5       10.1.1.1        SMTP     S: 220 mailgateway2.contoso.com ESMTP Tue, 20 Apr 2010 15:18:42 +0200
          5 0.001000    10.1.1.1       10.1.1.5        SMTP     C: EHLO Mailgateway.contoso.com
          6 0.021001    10.1.1.5       10.1.1.1        SMTP     S: 250-mailgateway2.contoso.com Hello Mailgateway.contoso.com [10.1.1.1] | 250-SIZE 26214400 | 250-PIPELINING | 250 HELP
          7 0.001000    10.1.1.1       10.1.1.5        SMTP     C: MAIL FROM:<
    postmaster@contoso.com> SIZE=2616 | RCPT TO:<test@test.abc.com>
          8 0.183011    10.1.1.5       10.1.1.1        SMTP     S: 250 OK | 250 Accepted
          9 0.000000    10.1.1.1       10.1.1.5        SMTP     C: DATA
         10 0.022001    10.1.1.5       10.1.1.1        SMTP     S: 354 Enter message, ending with "." on a line by itself

    c) Even though an MSS size of 512 should be commonly agreed by both parties, Windows 2008 server doesn't seem to be using that value and keeps sending data with an MSS of 536 bytes:

    No.     Time        Source        Destination    Protocol   Info
         11 0.294017    10.1.1.1       10.1.1.5        SMTP     C: Message Body, 536 bytes

    d) Most likely the TCP segment with 536 bytes of data doesn't arrive at the target server and we don't get a TCP ACK back as a result so we start TCP packet retransmissions:

    No.     Time        Source        Destination    Protocol   Info
         12 0.600034    10.1.1.1       10.1.1.5        SMTP     [TCP Retransmission] C: Message Body, 536 bytes
         13 0.190011    10.1.1.5       10.1.1.1        SMTP     [TCP Retransmission] S: 354 Enter message, ending with "." on a line by itself
         14 0.000000    10.1.1.1       10.1.1.5        TCP      [TCP Dup ACK 12#1] 28474 > 25 [ACK] Seq=649 Ack=269 Win=65124 Len=0
         15 1.010058    10.1.1.1       10.1.1.5        SMTP     [TCP Retransmission] C: Message Body, 536 bytes
         16 2.400137    10.1.1.1       10.1.1.5        SMTP     [TCP Retransmission] C: Message Body, 536 bytes
         17 4.800274    10.1.1.1       10.1.1.5        SMTP     [TCP Retransmission] C: Message Body, 536 bytes

    e) Finally the source server closes the TCP session as it fails to successfully deliver the 536 bytes TCP segment to the target system:

    No.     Time        Source        Destination    Protocol   Info
         18 9.600550    10.1.1.1       10.1.1.5        TCP      28474 > 25 [RST, ACK] Seq=649 Ack=269 Win=0 Len=0


    The same problem doesn't happen if the source server is a Windows 2003 server.

    After explaining the problem, now let's try to understand the root cause:

    This issue stems from the fact that Windows Vista onwards systems don't accept an MTU size lower than 576 bytes:


    TCP/IP Registry Values for Microsoft Windows Vista and Windows Server 2008
    http://www.microsoft.com/downloads/details.aspx?familyid=12AC9780-17B5-480C-AEF7-5C0BDE9060B0&displaylang=en

    MTU
    Key:  Tcpip\Parameters\Interfaces\interfaceGUID
    Value Type: REG_DWORD—number
    Valid Range: From 576 to the MTU of the underlying network
    Default: 0xFFFFFFFF
    Description: This value overrides the default Maximum Transmission Unit (MTU) for a network interface. The MTU is the maximum IP packet size, in bytes, that can be transmitted over the underlying network. For values larger than the default for the underlying network, the network default MTU is used. For values smaller than 576, the MTU of 576 is used. This setting only applies to IPv4.
    Note: Windows Vista TCP/IP uses path MTU (PMTU) detection by default and queries the network adapter driver to find out what local MTU is supported. Altering the MTU value is typically not necessary and might result in reduced performance.


    Since minimum MTU that could be used by a Window Vista onwards system is 576 bytes, a TCP MSS (maximum segment size) should be 536 bytes at miminum so that's why Windows 2008 source server tries to send TCP segments with 536 bytes of data. TCP MSS value is calculated as follows:

    TCP MSS = IP MTU - IP header size (20 bytes by default) - TCP header size (20 bytes by default)


    Hope this helps

    Thanks,
    Murat

  • Bogus IP packets and Wireshark

    Hi there,

     

    In today’s blog post, I’m going to talk about an issue that I have come across several times while analyzing network traces with Wireshark. Let’s take the following example:

     

    I apply the following filter on a network trace:

     

    ip.addr==192.168.100.23 and ip.addr==192.168.121.51 and tcp.port==3268 and tcp.port==8081

     

    And I get the following traffic:

     

    No.     Time        Source                Destination           Protocol Info

       8773 17.458870   192.168.100.23        192.168.121.51        TCP      3268 > 8081 [SYN] Seq=0 Win=65535 Len=0 MSS=1460

       8774 17.458988   192.168.121.51        192.168.100.23        TCP      8081 > 3268 [SYN, ACK] Seq=0 Ack=1 Win=8192 [TCP CHECKSUM INCORRECT] Len=0 MSS=1460

       8775 17.459239   192.168.100.23        192.168.121.51        TCP      3268 > 8081 [ACK] Seq=1 Ack=1 Win=65535 Len=0

       8776 17.459239   192.168.100.23        192.168.121.51        TCP      3268 > 8081 [PSH, ACK] Seq=1 Ack=1 Win=65535 Len=264

       8850 17.658922   192.168.121.51        192.168.100.23        TCP      8081 > 3268 [ACK] Seq=1 Ack=265 Win=64240 [TCP CHECKSUM INCORRECT] Len=0

       8851 17.659108   192.168.100.23        192.168.121.51        TCP      3268 > 8081 [PSH, ACK] Seq=265 Ack=1 Win=65535 Len=21

       8853 17.661356   192.168.100.23        192.168.121.51        TCP      [TCP ACKed lost segment] 3268 > 8081 [ACK] Seq=286 Ack=2581 Win=65535 Len=0

       8854 17.661404   192.168.121.51        192.168.100.23        TCP      8081 > 3268 [FIN, ACK] Seq=2581 Ack=286 Win=64219 [TCP CHECKSUM INCORRECT] Len=0

       8855 17.661605   192.168.100.23        192.168.121.51        TCP      3268 > 8081 [ACK] Seq=286 Ack=2582 Win=65535 Len=0

       8859 17.665981   192.168.100.23        192.168.121.51        TCP      3268 > 8081 [FIN, ACK] Seq=286 Ack=2582 Win=65535 Len=0

       8860 17.666013   192.168.121.51        192.168.100.23        TCP      8081 > 3268 [ACK] Seq=2582 Ack=287 Win=64219 [TCP CHECKSUM INCORRECT] Len=0

     

    When I take a closer look, I see that a TCP segment is missing from the list of packets and hence the next frame is displayed with a [TCP ACKed lost segment] comment by Wireshark. Interestingly if I apply the following filter, I can see the frame that’s missing from the TCP conversation:

     

    ip.addr==192.168.100.23 and ip.addr==192.168.121.51

     

    No.     Time        Source                Destination           Protocol Info

       8852 17.661030   HewlettP_12:34:56     Cisco_12:34:56        IP       Bogus IP length (0, less than header length 20)

     

    Frame 8852 (2634 bytes on wire, 2634 bytes captured)

    Ethernet II, Src: HewlettP_12:34:56 (00:17:a4:12:34:56), Dst: Cisco_12:34:56 (00:15:2c:12:34:56)

        Destination: Cisco_12:34:56 (00:15:2c:12:34:56)

        Source: HewlettP_12:34:56 (00:17:a4:12:34:56)

        Type: IP (0x0800)

    Internet Protocol

        Version: 4

        Header length: 20 bytes

        Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00)

            0000 00.. = Differentiated Services Codepoint: Default (0x00)

            .... ..0. = ECN-Capable Transport (ECT): 0

            .... ...0 = ECN-CE: 0

        Total length: 0 bytes (bogus, less than header length 20)

     

    0000  00 15 2c 31 48 00 00 17 a4 77 00 24 08 00 45 00   ..,1H....w.$..E.

    0010  00 00 57 d0 40 00 80 06 00 00 c0 a8 79 33 c0 a8   ..W.@.......y3..

    0020  64 17 1f 91 0c c4 52 83 a3 f2 a2 a2 06 be 50 18   d.....R.......P.

    0030  fa db 5e a2 00 00 48 54 54 50 2f 31 2e 31 20 32   ..^...HTTP/1.1 2

    0040  30 30 20 4f 4b 0d 0a 50 72 61 67 6d 61 3a 20 6e   00 OK..Pragma: n

    0050  6f 2d 63 61 63 68 65 0d 0a 43 6f 6e 74 65 6e 74   o-cache..Content

    0060  2d 54 79 70 65 3a 20 74 65 78 74 2f 68 74 6d 6c   -Type: text/html

    0070  3b 63 68 61 72 73 65 74 3d 75 74 66 2d 38 0d 0a   ;charset=utf-8..

    0080  53 65 72 76 65 72 3a 20 4d 69 63 72 6f 73 6f 66   Server: Microsof

    0090  74 2d 49 49 53 2f 37 2e 35 0d 0a 58 2d 50 6f 77   t-IIS/7.5..X-Pow

    ...

     

    Even though the total length field is set to 0, I see that the IP packet has some payload (probably including a TCP header).

     

    The problem occurs because the Wireshark doesn’t fully parse the IP and TCP headers because of total length field in the IP header is 0. This also explains why we don’t see the same packet when TCP filter is applied.

     

    After some testing, I realized that this issue could be fixed by setting the following value in Wireshark settings:

     

      

    After I enable “Support packet-capture from IP TSO-enabled hardware”, Wireshark also started to correctly display the frames even when the TCP session filter is applied:

     

    ip.addr==192.168.100.23 and ip.addr==192.168.121.51 and tcp.port==3268 and tcp.port==8081

     

    No.     Time        Source                Destination           Protocol Info

       8771 17.458870   192.168.100.23        192.168.121.51        TCP      3268 > 8081 [SYN] Seq=0 Win=65535 Len=0 MSS=1460 SACK_PERM=1

       8772 17.458988   192.168.121.51        192.168.100.23        TCP      8081 > 3268 [SYN, ACK] Seq=0 Ack=1 Win=8192 [TCP CHECKSUM INCORRECT] Len=0 MSS=1460 SACK_PERM=1

       8773 17.459239   192.168.100.23        192.168.121.51        TCP      3268 > 8081 [ACK] Seq=1 Ack=1 Win=65535 Len=0

       8774 17.459239   192.168.100.23        192.168.121.51        TCP      3268 > 8081 [PSH, ACK] Seq=1 Ack=1 Win=65535 Len=264

       8848 17.658922   192.168.121.51        192.168.100.23        TCP      8081 > 3268 [ACK] Seq=1 Ack=265 Win=64240 [TCP CHECKSUM INCORRECT] Len=0

       8849 17.659108   192.168.100.23        192.168.121.51        TCP      3268 > 8081 [PSH, ACK] Seq=265 Ack=1 Win=65535 Len=21

       8850 17.661030   192.168.121.51        192.168.100.23        TCP      8081 > 3268 [PSH, ACK] Seq=1 Ack=286 Win=64219 [TCP CHECKSUM INCORRECT] Len=2580

       8851 17.661356   192.168.100.23        192.168.121.51        TCP      3268 > 8081 [ACK] Seq=286 Ack=2581 Win=65535 Len=0

       8852 17.661404   192.168.121.51        192.168.100.23        TCP      8081 > 3268 [FIN, ACK] Seq=2581 Ack=286 Win=64219 [TCP CHECKSUM INCORRECT] Len=0

       8853 17.661605   192.168.100.23        192.168.121.51        TCP      3268 > 8081 [ACK] Seq=286 Ack=2582 Win=65535 Len=0

       8857 17.665981   192.168.100.23        192.168.121.51        TCP      3268 > 8081 [FIN, ACK] Seq=286 Ack=2582 Win=65535 Len=0

       8858 17.666013   192.168.121.51        192.168.100.23        TCP      8081 > 3268 [ACK] Seq=2582 Ack=287 Win=64219 [TCP CHECKSUM INCORRECT] Len=0

     

    When TSO (TCP segmentation offloading) is in place, TCPIP stack doesn’t deal with segmentation at TCP layer and leave it to NIC driver for effienciency purposes. Since Wireshark does see the packet before the NIC, we see the total length as 0 in the packet but when that packet is segmented accordingly by the NIC, there will be correct length field set in the packet. This can also be proved by collecting a network trace at the other end of the session

     

    Note: Network Monitor already takes that into account and hence you don’t need to take any corrective action if you’re checking the trace with it.

     

    Hope this helps

     

    Thanks,

    Murat

     

     

  • TMG initiates active FTP connections to external servers even though it's configured for passive FTP - a problem with FTP over HTTP

    Hi there,

     

    In this blog post, I’ll be talking about another TMG problem where FTP over HTTP was failing through TMG server.

    Let me first summarize the scenario:

    - Internet Explorer clients need to connect to an external FTP site through TMG server

    - Due to some other requirements, this FTP site needs to be accessed passively

    FTP filter in TMG server already uses passive FTP when connecting to external FTP sites:

    (Note: And this is the default behavior, please see http://blogs.technet.com/b/yuridiogenes/archive/2010/03/16/error-502-active-ftp-not-allowed-when-trying-to-list-files-in-a-ftp-session-behind-forefront-tmg-2010.aspx for more information.

     

    That was also the case in my customer’s scenario but passive FTP connection to the target FTP server was still failing. After some troubleshooting, we found out that TMG server was trying to connect to the target FTP site actively even FTP filter was configured as above.

     

    Normally, when you type ftp://target-FTP-Server-FQDN in the IE address bar and IE is configured to use a Proxy server, the connection request will be sent as an HTTP request to the Proxy server (and the FTP GET request will be inside that HTTP request), this is also called FTP over HTTP. So the request flow will be similar to below:

     

    a) Client sends the request via FTP over HTTP to the Proxy server

    b) Proxy server connects to the target FTP server via FTP procotol

     

    After some further troubleshooting with TMG data packager and the network trace analysis, I found out that FTP filter wasn’t involved in when Proxy server receives FTP over HTTP traffic from clients and hence FTP filter setting doesn’t apply to FTP over HTTP requests.

     

    The resolution was to set the NonPassiveFTPTransfer registry key on the TMG server and restart the firewall service:

     

    Note: You can find more information about that registry key at http://support.microsoft.com/kb/300641 How to enable passive CERN FTP connections through ISA Server 2000, 2004, or 2006

     

    As mentioned above, after the registry key is created, you’ll need to stop and then start firewall service from an elevated command prompt:

     

    net stop fwsrv

    net start fwsrv

     

    To summarize; even though “NonPassiveFTPTransfer” registry key shouldn’t be needed for TMG server, the exact requirements are as follows:

     

    a) If the internal client sends the FTP request directly through FTP procotol, there’s no need to change anything on TMG server side as the FTP filter will kick in and the FTP connection to the external FTP server will be initiated passively (Examples: Command prompt FTP client, 3rd party FTP client applications, IE which isn’t configured to use a Proxy server etc)

     

    b) If the internal client sends the FTP request through FTP over HTTP procotol, then the changes mentioned above needs to be implemented on TMG server side in order for TMG server to initiate the outbound FTP connection passively (Example: IE which is configured to use a Proxy server)

     

    Hope this helps

     

    Thanks,

    Murat

     

  • Do we support "Policy based routing" on Windows Server operating systems?

    Hi there,

    In one of the past cases, one of our customers wanted to know if we supported policy based routing on Windows 2003 or later OSes. First of all, it might be useful to clarify what "policy based routing" means in this context. Let's take the following as an example:

    "A server is running as a router and have 3 network interfaces. When the server receives a packet from a specific host (let’s say running at a certain IP address) from one of its interfaces (say interface1), we would like the server to always route that host’s packets through interface2 without consulting the routing table. (The criteria might be different for different scenarios such as "all packets with a destination of TCP port 80 to be sent out from interface3 etc)"

    This kind of advanced routing decisions are generally supported by network hardware vendors like Cisco. For example, by using route-map configuration in Cisco IOS, you can affect the conventional routing decisions made by looking up the routing table. You can find more information on that at the following link:

    http://www.cisco.com/en/US/docs/ios/12_0/qos/configuration/guide/qcpolicy.html Configuring Policy-Based Routing


    And the answer to the original question is: No, we don't support policy based routing on Windows server OSes since this is generally a feature that would be needed on hardware routers whose main purpose is to do packet routing.

    Hope this helps

    Thanks,
    Murat

  • Running Lync 2013 WebApp plugin in locked down Terminal server environments

    Hi there,

     

    In this blog post, I would like to talk about running Lync 2013 Webapp in Windows Terminal server environments. Lync 2013 Webapp feature has a client side plug-in which provides audio/video/application sharing functionality and this plug is installed per user, in other words installation program installs files and creates registry settings in user specific areas of the system. Most of terminal server environments are locked down in production networks and users are generally not allowed to install softwares.

    I recently dealt with a couple of cases where it was required to find a solution to this problem. One possible solution is to create exceptions in your software restriction softwares (it could be a 3rd party software or it could be a Microsoft solution (Software restriction policies or Applocker)). You will find steps below to create such exceptions in Software restriction policies and applocker:

     

    Software Restriction policies (That could be applied if the Terminal server is Windows 2003 and later)

    Note: Please note that we don't support Lync Webapp on Windows 2003, it's supported on Windows 2008 or later. Please see the below link for more details:

    http://technet.microsoft.com/en-us/library/gg425820.aspx Lync Web App Supported Platforms

     

    a) First of all, Lync webapp plugin (LWAPlugin64BitInstaller32.msi) file includes a number of executables each of which needs to be defined within software restriction policy rules. You can extract the MSI file itself with 7zip or a similar tool. Once the msi file is extracted, we have the following executables:

     

    AppSharingHookController.exe

    AppSharingHookController64.exe

    LWAPlugin.exe

    LWAVersionPlugin.exe

     

    b) So we need to create 5 additional rules (1 MSI rule and 4 executable file rules) in Software restriction policies in addition to your existing software restriction policy rules as given below:

     

     

    Note: It’s best to create File hash rules for the MSI file itself or the other 4 executables that are extracted from MSI file

     

    So if Software restriction policies is already deployed on your network, there could be an exception created for the Lync web app plugin so users will still comply with the application installation/execution policies

     

    Applocker: (That could be applied if the Terminal server is Windows 2008 R2 or later)

     

    a) As mentioned in previous scenario, Lync webapp plugin (LWAPlugin64BitInstaller32.msi) file includes a number of executables each of which needs to be defined within applocker rules. You can extract the MSI file itself with 7zip or a similar tool. Once the msi file is extracted, we have the following executables:

     

    AppSharingHookController.exe

    AppSharingHookController64.exe

    LWAPlugin.exe

    LWAVersionPlugin.exe

     

    b) So we need to create 1 MSI rule and 4 executable file rules in applocker as given below:

     

     

      

    Note: It’s best to create File hash rules for the MSI file itself or the other 4 executables that are extracted from MSI file

     

    So if Applocker is deployed on your network, there could be an exception created for the Lync web app plugin so users will still comply with the application installation/execution policies

     

    The only drawback in regards to file hash rules is that once Lync server Web components are updated with a cumulative update on Lync server side, you’ll have to create those file hash rules one more time (because probably the content of msi file that is shared from FE server will be different and hence the file hash will change) but considering that the web components are not frequently updated this may need to be done 2 or 3 times in a year. Alternatively there could be file path rule or publisher rule created instead of a file hash rule to avoid such maintenance.

     

    Hope this helps

     

    Thanks,

    Murat

  • Java applications and TMG access rules that require authentication

    Hi there,

    In this blog post, I’ll be talking about a TMG related issue. Actually it’s not an issue that stems from TMG itself but the way TMG server is configured (using authenticated rules on TMG server) triggers the problem.

     

    This is already a known fact and we have a KB article that explains this issue (JVM applications cannot send authentication information when requested) and the workaround is to turn off authentication for the access rule that will allow the client’s connection to external networks:

     

    http://support.microsoft.com/kb/925881/ An ISA server or Forefront Threat Management Gateway server requests credentials when client computers in the same domain use Internet Explorer to access Web sites that contain Java programs

     

    So if you see all or some parts of a web page is not displayed correctly and you see Proxy authentication required or similar messages on the client side and you suspect that Java is involved somehow you’ll have to implement the steps mentioned at the above article.

    But sometimes it may not be that clear which was the case in my scenario. The customer reported that videos at an on demand video conference site weren’t successfully viewed and the application running inside IE was displaying an unrelated error. I suspected that we were hitting the problem mentioned above and then requested the customer to configure a temp access rule to allow all outbound access for “All users”, then the videos started to play J

    Then we changed the rule target to the target web site only (you can do this via a URL set (for HTTP/HTTPS access) or via domain name set (for any protocols), you can find more information below:

    http://technet.microsoft.com/en-us/library/cc441706.aspx Processing domain name sets and URL sets

    Since the customer was connecting to https://www.videoondemandwebsite.com , we have added this domain to the rule target. But afterwards the video access was still failing. Then we decided to collect more information on what kind of http activity was taking place on the client side. I asked the customer to install Fiddler on the client to see this activity (you can download the tool at http://www.fiddler2.com)

    You’ll find below a sample screen shot taken from Fiddler which was taken when accessing Microsoft’s web site:

     

     

     

    As you can see from the above output, even if you see a certain address in IE (www.microsoft.com in this example), the browser might need to connect to other related web sites to load some images, to get a script, etc etc. In the above example, browser also connects to ads1.msn.com or rad.msn.com ...

     

    That was the case in my customer problem, even though the customer was connecting to https://www.videoondemandwebsite.com, the browser was connecting to a few other web sites like *.site1.com and *.site2.com. So we changed the relevant rule to cover these domain names as well and the problem was resolved:

     

    *.videoondemandwebsite.com

    *.site1.com

    *.site2.com

     

    Hope this helps

     

    Thanks,

    Murat

  • Exchange integration error on HP4120 desktop phones - A CA trust issue

    I would like to go through an integration problem between Lync phone edition devices and Exchange 2010 that I worked on a while ago. Since the integration wasn’t working properly, users couldn’t access call logs, recorded voice mails, calendar information etc from their desktop phones (HP4120).

     

    To understand the problem in more details, we asked our customer to collect Exchange side configuration information, phone edition logs from a problem device, network trace from Lync FE server etc.

     

    Note: You can find full details on how to collect and analyze CELog data in the following Microsoft whitepaper:

    http://www.microsoft.com/en-us/download/details.aspx?id=15668 Understanding and Troubleshooting Microsoft Exchange Server Integration

     

    Troubleshooting:

    ===============

    Note: IP addresses, server names, URLs etc are replaced for privacy purposes

     

    1) Lync phone edition device succeeds to obtain autodiscovery information:

    ...

    0:01:43.203.610 : Raw data  211 (char), UCD_LOG_INFO: 10/19/2012|14:50:27 Aries: 10/19/2012|14:50:27.214 4EC0006:5250012 INFO  :: NAutoDiscover::DnsAutodiscoverTask::ParseSoapResponse: InternalEwsUrl is https://casarray.contoso.com/EWS/Exchange.asmx

                   

    0:01:43.204.593 : Raw data  212 (char), UCD_LOG_INFO: 10/19/2012|14:50:27 Aries: 10/19/2012|14:50:27.215 4EC0006:5250012 INFO  :: NAutoDiscover::DnsAutodiscoverTask::ParseSoapResponse: ExternalEwsUrl is https://autodiscover.contoso.com/EWS/Exchange.asmx

    ...

     

    2) But the Lync phone edition device fails to access the EWS site with the following error (the same error is seen in all device logs) so the integration error occurs:

     

    0:01:43.840.224 : Raw data  206 (char), UCD_LOG_ERROR: 10/19/2012|14:50:27 Aries: 10/19/2012|14:50:27.850 4EC0006:5250012 ERROR :: WebServices::CSoapTransport::ExecuteSoapOperation: Insecure server. errorCode=12045, status=014C0220, hr=80f10043

    0:01:43.840.617 : Raw data  196 (char), UCD_LOG_ERROR: 10/19/2012|14:50:27 Aries: 10/19/2012|14:50:27.851 4EC0006:5250012 ERROR :: WebServices::CSoapTransport::SendSoapRequest: ExecuteSoapOperation failed. status=014C0220, hr=80f10043

    0:01:43.840.963 : Raw data  225 (char), UCD_LOG_INFO: 10/19/2012|14:50:27 Aries: 10/19/2012|14:50:27.851 4EC0006:5250012 INFO  :: WebServices::WebRequestImpl::ExecuteCommon: after SendSoapRequest. hr=0x80f10043, url=https://casarray.contoso.com/EWS/Exchange.asmx

    0:01:43.841.216 : Raw data  191 (char), UCD_LOG_INFO: 10/19/2012|14:50:27 Aries: 10/19/2012|14:50:27.851 4EC0006:5250012 INFO  :: WebServices::CSoapTransport::GetHttpHeaderInfoFromHandle: hSoapHandle=014C0220, headerInfo=013E2708

    0:01:43.841.548 : Raw data  197 (char), UCD_LOG_INFO: 10/19/2012|14:50:27 Aries: 10/19/2012|14:50:27.852 4EC0006:5250012 INFO  :: WebServices::CSoapTransport::GetCredentialsInfoFromHandle: hSoapHandle=014C0220, credentialsInfo=013E2670

    0:01:43.842.135 : Raw data  165 (char), UCD_LOG_INFO: 10/19/2012|14:50:27 Aries: 10/19/2012|14:50:27.852 4EC0006:5250012 INFO  :: WebServices::CSoapTransportStatus::~CSoapTransportStatus: status=014C0220

    0:01:43.842.492 : Raw data  182 (char), UCD_LOG_ERROR: 10/19/2012|14:50:27 Aries: 10/19/2012|14:50:27.853 4EC0006:5250012 ERROR :: WebServices::WebRequestImpl::ExecuteCommon: Could not execute SOAP request. hr=0x80f10043

     

     

    err 014C0220

    # as an HRESULT: Severity: SUCCESS (0), Facility: 0x14c, Code 0x220

    # for hex 0x220 / decimal 544 :

      SE_AUDITID_IPSEC_AUTH_FAIL_CERT_TRUST                         msaudite.h

    # IKE security association establishment failed because peer

    # could not authenticate.

    # The certificate trust could not be established.%n

    # Peer Identity: %n%1%n

    ...

     

    err 0x80f10043

    ...

    # (user-to-user)

      MIDIERR_NOTREADY                                              mmsystem.h

      NMERR_INVALID_PACKET_LENGTH                                   netmon.h

      ERROR_BAD_NET_NAME                                            winerror.h

    # The network name cannot be found.

      LDAP_NOT_ALLOWED_ON_RDN                                       winldap.h

    ...

     

    So the problem looks like a certificate issue.

     

    => When I check the certificate assigned to CAS array, I see the following:

     

    AccessRules        : {System.Security.AccessControl.CryptoKeyAccessRule, System.Security.AccessControl.CryptoKeyAccessRule}

    CertificateDomains : {casarray.contoso.com, cas01.contoso.com}

    HasPrivateKey      : True

    IsSelfSigned       : False

    Issuer             : CN=Issuing-CA for contoso, DC=contoso, DC=com

    NotAfter           : 8/8/2014 3:36:18 PM

    NotBefore          : 8/8/2012 3:36:18 PM

    PublicKeySize      : 1024

    RootCAType         : Enterprise

    SerialNumber       : 362763723200000000524

    Services           : IMAP, POP, IIS

    Status             : Valid

    Subject            : CN=casarray.contoso.com

    Thumbprint         : EF32873628362BDA8326108875301F38504AB

     

    Issuer is Issuing-CA for contoso, DC=contoso, DC=com

     

    => On the other hand, the CA that issues the Lync frontend certificate is the following: (this is taken from Lync server side network trace)

     

    ...

    + Tcp: Flags=...A...., SrcPort=5061, DstPort=50855, PayloadLen=1460, Seq=2076240595 - 2076242055, Ack=1109389286, Win=256 (scale factor 0x8) = 65536

      TLSSSLData: Transport Layer Security (TLS) Payload Data

    - TLS: TLS Rec Layer-1 HandShake: Server Hello. Certificate.

      - TlsRecordLayer: TLS Rec Layer-1 HandShake:

         ContentType: HandShake:

       + Version: TLS 1.0

         Length: 4380 (0x111C)

       - SSLHandshake: SSL HandShake Certificate(0x0B)

          HandShakeType: ServerHello(0x02)

          Length: 77 (0x4D)

        + ServerHello: 0x1

          HandShakeType: Certificate(0x0B)

          Length: 3423 (0xD5F)

        - Cert: 0x1

           CertLength: 3420 (0xD5C)

         - Certificates:

            CertificateLength: 1574 (0x626)

    ...

            + Signature: Sha1WithRSAEncryption (1.2.840.113549.1.1.5)

            + Issuer: Issuing CA for contoso,com

            + Validity: From: 09/10/12 10:05:05 UTC To: 09/10/14 10:05:05 UTC

            + Subject: casarray.contoso.com

            + SubjectPublicKeyInfo: RsaEncryption (1.2.840.113549.1.1.1)

            + Tag3:

            + Extensions:

           + SignatureAlgorithm: Sha1WithRSAEncryption (1.2.840.113549.1.1.5)

           + Signature:

           Certificates:

     

     

    => So the issuers of certificates used by Lync server itself and casarray.contoso.com servers are different CAs:

     

    a) CA that issues Lync FE certificate:

    Issuing CA for contoso,com

     

    b) CA that issues CAS array certificate:

    Issuing-CA for contoso, DC=contoso, DC=com

     

    Apparently two different CAs with similar names issued Lync FE and Exchange CAS array certificates.

     

    Under normal circumstances, if those two CAs are enterprise CAs, they automatically publish their own CA certificates to AD so the clients can download and use them while verifying the certificate chain. But after some internal  research I found out that phone edition devices only trust the same CA that assigned the certificate to Lync FE pool to which phone edition device signs in.  (except public certificates listed in http://technet.microsoft.com/en-us/library/gg398270(OCS.14).aspx)

     

    RESULTS:

    ========

    After issuing a new certificate to CAS array from the same enterprise CA (Issuing CA for contoso,com) that issues Lync FE certificate as well, the integration problem was resolved.

     

    Hope this helps you when dealing with similar problems...

     

    Thanks,

    Murat

  • Things that you may want to know about TCP Keepalives

    Hi,

    In this blog entry, I will be discussing TCP keepalive mechanism and will also provide some information about configuration options on Windows systems.

    a) Definition

    Let's first understand the mechanism. A TCP keep-alive packet is simply an ACK with the sequence number set to one less than the current sequence number for the connection. A host receiving one of these ACKs responds with an ACK for the current sequence number. Keep-alives can be used to verify that the computer at the remote end of a connection is still available. TCP keep-alives can be sent once every KeepAliveTime (defaults to 7,200,000 milliseconds or two hours) if no other data or higher-level keep-alives have been carried over the TCP connection. If there is no response to a keep-alive, it is repeated once every KeepAliveInterval seconds. KeepAliveInterval defaults to 1 second. NetBT connections, such as those used by other Microsoft networking components, send NetBIOS keep-alives more frequently, so normally no TCP keep-alives are sent on a NetBIOS connection. TCP keep-alives are disabled by default, but Windows Sockets applications can use the setsockopt() function to enable them.

    b) Configuration

    Now let's talk a little bit about configuration options. There're 3 registry keys where you can affect TCP Keepalive mechanism on Windows systems:

    KeepAliveInterval
    Key: Tcpip\Parameters
    Value Type: REG_DWORD-time in milliseconds
    Valid Range: 1-0xFFFFFFFF
    Default: 1000 (one second)
    Description: This parameter determines the interval between TCP keep-alive retransmissions until a response is received. Once a response is received, the delay until the next keep-alive transmission is again controlled by the value of KeepAliveTime. The connection is aborted after the number of retransmissions specified by TcpMaxDataRetransmissions have gone unanswered.

    Notes:
    TCPIP driver waits for a TCP Keepalive ACK for the duration of time specified in this registry entry.

    KeepAliveTime
    Key: Tcpip\Parameters
    Value Type: REG_DWORD-time in milliseconds
    Valid Range: 1-0xFFFFFFFF
    Default: 7,200,000 (two hours)
    Description: The parameter controls how often TCP attempts to verify that an idle connection is still intact by sending a keep-alive packet. If the remote system is still reachable and functioning, it acknowledges the keep-alive transmission. Keep-alive packets are not sent by default. This feature may be enabled on a connection by an application

    Notes:
    In order for a TCP session to stay idle, there should be no data sent or received.

    c) If OS is Windows XP/2003 the following registry entry applies:

    TcpMaxDataRetransmissions
    Key: Tcpip\Parameters
    Value Type: REG_DWORD-number
    Valid Range: 0-0xFFFFFFFF
    Default: 5
    Description: This parameter controls the number of times that TCP retransmits an individual data segment (not connection request segments) before aborting the connection. The retransmission time-out is doubled with each successive retransmission on a connection. It is reset when responses resume. The Retransmission Timeout (RTO) value is dynamically adjusted, using the historical measured round-trip time (Smoothed Round Trip Time) on each connection. The starting RTO on a new connection is controlled by the TcpInitialRtt registry value.

    Notes:
    This registry entry determines the number of TCP retransmissions for an individual TCP segment. There's no special registry entry to determine the retransmission behavior of TCP Keepalives and this registry entry is also used for the TCP keepalive scenario.

    Important note: If OS is Windows Vista/2008, the number of TCP Keepalive attempts are hardcoded to 10 and could not be adjusted via the registry.

    d) Some special considerations

    => Even if TCP KeepaliveTime and TCPKeepAliveInterval registry keys are set to a specific value (TCPIP driver uses the deafult values even if we don't set these registry keys from the registry), TCPIP driver won't start sending TCP Keepalives until Keepalives are enabled via various methods at upper layers (layers above TCPIP driver).

    => Native Socket applications can enable TCP keepalives by using anyone of the following methods:

    - setsockopt() with SO_KEEPALIVE option
    - WSAIoctl() with SIO_KEEPALIVE_VALS option (it's also possible to change Keepalive timers with this API call dynamically on a per-socket basis)

    => Managed applications (.NET), can use one of the following methods:

    - SetSocketOption method from Socket Class in System.Net.Sockets namespace
    - GetSocketOption method from Socket Class in System.Net.Sockets namespace

    => Effect of using Keepalives on bandwidth usage

    Since TCP Keepalives are TCP segments without data (and the SEQ number set to one less than the current SEQ number), Keepalive usage bandwidth usage can simply be neglected. There's an example below to give an idea about how big a TCP Keepalive packet could be:

    - 14 bytes (L2 header - Assuming that Ethernet protocol is used. This could be even lower for other WAN protocols like PPP/HDLC/etc)
    - 20 bytes (IP header - assuming no IP options are used)
    - 20 bytes (TCP header - assuming no TCP options are used)

    Total: ~54 bytes

    Even if TCP Keepalive interval is set to 5 minutes or so (default is 2 hours), given that TCP connection goes idle, TCPIP driver will send a ~54 TCP Keepalive message every 5 minutes and as can be seen it could simply be neglected.

    You may also find some references below:

    References
    =============================
    http://www.microsoft.com/downloads/details.aspx?FamilyID=06c60bfe-4d37-4f50-8587-8b68d32fa6ee&displaylang=en
    Microsoft Windows Server 2003 TCP/IP Implementation Details

    http://www.microsoft.com/downloads/details.aspx?FamilyId=12AC9780-17B5-480C-AEF7-5C0BDE9060B0&displaylang=en
    TCP/IP Registry Values for Microsoft Windows Vista and Windows Server 2008

    http://msdn.microsoft.com/en-us/library/ms740476(VS.85).aspx setsockopt Function
    http://msdn.microsoft.com/en-us/library/ms741621(VS.85).aspx WSAIoctl Function
    http://msdn.microsoft.com/en-us/library/system.net.sockets.socket.setsocketoption.aspx Socket..::.SetSocketOption Method
    http://msdn.microsoft.com/en-us/library/system.net.sockets.socket.getsocketoption.aspx Socket..::.GetSocketOption Method


    Hope this helps.

    Thanks,
    Murat

  • Is MaximumBlockSize registry key supported on Windows 2008 R2 RTM WDS server?

    Hi there,

    Recently I dealt with a connectivity issue that occurs when deploying OS images with Windows 2008 R2 WDS server to PXE clients running in different subnets. As you may already know, we have a workaround for router MTU size incompatibilities seen when deploying OS images to remote subnets in Windows 2008:

    http://support.microsoft.com/kb/975710 Operating system deployment over a network by using WDS fails in Windows Server 2008

    The same router packet drop issue was present in my case. Even we configured the MaximumBlockSize reg key on the Windows 2008 R2 WDS server, the issue was still in place and the TFTP server (part of WDS server) was still sending data in chunks requested by the client rather than the size we tried to limit wit the MaximumBlockSize reg key:

    Note:
    10.1.1.1 is PXE client
    10.2.2.2 is TFTP server

    => Client is asking for a TFTP block size of 1456 bytes and TFTP server (that’s running as a part of WDS server) is honoring that request:

    10.1.1.1          10.2.2.2            TFTP        Read Request, File: boot\x86\wdsnbp.com\000, Transfer type: octet\000, blksize\000=1456\000
    10.2.2.2          10.1.1.1            TFTP        Option Acknowledgement, blksize\000=1456\000

    => Then TFTP server begins sending the first 1456 bytes block of wdsnbp.com file:

    10.2.2.2            10.1.1.1          TFTP        Data Packet, Block: 1

    => Most likely because the router drops that packet, TFTP client doesn’t receive it and hence it re-sends the Read file request once more:

    10.1.1.1          10.2.2.2            TFTP        Read Request, File: boot\x86\wdsnbp.com\000, Transfer type: octet\000, blksize\000=1456\000

    => And TFTP server sends the same block once more and tries a number of times:

    10.2.2.2         10.1.1.1      TFTP      Data Packet, Block: 1
    10.2.2.2         10.1.1.1      TFTP      Data Packet, Block: 1
    10.2.2.2         10.1.1.1      TFTP      Data Packet, Block: 1
    10.2.2.2         10.1.1.1      TFTP      Data Packet, Block: 1
    ...

    So PXE boot fails at the end.

    The analysis showed us that the reg key didn't take effect on Windows 2008 R2 WDS server. After a source code review, I saw that the feature wasn't integrated into Windows 2008 R2 RTM code due to release timing of Windows 2008 R2 and the fix for Windows 2008. I verified with internal resources and can say that the feature will be a part of Windows 2008 R2 SP1. The SP1 public beta is being made available by the end of July:

    http://blogs.technet.com/b/itproaustralia/archive/2010/06/08/windows-7-and-windows-server-2008-r2-sp1-beta-available-end-of-july.aspx  Windows 7 and Windows Server 2008 R2 SP1 Beta available end of July

    Hope this helps

    Thanks,
    Murat

  • SCCM packages may be distributed slower than standard file copy (xcopy/Windows Explorer)

    Hi there,

     

    In this post, I’m going to mention about another issue where I helped a colleague of mine to troubleshoot an SCCM package distribution scenario. The problem was that package distribution to clients were visibly slower compared to standard file copy methods (like using xcopy, Windows Explorer etc). In the given setup, the sccm client was accessing and retrieving the distribution package via SMB protocol so BITS was out of the picture. We requested the customer to collect the following logs while reproducing the problem:

     

    a) Create a distribution package which simply includes a 100 MB executable file

    b) Collect the following logs for two different scenarios:

     

    => For standard file copy scenario:

    - Start Network traces on the SCCM server (Windows 2008 R2) and the SCCM agent (Windows 7)

    - Start Process Explorer on the SCCM agent

    - Start file copy by using xcopy from a command prompt on Windows 7 client

     

    => For SCCM package distribution scenario:

    - Start Network traces on the SCCM server (Windows 2008 R2) and the SCCM agent (Windows 7)

    - Start Process Explorer on the SCCM agent

    - Trigger packet distribution

     

    After the above logs collected, I analzyed the network traces and Process monitor logs for both scenarios. Let us take a closer look for each scenario:

     

    A. SCCM PACKAGE DISTRIBUTION SCENARIO

     

    The package download activity was seen as below in Process Monitor:

    - Ccmexec posts about 4900 ReadFile()s with 64KB buffers each

    - This is also supported by the behavior seen in the network trace collected for ccmexec scenario:

     

     

    No.     Time        Source                Destination           Info                                                            Protocol

    ...

      16475 0.005513    192.168.1.7         192.168.2.77         Read Request Len:65536 Off:16121856 File: TEST\100MBFile.txt SMB2

      16476 0.000013    192.168.1.7         192.168.2.77         Read Request Len:65536 Off:16187392 File: TEST\100MBFile.txt SMB2

      16478 0.001872    192.168.2.77         192.168.1.7         Read Response                                                   SMB2

      16538 0.005313    192.168.2.77         192.168.1.7         Read Response                                                   SMB2

      16603 0.080443    192.168.1.7         192.168.2.77         Read Request Len:65536 Off:16252928 File: TEST\100MBFile.txt SMB2

      16604 0.000013    192.168.1.7         192.168.2.77         Read Request Len:65536 Off:16318464 File: TEST\100MBFile.txt SMB2

      16606 0.001229    192.168.2.77         192.168.1.7         Read Response                                                   SMB2

      16666 0.005312    192.168.2.77         192.168.1.7         Read Response                                                   SMB2

      16730 0.005827    192.168.1.7         192.168.2.77         Read Request Len:65536 Off:16384000 File: TEST\100MBFile.txt SMB2

      16731 0.000013    192.168.1.7         192.168.2.77         Read Request Len:65536 Off:16449536 File: TEST\100MBFile.txt SMB2

      16733 0.001193    192.168.2.77         192.168.1.7         Read Response                                                   SMB2

      16795 0.005643    192.168.2.77         192.168.1.7         Read Response                                                   SMB2

      16856 0.070364    192.168.1.7         192.168.2.77         Read Request Len:65536 Off:16515072 File: TEST\100MBFile.txt SMB2

      16857 0.000013    192.168.1.7         192.168.2.77         Read Request Len:65536 Off:16580608 File: TEST\100MBFile.txt SMB2

      16859 0.001037    192.168.2.77         192.168.1.7         Read Response                                                   SMB2

      16919 0.005313    192.168.2.77         192.168.1.7         Read Response                                                   SMB2

      16982 0.005789    192.168.1.7         192.168.2.77         Read Request Len:65536 Off:16646144 File: TEST\100MBFile.txt SMB2

      16983 0.000014    192.168.1.7         192.168.2.77         Read Request Len:65536 Off:16711680 File: TEST\100MBFile.txt SMB2

      16985 0.001043    192.168.2.77         192.168.1.7         Read Response                                                   SMB2

      17045 0.005312    192.168.2.77         192.168.1.7         Read Response                                                   SMB2

      17108 0.048421    192.168.1.7         192.168.2.77         Read Request Len:65536 Off:16777216 File: TEST\100MBFile.txt SMB2

      17109 0.000019    192.168.1.7         192.168.2.77         Read Request Len:65536 Off:16842752 File: TEST\100MBFile.txt SMB2

      17111 0.002061    192.168.2.77         192.168.1.7         Read Response                                                   SMB2

      17171 0.005311    192.168.2.77         192.168.1.7         Read Response                                                   SMB2

      17236 0.055958    192.168.1.7         192.168.2.77         Read Request Len:65536 Off:16908288 File: TEST\100MBFile.txt SMB2

      17237 0.000015    192.168.1.7         192.168.2.77         Read Request Len:65536 Off:16973824 File: TEST\100MBFile.txt SMB2

      17239 0.002242    192.168.2.77         192.168.1.7         Read Response                                                   SMB2

      17300 0.005311    192.168.2.77         192.168.1.7         Read Response                                                   SMB2

    ...

     

    Note: IP addresses are replaced for privacy purposes

     

    B. STANDARD FILE COPY SCENARIO

     

    The standard file copy with xcopy was seen as below in Process Monitor:

    - The xcopy tool posts only 100 ReadFile()s each with a 1 MB buffer each

    - This is also seen in the network trace collected for the xcopy scenario:

     

    No.     Time                       Source                Destination           Info                                                            Protocol

    ...

       5445 2010-09-21 15:59:29.436686 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:12582912 File: xcopytest\100MBFile.txt SMB2

       5446 2010-09-21 15:59:29.436701 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:12648448 File: xcopytest\100MBFile.txt SMB2

       5447 2010-09-21 15:59:29.436712 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:12713984 File: xcopytest\100MBFile.txt SMB2

       5448 2010-09-21 15:59:29.436723 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:12779520 File: xcopytest\100MBFile.txt SMB2

       5449 2010-09-21 15:59:29.436735 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:12845056 File: xcopytest\100MBFile.txt SMB2

       5450 2010-09-21 15:59:29.436748 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:12910592 File: xcopytest\100MBFile.txt SMB2

       5451 2010-09-21 15:59:29.436760 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:12976128 File: xcopytest\100MBFile.txt SMB2

       5452 2010-09-21 15:59:29.436772 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:13041664 File: xcopytest\100MBFile.txt SMB2

       5453 2010-09-21 15:59:29.436784 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:13107200 File: xcopytest\100MBFile.txt SMB2

       5457 2010-09-21 15:59:29.436798 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:13172736 File: xcopytest\100MBFile.txt SMB2

       5458 2010-09-21 15:59:29.436813 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:13238272 File: xcopytest\100MBFile.txt SMB2

       5459 2010-09-21 15:59:29.436824 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:13303808 File: xcopytest\100MBFile.txt SMB2

       5460 2010-09-21 15:59:29.436835 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:13369344 File: xcopytest\100MBFile.txt SMB2

       5461 2010-09-21 15:59:29.436845 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:13434880 File: xcopytest\100MBFile.txt SMB2

       5462 2010-09-21 15:59:29.436857 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:13500416 File: xcopytest\100MBFile.txt SMB2

       5463 2010-09-21 15:59:29.436869 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:13565952 File: xcopytest\100MBFile.txt SMB2

       5509 2010-09-21 15:59:29.441113 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       5572 2010-09-21 15:59:29.446773 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       5632 2010-09-21 15:59:29.452104 192.168.2.77         192.168.1.7         [Unreassembled Packet]                                          SMB2

       5694 2010-09-21 15:59:29.457766 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       5755 2010-09-21 15:59:29.463095 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       5817 2010-09-21 15:59:29.468755 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       5878 2010-09-21 15:59:29.474076 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       5940 2010-09-21 15:59:29.479738 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       6002 2010-09-21 15:59:29.485400 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       6063 2010-09-21 15:59:29.490729 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       6125 2010-09-21 15:59:29.496387 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       6187 2010-09-21 15:59:29.502044 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       6248 2010-09-21 15:59:29.507367 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       6310 2010-09-21 15:59:29.513024 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       6372 2010-09-21 15:59:29.518677 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       6433 2010-09-21 15:59:29.523999 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       6447 2010-09-21 15:59:29.525133 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:13631488 File: xcopytest\100MBFile.txt SMB2

       6448 2010-09-21 15:59:29.525148 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:13697024 File: xcopytest\100MBFile.txt SMB2

       6449 2010-09-21 15:59:29.525159 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:13762560 File: xcopytest\100MBFile.txt SMB2

       6450 2010-09-21 15:59:29.525170 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:13828096 File: xcopytest\100MBFile.txt SMB2

       6451 2010-09-21 15:59:29.525183 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:13893632 File: xcopytest\100MBFile.txt SMB2

       6452 2010-09-21 15:59:29.525196 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:13959168 File: xcopytest\100MBFile.txt SMB2

       6453 2010-09-21 15:59:29.525207 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:14024704 File: xcopytest\100MBFile.txt SMB2

       6454 2010-09-21 15:59:29.525219 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:14090240 File: xcopytest\100MBFile.txt SMB2

       6455 2010-09-21 15:59:29.525231 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:14155776 File: xcopytest\100MBFile.txt SMB2

       6456 2010-09-21 15:59:29.525243 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:14221312 File: xcopytest\100MBFile.txt SMB2

       6457 2010-09-21 15:59:29.525255 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:14286848 File: xcopytest\100MBFile.txt SMB2

       6458 2010-09-21 15:59:29.525267 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:14352384 File: xcopytest\100MBFile.txt SMB2

       6459 2010-09-21 15:59:29.525280 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:14417920 File: xcopytest\100MBFile.txt SMB2

       6460 2010-09-21 15:59:29.525292 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:14483456 File: xcopytest\100MBFile.txt SMB2

       6461 2010-09-21 15:59:29.525304 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:14548992 File: xcopytest\100MBFile.txt SMB2

       6462 2010-09-21 15:59:29.525316 192.168.1.7         192.168.2.77         Read Request Len:65536 Off:14614528 File: xcopytest\100MBFile.txt SMB2

       6511 2010-09-21 15:59:29.529653 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       6573 2010-09-21 15:59:29.534977 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       6635 2010-09-21 15:59:29.540629 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       6697 2010-09-21 15:59:29.546286 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       6758 2010-09-21 15:59:29.551606 192.168.2.77         192.168.1.7         [Unreassembled Packet]                                          SMB2

       6821 2010-09-21 15:59:29.557255 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       6883 2010-09-21 15:59:29.562576 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       6945 2010-09-21 15:59:29.568234 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       7007 2010-09-21 15:59:29.573893 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       7068 2010-09-21 15:59:29.579219 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       7130 2010-09-21 15:59:29.584876 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       7192 2010-09-21 15:59:29.590530 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       7253 2010-09-21 15:59:29.595858 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       7315 2010-09-21 15:59:29.601517 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       7377 2010-09-21 15:59:29.607173 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       7438 2010-09-21 15:59:29.612499 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       7500 2010-09-21 15:59:29.618155 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       7561 2010-09-21 15:59:29.623478 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       7623 2010-09-21 15:59:29.629132 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       7685 2010-09-21 15:59:29.634785 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       7746 2010-09-21 15:59:29.640111 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       7808 2010-09-21 15:59:29.645771 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       7871 2010-09-21 15:59:29.651433 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       7932 2010-09-21 15:59:29.656750 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       7996 2010-09-21 15:59:29.662406 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       8058 2010-09-21 15:59:29.667728 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       8120 2010-09-21 15:59:29.673385 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       8182 2010-09-21 15:59:29.679045 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

       8243 2010-09-21 15:59:29.684367 192.168.2.77         192.168.1.7         Read Response                                                   SMB2

    ...

     

    Note: IP addresses are replaced for privacy purposes

    Note: The above 16 x 64 KB = 1 MB read requests are actually created as a result of 1 MB read requests at the application layer (xcopy)

     

     

    SUMMARY:

    =========

    The performance difference between sccm package distribution and xcopy stems from the fact that xcopy tool (and most probably Windows Explorer as well) posts Read requests with larger buffers (1 MB) compared to sccm agent process - ccmexec (64 KB) which results in a better performance in the xcopy scenario since better concurrency is achieved and the network bandwidth is better utilized that way. This is both seen in the network trace and Process Monitor activities. We shared the results with our SCCM colleagues to see if that behaviour could be changed or not, if I receive any update on that I’ll update this post.

     

    Hope this helps

     

    Thanks,

    Murat

  • HTTPS access through TMG fails from a certain VLAN with a very unusual error: FWX_E_SEQ_ACK_MISMATCH

    In this blog post, I’ll be talking about an interesting problem that I dealt with recently. The problem was that clients running in a certain VLAN were not able to establish HTTPS connections through TMG server. Due to the nature of the network, the clients should be configured as SecureNet clients (my customer cannot configure them as web proxy clients or use TMG client software because these machines are guest machines)

     

    I asked for the usual data from our customer to find out what was happening during the problem:

     

    - Network trace & HTTPWatch logs from a test client

    - TMG data packager from the TMG server

     

    1. After receiving the data, I started from client side. What I see on the client side is the client starts failing right after the initial TCP 3-way handshake:

     

    Note: Please note that, for privacy purposes all IP addresses have been replaced with random addresses.

     

    => Client network trace:

    (ip.addr eq 10.110.233.50 and ip.addr eq 172.16.1.1) and (tcp.port eq 50183 and tcp.port eq 443)

     

    453          14:44:32 01.05.2012               27.1395045             0.0000000                               10.110.233.50         172.16.1.1         TCP          TCP: [Bad CheckSum]Flags=......S., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=0, Seq=367515135, Ack=0, Win=8192 ( Negotiating scale factor 0x2 ) = 8192

    456          14:44:32 01.05.2012               27.1565249             0.0170204                               172.16.1.1         10.110.233.50         TCP                TCP:Flags=...A..S., SrcPort=HTTPS(443), DstPort=50183, PayloadLen=2, Seq=2180033885 - 2180033888, Ack=367515136, Win=64240 ( Scale factor not supported ) = 64240

    457          14:44:32 01.05.2012               27.1565443             0.0000194                               10.110.233.50         172.16.1.1         TCP          TCP: [Bad CheckSum]Flags=...A...., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=0, Seq=367515136, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

    458          14:44:32 01.05.2012               27.1585176             0.0019733                               10.110.233.50         172.16.1.1         TLS          TLS:TLS Rec Layer-1 HandShake: Client Hello.

    465          14:44:32 01.05.2012               27.4744962             0.3159786                               10.110.233.50         172.16.1.1         TCP                TCP:[ReTransmit #458] [Bad CheckSum]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

    476          14:44:33 01.05.2012               28.0828875             0.6083913                               10.110.233.50         172.16.1.1         TCP                TCP:[ReTransmit #458] [Bad CheckSum]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

    494          14:44:34 01.05.2012               29.2948179             1.2119304                               10.110.233.50         172.16.1.1         TCP                TCP:[ReTransmit #458]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

    512          14:44:35 01.05.2012               30.3815127             1.0866948                               172.16.1.1         10.110.233.50         TLS          TLS:Continued Data: 2 Bytes

    513          14:44:35 01.05.2012               30.3815385             0.0000258                               10.110.233.50         172.16.1.1         TCP                TCP:Flags=...A...., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=0, Seq=367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

    516          14:44:35 01.05.2012               30.5019345             0.1203960                               10.110.233.50         172.16.1.1         TCP                TCP:[ReTransmit #458]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

    535          14:44:37 01.05.2012               31.7018989             1.1999644                               10.110.233.50         172.16.1.1         TCP                TCP:[ReTransmit #458] [Bad CheckSum]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

    568          14:44:38 01.05.2012               33.1872833             1.4853844                               172.16.1.1         10.110.233.50         TLS          TLS:Continued Data: 6 Bytes 

     

    - Client cannot connect to remote web site because SSL/TLS negotiation doesn’t succeed because no response from the Web server is received (from client’s perspective)

     

    2. Then I decided to check things from TMG server perspective. I first checked the network trace that was collected on the internal interface of TMG server through which the client request was received:

     

    6457        14:33:49 01.05.2012               39.8915584             0.0000000                               10.110.233.50         172.16.1.1         TCP                TCP:Flags=......S., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=0, Seq=367515135, Ack=0, Win=8192 ( Negotiating scale factor 0x2 ) = 8192

    6461        14:33:49 01.05.2012               39.9079944             0.0164360                               172.16.1.1         10.110.233.50         TCP                TCP:Flags=...A..S., SrcPort=HTTPS(443), DstPort=50183, PayloadLen=0, Seq=2180033885, Ack=367515136, Win=64240 ( Scale factor not supported ) = 64240

    6462        14:33:49 01.05.2012               39.9084181             0.0004237                               10.110.233.50         172.16.1.1         TCP                TCP:Flags=...A...., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=0, Seq=367515136, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

    6463        14:33:49 01.05.2012               39.9103936             0.0019755                               10.110.233.50         172.16.1.1         TLS          TLS:TLS Rec Layer-1 HandShake: Client Hello.

    6489        14:33:49 01.05.2012               40.2259991             0.3156055                               10.110.233.50         172.16.1.1         TCP                TCP:[ReTransmit #6463]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

    6614        14:33:50 01.05.2012               40.8343158             0.6083167                               10.110.233.50         172.16.1.1         TCP                TCP:[ReTransmit #6463]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

    6806        14:33:51 01.05.2012               42.0457229             1.2114071                               10.110.233.50         172.16.1.1         TCP                TCP:[ReTransmit #6463]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

    6872        14:33:52 01.05.2012               43.1311020             1.0853791                               172.16.1.1         10.110.233.50         TCP                TCP:Flags=...A..S., SrcPort=HTTPS(443), DstPort=50183, PayloadLen=0, Seq=2180033885, Ack=367515136, Win=64240 ( Scale factor not supported ) = 64240

    6873        14:33:52 01.05.2012               43.1314010             0.0002990                               10.110.233.50         172.16.1.1         TCP                TCP:Flags=...A...., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=0, Seq=367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

    6891        14:33:52 01.05.2012               43.2517820             0.1203810                               10.110.233.50         172.16.1.1         TCP                TCP:[ReTransmit #6463]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

    7110        14:33:54 01.05.2012               44.4514449             1.1996629                               10.110.233.50         172.16.1.1         TCP                TCP:[ReTransmit #6463]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

    7481        14:33:55 01.05.2012               45.9360680             1.4846231                               172.16.1.1         10.110.233.50         TCP                TCP:Flags=...A.R.., SrcPort=HTTPS(443), DstPort=50183, PayloadLen=0, Seq=2180033886, Ack=367515136, Win=8192 (scale factor 0x0) = 8192 

     

    - TMG server doesn’t really send responses back to the client

     

    3. Then I decided to check the network trace collected on the external interface of the TMG server:

     

    21            14:33:49 01.05.2012               39.6940232             0.0000000                               10.110.235.202       172.16.1.1         TCP                TCP:Flags=......S., SrcPort=55073, DstPort=HTTPS(443), PayloadLen=0, Seq=367515135, Ack=0, Win=8192 ( Negotiating scale factor 0x2 ) = 8192

    22            14:33:49 01.05.2012               39.7097097             0.0156865                               172.16.1.1         10.110.235.202       TCP                TCP:Flags=...A..S., SrcPort=HTTPS(443), DstPort=55073, PayloadLen=0, Seq=2180033885, Ack=367515136, Win=64240 ( Scale factor not supported ) = 64240

    23            14:33:52 01.05.2012               42.9329903             3.2232806                               172.16.1.1         10.110.235.202       TCP                TCP:Flags=...A..S., SrcPort=HTTPS(443), DstPort=55073, PayloadLen=0, Seq=2180033885, Ack=367515136, Win=64240 ( Scale factor not supported ) = 64240

    24            14:33:55 01.05.2012               45.7379523             2.8049620                               172.16.1.1         10.110.235.202       TCP                TCP:Flags=...A.R.., SrcPort=HTTPS(443), DstPort=55073, PayloadLen=0, Seq=2180033886, Ack=367515136, Win=8192 (scale factor 0x0) = 8192

     

    - External Web server sends a response to initial TCP 3-way handshake request that is forwarded by TMG, but TMG server doesn’t proceed with the connection

     

    4. Then I checked the Web Proxy/Firewall log on the TMG server:

     

    01.05.2012 14:33 Denied 10.110.233.50     50183    172.16.1.1 443 0xc0040034 FWX_E_SEQ_ACK_MISMATCH 

    01.05.2012 14:33 Denied 10.110.233.50     50183    172.16.1.1 443 0xc0040034 FWX_E_SEQ_ACK_MISMATCH 

    01.05.2012 14:33 Denied 10.110.233.50     50183    172.16.1.1 443 0xc0040034 FWX_E_SEQ_ACK_MISMATCH 

     

    When I check more details on that error code, I see that we fail because we receive a TCP packet with an invalid sequence number:

     

    http://msdn.microsoft.com/en-us/library/ms812624.aspx/

    FWX_E_SEQ_ACK_MISMATCH 0xC0040034 A TCP packet was rejected because it has an invalid sequence number or an invalid acknowledgement number.

     

    So TMG Server drops the TCP ACK packet (3rd TCP packet in TCP 3-way handshake) coming from the client because it has an invalid TCP ACK number

     

    5. The problem was also visible in the ETL trace:

     

    … handshake packet is dropped becuase ACK (2180033888) no equal ISN(peer)+1 (2180033886)

    Warning:The packet failed TCP sequence validation

    … Warning:The packet is dropped because of SEQ_ACK_MISMATCH

     

    6. When we check the TMG side network trace we see it there:

     

    6457       14:33:49 01.05.2012         39.8915584          0.0000000                            10.110.233.50     172.16.1.1     TCP                TCP:Flags=......S., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=0, Seq=367515135, Ack=0, Win=8192 ( Negotiating scale factor 0x2 ) = 8192

    SequenceNumber: 367515135 (0x15E7D5FF)

     

    6461       14:33:49 01.05.2012         39.9079944          0.0164360                            172.16.1.1     10.110.233.50     TCP                TCP:Flags=...A..S., SrcPort=HTTPS(443), DstPort=50183, PayloadLen=0, Seq=2180033885, Ack=367515136, Win=64240 ( Scale factor not supported ) = 64240

    SequenceNumber: 2180033885 (0x81F0AD5D)

    AcknowledgementNumber: 367515136 (0x15E7D600)

     

    6462       14:33:49 01.05.2012         39.9084181          0.0004237                            10.110.233.50     172.16.1.1     TCP                TCP:Flags=...A...., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=0, Seq=367515136, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

    SequenceNumber: 367515136 (0x15E7D600)

    AcknowledgementNumber: 2180033888 (0x81F0AD60) 

    That Acknowledgement number SHOULD HAVE BEEN 2180033886 (0x81F0AD5E)

     

    So TMG ignores the rest of the session (like TLS client hello coming from the client machine)

     

    6463        14:33:49 01.05.2012               39.9103936             0.0019755                               10.110.233.50         172.16.1.1         TLS          TLS:TLS Rec Layer-1 HandShake: Client Hello.

    6489        14:33:49 01.05.2012               40.2259991             0.3156055                               10.110.233.50         172.16.1.1         TCP                TCP:[ReTransmit #6463]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

    6614        14:33:50 01.05.2012               40.8343158             0.6083167                               10.110.233.50         172.16.1.1         TCP                TCP:[ReTransmit #6463]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

    6806        14:33:51 01.05.2012               42.0457229             1.2114071                               10.110.233.50         172.16.1.1         TCP                TCP:[ReTransmit #6463]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

    6872        14:33:52 01.05.2012               43.1311020             1.0853791                               172.16.1.1         10.110.233.50         TCP                TCP:Flags=...A..S., SrcPort=HTTPS(443), DstPort=50183, PayloadLen=0, Seq=2180033885, Ack=367515136, Win=64240 ( Scale factor not supported ) = 64240

    6873        14:33:52 01.05.2012               43.1314010             0.0002990                               10.110.233.50         172.16.1.1         TCP                TCP:Flags=...A...., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=0, Seq=367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

    6891        14:33:52 01.05.2012               43.2517820             0.1203810                               10.110.233.50         172.16.1.1         TCP                TCP:[ReTransmit #6463]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

    7110        14:33:54 01.05.2012               44.4514449             1.1996629                               10.110.233.50         172.16.1.1         TCP                TCP:[ReTransmit #6463]Flags=...AP..., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=127, Seq=367515136 - 367515263, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

    7481        14:33:55 01.05.2012               45.9360680             1.4846231                               172.16.1.1         10.110.233.50         TCP                TCP:Flags=...A.R.., SrcPort=HTTPS(443), DstPort=50183, PayloadLen=0, Seq=2180033886, Ack=367515136, Win=8192 (scale factor 0x0) = 8192 

     

    7. When I check the client side trace, I see that the ACK number in the TCP ACK packet is really set to the wrong value (2180033888) by the client:

     

      Frame: Number = 6462, Captured Frame Length = 60, MediaType = ETHERNET

    + Ethernet: Etype = Internet IP (IPv4),DestinationAddress:[00-11-22-33-44-55],SourceAddress:[00-12-34-56-78-1B]

    + Ipv4: Src = 10.110.233.50, Dest = 172.16.1.1, Next Protocol = TCP, Packet ID = 24041, Total IP Length = 40

    - Tcp: Flags=...A...., SrcPort=50183, DstPort=HTTPS(443), PayloadLen=0, Seq=367515136, Ack=2180033888, Win=64860 (scale factor 0x0) = 64860

        SrcPort: 50183

        DstPort: HTTPS(443)

        SequenceNumber: 367515136 (0x15E7D600)

        AcknowledgementNumber: 2180033888 (0x81F0AD60)

      + DataOffset: 80 (0x50)

      + Flags: ...A....

        Window: 64860 (scale factor 0x0) = 64860

        Checksum: 0x4B5D, Good

        UrgentPointer: 0 (0x0)

     

    8. One can think that it’s a problem with TCPIP stack on the client, but when we check the TCP SYN ACK packet (the second TCP packet sent by TMG server before the TCP ACK with wrong sequence number is sent by the client) we see that the client receives that TCP SYN ACK packet with 2 bytes extra data (which is something unusual for a TCP SYN ACK packet - such packets shouldn’t have data just should have TCP header):

     

      Frame: Number = 456, Captured Frame Length = 60, MediaType = ETHERNET

    + Ethernet: Etype = Internet IP (IPv4),DestinationAddress:[00-11-22-33-44-55],SourceAddress:[00-12-34-56-78-1B]

      + DestinationAddress: Microsoft Corporation [00-11-22-33-44-55]

      + SourceAddress: Test Data [00-12-34-56-78-1B]

        EthernetType: Internet IP (IPv4), 2048(0x800)

    - Ipv4: Src = 172.16.1.1, Dest = 10.110.233.50, Next Protocol = TCP, Packet ID = 1342, Total IP Length = 46

      + Versions: IPv4, Internet Protocol; Header Length = 20

      + DifferentiatedServicesField: DSCP: 0, ECN: 0

        TotalLength: 46 (0x2E)

        Identification: 1342 (0x53E)

      + FragmentFlags: 0 (0x0)

        TimeToLive: 120 (0x78)

        NextProtocol: TCP, 6(0x6)

        Checksum: 46957 (0xB76D)

        SourceAddress: 194.53.208.72

        DestinationAddress: 10.110.233.50

    - Tcp: Flags=...A..S., SrcPort=HTTPS(443), DstPort=50183, PayloadLen=2, Seq=2180033885 - 2180033888, Ack=367515136, Win=64240 ( Scale factor not supported ) = 64240

        SrcPort: HTTPS(443)

        DstPort: 50183

        SequenceNumber: 2180033885 (0x81F0AD5D)

        AcknowledgementNumber: 367515136 (0x15E7D600)

      + DataOffset: 96 (0x60)

      + Flags: ...A..S.

        Window: 64240 ( Scale factor not supported ) = 64240

        Checksum: 0x365C, Good

        UrgentPointer: 0 (0x0)

      - TCPOptions:

       - MaxSegmentSize: 1

          type: Maximum Segment Size. 2(0x2)

          OptionLength: 4 (0x4)

          MaxSegmentSize: 1380 (0x564)

      - TCPPayload: SourcePort = 443, DestinationPort = 50183

         UnknownData: Binary Large Object (2 Bytes)

     

    That’s why the TCPIP stack running on the client sends a TCP ACK number which is 2 more than the value that should be:

     

    ACK number sent by the client:                                                AcknowledgementNumber: 2180033888 (0x81F0AD60)

    The correct ACK number that should have been sent:        AcknowledgementNumber: 2180033886 (0x81F0AD5E)

      

    9. And when we check the TCP SYN ACK packet that is leaving the TMG server, we don’t see such an extra 2 bytes:

     

      Frame: Number = 6461, Captured Frame Length = 60, MediaType = ETHERNET

    + Ethernet: Etype = Internet IP (IPv4),DestinationAddress:[00-11-22-33-44-55],SourceAddress:[00-12-34-56-78-1B]

    + Ipv4: Src = 172.16.1.1, Dest = 10.110.233.50, Next Protocol = TCP, Packet ID = 1342, Total IP Length = 44

    - Tcp: Flags=...A..S., SrcPort=HTTPS(443), DstPort=50183, PayloadLen=0, Seq=2180033885, Ack=367515136, Win=64240 ( Scale factor not supported ) = 64240

        SrcPort: HTTPS(443)

        DstPort: 50183

        SequenceNumber: 2180033885 (0x81F0AD5D)

        AcknowledgementNumber: 367515136 (0x15E7D600)

      + DataOffset: 96 (0x60)

      + Flags: ...A..S.

        Window: 64240 ( Scale factor not supported ) = 64240

        Checksum: 0x365E, Good

        UrgentPointer: 0 (0x0)

      - TCPOptions:

       - MaxSegmentSize: 1

          type: Maximum Segment Size. 2(0x2)

          OptionLength: 4 (0x4)

          MaxSegmentSize: 1380 (0x564)

     

    So that 2 bytes extra data is somehow added to TCP SYN ACK by something else (like the NIC driver on the TMG server, a network device running in between (like Wireless access point etc) or the NIC driver on the client machine

     

    SUMMARY:

    ==========

    In summary the HTTPS connectivity problem stems from an issue between the Client and TMG server (including the NIC layer or below on the client and server and the network devices/links in between the two)

     

    My customer informed me that the issue was visible with any clients which makes it unlikely that it’s a client side issue. I advised my customer to update NIC drivers on the TMG server side and check the network devices running in the path and upgrade firmwares where possible.

     

    Hope this helps

     

    Thanks,

    Murat

  • How to decrypt an SSL or TLS session by using Wireshark

    [Updated on 26th October 2013]

    The following blog post is the newer version of this blog post:

    http://blogs.technet.com/b/nettracer/archive/2013/10/12/decrypting-ssl-tls-sessions-with-wireshark-reloaded.aspx

    Hi there,

     

    In this blog post, I would like to talk about decrypting SSL/TLS sessions by using Wireshark provided that you have access to the server certificate’s private key. In some cases it may be quite useful to see what is exchanged under the hood of an SSL/TLS session from troubleshooting purposes. You’ll find complete steps to do this on Windows systems. Even though there’re a couple of documentations around (you can find the references at the end of the blog post), all steps from one document doesn’t fully apply and you get stuck at some point. I tested the following steps a couple of times on a Windows 2008 server and it seems to be working fine.

     

    Here are the details of the process:

     

    First of all we’ll need the following tools for that process: (At least I tested with these versions)

     

    http://www.wireshark.org/download.html

    Wireshark -> Version 1.2.8

     

    http://www.slproweb.com/products/Win32OpenSSL.html

    (Win32 OpenSSL v1.0.0.a Light)

    openssl                  -> 1.0.0a

     

    1) We first need to export the certificate that is used by the server side in SSL/TLS session with the following steps:

     

    Note: The Certificate export wizard could be started by right clicking the related certificate from certificates mmc and selecting “All Tasks > Export” option.

     

     

     

    2) In the second stage, we’ll need to convert the private key file in PKCS12 format to PEM format (which is used by Wireshark) in two stages by using the openssl tool:

     

    c:\OpenSSL-Win32\bin> openssl pkcs12 -nodes -in iis.pfx -out key.pem -nocerts -nodes

    Enter Import Password: <<Password used when exporting the certificate in PKCS12 format>>

     

    c:\OpenSSL-Win32\bin> openssl rsa -in key.pem -out keyout.pem

    writing RSA key

     

    => After the last command, the outfile “keyout.pem” should be seen in the following format:

     

    -----BEGIN RSA PRIVATE KEY-----

    jffewjlkfjelkjfewlkjfew.....

    ...

    akfhakdfhsakfskahfksjhgkjsah

    -----END RSA PRIVATE KEY-----

     

    3) Now we can use the private key file in Wireshark as given below:

     

    Note: The following dialog box could be seen by first selecting Edit > Preferences and then selecting “Protocols” from the left pane and selecting SSL at the left pane again:

     

    Notes:

     

    - 172.17.1.1 is server IP address. This is the server using the certificate that we extracted the private key from.

    - 443 is the TCP port at the server side.

    - http is the protocol carried inside the SSL/TLS session

    - c:\tls\keyout.pem is the name of the file which includes the converted private key

    - c:\tls\debug2.txt is the name of the file which includes information about the decryption process

     

    4) Once all is ready, you can click “Apply” to start the decryption process. Wireshark will show you the packets in the given session in an unencrypted fashion. Here is the difference between the encrypted and unencrypted versions:

     

    a) How it is seen before Wireshark decrypts SSL/TLS session:

     

     

    b) How it is seen after Wireshark decrypts SSL/TLS session:

      

     

    5) Since the private key of a certificate could be considered as a password, we couldn’t ask for that from our customers given that you're troubleshooting a problem on behalf of your customers not for your environment . The following alternatives could be used in that case:

     

    Note: It looks like a capture file decrypted by using the private key couldn’t be saved as a different capture file in unencrypted format.

     

    - After decrypting the traffic, we could examine it in a live meeting session where the customer shares his desktop

    - The decrypted packets could be printed to a file from File > Print option (by choosing the “Output to file” option)

    - By right clicking one of the decrypted packets and selecting “Follow SSL Stream”, we can save the session content to a text file. The following is an example of such a file created that way:

     

    6) More information could be found at the following links:

     

    Citrix

    http://support.citrix.com/article/CTX116557

     

    Wireshark

    http://wiki.wireshark.org/SSL

     

     

    Hope this helps

     

    Thanks,

    Murat

  • Effects of incorrect QoS policies: A story behind a slow file copy...

    Hi there,

     

    In this blog post, I’ll talk about another network trace analysis scenario.

     

    The problem was that some Windows XP clients were copying files from a NAS device very slowly compared to others. As one of the most useful logs to troubleshoot such problems, I requested a network trace to be collected on a problem Windows XP client. Normally it’s best to collect simultaneous network traces but it was a bit diffcult to collect a trace at the NAS device side so we were limited to a client side trace.

     

    Before I start explaining how I got to the bottom of the issue, I would like to provide you with some background on how files are read by Windows via SMB protocol so that you’ll better understand the resolution part:

     

    Windows XP and Windows 2003 use SMB v1 protocol for remote file system access (like creating/reading/writing/deleting/locking files over a network connection). Since it was a file read from the remote server in this scenario, the following SMB activity would be seen between the client and server:

     

    Client                                      Server

    =====                                     ======

    The client will open the file at the server first:

     

    SMB Create AndX request ---->

                                              <---- SMB Create AndX response

     

    Then the client will send SMB Read AndX requests to retrieve blocks of the file:

     

    SMB Read AndX request   ----> (61440 bytes)

                                              <---- SMB Read AndX response

     

    SMB Read AndX request   ----> (61440 bytes)

                                              <---- SMB Read AndX response

     

    SMB Read AndX request   ----> (61440 bytes)

                                              <---- SMB Read AndX response

     

    SMB Read AndX request   ----> (61440 bytes)

                                              <---- SMB Read AndX response

    ...

     

    Note: SMBv1 protocol could request 61 KB of data at most in one SMB Read AndX request.

     

     

    After this short overview, let’s get back to the original problem and analyze the packets taken from the real network trace:

     

    Frame#  Time delta    Source IP             Destination IP     Protocol        Information

    =====     ========     =========           ==========         ======            ========

    ....

    59269    0.000000              10.1.1.1                10.1.1.2                SMB       Read AndX Request, FID: 0x0494, 61440 bytes at offset 263823360

    59270    0.000000              10.1.1.2                10.1.1.1                SMB       Read AndX Response, 61440 bytes

    59271    0.000000              10.1.1.2                10.1.1.1                TCP        [Continuation to #59270] microsoft-ds > foliocorp [ACK] Seq=65993793

    59272    0.000000              10.1.1.2                10.1.1.1                TCP        [Continuation to #59270] microsoft-ds > foliocorp [ACK] Seq=65995249

    59273    0.000000              10.1.1.2                10.1.1.1                TCP        [Continuation to #59270] microsoft-ds > foliocorp [ACK] Seq=65996705

    ...

    59320    0.000000              10.1.1.2                10.1.1.1                TCP        [Continuation to #59270] microsoft-ds > foliocorp [ACK] Seq=66049121

    59321    0.000000              10.1.1.2                10.1.1.1                TCP        [Continuation to #59270] microsoft-ds > foliocorp [ACK] Seq=66050577

    59322    0.000000              10.1.1.2                10.1.1.1                TCP        [Continuation to #59270] microsoft-ds > foliocorp [ACK] Seq=66052033

    59323    0.000000              10.1.1.1                10.1.1.2                TCP        foliocorp > microsoft-ds [ACK] Seq=67600 Ack=66053489 Win=65535

    59325    0.406250              10.1.1.2                10.1.1.1                TCP       [Continuation to #59270] microsoft-ds > folioc [PSH, ACK]Seq=66053489

     

    59326    0.000000              10.1.1.1                10.1.1.2                SMB       Read AndX Request, FID: 0x0494, 61440 bytes at offset 263884800

    59327    0.000000              10.1.1.2                10.1.1.1                SMB       Read AndX Response, 61440 bytes

    59328    0.000000              10.1.1.2                10.1.1.1                TCP        [Continuation to #59327] microsoft-ds > foliocorp [ACK] Seq=66055297

    ...

     

    Now let’s take a closer look at some related frames:

     

    Frame# 59269 => Client requests the next 61 KB of data at offset 263823360 from the file represented with FID 0x0494 (this FID is assigned by server side when the file is first opened/created)

     

    Frame# 59270 => Server starts sending 61440 bytes of data back to the client in SMB Read AndX response.

     

    Frame# 59271 => The remaining parts are sent in 1460 bytes chunks because of TCP MSS negotiated, it’s generally 1460 bytes. (like frame# 59272, frame# 59273 etc)

     

    The most noticable thing in the network trace was to see many such 0.4 seconds delays (like the one that we see at frame #59325). Those 0.4 seconds delays were always present at the last fragment of 61 KB of data returned by the server.

     

    Normally 0.4 seconds could be seen as a very low delay but considering that the client will send n x SMB Read Andx request to the server to read the file it will quickly be clear that 0.4 seconds of delay is huge (for example, the client needs to send 1000 SMB Read AndX requests to read a 64 MB file)

     

    Generally we’re used to see some delays in network traces due to packet retransmissions (due to packet loss) or link transfer delayes etc. But seeing a constant delay of 0.4 seconds in every last fragment of a 61 KB block made me suspect that a QoS implementation was in place somewhere between the client and server. By delaying every read request about 0.4 seconds, actually file copy is being slowed down on purpose: traffic shaping/limiting.

     

    Since we didn’t have a network trace collected at the NAS device side, we couldn’t check if the QoS policy was in effect at the NAS device side or on a network device running in between the two. (we checked the client side and there was no QoS configuration in place). After further checking the network devices, it turned out that there was an incorrectly configured QoS policy on one of them. After making the required changes, the problem was resolved...

     

    Hope this helps

     

    Thanks,

    Murat

  • Internet Explorer doesn't display ISA or TMG error message 502 when connecting to HTTPS servers

    Hi there,

    I would like to talk about an issue that I have dealt with recently regarding Internet Explorer and displaying TMG error messages.

    The problem reported was that newer IE versions (like 8 or 9) didn’t display the regular TMG error message which is displayed when the access rule allows certain users and the current user is not one of the allowed users (Error Code: 502 Proxy Error. The Forefront TMG denied the specified Uniform Resource Locator (URL). (12202)),instead the "Page not found" error was displayed and that was causing some help desk calls since the user thought that the target web site was not reachable based on the displayed error message whereas the real problem was user was not allowed to access the given web site.

    IE6 didn’t have the same problem. Then we started investigating the problem from TMG perspective to make sure that it wasn’t something stemming from TMG server side. After some further troubleshooting (network traces), we found out that TMG was sending the regular error page back to the client but somehow it wasn’t displayed by the IE client.

    Then we focused on the IE side. After some further investigation, I found out that it was the expected default behavior for newer Internet Explorer versions (8 and 9, we haven’t tested 7 but this might apply to 7 as well) for security reasons. You can find below more information about the vulnerability that could be exploited when IE uses Proxy servers to connect to target servers:

    Pretty-Bad-Proxy: An Overlooked Adversary in Browsers’ HTTPS Deployments

    Having said that, there’s a registry key which allows you to turn this enhanced security feature off in newer IE versions. You can see the details below on how to do this on the client machines:

    http://msdn.microsoft.com/en-us/library/ms537184(VS.85).aspx Introduction to Feature Controls

    - You’ll need to create the highlighted key at the given path on a client machine:

    HKEY_LOCAL_MACHINE (or HKEY_CURRENT_USER)
         SOFTWARE
              Microsoft
                   Internet Explorer
                        Main
                             FeatureControl
                                          FEATURE_SHOW_FAILED_CONNECT_CONTENT_KB942615     (Note: you’ll also need to create that registry key under “FeatureControl”)

                                                   Reg key name: Iexplore.exe

                                                   Type:               REG_DWORD

                                                   Value:              0x00000001

     

     

    You can also get some more information at http://msdn.microsoft.com/en-us/library/dd565641(VS.85).aspx#eventLog Event 1065 - Web Proxy Error Handling Changes

     

    I would like to re-emphasize that it normally shouldn’t be turned off from security perspective, so please implement it at your own risk.

     

    Hope this helps

    Thanks,

    Murat

  • PDF file corrupted when downloaded through TMG server

    Recently I dealt with a problem where PDF file downloaded from a certain external web site was always corrupted and I would like to talk about how I troubleshooted that problem. The client was connected to internet through a four node TMG 2010/SP2 array.

    We decided to collect the following logs to better understand why the file was corrupted:

    - Network trace on the internal client

    - TMG data packager on one of the TMG servers

    (Since the problem was reproducible by setting any of the TMG servers as the proxy server, we set one of the array members as a proxy server to collect less logs)

    Note: TMG data packager is installed as part of TMG Best practices Analyzer installation

    http://www.microsoft.com/en-us/download/details.aspx?id=17730

    Microsoft Forefront Threat Management Gateway Best Practices Analyzer Tool

     

    The results from the log analysis were as below:

    - There weren’t any connectivity problems present in the TCP sessions (through which the file was downloaded) in the network trace collected on the client, internal and external interfaces of TMG server

    - The error code for the given file download was 13: (taken from Web proxy log)

    Action

    Client IP

    Destination Host IP

    Server Name

    Operation

    Result Code

    URL

    Failed

    10.200.1.20

    10.1.1.1

    Proxy1

    GET

    13

    http://.../Report.pdf

    Failed

    10.200.1.20

    10.1.1.1

    Proxy1

    GET

    13

    http://.../Report.pdf

     

    Note: IP addresses/links/proxy names etc are deliberately changed

    Error 13 is “The data is invalid”:

    C:\>net helpmsg 13

    The data is invalid.

    So TMG server thinks that the received data was invalid. That also explains why the downloaded file was corrupted.

    Then I decided to take a look at the ETL trace which was also collected with TMG Data packager. Actually the root cause behind why TMG server thought the data was invalid was clearly visible there:

    ... GZIP Dempression failed. Drop the request. (connection closed=0) 0x8007000d(ERROR_INVALID_DATA)

    Because the file decompression fails on TMG server, TMG server finalizes the session with Error_Invalid_data (error 13)

    Note: Please note that you have to contact Microsoft support for ETL trace conversion

    Note: You can also collect a similar diagnostics log from TMG server’s console:


    (Before reproducing the problem you have to enable logging from “Enable Diagnostic Logging” and once the problem is reproduced you have to disable logging by selecting “Disable Diagnostic Logging”)

    For troubleshooting purposes, I suggested to turn off Compression on TMG server:

    (We remove “External” from the “Request compressed HTTP content when sending requests to these network elements” section.)

    As expectedly the corrupted file download problem was resolved. When we make the above configuration change actually we ask the TMG server not to ask for compression when sending HTTP requests out to external web servers. So the file was downloaded in uncompressed format. Please note that TMG server asks for compression for HTTP requests sent to external web sites by default and that provides some bandwidth saving by minimizing the amount of data transferred.

     

    We decided that the problem was somehow related to the target web site or upstream Web proxy because the same TMG server was able to successfully download HTTP content in compressed format from other external web sites.

     

    Normally it’s possible to turn off compression for a specific web site (which could be configured from “Exceptions” tab in the above screen shot). But the TMG array in question was configured to use an upstream proxy for all external web traffic. So creating an exception wouldn’t make much difference here. Our customer decided to keep HTTP compression off (and re-enable it once the file downloads from the given web site were finished)

     

    Hope this helps

    Thanks,

    Murat

  • Why does anonymous PIPE access fail on Windows Vista, 2008, Windows 7 or Windows 2008 R2

    Hi there,

    In this blog post, I would like to talk about a named pipe access issue on Windows 2008 that I had to deal with recently. One of our customers was having problems in accessing named pipes anonymously on Windows 2008 and therefore we were involved in to address the issue. Even the required configuration for anonymous pipe access was in place, the pipe client was getting ACCESS DENIED when trying to access the pipe.

    The problem was easy to reproduce on any Windows Vista or later system. Just run a named pipe server application which creates a pipe, then try to connect to the pipe anonymously from a remote system. You can see more details below on how to reproduce this behavior:

    a) Compile the sample pipe server&client application given at the following MSDN link:

    http://msdn.microsoft.com/en-us/library/aa365588(VS.85).aspx Multithreaded Pipe Server
    http://msdn.microsoft.com/en-us/library/aa365592(VS.85).aspx Named Pipe Client

    b) Add the named pipe created by Pipe server to the Null session pipe lists (configuring "Nullsessionpipes" registry key under LanmanServer)

    c) Do not enable “Network access: Let Everyone permissions apply to anonymous users” from local GPO or domain GPO

    d) Make sure that the Pipe ACL included Anonymous user with Full Control permission. (You can do that by using a 3rd party application like pipeacl.exe)

    e) Start a command line within the Local System account security context by running a command similar to below:

    at 12:40 /interactive cmd.exe

    f) Run the pipe client application from the command line and try to connect to the pipe. You'll get an ACCESS_DENIED in response from the server.

    Note that as soon as you re-enable “Network access: Let Everyone permissions apply to anonymous users”, pipe client starts successfully opening the pipe and reading from/writing to pipe.

    UNDERSTANDING THE ROOT CAUSE:
    ==============================

    Well now after explaining the problem, now let's take a look at the root cause of this problem:

    Note: Some of the outputs below are WinDBG (debugger) outputs.

     

    1) From Vista onwards, in order to access an object, you need to pass two security checks:

     

    a) Integrity check  => For Vista onwards

    b) Classical access check (checking object’s security descriptor against the desired access) => For all Windows versions

     

    Note: Integrity check couldn’t be turned off even if you disable UAC (and we wouldn’t want to do that either)

     

    2) In test pipe application, we see the following differences when “Network access: Let Everyone permissions apply to anonymous users” is enabled and disabled:

     

    a) “Network access: Let Everyone permissions apply to anonymous users” ENABLED situation

     

    => Token of the thread:

    - The user is anonymous (as expected)

    - The token has also the SID  S-1-16-8192 (which represents Medium integrity level). So the thread will be accessing the pipe object while its integrity level is Medium

     

    kd> !token -n

    _ETHREAD 892ef810, _TOKEN 9a983b00

    TS Session ID: 0

    User: S-1-5-7 (Well Known Group: NT AUTHORITY\ANONYMOUS LOGON)

    Groups:

     00 S-1-0-0 (Well Known Group: localhost\NULL SID)

        Attributes -

     01 S-1-1-0 (Well Known Group: localhost\Everyone)

        Attributes - Mandatory Default Enabled

     02 S-1-5-2 (Well Known Group: NT AUTHORITY\NETWORK)

        Attributes - Mandatory Default Enabled

     03 S-1-5-15 (Well Known Group: NT AUTHORITY\This Organization)

        Attributes - Mandatory Default Enabled

     04 S-1-5-64-10 (Well Known Group: NT AUTHORITY\NTLM Authentication)

        Attributes - Mandatory Default Enabled

     05 S-1-16-8192 Unrecognized SID

        Attributes - GroupIntegrity GroupIntegrityEnabled

    Primary Group: S-1-0-0 (Well Known Group: localhost\NULL SID)

    Privs:

     23 0x000000017 SeChangeNotifyPrivilege           Attributes - Enabled Default

    Authentication ID:         (0,15a3fe2)

    Impersonation Level:       Impersonation

    TokenType:                 Impersonation

    Source: NtLmSsp            TokenFlags: 0x2000

    Token ID: 15a3fe5          ParentToken ID: 0

    Modified ID:               (0, 15a3fe8)

    RestrictedSidCount: 0      RestrictedSids: 00000000

    OriginatingLogonSession: 0

     

    => Security descriptor of the pipe (DACL & SACL of the pipe object)

    - Anonymous Logon has full access to the pipe object

    - Integrity level’s of objects are stored in the SACL of the security descriptor of the object. If the integrity level is not explicitly assigned, the object’s integrity level is Medium.

     

    kd> !sd 0x81f69848 1

    ->Revision: 0x1

    ->Sbz1    : 0x0

    ->Control : 0x8004

                SE_DACL_PRESENT

                SE_SELF_RELATIVE

    ->Owner   : S-1-5-32-544 (Alias: BUILTIN\Administrators)

    ->Group   : S-1-5-21-1181840707-4124064209-3703316816-513 (no name mapped)

    ->Dacl    :

    ->Dacl    : ->AclRevision: 0x2

    ->Dacl    : ->Sbz1       : 0x0

    ->Dacl    : ->AclSize    : 0x5c

    ->Dacl    : ->AceCount   : 0x4

    ->Dacl    : ->Sbz2       : 0x0

    ->Dacl    : ->Ace[0]: ->AceType: ACCESS_ALLOWED_ACE_TYPE

    ->Dacl    : ->Ace[0]: ->AceFlags: 0x0

    ->Dacl    : ->Ace[0]: ->AceSize: 0x18

    ->Dacl    : ->Ace[0]: ->Mask : 0x001f01ff

    ->Dacl    : ->Ace[0]: ->SID: S-1-5-32-544 (Alias: BUILTIN\Administrators)

     

    ->Dacl    : ->Ace[1]: ->AceType: ACCESS_ALLOWED_ACE_TYPE

    ->Dacl    : ->Ace[1]: ->AceFlags: 0x0

    ->Dacl    : ->Ace[1]: ->AceSize: 0x14

    ->Dacl    : ->Ace[1]: ->Mask : 0x001f01ff

    ->Dacl    : ->Ace[1]: ->SID: S-1-5-7 (Well Known Group: NT AUTHORITY\ANONYMOUS LOGON)

     

    ->Dacl    : ->Ace[2]: ->AceType: ACCESS_ALLOWED_ACE_TYPE

    ->Dacl    : ->Ace[2]: ->AceFlags: 0x0

    ->Dacl    : ->Ace[2]: ->AceSize: 0x14

    ->Dacl    : ->Ace[2]: ->Mask : 0x00120089

    ->Dacl    : ->Ace[2]: ->SID: S-1-1-0 (Well Known Group: localhost\Everyone)

     

    ->Dacl    : ->Ace[3]: ->AceType: ACCESS_ALLOWED_ACE_TYPE

    ->Dacl    : ->Ace[3]: ->AceFlags: 0x0

    ->Dacl    : ->Ace[3]: ->AceSize: 0x14

    ->Dacl    : ->Ace[3]: ->Mask : 0x001f01ff

    ->Dacl    : ->Ace[3]: ->SID: S-1-5-18 (Well Known Group: NT AUTHORITY\SYSTEM)

     

    ->Sacl    :  is NULL

     

    So in this scenario, a thread with integrity level of Medium is accessing an object with an integrity level of Medium. Hence it’s ok from Integrity check perspective to access the object. Once the integrity check is passed, DACL evaluation is made next (the classical access check that is done in all Windows versions). Since Anonymous user has access on the DACL of the pipe, it passes that stage as well and access to the pipe object is granted.

     

    b) “Network access: Let Everyone permissions apply to anonymous users” DISABLED situation

     

    => Token of the thread:

    - The user is anonymous (as expected)

    - The token has also the SID  S-1-16-0 (which represents Untrusted integrity level). So the thread will be accessing the pipe object while its integrity level is Untrusted

     

    kd> !token -n

    _ETHREAD 892dab58, _TOKEN 9a81b7f8

    TS Session ID: 0

    User: S-1-5-7 (Well Known Group: NT AUTHORITY\ANONYMOUS LOGON)

    Groups:

     00 S-1-0-0 (Well Known Group: localhost\NULL SID)

        Attributes -

     01 S-1-5-2 (Well Known Group: NT AUTHORITY\NETWORK)

        Attributes - Mandatory Default Enabled

     02 S-1-5-15 (Well Known Group: NT AUTHORITY\This Organization)

        Attributes - Mandatory Default Enabled

     03 S-1-5-64-10 (Well Known Group: NT AUTHORITY\NTLM Authentication)

        Attributes - Mandatory Default Enabled

     04 S-1-16-0 Unrecognized SID

        Attributes - GroupIntegrity GroupIntegrityEnabled

    Primary Group: S-1-0-0 (Well Known Group: localhost\NULL SID)

    Privs:

    Authentication ID:         (0,15a3909)

    Impersonation Level:       Impersonation

    TokenType:                 Impersonation

    Source: NtLmSsp            TokenFlags: 0x0

    Token ID: 15a390c          ParentToken ID: 0

    Modified ID:               (0, 15a390f)

    RestrictedSidCount: 0      RestrictedSids: 00000000

    OriginatingLogonSession: 0

     

     

    => Security descriptor of the pipe (DACL & SACL of the pipe object)

    - Anonymous Logon has full access to the pipe object

    - Integrity level’s of objects are stored in the SACL of the security descriptor objects. If the integrity level is not explicitly assigned, the object’s integrity level is Medium.

     

    kd> !sd 0x81f69848 1

    ->Revision: 0x1

    ->Sbz1    : 0x0

    ->Control : 0x8004

                SE_DACL_PRESENT

                SE_SELF_RELATIVE

    ->Owner   : S-1-5-32-544 (Alias: BUILTIN\Administrators)

    ->Group   : S-1-5-21-1181840707-4124064209-3703316816-513 (no name mapped)

    ->Dacl    :

    ->Dacl    : ->AclRevision: 0x2

    ->Dacl    : ->Sbz1       : 0x0

    ->Dacl    : ->AclSize    : 0x5c

    ->Dacl    : ->AceCount   : 0x4

    ->Dacl    : ->Sbz2       : 0x0

    ->Dacl    : ->Ace[0]: ->AceType: ACCESS_ALLOWED_ACE_TYPE

    ->Dacl    : ->Ace[0]: ->AceFlags: 0x0

    ->Dacl    : ->Ace[0]: ->AceSize: 0x18

    ->Dacl    : ->Ace[0]: ->Mask : 0x001f01ff

    ->Dacl    : ->Ace[0]: ->SID: S-1-5-32-544 (Alias: BUILTIN\Administrators)

     

    ->Dacl    : ->Ace[1]: ->AceType: ACCESS_ALLOWED_ACE_TYPE

    ->Dacl    : ->Ace[1]: ->AceFlags: 0x0

    ->Dacl    : ->Ace[1]: ->AceSize: 0x14

    ->Dacl    : ->Ace[1]: ->Mask : 0x001f01ff

    ->Dacl    : ->Ace[1]: ->SID: S-1-5-7 (Well Known Group: NT AUTHORITY\ANONYMOUS LOGON)

     

    ->Dacl    : ->Ace[2]: ->AceType: ACCESS_ALLOWED_ACE_TYPE

    ->Dacl    : ->Ace[2]: ->AceFlags: 0x0

    ->Dacl    : ->Ace[2]: ->AceSize: 0x14

    ->Dacl    : ->Ace[2]: ->Mask : 0x00120089

    ->Dacl    : ->Ace[2]: ->SID: S-1-1-0 (Well Known Group: localhost\Everyone)

     

    ->Dacl    : ->Ace[3]: ->AceType: ACCESS_ALLOWED_ACE_TYPE

    ->Dacl    : ->Ace[3]: ->AceFlags: 0x0

    ->Dacl    : ->Ace[3]: ->AceSize: 0x14

    ->Dacl    : ->Ace[3]: ->Mask : 0x001f01ff

    ->Dacl    : ->Ace[3]: ->SID: S-1-5-18 (Well Known Group: NT AUTHORITY\SYSTEM)

     

    ->Sacl    :  is NULL

     

    So in this scenario, a thread with integrity level of Untrusted is accessing an object with an integrity level of Medium. Hence it’s NOT OK from Integrity check perspective to access the object and integrity check mechanism denies access to the object. Classical DACL evaluation is even not done here.

     

     

    In summary

    anonymous pipe access fails because of integrity check. When “Network access: Let Everyone permissions apply to anonymous users” is enabled, the EVERYONE SID is also added to the thread token and hence the token’s integrity level is raised (to medium in this scenario). So integrity check succeeds when this policy is enabled.

     

     

     

    HOW TO FIX IT:

    ============

    1) The most meaningful solution here is to set the Pipe object’s integrity level to Untrusted. If we could achieve this, we should be able to pass the integrity check because the integrity level of both the token and the object that the token was trying to open (with Read/Write permissions) would be the same (untrusted)

     

    2) Changing the pipe object’s integrity level could be achieved in two different ways:

     

    a) Setting the integrity level while creating the PIPE from the server application (via CreateFile() API)

    b) Setting the integrity level after the PIPE is created (via SetSecurityInfo() API)  (Either from the server application or from another application)

     

    3) While searching for possible programmatic solutions, we have come across a very good source code example on how to set the integrity level of the pipe to Untrusted while creating the pipe. It’s also a good example of how to create pipe applications that will be using Anonymous pipes on Vista onwards systems:

    => Blog link: (in German)

    http://blog.m-ri.de/index.php/2009/12/08/windows-integrity-control-schreibzugriff-auf-eine-named-pipe-eines-services-ueber-anonymen-zugriff-auf-vista-windows-2008-server-und-windows-7/

    Note: It's a 3rd party link so please connect to it at your own risk.

    =>  Just a few notes from the source code to further explain how it could be done:

    a) While the pipe is created, a security attributes structure is passed:

     

            hPipe = CreateNamedPipe(

                        server,

                        PIPE_ACCESS_DUPLEX,      

                        PIPE_TYPE_MESSAGE | PIPE_READMODE_MESSAGE | PIPE_WAIT,

                        PIPE_UNLIMITED_INSTANCES,

                        sizeof(DWORD),

                        0,

                        NMPWAIT_USE_DEFAULT_WAIT,        

                        &sa );

     

     

    b) Especially integrity level related part of that the security attribute structure is built as follows:

     

     

    ...

          // We need this only if we have Windows Vista, Windows 7 or Windows 2008 Server

          OSVERSIONINFO osvi;

          osvi.dwOSVersionInfoSize = sizeof(osvi);

          if (!GetVersionEx(&osvi))

          {

              DisplayError( L"GetVersionInfoEx" );

            return FALSE;

        }

     

          // If Vista, Server 2008, or Windows7!

          if (osvi.dwMajorVersion>=6)

          {

                // Now the trick with the SACL:

                // We set SECURITY_MANDATORY_UNTRUSTED_RID to SYSTEM_MANDATORY_POLICY_NO_WRITE_UP

                // Anonymous access is untrusted, and this process runs equal or above medium

                // integrity level. Setting "S:(ML;;NW;;;LW)" is not sufficient.

                _tcscat(szBuff,_T("S:(ML;;NW;;;S-1-16-0)"));

          }

         

    ...

     

    The highlighted part will cause the integrity level to be set to Untrusted on the pipe object while the pipe is created via CreateNamedPipe().

     

    You can find more information on Integrity check at the following link:

     

    http://msdn.microsoft.com/en-us/library/bb625963.aspx Windows Integrity Mechanism Design

    http://msdn.microsoft.com/en-us/library/aa379588(VS.85).aspx SetSecurityInfo Function

    http://msdn.microsoft.com/en-us/library/aa363858(VS.85).aspx CreateFile Function

     

    Thanks,

    Murat

  • Why can't we access NLB Clusters from remote subnets?

    Hi there,

    In today's blog, I would like to talk about NLB cluster access problems that our customers experience most of the time...

    When Microsoft NLB cluster operates in multicast mode, in certain scenarios you may not be able to access the NLB cluster IP address from remote subnets whereas suame subnet access keeps working fine. You can find more information at the two most common scenarios below:

     

    Problem 1:

    When NLB cluster on Windows 2008/SP2 operates in multicast mode, due to a problem with NLB implementation on 2008 remote subnets cannot access NLB cluster IP address

     

    Solution 1:

    - This problem was stemming from NLB implementation

    - This has been fixed by Microsoft with the hotfix KB960916.

    - KB960916 is already included in Windows 2008 SP2

     

    Problem 2:

    When NLB cluster on Windows 2003 or Windows 2008 operates in multicast mode, remote subnets cannot access NLB cluster IP address. That second problem stems from the fact that some vendors (like Cisco) don't accept mapping L3 unicast IP addresses to L2 multicast MAC addresses (this happens when NLB cluster operates in multicast mode - L3 unicast IP address is the NLB cluster IP address and L2 mac address is the multicast MAC address that is chose by NLB) so you have to create a static mapping on the router to avoid such a problem. You can find more information about this problem at the following link:

     

    http://www.cisco.com/en/US/products/hw/switches/ps708/products_configuration_example09186a0080a07203.shtml

     

    (Taken from the above link)

     

    Multicast Mode

     

    Another solution is to use multicast mode in MS NLB configuration GUI instead of Unicast mode. In Multicast Mode, the system admin clicks the IGMP Multicast button in the MS NLB configuration GUI. This choice instructs the cluster members to respond to ARPs for their virtual address using a multicast MAC address for example 0300.5e11.1111 and to send IGMP Membership Report packets. If IGMP snooping is enabled on the local switch, it snoops the IGMP packets that pass through it. In this way, when a client ARPs for the cluster’s virtual IP address, the cluster responds with multicast MAC for example 0300.5e11.1111. When the client sends the packet to 0300.5e11.1111, the local switch forwards the packet out each of the ports connected to the cluster members. In this case, there is no chance of flooding the ARP packet out of all the ports. The issue with the multicast mode is virtual IP address becomes unreachable when accessed from outside the local subnet because Cisco devices do not accept an arp reply for a unicast IP address that contains a multicast MAC address. So the MAC portion of the ARP entry shows as incomplete. (Issue the command show arp to view the output.) As there is no MAC portion in the arp reply, the ARP entry never appeared in the ARP table. It eventually quit ARPing and returned an ICMP Host unreachable to the clients. In order to override this, use static ARP entry to populate the ARP table as given below. In theory, this allows the Cisco device to populate its mac-address-table. For example, if the virtual ip address is 172.16.63.241 and multicast mac address is 0300.5e11.1111, use this command in order to populate the ARP table statically:

     

    Solution 2:

    In order to resolve that problem, you have two choices:

     

    a) Adding a static ARP entry on the router

    b) Changing NLB cluster mode to Unicast

     

    Also please always keep in mind the following when troubleshooting NLB problems:

     

     1) Do I run the latest NLB driver available from Microsoft? We have released a few updates on NLB drivers on Windows 2003, Windows 2008 and Windows 2008 R2 to address a few problems

     2) Do I run the latest NIC driver and teaming driver? We generally prefer not to run teaming on NLB clusters and may ask to dissolve the teaming if needed even though we don't have strict "not supported" statement.

     3) Do the NLB rules are correctly configured? The most common problem with that is to set affinity to "None" for stateful protocols which causes many NLB cluster access problems.

     4) Do I run the latest TCPIP driver? (preferrably the latest security update which updates TCPIP driver)

     5) Do I run the latest 3rd party filter drivers that run at NDIS layer? (for example security drivers)

     6) If NLB cluster runs on Windows 2008 R2 Hyper-V, do you disable "Enable spoofing of MAC addresses"?

     I'm going to talk about troubleshooting approaches in another blog post.

     Hope this helps

     Thanks,
    Murat

  • Why should a DC contact clients in the domain?

    Hi there,

     

    In today’s blog post, I’m going to show you how I found out why a Domain controller was contacting random clients in the domain. This issue was reported by the customer due to security concerns. They suspected that a suspicious process might be running on the DC and the case was raised as a result of security concerns. In general we don’t expect Domain controllers to contact the clients running in the domain so our customer wanted to understand the reason behind that.

     

    We first verified that the DC was really contacting some clients by collecting a network trace on the DC. You can see one of those clients (client1) contacted by the DC (DC1):

    Note: DC and client IP addresses are replaced for data privacy.

     

    ...

    11415      14:21:12 05.07.2010              DC1  CLIENT1         TCP          TCP:Flags=......S., SrcPort=3912, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=70128947, Ack=0, Win=65535 (  ) = 65535     {TCP:515, IPv4:46}

    11443      14:21:12 05.07.2010              DC1  CLIENT1         TCP          TCP:Flags=......S., SrcPort=3913, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=3133793441, Ack=0, Win=65535 (  ) = 65535 {TCP:518, IPv4:46}

    30922      14:33:17 05.07.2010              DC1  CLIENT1         TCP          TCP:Flags=......S., SrcPort=4118, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=2414564040, Ack=0, Win=65535 (  ) = 65535 {TCP:1270, IPv4:46}

    30950      14:33:17 05.07.2010              DC1  CLIENT1         TCP          TCP:Flags=......S., SrcPort=4120, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=1797119693, Ack=0, Win=65535 (  ) = 65535 {TCP:1273, IPv4:46}

    51472      14:45:22 05.07.2010              DC1  CLIENT1         TCP          TCP:Flags=......S., SrcPort=4314, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=1834145861, Ack=0, Win=65535 (  ) = 65535 {TCP:1403, IPv4:46}

    51500      14:45:22 05.07.2010              DC1  CLIENT1         TCP          TCP:Flags=......S., SrcPort=4315, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=4278939251, Ack=0, Win=65535 (  ) = 65535 {TCP:1406, IPv4:46}

    67096      14:57:26 05.07.2010              DC1  CLIENT1         TCP          TCP:Flags=......S., SrcPort=4514, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=1707963693, Ack=0, Win=65535 (  ) = 65535 {TCP:1945, IPv4:46}

    67126      14:57:26 05.07.2010              DC1  CLIENT1         TCP          TCP:Flags=......S., SrcPort=4515, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=3807245641, Ack=0, Win=65535 (  ) = 65535 {TCP:1948, IPv4:46}

    74691      15:09:30 05.07.2010              DC1  CLIENT1         TCP          TCP:Flags=......S., SrcPort=4740, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=1036190517, Ack=0, Win=65535 (  ) = 65535 {TCP:1983, IPv4:46}

    74721      15:09:31 05.07.2010              DC1  CLIENT1         TCP          TCP:Flags=......S., SrcPort=4741, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=2281072822, Ack=0, Win=65535 (  ) = 65535 {TCP:1986, IPv4:46}

    84937      15:21:35 05.07.2010              DC1  CLIENT1         TCP          TCP:Flags=......S., SrcPort=4930, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=3190224054, Ack=0, Win=65535 (  ) = 65535 {TCP:2104, IPv4:46}

    84965      15:21:35 05.07.2010              DC1  CLIENT1         TCP          TCP:Flags=......S., SrcPort=4931, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=2774224583, Ack=0, Win=65535 (  ) = 65535 {TCP:2107, IPv4:46}

    ...

     

    At first look, it drew my attention that the connection attempt was repeated every 12 minutes or so. Then this should have been something running periodically on the DC. Normally Network Monitor should show you the process that is initiating those TCP sessions but under heavy load Network monitor stops to do so in favor of performance as it’s a costly operation. There’re some other methods to find out a process sending a certain packet but I decided to let the DC do whatever it would do against the client to see the whole activity.

     

    So the customer removed firewall filters and allowed the DC to connect to Client1. After doing so we collected a new network trace to see the latest situation. We got the expected results by examining the new network trace:

     

    a) The first interesting finding was that the client was sending a “Master Browser” announcement to the DC (DC1) shortly before one of these connection attempts from the DC side:

     

    47140      07:30:31 08.07.2010              CLIENT1             DC1 BROWSER              BROWSER:Master Announcement       {SMB:351, UDP:350, IPv4:3}

     

    b) After that browser announcement, the DC contacted the client at TCP port 139 to establish an SMB session:

     

    47595      07:30:33 08.07.2010              DC1 CLIENT1             TCP          TCP:Flags=......S., SrcPort=3787, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=2594372577, Ack=0, Win=65535 (  ) = 65535              {TCP:373, IPv4:3}

    47596      07:30:33 08.07.2010              CLIENT1  DC1            TCP          TCP:Flags=...A..S., SrcPort=NETBIOS Session Service(139), DstPort=3787, PayloadLen=0, Seq=2981880191, Ack=2594372578, Win=8192 ( Scale factor not supported ) = 8192   {TCP:373, IPv4:3}

    47597      07:30:33 08.07.2010              DC1 CLIENT1             TCP          TCP:Flags=...A...., SrcPort=3787, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=2594372578, Ack=2981880192, Win=65535 (scale factor 0x0) = 65535    {TCP:373, IPv4:3}

     

    c) Then it initiated a NetBT session to the client:

     

    47598      07:30:33 08.07.2010              DC1 CLIENT1             NbtSS      NbtSS:SESSION REQUEST, Length =68 {NbtSS:374, TCP:373, IPv4:3}

    47599      07:30:33 08.07.2010              CLIENT1             DC1 NbtSS      NbtSS:POSITIVE SESSION RESPONSE, Length =0                {NbtSS:374, TCP:373, IPv4:3}

     

    d) Then it established an SMB connection:

     

    47600      07:30:33 08.07.2010              DC1 CLIENT1             SMB        SMB:C; Negotiate, Dialect = PC NETWORK PROGRAM 1.0, LANMAN1.0, Windows for Workgroups 3.1a, LM1.2X002, LANMAN2.1, NT LM 0.12       {NbtSS:374, TCP:373, IPv4:3}

    47602      07:30:33 08.07.2010              CLIENT1             DC1 SMB        SMB:R; Negotiate, Dialect is NT LM 0.12 (#5), SpnegoToken (1.3.6.1.5.5.2)         {NbtSS:374, TCP:373, IPv4:3}

    47614      07:30:34 08.07.2010              DC1 CLIENT1             SMB        SMB:C; Session Setup Andx, NTLM NEGOTIATE MESSAGE                {NbtSS:374, TCP:373, IPv4:3}

    47615      07:30:34 08.07.2010              CLIENT1             DC1 SMB        SMB:R; Session Setup Andx, NTLM CHALLENGE MESSAGE - NT Status: System - Error, Code = (22) STATUS_MORE_PROCESSING_REQUIRED                {NbtSS:374, TCP:373, IPv4:3}

    47616      07:30:34 08.07.2010              DC1 CLIENT1             SMB        SMB:C; Session Setup Andx, NTLM AUTHENTICATE MESSAGE, Workstation: DC1           {NbtSS:374, TCP:373, IPv4:3}

    47621      07:30:34 08.07.2010              CLIENT1             DC1 SMB        SMB:R; Session Setup Andx  {NbtSS:374, TCP:373, IPv4:3}

     

    e) Then it connected to the interprocess communication share (IPC$):

     

    47625      07:30:34 08.07.2010              DC1 CLIENT1             SMB        SMB:C; Tree Connect Andx, Path = \\CLIENT1\IPC$, Service = ?????      {NbtSS:374, TCP:373, IPv4:3}

    47626      07:30:34 08.07.2010              CLIENT1             DC1 SMB        SMB:R; Tree Connect Andx, Service = IPC            {NbtSS:374, TCP:373, IPv4:3}

     

    f) Then it called RAP (Remote Administration Protocol) APIs like NetServerEnum2 etc:

     

    47630      07:30:34 08.07.2010              DC1 CLIENT1             RAP         RAP:NetServerEnum2 Request, InfoLevel = 1,  LocalList in                 {SMB:379, NbtSS:374, TCP:373, IPv4:3}

    47631      07:30:34 08.07.2010              CLIENT1             DC1 RAP         RAP:NetServerEnum2 Response, Count = 1         {SMB:379, NbtSS:374, TCP:373, IPv4:3}

     

    g) Once it got the requested info, it logged off and disconnected the TCP session:

     

    47642      07:30:34 08.07.2010              DC1 CLIENT1             SMB        SMB:C; Logoff Andx               {NbtSS:374, TCP:373, IPv4:3}

    47643      07:30:34 08.07.2010              CLIENT1             DC1 SMB        SMB:R; Logoff Andx               {NbtSS:374, TCP:373, IPv4:3}

    47650      07:30:34 08.07.2010              DC1 CLIENT1             SMB        SMB:C; Tree Disconnect        {NbtSS:374, TCP:373, IPv4:3}

    47651      07:30:34 08.07.2010              CLIENT1             DC1 SMB        SMB:R; Tree Disconnect        {NbtSS:374, TCP:373, IPv4:3}

    47657      07:30:34 08.07.2010              DC1 CLIENT1             TCP          TCP:Flags=...A...F, SrcPort=3787, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=2594373651, Ack=2981881320, Win=64407 (scale factor 0x0) = 64407    {TCP:373, IPv4:3}

    47658      07:30:34 08.07.2010              CLIENT1             DC1 TCP          TCP:Flags=...A...F, SrcPort=NETBIOS Session Service(139), DstPort=3787, PayloadLen=0, Seq=2981881320, Ack=2594373652, Win=15559 (scale factor 0x0) = 15559  {TCP:373, IPv4:3}

    47662      07:30:34 08.07.2010              DC1 CLIENT1             TCP          TCP:Flags=...A...., SrcPort=3787, DstPort=NETBIOS Session Service(139), PayloadLen=0, Seq=2594373652, Ack=2981881321, Win=64407 (scale factor 0x0) = 64407    {TCP:373, IPv4:3}

     

    h) That similar activity was seen every 12 minutes in the network trace.

     

    RESULTS:

    ========

    1) After analyzing the second network trace, the reason behind DC’s connection to clients was clear now:

    Every 12 minutes or so master browser in a network segment informs the domain master browser (which is the DC) that it’s a master browser. And then the DC connects to master browser in turn to retrieve the browse list from that master browser. You can find more details below:

    Taken from http://technet.microsoft.com/en-us/library/cc737661(WS.10).aspx How Computer Browser Service Works:

    When a domain spans multiple subnets, the master browse servers for each subnet use a unicast Master Announcement message to announce themselves to the domain master browse server. This message notifies the domain master browse server that the sending computer is a master browse server in the same domain. When the domain master browse server receives a Master Browse Server Announcement message, it returns to the “announcing” master browse server a request for a list of the server’s in that master browse server’s subnet. When that list is received, the domain master browse server merges it with its own server list.

    This process, repeated every 12 minutes, guarantees that the domain master browse server has a complete browse list of all the servers in the domain. Thus, when a client sends a browse request to a backup browse server, the backup browse server can return a list of all the servers in the domain, regardless of the subnet on which those servers are located.

    ...

     

    Hope this helps

     

    Thanks,

    Murat

     

  • Exchange servers send ICMP and UDP packets to clients or Domain Controllers, why?

    Hi,

     

    I would like to talk about a few network trace analysis cases where we were requested to find out why certain packets (spefically ICMP and UDP) were sent by Exchange servers. You’ll find below more details about how we found the processes sending those packets:

     

    a) Exchange servers sending UDP packets with random source or destination ports to various clients

     

    In one scenario, our customer’s security team wanted to find out the reason of why the Exchange servers were sending UDP packets to random clients on the network because of security concerns. There was no deterministic pattern regarding source or destination UDP ports. The only consistency was that each UDP packet sent by the Exchange servers had always 8 byte as payload. You can see a sample network trace output below:

     

    Note: Addresses were replaced for privacy purposes even though private IP address space was in use.

     

    No.     Time                       Source                Destination           Protocol Info

    ...

    105528 2010-01-14 15:20:14.454856 10.1.1.1     172.1.10.14        UDP      Source port: 35996  Destination port: mxomss

    105530 2010-01-14 15:20:14.454856 10.1.1.1     172.18.10.27        UDP      Source port: 35997  Destination port: edtools

    105531 2010-01-14 15:20:14.454856 10.1.1.1     172.17.17.95         UDP      Source port: 35998  Destination port: fiveacross

    105535 2010-01-14 15:20:14.454856 10.1.1.1     172.17.11.51         UDP      Source port: 36000  Destination port: kwdb-commn

    105540 2010-01-14 15:20:14.454856 10.1.1.1     172.23.98.97          UDP      Source port: 36003  Destination port: dicom-tls

    105541 2010-01-14 15:20:14.454856 10.1.1.1     172.24.12.8         UDP      Source port: 36004  Destination port: dkmessenger

    105542 2010-01-14 15:20:14.454856 10.1.1.1     172.28.2.52         UDP      Source port: 36005  Destination port: tragic

    105545 2010-01-14 15:20:14.454856 10.1.1.1     172.31.5.14        UDP      Source port: 36006  Destination port: xds

    105546 2010-01-14 15:20:14.454856 10.1.1.1     172.2.10.63         UDP      Source port: 36007  Destination port: 4642

    105547 2010-01-14 15:20:14.454856 10.1.1.1     172.2.35.68          UDP      Source port: 36008  Destination port: foliocorp

    105552 2010-01-14 15:20:14.454856 10.1.1.1     172.18.12.55         UDP      Source port: 36010  Destination port: saphostctrl

    105553 2010-01-14 15:20:14.454856 10.1.1.1     172.48.199.45         UDP      Source port: 36011  Destination port: slinkysearch

    105554 2010-01-14 15:20:14.454856 10.1.1.1     172.27.133.42         UDP      Source port: 36012  Destination port: oracle-oms

    105555 2010-01-14 15:20:14.454856 10.1.1.1     172.27.121.40         UDP      Source port: 36013  Destination port: proxy-gateway

    105558 2010-01-14 15:20:14.454856 10.1.1.1     172.24.7.11         UDP      Source port: 36016  Destination port: fcmsys

    ...

     

    - Source UDP port is increasing and destination UDP port seems random at first sight

    - The data part of the UDP datagrams are always 8 bytes. As an example:

     

    Frame 105540 (50 bytes on wire, 50 bytes captured)

    Ethernet II, Src: HewlettP_11:11:11 (00:1c:c4:11:11:11), Dst: All-HSRP-routers_15 (00:00:0c:07:ac:15)

    Internet Protocol, Src: 10.1.1.1 (10.1.1.1), Dst: 172.23.98.97 (172.23.98.97)

    User Datagram Protocol, Src Port: 36003 (36003), Dst Port: dicom-tls (2762)

    Data (8 bytes)

     

     

    => To better understand which process might be sending that packet, we decided to collect a kernel TCPIP trace on the source Windows 2003 server. You can find more information about methods that could be used to identify the process sending a certain packet, please see my previous post on this

     

    After collecting a network trace and an accompanying kernel TCPIP trace as described in the other post (option 4), we managed to catch the UDP packet that we see in the above network trace (actually the above network trace and the below kernel TCPIP trace were collected together). As an example:

     

    ...

           UdpIp,       Send, 0xFFFFFFFF,   129079416141424158,          0,          0,     2136,        8, 172.023.098.097, 010.001.001.001, 2762, 36003, 0, 0

           UdpIp,       Send, 0xFFFFFFFF,   129079416141424158,          0,          0,     2136,        8, 172.027.153.050, 172.023.021.024, 6004, 36009, 0, 0

           UdpIp,       Send, 0xFFFFFFFF,   129079416141424158,          0,          0,     2136,        8, 172.028.097.111, 172.023.021.024, 2344, 36016, 0, 0

           UdpIp,       Send, 0xFFFFFFFF,   129079416141424158,          0,          0,     2136,        8, 172.027.102.056, 172.023.021.024, 1116, 36022, 0, 0

    ...

     

    - For example, in the red line, 10.1.1.1 (Exchange server) is sending a UDP packet to 172.23.98.97. Packet lenght is 8 bytes and source UDP port is 36003 and destination UDP port is 2762. And process ID that is sending the UDP packet is 2136. Actually in all such UDP packets, process ID is always 2136.

    - The above line in Red taken from the kernel trace is the packet #105540 seen in the network trace

     

    => After checking the “tasklist /SVC” output, we saw that process ID 2136 was store.exe (which is Exchange Information store process):

     

    ...

    wmiprvse.exe                  5176 Console                    0      2,168 K

    mad.exe                       7176 Console                    0     45,792 K

    AntigenStore.exe             10092 Console                    0        200 K

    store.exe                     2136 Console                    0  1,040,592 K

    emsmta.exe                   12020 Console                    0     29,092 K

    ...

     

     

    => After further investigation at Exchange side with the help of an Exchange expert, we found out that this traffic was expected and was used as an E-mail notification mechanism:

     

    http://support.microsoft.com/default.aspx?scid=kb;EN-US;811061  XCCC: Exchange Clients Do Not Receive "New Mail" Notification Messages

     

    The Information Store process (Store.exe) sends a User Datagram Protocol (UDP) packet for new mail notifications. However, because the Store process does not run on an Exchange virtual server but on the cluster node, the UDP packet is sent from the IP address of that node. If you fail over the cluster node, the data and Exchange 2000 Server virtual server configuration are moved to the Store process that is running on the other cluster server node. New mail notifications are sent from the IP address of that second cluster node.

     

    ///////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

     

    b) Exchange servers sending 1 byte pings to DCs

     

    One of our customers reported that their DCs were getting constant ICMP Echo requests from a number of member servers and they wanted to get help in finding the process behind it because of security concerns. After some analysis and testing with the help of an Exchange expert colleague of mine, we found out that those ICMP echo requests were sent by the Exchange server related services. The ICMP echo request has the following characteristics:

     

    => It’s payload is always 1 byte

    => The payload itself is “3F

     

    Those ICMP echo requests cease once Exchange server related services are stopped which is another indication. This behavior is partly explained at the following article:

     

    http://support.microsoft.com/kb/270836  Exchange Server static port mappings

     

    Taken from the article:

     

    Note In a perimeter network firewall scenario, there is no Internet Control Message Protocol (ICMP) connectivity between the Exchange server and the domain controllers. By default, Directory Access (DSAccess) uses ICMP to ping each server to which it connects to determine whether the server is available. When there is no ICMP connectivity, Directory Access responds as if every domain controller were unavailable

    => You can also see a sample network trace output collected on an Exchange server: 

     

    Hope this helps

     

    Thanks,

    Murat 

  • SCCM client push installation may fail due to firewall problems

    I was collaborating with a colleague of mine on a problem where SCCM client push installation was failing. They suspected network connectivity problems and collected simultaneous network traces from SCCM server and from a problem client machine and involved me in for further analysis.

     

    When I check the SCCM server and client side traces, I saw that SCCM server was successfully accessing the client through TCP port 135

     

    => SCCM server side trace:

     

    - TCP three way handshake between SCCM server and client:

     

    5851            14:42:47 05.09.2012                       34.0337296                                            10.0.9.149                       CLIENTNAME.company.com    TCP                TCP: [Bad CheckSum]Flags=......S., SrcPort=51763, DstPort=DCE endpoint resolution(135), PayloadLen=0, Seq=2250995253, Ack=0, Win=8192 ( Negotiating scale factor 0x8 ) = 8192                   {TCP:861, IPv4:843}

    5852            14:42:47 05.09.2012                       34.0364843                                            CLIENTNAME.company.com    10.0.9.149                       TCP                TCP:Flags=...A..S., SrcPort=DCE endpoint resolution(135), DstPort=51763, PayloadLen=0, Seq=1315818582, Ack=2250995254, Win=65535 ( Negotiated scale factor 0x0 ) = 65535                        {TCP:861, IPv4:843}

    5853            14:42:47 05.09.2012                       34.0365076                                            10.0.9.149                       CLIENTNAME.company.com    TCP                TCP: [Bad CheckSum]Flags=...A...., SrcPort=51763, DstPort=DCE endpoint resolution(135), PayloadLen=0, Seq=2250995254, Ack=1315818583, Win=258 (scale factor 0x8) = 66048     {TCP:861, IPv4:843}

     

    - SCCM server binds to SCMActivator and activates WMI component:

     

    5877            14:42:47 05.09.2012                       34.0610846                                            10.0.9.149                       CLIENTNAME.company.com    MSRPC        MSRPC:c/o Bind: IRemoteSCMActivator(DCOM) UUID{000001A0-0000-0000-C000-000000000046}  Call=0x3  Assoc Grp=0xBB15  Xmit=0x16D0  Recv=0x16D0          {MSRPC:865, TCP:861, IPv4:843}

    5880            14:42:47 05.09.2012                       34.0642128                                            CLIENTNAME.company.com    10.0.9.149                       TCP                TCP:Flags=...A...., SrcPort=DCE endpoint resolution(135), DstPort=51763, PayloadLen=0, Seq=1315818583, Ack=2250996747, Win=65535 (scale factor 0x0) = 65535                        {TCP:861, IPv4:843}

    5882            14:42:47 05.09.2012                       34.0748352                                            CLIENTNAME.company.com    10.0.9.149                       MSRPC        MSRPC:c/o Bind Ack:  Call=0x3  Assoc Grp=0xBB15  Xmit=0x16D0  Recv=0x16D0        {MSRPC:865, TCP:861, IPv4:843}

    5883            14:42:47 05.09.2012                       34.0750212                                            10.0.9.149                       CLIENTNAME.company.com    MSRPC        MSRPC:c/o Alter Cont: IRemoteSCMActivator(DCOM)  UUID{000001A0-0000-0000-C000-000000000046}  Call=0x3     {MSRPC:865, TCP:861, IPv4:843}

    5884            14:42:47 05.09.2012                       34.0785470                                            CLIENTNAME.company.com    10.0.9.149                       MSRPC        MSRPC:c/o Alter Cont Resp:  Call=0x3  Assoc Grp=0xBB15  Xmit=0x16D0  Recv=0x16D0                {MSRPC:865, TCP:861, IPv4:843}

    5885            14:42:47 05.09.2012                       34.0786863                                            10.0.9.149                       CLIENTNAME.company.com    DCOM                        DCOM:RemoteCreateInstance Request, DCOM Version=5.7  Causality Id={FEEE1975-B61E-42EB-B500-939EA5EE4B2A}                  {MSRPC:865, TCP:861, IPv4:843}

      Frame: Number = 5885, Captured Frame Length = 923, MediaType = ETHERNET

    + Ethernet: Etype = Internet IP (IPv4),DestinationAddress:[00-22-90-E3-B7-80],SourceAddress:[00-22-64-08-91-A6]

    + Ipv4: Src = 10.0.9.149, Dest = 10.102.0.230, Next Protocol = TCP, Packet ID = 639, Total IP Length = 909

    + Tcp:  [Bad CheckSum]Flags=...AP..., SrcPort=51763, DstPort=DCE endpoint resolution(135), PayloadLen=869, Seq=2250996924 - 2250997793, Ack=1315818870, Win=257 (scale factor 0x8) = 65792

    + Msrpc: c/o Request: IRemoteSCMActivator(DCOM) {000001A0-0000-0000-C000-000000000046}  Call=0x3  Opnum=0x4  Context=0x1  Hint=0x318

    - DCOM: RemoteCreateInstance Request, DCOM Version=5.7  Causality Id={FEEE1975-B61E-42EB-B500-939EA5EE4B2A}

      + HeaderReq: DCOM Version=5.7  Causality Id={FEEE1975-B61E-42EB-B500-939EA5EE4B2A}

      + AggregationInterface: NULL

      - ActivationProperties: OBJREFCUSTOM - {000001A2-0000-0000-C000-000000000046}

       + MInterfacePointerPtr: Pointer To 0x00020000

       - Interface: OBJREFCUSTOM - {000001A2-0000-0000-C000-000000000046}

        + Size: 744 Elements

          InterfaceSize: 744 (0x2E8)

        - Interface: OBJREFCUSTOM - {000001A2-0000-0000-C000-000000000046}

           Signature: 1464812877 (0x574F454D)

           Flags: OBJREFCUSTOM - Represents a custom marshaled object reference

           MarshaledInterfaceIID: {000001A2-0000-0000-C000-000000000046}

         - Custom:

            ClassId: {00000338-0000-0000-C000-000000000046}

            ExtensionSize: 0 (0x0)

            ObjectReferenceSize: 704 (0x2C0)

          - ActivationProperties:

             TotalSize: 688 (0x2B0)

             Reserved: 0 (0x0)

           + CustomHeader:

           - Properties: 6 Property Structures

            + Special:

            - Instantiation:

             + Header:

               InstantiatedObjectClsId: {8BC3F05E-D86B-11D0-A075-00C04FB68820} => This is WMI

               ClassContext: 20 (0x14)

               ActivationFlags: 2 (0x2)

               FlagsSurrogate: 0 (0x0)

     

    - Server responds with success and provides the endpoint information for WMI service:

     

    5886            14:42:47 05.09.2012                       34.0848992                                            CLIENTNAME.company.com    10.0.9.149                       DCOM                        DCOM:RemoteCreateInstance Response, ORPCFLOCAL - Local call to this computer                        {MSRPC:865, TCP:861, IPv4:843}

            - ScmReply:

             + Header:

             + Ptr: Pointer To NULL

             + RemoteReplyPtr: Pointer To 0x00106E98

             - RemoteReply:

                ObjectExporterId: 13300677357152346811 (0xB8957F961925A2BB)

              + OxidBindingsPtr: Pointer To 0x00102FF0

                IRemUnknownInterfacePointerId: {0000B400-0580-0000-9A5E-C2357038B9DF}

                AuthenticationHint: 4 (0x4)

              + Version: DCOM Version=5.7

              - OxidBindings:

               + Size: 378 Elements

               - Bindings:

                  WNumEntries: 378 (0x17A)

                  WSecurityOffsets: 263 (0x107)

                - StringBindings:

                   TowerId: 15 (0xF)

                   NetworkAddress: \\\\CLIENTNAME[\\PIPE\\atsvc]

                - StringBindings:

                   TowerId: 15 (0xF)

                   NetworkAddress: \\\\CLIENTNAME[\\PIPE\\wkssvc]

                - StringBindings:

                   TowerId: 15 (0xF)

                   NetworkAddress: \\\\CLIENTNAME[\\pipe\\keysvc]

                - StringBindings:

                   TowerId: 15 (0xF)

                   NetworkAddress: \\\\CLIENTNAME[\\PIPE\\srvsvc]

                - StringBindings:

                   TowerId: 15 (0xF)

                   NetworkAddress: \\\\CLIENTNAME[\\pipe\\trkwks]

                - StringBindings:

                   TowerId: 15 (0xF)

                   NetworkAddress: \\\\CLIENTNAME[\\PIPE\\W32TIME]

                - StringBindings:

                   TowerId: 15 (0xF)

                   NetworkAddress: \\\\CLIENTNAME[\\PIPE\\ROUTER]

                - StringBindings:

                   TowerId: 7 (0x7)

                   NetworkAddress: CLIENTNAME[1431]

                - StringBindings:

                   TowerId: 7 (0x7)

                   NetworkAddress: 10.102.0.230[1431]

                  Terminator1: 0 (0x0)

                + SecurityBindings:

                + SecurityBindings:

                + SecurityBindings:

                + SecurityBindings:

                + SecurityBindings:

                  Terminator2: 0 (0x0)

     

    - Since WMI listens on TCP 1431, SCCM server tries to connect to that endpoint to access WMI subsystem:

     

    ...

    8980            14:43:08 05.09.2012                       55.1014127                                            10.0.9.149                       CLIENTNAME.company.com    TCP                TCP: [Bad CheckSum]Flags=......S., SrcPort=51785, DstPort=1431, PayloadLen=0, Seq=1764982397, Ack=0, Win=8192 ( Negotiating scale factor 0x8 ) = 8192                        {TCP:1203, IPv4:843}

    9390            14:43:11 05.09.2012                       58.1101896                                            10.0.9.149                       CLIENTNAME.company.com    TCP                TCP:[SynReTransmit #8980] [Bad CheckSum]Flags=......S., SrcPort=51785, DstPort=1431, PayloadLen=0, Seq=1764982397, Ack=0, Win=8192 ( Negotiating scale factor 0x8 ) = 8192                        {TCP:1203, IPv4:843}

    11236         14:43:17 05.09.2012                       64.1163158                                            10.0.9.149                       CLIENTNAME.company.com    TCP                TCP:[SynReTransmit #8980] [Bad CheckSum]Flags=......S., SrcPort=51785, DstPort=1431, PayloadLen=0, Seq=1764982397, Ack=0, Win=8192 ( Negotiating scale factor 0x8 ) = 8192                        {TCP:1203, IPv4:843}

     

    - But this TCP session request fails because SCCM server doesn’t get a response to TCP SYN requests.

    - When we check the client side network trace, we cannot see any of those TCP SYNs sent by the SCCM server.

     

    This is most of the time a hardware router/firewall filtering problem. After our customer made the necessary configuration changes in the firewall, SCCM client push installation started working properly.

     

    Since WMI is assigned a random TCP port from dynamic RPC port range at every startup, network/firewall administrators need to allow that range as well in addition to allowing TCP 135 activity towards the clients. One other alternative in this instance could be fixing the TCPIP port than WMI subsytem obtains at each startup. You can see the below article for more information on this:

     

    http://support.microsoft.com/kb/897571  FIX: A DCOM static TCP endpoint is ignored when you configure the endpoint for WMI on a Windows Server 2003-based computer

     

    Hope this helps

     

    Thanks,

    Murat

     

     

  • Network traffic capturing hints

    In this post, I would like to talk about some important points about network capturing. If a network trace is not collected appropriately, it won’t provide any useful information and it will be a waste of time analyzing such a network trace.

     

    Additionally, just collecting the network trace isn’t sufficient if you intend to ask for some help when analyzing that network trace, you also have to provide some information about the trace itself. Generally I collaborate with other colleagues in terms of network trace analysis and I have a standard template of questions when I’m approached by a colleague for assistance in analyzing a network trace:

     

    - What is the exact problem definition

    - Which network traces were collected on which system

    - The IP addresses of the relevant systems (like  client/server/DC/DNS)

    - OS versions for relevant systems

    - Network topology between the source and target systems on which network traces were collected

    - The exact date & time of the problem & error seen

    - The exact error message seen

    - What were the exact actions taken when collecting the network traces (as much detailed as possible)

     

     

    Now let’s talk about some important points that we need to be aware of to be able to collect a useable network trace that will really help you troubleshoot a given problem.

     

    1) First of all, we need to make sure that it really makes sense to collect a network trace for the problem in hand. You can check the previous blog post to have a better idea on this:

    http://blogs.technet.com/b/nettracer/archive/2012/06/22/when-do-we-need-collect-network-traces.aspx

     

    2) Especially in switched networks, when we collect a network trace from a given node (a client or server), only the following traffic will be seen by the capturing agent (like Network monitor/Wireshark/...) running on the node:

     

    - Packets sent out by the node itself

    - Packets sent to the node’s unicast address

    - Packets sent to unknown unicast addresses (switch doesn’t have that MAC address at its MAC address table yet so it floods the frame everywhere)

    - Packets sent to broadcast address

    - Packets sent to multicast addresses

     

    So we won’t be able to see the packets sent to/received from client2 in a network trace collected on client1. If you really have to see the packets sent to/received from a node other than the node on which network trace is collected, you have to do port mirroring configuration (and your LAN switch should support it as well). Most of the LAN switches used in enterprise networks support port mirroring. You can see below a link for making such a configuration on Cisco LAN switches:

     

    http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note09186a008015c612.shtml

    Switched Port Analyzer (SPAN) Configuration Example

     

    3) If you troubleshoot a communication or performance problem between two processes running on the same node, that traffic won’t leave the machine and hence the network traffic won’t be captured by the capturing agent (Network Monitor/Wireshark). The traffic will be looped back by TCPIP stack. As an example, you won’t be able to see the network activity taking place between Internet Explorer and the Web server running on the same machine. If you need to troubleshoot such a scenario, you might try to collect an ETL trace, but the node will have to be running Windows 7 or Windows 2008 R2 for that. Please see the following post for more details on collecting such an ETL trace:

     

    http://blogs.technet.com/b/nettracer/archive/2010/10/06/how-it-works-under-the-hood-a-closer-look-at-tcpip-and-winsock-etl-tracing-on-windows-7-and-windows-2008-r2-with-an-example.aspx

     

    4) When collecting a network trace from a busy server, a capture filter might be applied to minimize the amount of traffic captured. We generally don’t prefer to capturing network traffic with a capture filter because when such a capture filter is applied, we take the risk of excluding some of the traffic that might be really relevant to the issue. If you’re really sure about what you have to check, then you may want to apply such a filter. You can find below an example of capturing with a filter with nmcap (command line version of Network monitor)

    Note: The following is taken from nmcap /examples output:

     

    This example starts capturing network frames that DO NOT contain ARPs, ICMP, NBtNs and BROWSER frames.  If you want to stop capturing, Press Control+C.

     

    nmcap /network * /capture  (!ARP AND !ICMP AND !NBTNS AND !BROWSER) /File NoNoise.cap

     

    5) If you really need to capture network traffic from a very busy server and you don’t want to take the risk of excluding some network traffic that might be relevant, you might want to capture let’s say only the first 256 bytes of each packets. Considering that a standard ethernet frame is about 1500 bytes, this will provide you a saving of ~%80. You can find an example for nmcap where only the first 256 bytes of each packet is captured:

     

    nmcap /network * /MaxFrameLength 256

     

    6) If network traces will be collected for an extended period, capturing all packets inside the same file will make it nearly impossible to analyze it (example: 5 GB network trace). To be able to collect manageable and analyzable network traces, it’s suggested to collect chained and fragmented network traces. You can find below an example for nmcap again:

     

    nmcap /network * /capture /file ServerTest.chn:200M

     

    Note: nmcap will create a new capture file once the first one if full (200 MB) and so on. So please make sure that you have enough free disk space on the related drive.

    Note: The traces created will be named as ServerTest.cap, ServerTest(1).cap, ServerTest(2).cap,...

     

    7) If you have to collect network traces for an unspecified period of time and you would like to see some activity taking place some time before the problem, you may have to collect network traces in a circular fashion which is possible with dumpcap (command line version of Wireshark for trace collection). You can see an example below:

     

    dumpcap -i 2 -w c:\traces\servername.pcap -b filesize:204800 -b files:80

     

    Notes 1: interface id "2" will be monitored and each capture file will be 204800 KB (200 MB)

    Notes 2: The command assumes that c:\traces folder already exists. Also please make sure that there's enough free space on that drive (C: in this instance). 16 GB's of free space will be required to create and save 80 x 200 MB traces.

    Notes 3: Eighty different files will be created with "servername_0000n_Date&time.pcap" syntax.

    Example:

    servername_00001_20120622134811.pcap

    servername_00002_20120622135617.pcap

    servername_00003_20120622141512.pcap

    .

    .

    .

     

    Notes 4: When all eighty files are created and full, it will start overwriting starting from the oldest trace file

    Notes 5: Trace could be stopped any time by pressing Ctrl+C

     

     

    8) It’s important to mark network traces with pings to be able to narrow down the time period that you need focus on in the trace. For example, you can ping the default gateway of the client just before and right after reproducing the problem.

     

    Example1:

     

    <<Start network trace on the client>>

    ping -l 22 -n 2 IP-address-of-default-gateway

    <<Reproduce the problem now. Example: Try to connect to www.microsoft.com from IE and once you get the “page not found” run the second ping>>

    ping -l 33 -n 2 IP-address-of-default-gateway

    <<Stop network trace on the client>>

     

    Example2:

     

    <<Start network trace on the client>>

    ping -l 22 -n 5 IP-address-of-the-file-server

    start > run > \\server\share

    <<assuming that it takes 5+ seconds to open up the share content. Once the share content is listed, please run the below command>>

    ping -l 33 -n 5 IP-address-of-the-file-server

    <<Please write down the following information : the exact date&time of this test / how long it took to display the share content / exact \\server\share that you accessed >>

    dir \\server\share

    <<Please write down the following information : how long it took to display the share content when you used "dir" command>>

    ping -l 44 -n 5 IP-address-of-the-file-server

    <<Stop network trace on the client>>

     

     

    When you start analyzing a network trace collected in that fashion, you can easily focus on a certain range of packets in the trace. Example:

     

    Packet1

    Packet2

    Packet3

    Packet4

    Packet5

    <<22 bytes ICMP echo request>>

    Packet6

    Packet7

    Packet8

    Packet9

    Packet10

    <<33 bytes ICMP echo request>>

    Packet11

    Packet12

    ...

     

     

    We know that the issue was reproduced between 22 and 33 bytes ping markers, we can only focus on the activity taking place between packet #6 and packet # 10. Consider that it was a 50000 packets trace, you now isolated the problem down to 5 packets. (you may not be always lucky that much J)

    You might be wondering "how can I identify those 22 and 33 bytes ICMP packets in the network trace". Here's a trick that I generally use. I first apply the following Wireshark filters in the network trace:

    ip.len==50 and icmp (to identify the 22 bytes ping)

    ip.len==61 and icmp (to identify the 33 bytes ping)

     

    9) One of the most important points that you need to take into consideration is collecting simultaneous network traces where possible. With “simultaneous network traces” I mean “collecting a network trace on the source and on the target systems at the same time”. That may not be always possible especially if one one of those systems is not controlled by you (example you’re troubleshooting a connectivity problem to a web site that belongs to another company)

     

    Other than that, I cannot stress more how important it’s to collect simultaneous network traces. When troubleshooting network connectivity issues, you cannot conclude whether or not the target server received the packet, or it sent a response back to the source or the source received the response without simultaneous network traces. Similarly, in network performance issues, you cannot conclude whether or not the response delay stems from the network path in between or from target/source systems. Let me try to explain what I mean with a couple of examples:

     

    Example1:

    We look at a client side network trace and see that the client sends 3 x TCP SYN segments to target without a response:

     

    No.     Time                       Delta       Source                Destination           Protocol Info

    ...

    141154 2011-03-31 16:52:29.488847 0.000000    192.168.4.71          10.1.1.1         TCP      37389 > 443 [SYN] Seq=0 Win=65535 Len=0

    141158 2011-03-31 16:52:29.488847 0.000000    192.168.4.71          10.1.1.1         TCP      37389 > 443 [SYN] Seq=0 Win=65535 Len=0

    144808 2011-03-31 16:52:29.801347 0.312500    192.168.4.71          10.1.1.1         TCP      37389 > 80 [SYN] Seq=0 Win=65535 Len=0

     

    By looking at the client side trace, can you answer the following?

     

    => Did the target server really receive the above 3 TCP SYN segments?

    => Did the target server send a response back to the above TCP SYN segment?

    => Did the target server really send the response and we didn’t see it at the client side?

     

    All the answers are NO. You cannot say if the target server really received those TCP SYNs or received and sent a response back or didn’t send any response at all. To be able to correctly answer those questions, you will have to see the story from target server’s perspective by looking at a network trace collected on that system.

     

    Example2:

    We look at a client side network trace and see that HTTP response is sent by the HTTP server after 4 seconds:

     

    Time            Delta       Source                Destination           Protocol Info

    16:57:37.537895 0.000000    192.168.4.71          10.17.200.49          TCP      45221 > 80 [SYN] Seq=0 Win=65535 Len=0 MSS=1460 SACK_PERM=1

    16:57:37.787895 0.250000    192.168.4.71          10.17.200.49          TCP      45221 > 80 [ACK] Seq=1 Ack=1 Win=65535 [TCP CHECKSUM INCORRECT]

    16:57:37.787895 0.000000    10.17.200.49          192.168.4.71          TCP      80 > 45221 [SYN, ACK] Seq=0 Ack=1 Win=5840 Len=0 MSS=1380

    16:57:37.787895 0.000000    192.168.4.71          10.17.200.49          HTTP     GET /images/downloads/cartoons/thumb_1.jpg HTTP/1.1

    16:57:38.053520 0.265625    10.17.200.49          192.168.4.71          TCP      80 > 45221 [ACK] Seq=1 Ack=356 Win=6432 Len=0

    16:57:42.084770  4.031250    10.17.200.49          192.168.4.71          HTTP     HTTP/1.1 200 OK  (JPEG JFIF image)

    16:57:42.084770 0.000000    10.17.200.49          192.168.4.71          HTTP     Continuation or non-HTTP traffic

    16:57:42.084770 0.000000    192.168.4.71          10.17.200.49          TCP      45221 > 80 [ACK] Seq=356 Ack=2761 Win=65535 [TCP CHECKSUM

    16:57:42.350395 0.265625    10.17.200.49          192.168.4.71          HTTP     Continuation or non-HTTP traffic

    ....

     

    By looking at the client side trace, can you answer the following?

     

    => Does the 4 second delay come from the target server or a network device running in between?

    => Did the target server wait for 4 seconds before responding or did it immediately send a response back but we see it after 4 seconds at the client side?

     

    All the answers are NO. You cannot say if that 4 seconds delay really comes from the target web server or network device (web proxy for example) running in between. To be able to correctly answer those questions, you will have to see the story from target server’s perspective by looking at a network trace collected on that system.

     

     

    Hope this helps

     

    Thanks,

    Murat