A blog by Jose Barreto, a member of the File Server team at Microsoft.
All messages posted to this blog are provided "AS IS" with no warranties, and confer no rights.
Information on unreleased products are subject to change without notice.
Dates related to unreleased products are estimates and are subject to change without notice.
The content of this site are personal opinions and might not represent the Microsoft Corporation view.
The information contained in this blog represents my view on the issues discussed as of the date of publication.
You should not consider older, out-of-date posts to reflect my current thoughts and opinions.
© Copyright 2004-2012 by Jose Barreto. All rights reserved.
Follow @josebarreto on Twitter for updates on new blog posts.
1) Introduction
We have covered the basics of SMB Direct and some of the use cases in previous blog posts and TechNet articles. You can find them at http://smb3.info.
However, I get a lot of questions about specifically which cards work with this new feature and how exactly you set those up. This is one in a series of blog posts that cover specific instructions for RDMA NICs. In this specific post, we’ll cover all the details to deploy the Mellanox ConnectX-2 and ConnectX-3 adapters, using the InfiniBand “flavor” of RDMA.
2) Hardware and Software
To implement and test this technology, you will need:
Mellanox states support for Windows Server 2012 SMB Direct and Kernel-mode RDMA capabilities on the following adapter models:
You can find more information about these adapters on Mellanox’s web site.
Important note: The older Mellanox InfiniBand adapters (including the first generation of ConnectX adapters and the InfiniHost III adapters), won't work with SMB Direct in Windows Server 2012.
There are many options in terms of adapters, cables and switches. At the Mellanox web site you can find more information about these InfiniBand adapters (http://www.mellanox.com/content/pages.php?pg=infiniband_cards_overview&menu_section=41) and InfiniBand switches (http://www.mellanox.com/content/pages.php?pg=switch_systems_overview&menu_section=49). Here are some examples of configurations you can use to try the Windows Server 2012:
2.1) Two computers using QDR
If you want to setup a simple pair of computers to test SMB Direct, you simply need two InfiniBand cards and a back-to-back cable. This could be used for simple testing like one file server and one Hyper-V server. If you want the most affordable InfiniBand solution, you can use a single-port QDR card, which operates at 32Gbps data rate. Here are the parts you will need:
2.2) Eight computers using QDR
If you want to try a more realistic configuration with InfiniBand, you could setup a two-node file server cluster connected to a six-node Hyper-V cluster. In this setup, you will need 8 computers, each with an InfiniBand card. You will also need a switch with at least 8 ports (Mellanox offers an 8-port model). Using QDR speeds, you’ll need the following parts:
2.3) Two computers using FDR
You may also try the faster FDR speeds (54Gbps data rate). The minimum setup in this case would again be two cards and a cable. Please note that the QDR and FDR cables are different, although they use similar connectors. Here’s what you will need:
Please note that you will need a system with PCIe Gen3 slots to achieve the rated speed in this card. These slots are available on newer system like the ones equipped with an Intel Romley motherboard. If you use an older system, the card will be limited by the speed of the older PCIe Gen2 bus.
2.4) Ten computers using dual FDR cards
If you’re interested in experience great throughput in a private cloud setup, you could configure a two-node file server cluster plus an eight-node Hyper-V cluster. You could also use two InfiniBand cards for each system, for added performance and fault tolerance. In this setup, you would need 20 FDR cards and a 20-port FDR switch (Mellanox sells a model with 36 FDR ports). Here are the parts required:
3) Download and update the drivers
Windows Server 2012 RC includes an inbox driver for the Mellanox ConnectX-2 and ConnectX-3 cards. However, Mellanox provides updated firmware and drivers for download. You should be able to use the inbox driver to access the Internet to download the updated driver.
The latest Mellanox drivers for Windows Server 2012 RC can be downloaded from the Windows Server 2012 tab on this page on the Mellanox web site: http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=32&menu_section=34.
The package is provided to you as a single executable file. Simply run the EXE file to update the firmware and driver. This package will also install Mellanox tools on the server. Please note that this package is different from the Windows Server 2012 Beta package. Make sure you grab the latest version.
After the download, simply run the executable file and choose one of the installation options (complete or custom). The installer will automatically detect if you have at least one card with an old firmware, offering to update it. You should always update to the latest firmware provided.
Note 1: This package does not update firmware for OEM cards. If you using this type of card, contact your OEM for an update.
Note 2: Certain Intel Romley systems won't boot Windows Server 2012 when an old Mellanox firmware is present. It might be required for you to update the firmware of the Mellanox card using another system before you can use that Mellanox card on the Intel Romley system. That issue might also be addressed in certain cases by updating the firmware/BIOS of the Intel Romley system.
4) Configure a subnet manager
When using an InfiniBand network, you are required to have a subnet manager running. The best option is to use a managed InfiniBand switch (which runs a subnet manager), but you can also install a subnet manager on a computer connected to an unmanaged switch. Here are some details:
4.1) Best option – Using a managed switches with a built-in subnet manager
For this option, make sure you use managed switches. These switches come ready to run their own subnet manager and all you have to do is enable that option using the switch’s web interface.
4.2) Using OpenSM with a single unmanaged switch
If you don’t have a managed switch, you can use one of the computers running Windows Server 2012 to run your subnet manager. When you installed the Mellanox tools on step 3, you also installed the OpenSM.EXE tool, which is a subnet manager that runs on Windows Server. You want to make sure you install it as an auto-starting service.
Although the installation program configures OpenSM to run as a service, it misses the parameter to limit the log size. Here are a few commands to remove the default service and add a new one that has all the right parameters and starts automatically. Run them from a PowerShell prompt running as Administrator:
SC.EXE delete OpenSM New-Service –Name "OpenSM" –BinaryPathName "`"C:\Program Files\Mellanox\MLNX_VPI\IB\Tools\opensm.exe`" --service -L 128" -DisplayName "OpenSM" –Description "OpenSM" -StartupType Automatic Start-Service OpenSM
Note 1: This assumes that you installed the tools to the default location: C:\Program Files\Mellanox\MLNX_VPI
Note 2: For fault tolerance, make sure you have two computers on your network configured to run OpenSM. It is not recommended to run OpenSM in more than two computers connected to a switch.
4.3) Using OpenSM with two unmanaged switches
For complete fault tolerance, you want to have two switches and have two cards (or a dual-ported card) per computer, one going to each switch. With SMB Multichannel, you get fault tolerance in case a single card, cable or switch has a problem. However, each instance of OpenSM can only handle a single switch. In this case, you need two instances of OpenSM.EXE running on the computer, one for each card, working as a subnet manager for each of the two unmanaged switches.
In order to identify the two ports you have on the system (either on a single dual-ported card or in two single-ported cards). To do this, you need to run the IBSTAT tool from Mellanox, which will show you the identification for each InfiniBand port in your system (look for a line showing the port GUID). Here’s a sample with the two port GUIDs highlighted:
PS C:\> ibstat CA 'ibv_device0' CA type: Number of ports: 2 Firmware version: 0x20009209e Hardware version: 0xb0 Node GUID: 0x0002c903000f9956 System image GUID: 0x0002c903000f9959
Port 1: State: Active Physical state: LinkUp Rate: 40 Base lid: 1 LMC: 0 SM lid: 1 Capability mask: 0x90580000 Port GUID: 0x0002c903000f9957
Port 2: State: Down Physical state: Polling Rate: 70 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x90580000 Port GUID: 0x0002c903000f9958
Once you have identified the two port GUIDs, you can run the following commands from a PowerShell prompt running as Administrator:
SC.EXE delete OpenSM New-Service –Name "OpenSM1" –BinaryPathName "`"C:\Program Files\Mellanox\MLNX_VPI\IB\Tools\opensm.exe`" --service -g 0x0002c903000f9957 -L 128" -DisplayName "OpenSM1" –Description "OpenSM for the first IB subnet" -StartupType Automatic New-Service –Name "OpenSM2" –BinaryPathName "`"C:\Program Files\Mellanox\MLNX_VPI\IB\Tools\opensm.exe`" --service -g 0x0002c903000f9958 -L 128" -DisplayName "OpenSM2" –Description "OpenSM for the second IB subnet" -StartupType Automatic Start-Service OpenSM1 Start-Service OpenSM2
Note 2: For fault tolerance, make sure you have two computers on your network, both configured to run two instances of OpenSM. It is not recommended to run OpenSM in more than two computers connected to a switch.
5) Configure IP Addresses
After you have the drivers in place, you should configure the IP address for your NIC. If you’re using DHCP, that should happen automatically, so just skip to the next step.
For those doing manual configuration, assign an IP address to your interface using either the GUI or something similar to the PowerShell below. This assumes that the interface is called RDMA1, that you’re assigning the IP address 192.168.1.10 to the interface and that your DNS server is at 192.168.1.2.
Set-NetIPInterface -InterfaceAlias RDMA1 -DHCP Disabled Remove-NetIPAddress -InterfaceAlias RDMA1 -AddressFamily IPv4 -Confirm:$false New-NetIPAddress -InterfaceAlias RDMA1 -IPAddress 192.168.1.10 -PrefixLength 24 -Type Unicast Set-DnsClientServerAddress -InterfaceAlias RDMA1 -ServerAddresses 192.168.1.2
6) Verify everything is working
Follow the steps below to confirm everything is working as expected:
6.1) Verify network adapter configuration
Use the following PowerShell cmdlets to verify Network Direct is globally enabled and that you have NICs with the RDMA capability. Run on both the SMB server and the SMB client.
Get-NetOffloadGlobalSetting | Select NetworkDirect Get-NetAdapterRDMA Get-NetAdapterHardwareInfo
6.2) Verify SMB configuration
Use the following PowerShell cmdlets to make sure SMB Multichannel is enabled, confirm the NICs are being properly recognized by SMB and that their RDMA capability is being properly identified.
On the SMB client, run the following PowerShell cmdlets:
Get-SmbClientConfiguration | Select EnableMultichannel Get-SmbClientNetworkInterface
On the SMB server, run the following PowerShell cmdlets:
Get-SmbServerConfiguration | Select EnableMultichannel Get-SmbServerNetworkInterface netstat.exe -xan | ? {$_ -match "445"}
Note: The NETSTAT command confirms if the File Server is listening on the RDMA interfaces.
6.3) Verify the SMB connection
On the SMB client, start a long-running file copy to create a lasting session with the SMB Server. While the copy is ongoing, open a PowerShell window and run the following cmdlets to verify the connection is using the right SMB dialect and that SMB Direct is working:
Get-SmbConnection Get-SmbMultichannelConnection netstat.exe -xan | ? {$_ -match "445"}
Note: If you have no activity while you run the commands above, it’s possible you get an empty list. This is likely because your session has expired and there are no current connections.
7) Review Performance Counters
There are several performance counters that you can use to verify that the RDMA interfaces are being used and that the SMB Direct connections are being established. You can also use the regular SMB Server and and SMB Client performance counters to verify the performance of SMB, including IOPs (data requests per second), Latency (average seconds per request) and Throughput (data bytes per second). Here's a short list of the relevant performance counters.
On the SMB Client, watch for the following performance counters:
On the SMB Server, watch for the following performance counters:
8) Review the connection log details (optional)
SMB 3.0 now offers a “Object State Diagnostic” event log that can be used to troubleshoot Multichannel (and therefore RDMA) connections. Keep in mind that this is a debug log, so it’s very verbose and requires a special procedure for gathering the events. You can follow the steps below:
First, enable the log in Event Viewer:
After the log is enabled, perform the operation that requires an RDMA connection. For instance, copy a file or run a specific operation. If you’re using mapped drives, be sure to map them after you enable the log, or else the connection events won’t be properly captured.
Next, disable the log in Event Viewer:
Finally, review the events on the log in Event Viewer. You can filter the log to include only the SMB events that confirm that you have an SMB Direct connection or only error events.
The “Smb_MultiChannel” keyword will filter for connection, disconnection and error events related to SMB. You can also filter by event numbers 30700 to 30706.
You can also use a PowerShell window and run the following cmdlets to view the events. If there are any RDMA-related connection errors, you can use the following:
Get-WinEvent -LogName Microsoft-Windows-SMBClient/ObjectStateDiagnostic -Oldest |? Message -match "RDMA"
9) Conclusion
I hope this helps you with your testing of the Mellanox InfiniBand adapters. I wanted to covered all different angles to make sure you don’t miss any relevant steps. I also wanted to have enough troubleshooting guidance here to get you covered for any known issues. Let us know how was your experience by posting a comment.
Thanks for your contributions to the community Jose.