Recently I've run into several issues caused by ephemeral port exhaustion.
The issues come to us with several different symptoms and behaviors - Some of which are listed here.
The server is hung/ frozen/ unresponsive.
I'm unable to access the internet/ network file share.
I can't logon to the domain.
There are various other issues that may occur but those seem to be the common complaints. A reboot will fix the problem.
A memory dump will confirm but is a one-shot chance at data gathering. In the case of port exhaustion we can use one or two tools to quickly pinpoint the problem without taking down the system (and needing to wait for the issue to reoccur)
Let's dig into more details of "What Works/What doesn’t" and highlight the tools to confirm or discount Ephemeral Port Exhaustion (EPE for short, since I cannot spell Ephemeral or Exhaustion)
(Note: In Performance troubleshooting there are just a few things that should be banned from our vocabulary ;)
Some tools we have monitor systems over time. Typically if the issue is not happening right Now and cannot be reproduced on demand we'll have to gather data for a few hours/days/weeks until the issue does return.
Perfmon will show high number of handles in an application and/or in System that gradually increase (Process\Handle Count\*)
Poolmon (in the APP_HandleCount file Poolmon3vbs version) you may see I high number here:
Live Troubleshooting tools:
Without going into too much detail - there are three basic communication routes to the server, and three from the server. Each communication requires a socket connection. Just like any 3 prong power strip - there are three things that make a socket: Port, Protocol, IP
When we test the responsiveness of the computer the first tests should be to see what works In to the system and also Out from the system
NSLookup is a simplecommand line tool that checks DNS records and resolved names for us. It goes off the box to the domain controlleror DNS server, makes a socket connection and returns the information requested.
Basically, checks to see if the Network is 'alive'
In this scenarioPing will work to the system and from the system.
NSLookup will check to see if outbound UDP works
NSLookup -v will force TCP and check to see if outbound TCP connections work
If outbound TCP connections fail then we move to…
NetStat - ANO:
A warning about netstat -ano
While helpful it may not always show all the open ports. You can use it to look, but if it doesn't show a ton of connections - don't be fooled… dig deeper
If the issue is currently happening we can check handle information with Process explorer
In the Main Process Explorer window we have to make a few changes to see the information we want:
2. From the menu list add Show Lower Pane and select Handles
Tip: Move the Handle Column closer to the Process Name (for ease of use) And sort by handle count.
For this example, pretend svchost is the highest consumer. Once selected you will see the Type of handles listed and if we have EPE you will could see severalthousand FILE handles with Name \Device\AFD or Device\TCP:
Restarting that application will instantly resolve the issue.
If the high handle count is in System process then there is probably another application that is telling System process to do all its work. A reboot (and if possible, Full memory dump) is the only way to clear System handles and get more data.
If the handles are not in TCP or AFD - keep digging! It still is a valid test to restart the application, and if it's a 3rd party application, restarting it and confirming the server returns to "normal" should be enough proof that the application is at fault.
Port numbers 1024 through 5000
Port numbers 49152 to 65535
For example, many communications will start on Fixed port numbers (3389, 145, 25 110 are all examples of known fixed ports) and if the application needs additional connections it will then spawn a conversation on a dynamic port(s)
If the Applications do not close the conversation correctly, the port will be left connected - using a handle and possibly other resources (NPP, PP, Threads etc) Since there are a limited number ofEphemeral Ports we can eventually run out.
Imagine someone in the office picking up every phone, making a call and not hanging up. Every phone in use means no one else can call out. You can still work, if you do not make any outbound phone calls.
In the case of this type of Server "Hangs":
The mouse works on the console
Keyboard works on the console
Local logon will likely work on the console and RDP
Existing connections where no authentication takes place (where Kerberos is going off the box for verification) will work (file shares, currently connected RDP users)
Ping will work (ICMP)
UDP connections will work (NSLookup)
TCP Connections Into the box will work
TCP connection from the box outside will fail. (Nslookup -v)
Always dig into the exact behavior of Hang, Fail, Frozen, and Unresponsive by testing mouse, keyboard, and inbound and outbound network connectivity on various protocols.
How would I monitor number of ports available. Say I want to put a threshold of, warn me when only 1000 ports left.
So I can get early detection via an event etc.
The ports themselves are hard to track that way. Netstat -ano will only report the ports that it knows the status for (closed, waiting, established etc) and will miss counting ones that it doesn't know how to qualify. The best option if you think you have this issue is to have Performance Monitor trigger an event/alert on a high number of handles. Now caveat here is that it can get noisy if the threshold is too low as some applications will have high handle counts. The threshold can alert you to high usage, but you would still need to confirm that they are of type File Device\AFD or Device\TCP in Process Explorer.
If I used the 'handle' utility and specified looking for \device\tcp and\or device\afp would that accurately depict the number of TCP ports being consumed? Granted, it doesn't tell me if the ports are ESTABLISHED or CLOSE_WAIT status. Just it is a counter I can monitor? (by the way - great article)Example:handle -a \device\tcpOutput:System pid: 4 type: File 188: \Device\TcpSystem pid: 4 type: File 18C: \Device\TcpSystem pid: 4 type: File 190: \Device\TcpSystem pid: 4 type: File 198: \Device\TcpSystem pid: 4 type: File 19C: \Device\Tcp