The Microsoft HPC & Batch Team Blog

The blog for the Big Compute team in Microsoft Azure, working on HPC & Batch services and technologies

Authentication Failure error on Windows HPC caused by port conflict

Authentication Failure error on Windows HPC caused by port conflict

  • Comments 1
  • Likes

 I recently ran into an issue where all attempts to run any cluster command from the command line resulted in an authentication failure. We were able to connect to the cluster from the GUI and powershell however all attempts to connect to the cluster from the command line falied with that simple error - Authentication failure.

I was pretty stumped on this as I could not understand why this failure occured only on the command line but not via the GUI or powershell. A look at the technet article here http://technet.microsoft.com/en-us/library/cc719008.aspx#BKMK_Firewall shows that there are specific ports used for communication between the cluster services on the head node and compute nodes. As an example, the command line tools uses port 5800 for communication with the HPC Job Scheduler Service on the head node, and port 5969 is used by the client tools on the enterprise network to connect to the HPC Job Scheduler Service on the head node. If you're having trouble communicating to the Job scheduler services on the head node it is always a good idea to investigate which process is listening on which port. A useful tool to accomplish this is netstat.exe.  

Running netstat -ano displays all connections and listening ports, addresses and port numbers in numerical form and the owning process PID is listening on each port connection. Compare this with the output from tasklist.exe and you can pretty much figure out which process is listening on which port.

Doing this in my scenario revealed that a different application (VNC Server) was listening on port 5800 and as a result, the command line interface was unable to connect to the scheduler service on that port. The solution to this was simply to reconfigure the VNC application to listen on a different port and then restart the HPC Job Scheduler service.

After this the command line interface to the job scheduler was working well just as expected.

I hope someone out there in Windows HPC land finds this post helpful someday.

 

Comments
  • Hi ihimmiar (tried to look up your profile for your name but it doesn't come up anymore), thanks a lot for this article. It has helped me in overcoming a similar issue. Although in my case I was not even able to connect through the GUI. And yes, the port 5800 was taken up by VNC server. So here is a Thank You note. Last line of your article came out to be true finally, even though after 5 years. :)

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment