Last week I was helping a customer with some OpsMgr certificate issues with their monitoring Agents in a non-trusted domain. More info on Monitoring an Agent in a non-trusted domain can be found here: http://blogs.technet.com/smsandmom/archive/2008/09/10/opsmgr-2007-monitoring-an-agent-in-a-non-trusted-domain.aspx
These were the events in the OperationsManager Eventlog:
Event Type: Warning Event Source: OpsMgr Connector Event Category: None Event ID: 20067 Date: 6/17/2009 Time: 3:33:31 PM User: N/A Computer: computername Description: A device at IP 192.168.1.1:5723 attempted to connect but the certificate presented by the device was invalid. The connection from the device has been rejected. The failure code on the certificate was 0x800B010A (A certificate chain could not be built to a trusted root authority.).
For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.
Event Type: Warning Event Source: OpsMgr Connector Event Category: None Event ID: 21002 Date: 6/17/2009 Time: 3:33:31 PM User: N/A Computer: computername Description: The OpsMgr Connector could not accept a connection from 192.168.1.1:5723 because mutual authentication failed.
Event Type: Error Event Source: OpsMgr Connector Event Category: None Event ID: 20070 Date: 6/17/2009 Time: 3:33:31 PM User: N/A Computer: computername Description: The OpsMgr Connector connected to MS01.support.local, but the connection was closed immediately after authentication occurred. The most likely cause of this error is that the agent is not authorized to communicate with the server, or the server has not received configuration. Check the event log on the server for the presence of 20000 events, indicating that agents which are not approved are attempting to connect.
Event Type: Error Event Source: OpsMgr Connector Event Category: None Event ID: 21016 Date: 6/17/2009 Time: 3:33:33 PM User: N/A Computer: computername Description: OpsMgr was unable to set up a communications channel to MS01.support.local and there are no failover hosts. Communication will resume when MS01.support.local is available and communication from this computer is allowed.
For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp
So it was clear the agent could not communicate with the Management Server in the un-trusted domain using certificates. So we needed to check if the certificates were ok. And in this case it turned out that Certutil was our friend ;-). Certutil.exe is a command-line program that is installed as part of Certificate Services in the Windows Server 2003 family (and higher). Here are the steps we took to verify that there was certificate issue and how we solved it.
Issue: Agent needing a certificate to communicate with Management Server are generating “A certificate chain could not be built to a trusted root authority” event ids (20067, 20070, 21016) errors in the Operations Manager eventlog.
Reason: Wrong proxy settings, so the (Intermediate) Root CA could not be contacted.
See next line in output from certutil -urlfetch -verify <cert.cer> tool:
Failed "AIA" Time: 0 Error retrieving URL: The server name or address could not be resolved 0x80072ee7 (WIN32: 12007) http://cert.domain.local/aia/SUPPORT.WEB%20ROOT%20CA.crt
Complete output from certutil see attachment certutil_output.txt
Steps to solve issue:
Got this error when i went through the above steps.
Error retrieving URL: More data is available. 0x800700ea (WIN32/HTTP: 234)
And I am still getting the 20070 error. I have verified the certificate and even reinstalled and went through the gateway setup process.