Here's a troubleshooting tip for you all having problems figuring out the hidden meaning of the message : "Error in the sitedata webservice"

Say you are crawling a site and you receive the above error .What can you do to investigate what does that error mean ?

1. The nice and easy approach:

You can set the logging to verbose on the following categories

GatherStatus
MS Search Indexing
PHSts

Open the ULS Log with ULS Viewer http://code.msdn.microsoft.com/ULSViewer and do a bit of research around the time when the error occurred and spot potential messages like :

GetWebDefaultPage fail

Init fails

InitURLType fails

Search for the phrase "Return error to caller", and of course the obvious "error" and "exception" keywords

For a list of all the crawl-related categories, see my post : Indexing process verbose logging.

The Network way

2. You could get a network trace from the indexer , if you are crawling a remote site ( now this won't help much if your indexer crawls itself since all the traffic would go through the loopback and most of the sniffers would miss the traffic ) 

Look in the trace for http traffic and mostly SOAP actions.
See if you can spot SOAP responses that contain error keyword in response

Ex: filter for Network Monitor :

(http | SOAP) & (contains(Http.payload,"error") | http.Response.StatusCode==500)
 

The down and dirty way

3. Enable .Net Tracing as per http://support.microsoft.com/kb/947285#

create the file mssdmn.exe.config under %ProgramFiles%\Microsoft Office Servers\12.0\Bin ( mind the path if you changed the default installation location)


============FILE CONTENTS=============

<?xml version="1.0" encoding="utf-8"?>
<configuration>
 <system.diagnostics>
  <trace autoflush="true" />
  <sources>
   <source name="System.Net">
    <listeners>
     <add name="System.Net"/>
    </listeners>
   </source>
   <source name="System.Net.HttpListener">
    <listeners>
     <add name="System.Net"/>
    </listeners>
   </source>
   <source name="System.Net.Sockets">
    <listeners>
     <add name="System.Net"/>
    </listeners>
   </source>
   <source name="System.Net.Cache">
    <listeners>
     <add name="System.Net"/>
    </listeners>
   </source>
  </sources>
 <sharedListeners>
  <add
  name="System.Net"
  type="System.Diagnostics.TextWriterTraceListener"
  initializeData="C:\\Tracing\\mssdmn_trace.log" traceOutputOptions = "DateTime" />
 </sharedListeners>
 <switches>
  <add name="System.Net" value="Verbose" />
  <add name="System.Net.Sockets" value="Verbose" />
  <add name="System.Net.Cache" value="Verbose" />
  <add name="System.Net.HttpListener" value="Verbose" />
 </switches>
 </system.diagnostics>
</configuration>


============END FILE CONTENTS=======

Make sure that the output folder ( in our case Tracing ) exists and restart the NT service:

net stop osearch & net start osearch

...and start hunting

See the following blog on pointers how to analyze the traces and what information you can get out of them http://blogs.msdn.com/dgorti/archive/2005/09/18/471003.aspx

For detailed steps on tracing configuration see the KB article http://support.microsoft.com/kb/947285#