I recently came across a very nice troubleshooting methodology when I was trying to debug some authentication issues that were occurring during a SharePoint 2010 crawl. I was getting some errors and also having difficulty getting the information I needed out of the crawl log to some other issues that were occurring. Strangely enough, enter Fiddler to the rescue (www.fiddler2.com).
I’m sure most everyone is familiar with Fiddler so I won’t bother explaining it here. The trick though was to get it to capture what was happening during the crawl. I found a very slick way to do this is to configure Fiddler as a reverse proxy for the crawl account. The instructions for configuring Fiddler as a reverse proxy can be found here: http://www.fiddler2.com/Fiddler/help/reverseproxy.asp. The way I used it was as follows:
I had isolated my trouble sites into a separate content source. So once I followed these steps I was able to see each request from the crawler to that content source, how it was authenticating, and what was happening. Overall it proved very useful in understanding much more clearly what was going on during the crawl of those sites.
Very few people realize how useful a proxy server can be. For example, you can also ask it to get involved in SOAP conversations too if necessary. WAY easier to read than a network trace.
I found your tip really useful and I'm trying to use it but I can't understand how you manage to make it working if you don't change the port of the website you want to crawl in the content source?
This article might help to troubleshoot your issue..
if it helps, Please dont forget to mark it as an answer. Thank you.