Recently I dealt with a problem where PDF file downloaded from a certain external web site was always corrupted and I would like to talk about how I troubleshooted that problem. The client was connected to internet through a four node TMG 2010/SP2 array.
We decided to collect the following logs to better understand why the file was corrupted:
- Network trace on the internal client
- TMG data packager on one of the TMG servers
(Since the problem was reproducible by setting any of the TMG servers as the proxy server, we set one of the array members as a proxy server to collect less logs)
Note: TMG data packager is installed as part of TMG Best practices Analyzer installation
Microsoft Forefront Threat Management Gateway Best Practices Analyzer Tool
The results from the log analysis were as below:
- There weren’t any connectivity problems present in the TCP sessions (through which the file was downloaded) in the network trace collected on the client, internal and external interfaces of TMG server
- The error code for the given file download was 13: (taken from Web proxy log)
Destination Host IP
Note: IP addresses/links/proxy names etc are deliberately changed
Error 13 is “The data is invalid”:
C:\>net helpmsg 13
The data is invalid.
So TMG server thinks that the received data was invalid. That also explains why the downloaded file was corrupted.
Then I decided to take a look at the ETL trace which was also collected with TMG Data packager. Actually the root cause behind why TMG server thought the data was invalid was clearly visible there:
... GZIP Dempression failed. Drop the request. (connection closed=0) 0x8007000d(ERROR_INVALID_DATA)
Because the file decompression fails on TMG server, TMG server finalizes the session with Error_Invalid_data (error 13)
Note: Please note that you have to contact Microsoft support for ETL trace conversion
Note: You can also collect a similar diagnostics log from TMG server’s console:
(Before reproducing the problem you have to enable logging from “Enable Diagnostic Logging” and once the problem is reproduced you have to disable logging by selecting “Disable Diagnostic Logging”)
For troubleshooting purposes, I suggested to turn off Compression on TMG server:
(We remove “External” from the “Request compressed HTTP content when sending requests to these network elements” section.)
As expectedly the corrupted file download problem was resolved. When we make the above configuration change actually we ask the TMG server not to ask for compression when sending HTTP requests out to external web servers. So the file was downloaded in uncompressed format. Please note that TMG server asks for compression for HTTP requests sent to external web sites by default and that provides some bandwidth saving by minimizing the amount of data transferred.
We decided that the problem was somehow related to the target web site or upstream Web proxy because the same TMG server was able to successfully download HTTP content in compressed format from other external web sites.
Normally it’s possible to turn off compression for a specific web site (which could be configured from “Exceptions” tab in the above screen shot). But the TMG array in question was configured to use an upstream proxy for all external web traffic. So creating an exception wouldn’t make much difference here. Our customer decided to keep HTTP compression off (and re-enable it once the file downloads from the given web site were finished)
Hope this helps
Recover corrupted pdf file through repair a pdf file
You can also try Kernel for PDF repair tool. For more information visit here :