One of the more subtle aspects of converting (n)varchar or (n)text data to XML is the fact that XML is choosy about which characters are permitted and (n)/varchar/(n)text is not. Any T-SQL programmer who runs conversions of this type is likely to run into this issue. Here's a code block that resolves the issue.
The characters in question are what are commonly called "lower-order ASCII" characters, those below CHAR(32). Of these, only the TAB (CHAR(9)), LF (CHAR(10)), and CR(CHAR(13) are valid within XML. This solution uses trigger code to call a user-defined function to scrub the nvarchar columns, and a loop within the trigger to an ntext column.
Here's the UDF code:
Here's the trigger code:
The trigger code is built to maximize performance in that the NULLIF tests in the UPDATE statement will only run the (relatively expensive) UDF if the inserted and deleted images of a particular column differ (if they don't differ, we can guarantee that the value has already been scrubbed). The UDF and the loop in the trigger for the ntext SupplementDescription column employ the same basic strategy of looping through the source value looking for any invalid character and replacing it with a new character (NCHAR(164)) until the last invalid character is found.
This code was developed for a SQL Server 2000 environment. It would function in a SQL Server 2005 environment, but better performance would likely be had with a CLR-based UDF.