Full text search has been around in SQL Server since the 2000 edition and has gradually evolved in each subsequent release and it is included in SQL Server express so you don’t even have to buy it! In CTP6 (February) the latest revision appeared, The basic difference is that there is no longer a separate service and all of the indexes etc. are included in the database. This has two important advantages:
Indexing a big lump of unicode text in a column on a table requires little effort, but the real power of this feature is its ability to index various types of document stored in the database in varbinary(max) columns. For this to work the engine has to understand the type of document that’s in there and to match this to a set of filters. There are in built filters for office documents PDF’s etc. However When SQL Server 2005 came out it came with a base set of these filters for the products around at the time such as Office 2003. Since then Office 2007 has shipped and we have the new file types such as docx for word and xlsx for excel etc. The good news is that you can add to the set of filters and the new office 2007 filters are here. The not so good news is that if you’re working on SQL Server 2008 you’ll need to do the same thing.
How does all this work? Let’s have a look at the Adventure Works Production.document table as installed (note I am using SQL Server 2008):
In this table the document itself is in the [document] column and its type is in the aptly named [FileExtension] column. Because the table has not been setup for full text searching all of the options for it are greyed out in the table designer. To fix this right click on the table and start the wizard:
and select design full text index. The wizard will take you though what you need to do. For this demo I:
So now the index is ready to use and in my next post I’ll show you how to use it. In the meantime there are lots of resources on this including: