Search in Exchange 2007 (Exchange Search) is completely different from earlier versions of Exchange. Improvements were made to performance, Content Indexing, and Search. New items are indexed almost immediately after their arrival into a mailbox giving end users a fast, reliable, and more stable search function. In Exchange 2000 and Exchange 2003, indexing was not enabled by default. In Exchange 2007, indexing is enabled by default on all mailbox databases and there is no initial setup or configuration required.
In Exchange 2007, the indexes are held in a catalog which is created in the same location as the database files. Extensive research and experience shows that 5-10 percent of the database size can be expected for the catalog. This is a great improvement over Exchange 2003 which created a content index that was 35-45 percent of the database size.
Note that for planning purposes, you should reserve 20 percent of the database size since the catalog can momentarily double during a Master Index Merge. MSSearch maintains the inverted index for a particular mailbox database. It is more efficient for MSSearch to add to a small index than to a large one, so it creates new small indexes on a regular basis as a part of its normal operation. At regular intervals MSSearch performs a merge of the smaller indices and the master index. This process is known as the Master Merge and is completely different from a re-index.
With the use of event notifications from the information store, a new or changed item triggers indexing. This means that the catalog should never be more than a few minutes out of date. On average, new messages are indexed within ten seconds and a query finds the new messages within seconds. This process is totally new for Exchange 2007. Older versions of Exchange indexing were never current until the catalog was brought up to date by periodic indexing done in off-hours on a schedule, which could take several hours if there were many changes. That meant that in Exchange 2003 new items did not appear in a search until up to a week later, depending on the crawling schedule. In Exchange 2007 new items appear in search results shortly after their arrival to the database. Note that the word crawling generally refers to the first time a database is indexed, although it is sometimes used as a synonym for indexing.
Exchange 2007 does not perform Content Indexing of the Public folder Store. Additional information on searching and indexing Exchange Public Folders with Exchange 2007 will be found in part 3 of this blog series, in the section on Public Folders in Exchange 2007 Search - Part 3: The Search Process.
The performance of Exchange Search has dramatically improved in Exchange 2007. Several improvements were made to optimize system resources such CPU, Memory, Disk I/O, and Disk Space required for its indexes. A new Throttling feature in Exchange Search automatically backs off indexing (throttles) for a particular MDB or set, reducing Disc I/O and CPU utilization. For further information on the new throttle mechanism for Exchange 2007 content indexing, see the following reference:
Because of performance reasons, many administrators were reluctant to use content indexing on Exchange 2003 mailbox databases. In Exchange 2007, the performance cost for content indexing is generally low and content indexing is now recommended as a best practice for all Exchange 2007 mailbox databases. All those improvements together have made Exchange 2007 Search much faster and much less resource intensive than previous versions.
In addition to being faster, improvements were made to the client search experience. There is now an easily accessible search bar in OWA 2007 and Query builder support in Outlook 2007. Also new to Exchange 2007 Search is the ability to search attachments in Outlook online mode and OWA. Note that Exchange 2007 Search searches attachment types supported by the installed Filters. Filters will be covered in additional detail in Part II of this blog series in the section on the Microsoft Search Filter Daemon in Exchange 2007 Search - Part 2: Content Indexing.
Windows Desktop Search (WDS) which is a Files System Indexing application also has the ability to Index mail item located on the Exchange Server and in the user's local OST \PST file. When using Outlook 2007 with WDS installed, searches performed within Outlook use the local index on the user's machine to query and return search results.
This first blog post provided an overview to Exchange 2007 Search and Content Indexing. Exchange Search is automatically installed and configured for all mailbox type databases, and is significantly faster than previous versions. The catalog now is instantly up to date rather than days behind and is much smaller than in previous versions, and Exchange 2007 Search is much less resource intensive.
Exchange 2007 Search includes new features such as searching attachments and a new search bar for OWA 2007. Exchange 2007 Content Indexing is now recommended as a Best Practice for all Exchange 2007 mailbox databases.
The second post in this blog series will review Exchange 2007 Content Indexing , explain the crawling process, examine the components of Exchange 2007 Content Indexing, show how to implement a Noise Word file in Exchange 2007, and will detail Content Indexing and Exchange 2007 High Availability.
The third post will detail the Exchange 2007 Search process, list the basic Exchange 2007 Search methods, list further details for Mailbox Search, explain the differences between Exchange Search and Store Search in online mode, examine the client search process without WDS and with WDS installed, and present how to search and index Exchange 2007 public folder databases. The third post will also contain useful links that provide additional information.
-- Bob Want and Jack French