Want to know more about data mining? Try Crisp-Dm. No, it’s not a breakfast cereal but it is a methodology to use for your data mining projects. For those of you who don’t realize it, SQL Server 2005 includes a great deal of data mining functionality for clustering and forecasting.  Crisp-DM is a step by step guide for a process model to be used with your data mining projects.

A typical data mining project consists of six phases. It begins with Business Understanding. Determine what the project objectives are and then translate that into a data mining problem. The next two phases, Data understanding and Data preparation, consist of identifying the pertinent attributes required and cleaning data where needed. Modeling consists of building models and optionally returning for more data. For example, when you build a decision tree you might additionally build a clustering model and a Naïve Bayes model. In the Evaluation phase you create a lift chart to validate the model and to ensure the model meets the original business objectives. Finally the model is Deployed to be used in production.

Crisp-Dm is a generic methodology and should help guide you on your data mining adventures. More information is available at http://www.crisp-dm.org/ 

Val Matison