Introduction to the Data Warehouse: Custom Fact Tables, Dimensions and Outriggers

Introduction to the Data Warehouse: Custom Fact Tables, Dimensions and Outriggers

  • Comments 2
  • Likes

So you’ve deployed the data warehouse and tried out the reports but now you want to want to make it your own. You may be trying to recreate some reports you’ve been using forever and now need to run them against the Service Manager platform, or perhaps you simply want to take full advantage of the customizations you’re doing in the Incident or Change Management solutions and want those changes to flow through to the reports. Either way, if you’re stumped on how to proceed our latest posts from the Platform team will help you extend and customize the Data Warehouse to enable the in-depth analyses you’re aiming for.

Danny Chen, one of the Developers on our common Platform team, did a great job writing up how to create fact tables, dimensions and outriggers. If you’re not familiar with data warehousing principles, I’ll provide some clarity as to what these terms mean and how they apply to Service Manager below. If you understand the principles well enough and are chomping at the bit to dig into the details, here are the links:

1. A Deep Dive on Creating Relationship Facts in the Data Warehouse

2. A Deep Dive on Creating Outriggers and Dimensions in the Data Warehouse

Principles behind the Platform: Dimensional modeling and the star schema

The data warehouse is a set of databases and processes to populate those databases automatically. At a high level, the end goal is to populate the data mart where users will run reports and perform analyses to help them manage their business. We keep this data around longer in the warehouse than in the CMDB because it’s usefulness for trending and analysis generally outlives it’s usefulness for normal transactional processing needs.

A data warehouse is optimized for aggregating and analyzing a lot of data at once in a lot of different, unpredictable ways. This differs from transactional processing systems which are optimized for write access on few records in any given transaction , and those transactions are more predictable in behavior.

To optimize the data warehouse for performance and ease of use, we use the Kimball approach to dimensional modeling. What this means to you is that tables in the DWDataMart database are logically grouped into subject matter areas which resemble a star when laid out in a diagram, so these groupings are often called “star schemas”.

  1. In the center of the star is a Fact table. Fact tables represent relationships, measures & key performance indicators. They are normally long and skinny as they have relatively few columns but contain a large number of transactions.
  2. The fact table joins to Dimension tables, which represent classes, properties & enumerations. Dimension tables usually contain far fewer rows than fact tables but are wider as they have the interesting attributes by which users slice and dice reports (ie status, classifications, date attributes of a class like Created Date or Resolved Date, etc).
  3. An outrigger is a special kind of dimension table which hangs off another dimension table for performance and/or usability reasons.

Generalized representation of a star schema:

Star Schema

Consider what a star schema for a local coffee shop might look like. The transactions are the coffee purchases themselves, whereas the dimensions might include:

  1. Date dimension (to rollup the transaction by both gregorian and fiscal calendars)
  2. Customer dimension (bought the coffee)
  3. Employee dimension (made the coffee)
  4. Product dimension (espresso, drip, latte, breve,  etc etc…. and this could get quite complicated if you track the details of Seattleites drink orders)
  5. Store dimension
  6. And more

What measures might the fact table have? You could easily imagine:

  1. Quantity sold
  2. Price per Unit
  3. Total Sales
  4. Total Discounts
  5. etc

IT processes aren’t so different from the local coffee shop when it comes time to designing your dimensional model. There are a set of transactions which happen, like incident creation/resolution/closure which produce some interesting and useful metrics (time to resolution, resolution target adherence, billable time incurred by analysts, duration in status, etc).

When thinking about extending and customizing your data warehouse, think about the business questions you’d like to be able to answer, read up on dimensional modeling for some tips on best practices, and then check out Danny’s posts on creating fact tables, dimensions and outriggers for the technical know-how.

 

And of course, we’re always here to help so feel free to send me your questions.

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment
  • What are the key measurements in a Greenplum Data Warehouse?

  • What do you do if you want a fact table of which you don't have a class(for example TierQueues and Status(of which there is a fact table)), but you do have an enum-field? Current entering of incidents is not to be changed, only the underlying database(s)

    greetings