Robert's SQL Blog

My thoughts on SQL Server, PowerShell and Microsoft products in general

July, 2012

  • I got my MCM Lab result.... And I failed!

    A couple of weeks ago I got the result from my SQL 2008 MCM Lab exam and I failed.

    I have been working with SQL Server for almost 12 years. I have been a developer, dba and architect and last year I joined Microsoft to become a Premier Field Engineer on SQL Server. This means I have a broad experience with several aspects of SQL Server and since joining Microsoft I learned something new every week, if not every day.

    So how did I prepare for the MCM Lab exam? Well I did not. And I choose to. I wanted to know where I stood. Talking to other people who were thinking of doing the MCM and reading through the materials I was not sure what to expect. And for me that makes it hard to prepare myself. Although one could argue this is exactly what you would do in real life, it just doesn't feel the same. Failing the MCM Lab exam made me more experienced. Every day I waited for the result I knew more and more what I did wrong and how it should have been done. So even though I failed I have learned a lot. I think you can prepare yourself perfectly by going through the material listed on the MCM site. I just didn't have to time to go through it although I did went through some video's. So in a sense I did prepare myself, but for me, everyday is preparation for the MCM. I get challenged every week by different customers, with different SQL Server setups and different challenges. Some are hard, some are easy, some bring up a lot of discussion and some are just a lot of fun but every engagement has made me more knowledgeable on SQL.

    Of course I am retaking the exam. And the fact that I did not pass but did well, strengthens me in my believe that I can achieve this. But this time I come prepared. I know my weak points.This means I can develop a proper battleplan. I will watch a few of the MCM videos, especially those in my weak areas (http://technet.microsoft.com/en-us/sqlserver/ff977043.aspx). I will setup a lab myself an try out some scenarios I think are common or hard to setup. Being able to apply knowledge from experience will help you getting the job done.

    I love the exam. It is so much more testing your skills then the "regular" exams. But then again, this is the master exam. If you are pursuing the exam yourself challenge yourself with peers, get a group of people together trying to achieve the same goal. And if you think you are ready, go for it. Worst thing that could happen is that you fail. But that wouldn't be the end of the world. You can always retry.

  • Using table variables (or temp tables) to speed up… or slow down

    In SQL Server you can use temporary tables to store intermediate results. This is a common used technique to speed up query processing. Recently I came across a problem where the temporary tables were causing the performance degradation. I'm not going into the different temporary tables types, there are good posts on these. I'm going to walk you through a real-life example and show that the most given answer in the SQL Server world holds true; "It Depends".

    In this case the desired result was a single stored procedure that would return multiple results based on some or no search criteria. If a user navigated to the page the complete set was returned and only one filter was applied. In the result set there was a row count, a paged result and some distinct fields from the total results to provide additional filtering. Think of these as slicers and dicers you use in Excel. I will demonstrate the general idea using the AdventureWorks database. I'm using the 2008 version here.

    DECLARE @t table (id int IDENTITY, salesorderid int)

    INSERT INTO @t (salesorderid)

    SELECT soh.SalesOrderID FROM Sales.SalesOrderHeader soh

    INNER JOIN Purchasing.ShipMethod sm ON soh.ShipMethodID = sm.ShipMethodID

    WHERE sm.Name = 'CARGO TRANSPORT 5'

    ORDER BY soh.OrderDate DESC;

     

    -- 1 Get Rowcount

    SELECT COUNT(*) AS TotalRecords FROM @t;

     

    -- 2 Get Current page

    SELECT soh.OrderDate, soh.SubTotal, st.Name, p.LastName, c.AccountNumber FROM Sales.SalesOrderHeader soh

    INNER JOIN @t t ON soh.SalesOrderID = t.salesorderid

    INNER JOIN Sales.SalesTerritory st ON soh.TerritoryID = st.TerritoryID

    INNER JOIN Sales.SalesPerson sp ON soh.SalesPersonID = sp.BusinessEntityID

    INNER JOIN Sales.Customer c ON soh.CustomerID = c.CustomerID

    INNER JOIN Person.Person p ON sp.BusinessEntityID = p.BusinessEntityID

    WHERE t.id BETWEEN 1 AND 20

    ORDER BY soh.OrderDate DESC;

     

    -- 3 Get SalesTerritory values

    SELECT DISTINCT st.TerritoryID, st.Name FROM Sales.SalesTerritory st

    INNER JOIN Sales.SalesOrderHeader soh ON st.TerritoryID = soh.TerritoryID

    INNER JOIN @t t ON soh.SalesOrderID = t.salesorderid

    ORdER BY st.Name;

     

    -- 4 Get OrderDate Year values

    SELECT DISTINCT YEAR(soh.OrderDate) as OrderDateYear FROM Sales.SalesOrderHeader soh

    INNER JOIN @t t ON soh.SalesOrderID = t.salesorderid

    ORDER BY OrderDateYear;

     

    In preparation a table variable is filled with the filtered set of ID's and is sorted in the desired order. In the first query we retrieve the row count for the set. The second set is the actual result set the user will see in the application. In this case it's the first page with 20 results. In step 3 and 4 additional results are returned that are used in the application to provide quick filters for the user. When I run this query on my database I get 4 sets with a total of 35 rows. For this I used 25628 reads. That seemed a lot to me. The execution plan for this one showed some missing indexes and some clustered index scans here and there. I added a few indexes. If you are running along, here they are:

    CREATE NONCLUSTERED INDEX temp_SALESORDERHEADER_ShipMethodId ON Sales.SalesOrderHeader(ShipMethodId) INCLUDE(SalesOrderID, OrderDate);

    CREATE NONCLUSTERED INDEX temp_SALESORDERHEADER_Select ON Sales.SalesOrderHeader(SalesOrderId, OrderDate) INCLUDE(CustomerID, SalesPersonID, TerritoryID, SubTotal);

    CREATE NONCLUSTERED INDEX temp_SALESORDERHEADER_Territory ON Sales.SalesOrderHeader(SalesOrderID, TerritoryID)

    CREATE NONCLUSTERED INDEX temp_TERRITORY_Name ON Sales.SalesTerritory(TerritoryID, Name);

    CREATE NONCLUSTERED INDEX temp_PERSON_Lastname ON Person.Person(LastName)

     

    This got my reads down to 900. That's to show you will need a proper indexing strategy. But still, more work could be done. One of the things I tried first was use a temporary table instead of a table variable. But this increased my reads. Although I could add indexes the index creation itself would cause even more reads. And in this example the set is not that big but when I was working with the real set things got ugly pretty quickly.

    The last thing I tried was to remove the temporary tables altogether and use a view. My thought was to create a view with the search criteria in the view and maybe index that view. This view could easily be reused by the other sets. The definition of the view and the adjusted queries are below:

    CREATE VIEW vSearch AS

    SELECT soh.SalesOrderID, soh.TerritoryID, soh.OrderDate, soh.SalesPersonID, soh.CustomerID, soh.SubTotal, sm.Name

    FROM Sales.SalesOrderHeader soh

    INNER JOIN Purchasing.ShipMethod sm ON soh.ShipMethodID = sm.ShipMethodID

    GO

     

    -- 1 Get Rowcount

    SELECT COUNT(*) AS TotalRecords FROM vSearch WHERE Name = 'CARGO TRANSPORT 5'

     

    -- 2 Get Current page

    SELECT v.OrderDate, v.SubTotal, st.Name, p.LastName, c.AccountNumber FROM

    (SELECT ROW_NUMBER() OVER (ORDER BY OrderDate DESC) AS row_id, SalesOrderID,

    TerritoryID, SalesPersonID, CustomerID, OrderDate, SubTotal

    FROM vSearch WHERE Name = 'CARGO TRANSPORT 5') v

    INNER JOIN Sales.SalesTerritory st ON v.TerritoryID = st.TerritoryID

    INNER JOIN Sales.Customer c ON v.CustomerID = c.CustomerID

    INNER JOIN Person.Person p ON v.SalesPersonID = p.BusinessEntityID

    WHERE v.row_id BETWEEN 1 AND 20

    ORDER BY v.OrderDate DESC;

     

    -- 3 Get SalesTerritory values

    SELECT DISTINCT st.TerritoryID, st.Name FROM Sales.SalesTerritory st

    INNER JOIN vSearch v ON v.TerritoryID = st.TerritoryID

    WHERE v.Name = 'CARGO TRANSPORT 5'

    ORdER BY st.Name;

     

    -- 4 Get OrderDate Year values

    SELECT DISTINCT YEAR(v.OrderDate) as OrderDateYear FROM vSearch v

    WHERE v.Name = 'CARGO TRANSPORT 5'

    ORDER BY OrderDateYear;

     

    An additional index was needed:

    CREATE NONCLUSTERED INDEX temp_IX_SALESORDERHEADER_ShipMethodID ON Sales.SalesOrderHeader (ShipMethodID) INCLUDE (OrderDate,CustomerID,SalesPersonID,TerritoryID,SubTotal);

     

    This resulted in 466 reads. More optimizations can be done bit a big win was gained by using a view and just retrieving the data directly instead of using temporary sets.

    A big lesson lies in the creation of the temporary objects. If you create a simple table with two columns and just 10 records and you select the table you will only get a few reads. If you create a table variable, insert the real table into the variable and select it again you get a couple of hundred reads. On small sets you won't notice a big difference but on larger sets the difference on cpu, reads, writes and duration is really noticeable.