Insufficient data from Andrew Fryer

The place where I page to when my brain is full up of stuff about the Microsoft platform

December, 2008

  • SQL Server 2008 - Hierarchy ID Data Type

    To show this we need to create a table with the new data type..

    create table [dbo].Organisation (
      DepartmentID int primary key nonclustered,
      DepartmentName varchar(100) not null,   
      DepartmentHierarchyNode hierarchyid not null,
      DepartmentHierarchyLevel as DepartmentHierarchyNode.GetLevel() persisted
      -- plus any other columns you need
    )

    Note the level is a derived function of getLevel() which returns how deep a node is in the structure in this case it’s the only node and so will return 0.

    Now we need to put the some data in starting at the top of the tree i.e the whole company – which in this example is Contoso..

    insert Organisation(DepartmentID, DepartmentHierarchyNode, DepartmentName)
    values (1, hierarchyid::GetRoot(), 'Contoso')

    note the getroot() function which does what is says and gets the root node of the hierarchy and since this is the first node in the hierarchy this is simply itself.

    Adding in new rows to this table is best done with a stored procedure like this one which needs to know the id for the new department the name of it and the id of the parent department it belongs to ..

    create

    procedure AddDepartment

    @DepartmentID int,

    @DepartmentName varchar(100),

    @ParentID int

    as

    begin

    declare @ParentNode hierarchyid,

    @MaxChildNode hierarchyid

    begin transaction

    -- get the node for the parent of the new

    -- department

    select @ParentNode = DepartmentHierarchyNode

    from Organisation

    where DepartmentID = @ParentID

    -- get the last department for the parent

    select @MaxChildNode = max(DepartmentHierarchyNode)

    from Organisation

    where @ParentNode = DepartmentHierarchyNode.GetAncestor(1)

    -- add the new department

    -- Parent.GetDescendant(MaxChild, null) returns the

    -- next descendent

    insert Organisation (DepartmentID, DepartmentHierarchyNode, DepartmentName)

    values (@DepartmentID, @ParentNode.GetDescendant(@MaxChildNode, null), @DepartmentName)

     

    commit

    end

    Here you can se more obviously named functions of a HeirarchyID like GetAncestor, GetDescendant and MaxChild.  I have to admit I have no problem with these as they are quite similar to what you use in MDX in Analysis Services.

    Here’s some data to go into the new table..

    exec AddDepartment 2, 'Operations', 1
    exec AddDepartment 3, 'Development', 1
    exec AddDepartment 4, 'Parking', 1
    exec AddDepartment 5, 'Home Operations', 2
    exec AddDepartment 6, 'Field Operations', 2
    exec AddDepartment 7, 'North Territory', 6
    exec AddDepartment 8, 'South Territory', 6
    exec AddDepartment 9,   'Database', 3
    exec AddDepartment 10,  'Services', 3
    exec AddDepartment 11,  'Applications', 3
    exec AddDepartment 12, 'Windows', 11
    exec AddDepartment 13, 'Internet', 11
    exec AddDepartment 14, 'Self', 4
    exec AddDepartment 15, 'Valet', 4

    selecting from this will give you the levels and so on:

    image 

    This procedure will show a tree of the structure using a single select statement

    create procedure ShowDepartmentChart
    @DepartmentName varchar(50)
    as
    begin

      declare @TopNode hierarchyid,
              @TopLevel int

      select @TopNode = DepartmentHierarchyNode,
             @TopLevel = DepartmentHierarchyLevel
        from Organisation
       where DepartmentName = @DepartmentName

      -- note that parent is descendent of itself
      select DepartmentHierarchyNode.ToString() NodeText, space((DepartmentHierarchyLevel - @TopLevel) * 5) + DepartmentName Department
        from Organisation
       where DepartmentHierarchyNode.IsDescendantOf(@TopNode) = 1
       order by DepartmentHierarchyNode

    end

    go

     

    and then use this like this..

    exec ShowDepartmentChart 'Contoso'

    image

    The common toSting() function returns the human readable form of the hierarchyID node.

    You can (and should!) index Hierarchy ID’s in two ways..

    1. Create a "breadth first" index,  to speed up selecting of nodes at the same level, all the sub departments that belong to a parent department


    create clustered index Organisation_Breadth_First
    on [dbo].Organisation( DepartmentHierarchyLevel, DepartmentHierarchyNode )

    2. Create a "depth first" index, to speed up selecting a sub-tree
    of nodes, e.g. all the departments beneath a particular parent department


    create unique index Orgisation_Depth_First
    on [dbo].Organisation(DepartmentHierarchyNode )

    Hierarchy ID’s are faster than self joins, not least because no recursive SQL is needed and in my opinion more intuitive.  However one important thing to remember is that don’t enforce any kind of integrity so like with self joins you can end up with orphans. One area like this where you can stuff things up is if you use the reparent function. this moves a node in the structure to a differnet parent ( for example if a mmeber of staff has a new manager), but it doesn't move the nodes underneath the moved node as well, you must do this yourself. 

  • SQL Server Advent Calendar 11 – Report Builder v2

    Day 11 of my virtual advent calendar, about stuff I like in SQL Server 2008..

    …and I mentioned redesigning charts in my last post and the best answer might be to get your users to do it for themselves in Report Builder 2.

    image

    Leaving you (the DBA BI expert) to concentrate on getting them the data they need.  Report Builder2  is built of the same code as the Report designer in BI development studio, but has a different query builder.  If you want to use a report model then the model needs to be deployed to report manager and then report builder can see it from the data source dialog..

    image

    Report Builder v1 is still available as a ClickOnce deployment from Report Manager in SQL Server 2008, but if you want Report Builder 2 it needs to be downloaded from the Feature pack here.

  • SQL Server Advent Calendar 15 – Integration Services Profiling

    Day 15 of my virtual advent calendar, about stuff I like in SQL Server 2008..

    If there were no issues with data quality in data warehouses I would have completed many of my BI projects in a third of the time.  The client would always insist they had good data, after all it would often be coming form their billing or ERP systems, but I rarely found this to be the case.

    Why do I get these assurances? Because while the business user might understand their high level data (in terms of sales by store for example), they probably won’t know how it’s stored or the details of each order, while the DBA knows the storage but not necessarily anything about the data. It’s only when a BI project kicks off that these questions start to be asked.  So how does a BI professional quickly get an understanding of the data? Enter the data profiling task in SQL Server 2008 Integration Services (SSIS)…

    image

    Not that the tool is a control flow, not a data flow task, but nonetheless requires a data connections.  In this demo we are going to look at customer data and output our findings to an xml file.  This task has several panes:

    image

    The general tab allows you to specify where the profiling data ends up, in this case NewCustomerProfile.xml.  BTW the quick profile button here allows you to run a profiling job on the fly with default settings.

    The Profile Requests pane is where you setup what you want to analyse..

    image

     

    Here I have three different kinds of analysis and I have focused on the last one to look at Value distribution.  Note the connection and table name can be individually specified.

    The task can then be run to generate the file, and then you can take a look at the answers. I am not to good at quickly making sense of xml and so it’s good to know there is a special data profile viewer go to ..

    All Programs -> Microsoft SQL Server 2008 -> Integration Services -> Data Profile Viewer.

    Looking at the distribution of values for gender …

    image

    you can see that we have three M F and Male, something we might want to clean up as part of the ETL process. The other interesting thing to do is look at NULL distribution as there is little po9int in loading in a little used column no matter how important it is.

    Armed with this tool you can go back to the business and show them how clean there data is and then collaborate on how to resolve each issue.

  • SQL Server Advent Calendar 13 – Grouping Sets

    Day 13 of my virtual advent calendar, about stuff I like in SQL Server 2008..

    The SQL Group by clause has been around since I can remember, but in SQL Server 2008 we know have grouping sets.  Imagine you wanted to sum sales amount from the adventure works reseller sales table by year, by Product Category and by Year AND Product Category perhaps as the first stage in writing a report.  You also need to have all the results in one query so you end up doing something like this in SQL Server 2005:

    SELECT
        NULL AS [year],
        dpc.EnglishProductCategoryName AS Category,
        SUM(frs.salesamount)
    FROM FactResellerSales frs
    INNER JOIN DimProduct dp ON dp.ProductKey = frs.ProductKey
    INNER JOIN DimProductSubcategory dps ON dps.ProductSubcategoryKey = dp.ProductSubcategoryKey
    INNER JOIN DimProductCategory dpc ON dpc.ProductCategoryKey = dps.ProductCategoryKey
    GROUP BY dpc.EnglishProductCategoryName

    UNION ALL

    SELECT
        dt.CalendarYear AS [year],
        NULL AS Category,
        SUM(frs.salesamount)
    FROM FactResellerSales frs
    INNER JOIN DimTime dt ON dt.TimeKey = frs.OrderDateKey
    GROUP BY dt.CalendarYear

    UNION ALL

    SELECT
        dt.CalendarYear AS [year],
        dpc.EnglishProductCategoryName AS Category,
        SUM(frs.salesamount)
    FROM FactResellerSales frs
    INNER JOIN DimProduct dp ON dp.ProductKey = frs.ProductKey
    INNER JOIN DimProductSubcategory dps ON dps.ProductSubcategoryKey = dp.ProductSubcategoryKey
    INNER JOIN DimProductCategory dpc ON dpc.ProductCategoryKey = dps.ProductCategoryKey
    INNER JOIN DimTime dt ON dt.TimeKey = frs.OrderDateKey
    GROUP BY dpc.EnglishProductCategoryName,
            dt.CalendarYear

    Grouping sets makes this a far simpler exercise which also executes only once against the fact table..

    SELECT
        dt.CalendarYear AS [year],
        dpc.EnglishProductCategoryName AS Category,
        SUM(frs.salesamount)
    FROM FactResellerSales frs
    INNER JOIN DimProduct dp ON dp.ProductKey = frs.ProductKey
    INNER JOIN DimProductSubcategory dps ON dps.ProductSubcategoryKey = dp.ProductSubcategoryKey
    INNER JOIN DimProductCategory dpc ON dpc.ProductCategoryKey = dps.ProductCategoryKey
    INNER JOIN DimTime dt ON dt.TimeKey = frs.OrderDateKey
    GROUP BY
        GROUPING SETS
        (    (dpc.EnglishProductCategoryName),
            (dt.CalendarYear),
            (dpc.EnglishProductCategoryName,dt.CalendarYear),
            ()
        )

    Each row under grouping sets groups the above query in a different way  equivalent to the previous query except that I have added an extra set, the last row, () which gives the grand total,  (which shows up as  NULL,NULL, 804505969823) in the results below..

    image

    I think this plays very well with Reporting Services and for generating aggregated fact tables in large relational data warehouses, but will obviously need development work to implement.

  • SQL Server Advent Calendar 10 – Reporting Services Charts

    Day 10 of my virtual advent calendar, about stuff I like in SQL Server 2008..

    I have been using Reporting Services since beta 2 in SQL Server 2000, and for me the weak link was the charts which even  then looked pretty basic…

    image

    Many people resorted to using add-ons including the ubiquitous Dundas charts.  Microsoft noticed this trend and simply bought those tools and put them into Reporting services 2008. The chart below shows different axes and chart types for different data series, and the improved control there is over formatting the chart.

    image

    There are also gauges available now and at some point there will hopefully be the maps to leverage the spatial spatial data types in the SQL Server 2008 database engine.

    If you are using Dundas charts already then they'll be updated automatically to pick up the versions embedded in reporting services.  If you weren't then your charts will look the same, but will have been migrated to the new chart type as well and you can then change them if you want and usethe new features like I have show above, or….

  • SQL Server Advent Calendar 16 – Change Data Capture

    Day 16 of my virtual advent calendar, about stuff I like in SQL Server 2008..

    One of the main hurdles in refreshing a data warehouse is to work out what’s changed since the last refresh.  If you’re lucky the source will have a last modified date and if you’re really lucky this is actually populated and working.  There is a capability in SQL Server 2008 to help as well, Changed Data Capture (CDC), but be aware this is an enterprise edition only feature.

    What it does is to create a separate set of tables in the same database (but different schema) as the source.  It needs to be turned on first like this..

    USE AdventureWorks
    EXEC sys.sp_cdc_enable_db

    BTW there is a new column in the sys.databases dmv called is_cdc_enabled so you can check if CDC is enabled with…

    SELECT name, is_cdc_enabled FROM sys.databases

    Now you can enable a table for CDC..

    EXEC sys.sp_cdc_enable_table
        @Source_Schema = N'HumanResources',
        @Source_Name = N'Employee',
        @Captured_Column_List =
        N'EmployeeID,NationalIDNumber, ManagerID, Title, BirthDate, MaritalStatus, Gender, HireDate, SalariedFlag, VacationHours, SickLeaveHours, CurrentFlag',
        @Supports_Net_Changes = 1,
        @Role_Name = 'CDC_Admin'

    Hopefully all but the last 2 arguments are pretty obvious.  Supports Net changes is a bit flag which either shows you only the overall change between two periods as opposed all of the changes .  The role name is a database role which runs the code and a new one is created if you specify a new name.  Also you can specify other parameters here like @filegroup.

    Note you don’t have to include the captured column list (it will track all columns by default, but if you do you must include the primary key or another argument @index_name to refer to a unique index on the table if @supports_net_changes is set to 1.

    You can now check this has been enabled..

    EXECUTE sys.sp_cdc_help_change_data_capture
        @source_schema = N'HumanResources',
        @source_name = N'Employee';

    Behind the scenes a new system table has been created to track changes, [cdc].[HumanResources_Employee_CT] which has got all the columns we specified we wanted to track plus 5 extra ones ..

    [__$start_lsn]
    ,[__$end_lsn]
    ,[__$seqval]
    ,[__$operation]
    ,[__$update_mask]

    so lets make a simple change and see what happens in this table

    UPDATE HumanResources.Employee
    SET Title = 'DBA' where Title = 'Database Administrator'

    and then query the change table like this..

    SELECT TOP 1000 [__$start_lsn]
          ,[__$end_lsn]
          ,[__$seqval]
          ,[__$operation]
          ,[__$update_mask]
          ,[EmployeeID]
          ,[Title]

      FROM [AdventureWorks].[cdc].[HumanResources_Employee_CT]

    to see what was captured.

    image

    The LSN columns show the log sequence number of the commit for the change and the -$seqval orders the sequence of changes that can occur in the same transaction, so a possible lower level of detail. _$operation shows what happened to the row, 1 = delete, 2 = insert, 3 = update (before change), and 4 = update (after change).

    To help you use this table there are functions to get the min and max LSN’s in the table.

    You should also notice that you have got 2 new SQL Agent jobs to capture and cleanup the tables.  Rather than modify these jobs directly there are also specific functions to manage them.

    I have skimmed over this to give you an idea of the various parts of CDC, so you should check books on line for a fuller explanation here, and there is also a CDC TechNet virtual lab here, which you can try, before you have a go in your own environment.

  • SQL Server Advent Calendar 18 – Audit

    Day 18 of my virtual advent calendar, about stuff I like in SQL Server 2008..

    Rounding out ways of keeping an eye on what’s happening to your data in SQL Server 2008 is the new auditing feature in Enterprise edition.  As the name suggest it’s there for a specific purpose, to make your compliance work as easy as possible.

    In SQL Server 2005 you could monitor changes to to the data in a database using triggers and you could also track changes to permissions and schemas.  However tracking the results of queries was not easy or you had to resort to profiler to trap the SQL.  Audit not only handles all of this it, also allows you to direct the audit information to the application or security log as well as to file.

    You also get a nice UI to help you set it up and consume the information.

    The first step is to create an Audit in the Security node of SSMS..

    image 

    This will consume i.e. be the target of the database audit specification.  I am going to use a file here.

    Now I can setup a database audit specification again using the UI which I can get to by expanding security under the database I am interested in and right clicking on Database Audit Specifications..

    image

    I am going for an action type of select on an object and then I can select which object(s) to track. In this case I am worried about people looking up employee addresses so I have just selected the HumanResources.EmployeeAddress table. I can also filter by the security principal  (in my case dbo) as I may only be worried about certain groups of users.  Having created it I can right click on it to enable it and I am done.

    To test it I can run some queries against it..

    select
        E.LoginID,
        A.AddressLine1,
        A.City
    from HumanResources.EmployeeAddress EA
    inner join Person.Address A on EA.AddressID = A.AddressID
    inner join HumanResources.Employee E on E.EmployeeID = EA.EmployeeID

    Select
        VE.FirstName,
        VE.AddressLine1,
        VE.City
    from HumanResources.vEmployee VE

    Note the second query is a view based on the table we have the audit on.

    Now I can see the output by going back to the audit and right clicking to view the audit log..

    image

    scanning along to the right I can see the SQL I used…

    image

    so we got both selects back which is good to know, and I set this up in the the time it took my wife to grill tonight’s Salmon! 

    For further reading check books on-line here and this whitepaper on SQL Server 2008 security.

  • SQL Server Advent Calendar 12 - Tablix

    Day 12 of my virtual advent calendar, about stuff I like in SQL Server 2008..

    One of the frustrations I used to have with Reporting Services was to decide whether to use a table or a matrix data control, and if I got too far in to a design and it turned out to be wrong I had to pretty much start again.

    Although in looks like nothing has changed in this regard in SQL Server 2008 there is in fact only one kind of data grid which is called Tablix (TABL e + matr IX).  Here’s one I made earlier in Report Builder v2

    image

    No only does it replace table and matrix reports it is also a lot more powerful – as a tablix control can have multiple independent row and column groups as you can see in this simple example..

    I can add an independent group by right clicking on the column group area (bottom right)..

    image

    then add  Group –> Adjacent after, and then select group by Year.  That will create a new column into which I can drag the sales amount from the data set on the left into the detail row and then again into the total.  I need to have the year as the column header so I need to enter !Fields.[Order Year].value. in there (the teal coloured box).

    If I run it now I get this.

    image 

    This is simply not doable in SQL Server 2005 reporting services.  Be aware however that I had all of this data in the one data set, as any data region can only depend one dataset.

    If you want to have a go, you’ll need adventureworks as the database and then the original report is here and the version I changed it to is here

  • SQL Server Advent Calendar 9 – Management with Powershell

    Day 9 of my virtual advent calendar, about stuff I like in SQL Server 2008..

    Many SQL Server databases are managed by part time or accidental DBA’s, who have a whole bunch of other duties possibly including networking, active directory, SharePoint and Exchange.   As well as learning about what these all do part time DBA’s also have to get to grips with the different tools used to manage them. 

    This is one of the many areas addressed by Microsoft’s new scripting language, Powershell.  Bizarrely from the company that bought you wizards and tries to get you to do everything in  a slick UI, this initially looks like a throw back to the days of DOS or the Unix command shells. This is not as odd as it might seem, because although UIs are good for ‘do once’ operations,  they are not nearly so good for repeating tasks particularly if these tasks take arguments like server name, database name, today’s date etc.

    Powershell allows you to create scripts where everything you use is an object using all the power of WMI and .Net, to directly control Windows 2008, Vista, SQL Server 2008 and Exchange (SharePoint will come in Office 14). What does that mean for the DBA?

    • Manageability and repeatability
    • Unified scripting to encompass database management tasks with file operations, security permissions.
    • Using other new tools in SQL Server 2008, like scripting evaluation of policies to a table across multiple servers
    • No more reliance on VB Script and odd looking batch jobs
    • A huge amount of resources like codeplex, real books, TechNet, and the Powershell User Group run by Richard Siddaway 

    A good place to start is this http://www.microsoft.com/emea/spotlight/sessionh.aspx?videoid=984blogcast from Dan Jones on the SQL Server Management team followed by this introductory Powershell guide on Codeplex

    Finally while Powershell is built in to SQL Server 2008, you can retrofit Powershell onto older versions of SQL Server (back to 2000) using this kit on Codeplex.

    Technorati Tags: ,
  • SQL Server Advent Calendar 14 – Integration Services Lookups

    Day 14 of my virtual advent calendar, about stuff I like in SQL Server 2008..

    If it aint broke don’t fix it, is probably best aimed at Integration Services in SQL Server 2008. So what was ‘fixed’?  The most important thing was the performance of lookups, both in terms of speed and memory usage.  To address this there is a new lookup cache feature to allow frequent lookups from static data to be persisted to a special file (*.caw) .

    This cache need to be populated from the new cache transform task in the data flow designer:

    image

    To use this you need to tell the inevitable wizard what connection, table and columns you want to cache…

    image

    Note the 1 in the index position for the SalesTerritoryAlternateKey row to indicate this is indexed (at least one column must be indexed).

    So that’s how you populate the cache, now it can be used in the standard lookup transformation by selecting cache connection manager on the general tab for the lookup

    image

    and then putting in the cache location on the connection tab (You will be warned to do this). You can then use the columns form the cache in the same way as for a normal lookup. As ever I’ve done this in AdventureWorks so you should be able to follow this.

    This is going to be useful in applying the same (conformed) dimension to multiple fact tables, particularly if the dimension doesn’t change that often.  I would also point out that there has been a lot of work behind the scenes to improve the parallelism in integration services, which will kick in when you upgrade.

  • SQL Server Advent Calendar 7 – Partitions & Indexed Views

    Day 6 of my virtual advent calendar, about stuff I like in SQL Server 2008..

    Indexed Views allow result sets to be persistent as they are materialized on physical storage which avoids performing costly joins and/or aggregations at execution time.  However in SQL Server 2005 they would have to be dropped and recreated if one of the underlying tables was partitioned and a new partition was added.

    This is fixed with the cunningly named indexed aligned partition views (that’s a Microsoft mouthful so I will abbreviate that to PAIV). This allows you to swap partitions in and out of your large table to add new data and archive older data.

    Partition-Aligned Index Views Switching

    The key to this is the ALTER TABLE  ..SWITCH statement.  There’s a really simple example to follow in books on-line on the TechNet SQL Server TechCenter.

  • SQL Server Advent Calendar 8 – Analysis Services Named Sets

    Day 8 of my virtual advent calendar, about stuff I like in SQL Server 2008..

    Analysis Services allows you to create a set of things you are interested in and persist this in the cube.  MDX is used to define the set e.g.

    CREATE SET CURRENTCUBE.[Top 10 Resellers]
    AS TopCount([Reseller].[Reseller].[Reseller].Members, 10, [Measures].[Sales Amount]);

    creates a set Top 10 resellers of the resellers having the 10 highest sales amounts

    If you do this in SQL Server 2005 then that statement would be evaluated when the set was defined and members in the set would be fixed for the life of the named set regardless of any changes to the data in the cube.  This can be a good thing but as I business user I would expect this to be dynamic e.g. if in our example a new reseller meets the rules at the expense of one of the others then I expect to see that reflected in the set in the same way as if I used a SQL view on a table with a top  X count clause.

    SQL Server 2008 has the option to create dynamic sets that behave like this and this is simply achieved by altering the create statement to

    CREATE DYNAMIC SET CURRENTCUBE.[Dynamic Top 10 Resellers]
    AS TopCount([Reseller].[Reseller].[Reseller].Members, 10, [Measures].[Sales Amount]);
     

    I like the upgrade aspect of this as the set you originally created in will behave as before in Analysis Services 2005 and you simply need to flip the dynamic switch in Analysis Services 2008 to make it behave in what I believe is the more logical fashion.

  • SQL Server Advent Calendar 23 - Filestream

    Day 23 of my virtual advent calendar, about stuff I like in SQL Server 2008..

    Filestream isn’t one of the new data types in SQL Server 2008 it a variant of varbinary(MAX), and it is Microsoft’s solution to the problem of storing large unstructured data as part of a database. 

    In SQL Server 2005/8  varbinary(max) can store a file of up to 2Gb (a limitation inherited by SharePoint)  and as this limit is approached the performance of streaming this data back to a client becomes worse and worse.  On the other hand referential integrity is easy to maintain and backups disaster recovery is straightforward as all that’s needed is to restore the database.

    The other common approach to this problem is to have a table with pointers to the file locations of the documents on disk.  This is quite hard to write and implement and referential integrity and backup are also difficult as it’s important to have the right file at the end of each pointer and not to loose any on the way. However it does have the advantage of performance over internal storage of the data in the database. 

    Filestream is the best of both worlds.  Documents are stored as files on disk, but they are in a special SQL Server filegroup.  When a request for a file is made SQL resolves this into a filehandle that can be use by ordinary win32api’s so performance is as good as it can be.  However because all of this is in a filegroup backup is also straightforward. This mechanism also respects transactions so if you rollback you get back to the previous version of the file. A word of warning here is that this can involve some serious disk usage if the files are large as internally SQL Server keeps the older version of the file =until the transaction is committed.

    Using Filestream is a developer task , so any example I put here will be littered with very dodgy C#. Rather than do that I will instead point you to this excellent  example developed by one of my French (it is in English) colleagues Patrick Guimonet, and it also shows Powershell integration in action!

    Finally Filestream is included in all editions of SQL Server while not being part of size limitations in these editions.

     

  • 3rd day of Christmas – Ducks in a Row

    This blog is written using Windows Live Writer as are many of the others on TechNet and MSDN.  In this video Giorgio Sardo shows how he writes his and how to control your PC with a Samsung Omnia..

  • 7th Day of Christmas Tagging

    Properly indexing databases is generally seen as a good thing, and is one of the top causes of poor performance.  However when we get home from a hard days SQL we seem to forget all of this and our home lives are less well organised, especially our photographs.

    In my latest Christmas video, my wife Juliet shows you how to tag your photographs using Live Photo Gallery ..

    Juliet also wanted to mention that a lot of telemetry from your photos gets stored in the image (date, camera settings), automatically by a digital camera, so you just need to add the who and what, if you have one of those.  

    Technorati Tags: ,

  • 2nd Day of Christmas – Brighton 360

    Although Photosynth has been widely publicised in the general and IT press I still come across load of people who don’t know what it is.  In this short video James O’Neill in fine Blue Peter tradition shows a synth (the buzz word for what you make in Photosynth)  he made earlier of Brighton Pavilion..

    Hopefully this will inspire you to have a go with that new camera you got for Christmas

  • 6th Day of Christmas – Armchair Astronomy

    I still love to watch the Royal Institute Lectures at Christmas, and I am please to see Microsoft supporting them this year.  In this spirit I wanted to show the Microsoft World Wide Telescope so if you haven’t tried it yet hopefully this will convince you…

    Technorati Tags: ,

    On a serious note,  this is a very demonstrable example of Software + Services that Microsoft keeps banging on about, in this case the client does all of the rendering and caching while the service provides the rest including the up to date imagery, search, and repository of the guided tours.

  • 4th Day of Christmas

    I love PC Gaming, so I haven’t got an XBox 360 yet.  However there’s no denying its popularity and unlike other games consoles Microsoft have released a simple toolkit , the XNA Game Studio to enable anyone to write a game for the XBox/PC/Zune.

    In today’s video Kevin Pfister takes us through this free toolset..

  • 1st Day of Christmas – Looking for Vulcans

    clip_image001

    I hope you like the card (a result of a miss spent youth at Art College) and are enjoying your well earned break.

    My 12 days of Christmas this year will take a look at what sort of free stuff you can download from Microsoft, some of which you may already know but maybe not all of them.  The series are all short videos which are intended as a bit of fun.  So without further ado here’s James O’Neill talking to me about Vulcans…

    Technorati Tags: ,
  • SQL Server Advent Calendar 19 – Data Collection

    Day 19 of my virtual advent calendar, about stuff I like in SQL Server 2008..

    I often get asked how much faster is SQL Server 2008 than SQL Server 2000, what impact will virtualisation have on my performance, and my answer in fine Microsoft tradition is it depends.  So how can we begin to answer these questions and put some science into this area?

    As a Business Intelligence expert I would suggest building a data mart to capture the performance of the system you want to measure then apply any changes you want to it and then evaluate the result.    The good news is that there is a tool called data collection built into SQL Server 2008 to do a lot of the work for you.  Before you ask no you can’t retro fit this onto earlier versions although you could collect similar data yourself.

    Data Collection comprises of:

    • a Collection mechanism which writes out telemetry information to temp files on local storage to minimise the performance overhead of collecting performance data.
    • Collection Sets define what you are measuring, there are three installed:
      • Disk Usage
      • Query Statistics
      • Server Activity
    • Agent jobs to collect this data and drop it into
    • A predefined data mart i.e. a central database, called the Management Data Warehouse (MDW).  This means you can gather information form multiple instances and store all of it in one place.
    • A set of rich standard reports that link together to tell you what’s happening.

    Let’s have a look at each bit in turn..

    First of all we need to define where the data is to be stored, which can be done by opening SQL Server management studio (SSMS) expanding management –> data collection –> Configure Management Data Warehouse. The wizard will either allow you to create a new one or point to one you made earlier. You can also set permissions to the predefined MDW roles from here.

    Having done that you can then rerun the same wizard and select the Data Collection option.  From here you specify the location of the MDW and the local cache where performance data initially gets written to.

    Now you can configure each collection set by right clicking on it and selecting properties..

    image

     

    This is the Disk Usage Collection.  You can see it has two items in it one for data files and one for log files.  There is a predefined script for each of these and the rest of the screen allows you to specify how long to keep the data for how often the data is collected and whether to use a local cache or not.

    You can add to all of this, but IMO try it ‘as is’ and see what else you need. 

    Once it’s up and running you can then use the predefined reports that come with it by

    right clicking on data collection node and selecting one of the three reports listed in this case server activity gives you this…

    image

    The graphs are interactive so clicking on the plateau on the bottom graph me more info..

    image

    The time access controls these graphs as well.  If you are looking at queries you can get down to the individual sql statements, and these also have linked reports embedded in them

    Hopefully you get the idea, and simply turning this on with defaults will go a long way to helping you with your consolidation decisions.

  • SQL Server Advent Calendar 20 – Accidental DBA’s

    Day 20 of my virtual advent calendar, about stuff I like in SQL Server 2008..

    The accidental DBA is a DBA who ends up looking after SQL Server as part of a number of other duties e.g. Exchange admin, developer etc. This often occurs because SQL Server can survive pretty well despite long periods of neglect, but then suddenly something goes wrong and then someone gets nominated to look after it post fix. 

    As databases grow in complexity and size they do eventually need some attention, and some tasks can be bit daunting if you not immersed in SQL Server every day.  There are excellent add on tools from the likes of Idera, Quest and Redgate to ease the management burden, but SQL Server 2008 itself has a few things to help with this:

    • Powershell (which I have posted about here) gives a common scripting language for the Microsoft platform.
    • Policy Based Management works a bit like group policy in active directory to put a range of SQL Server into a set state.  I have already made a short screencast of this…

    image

    • Data Collection which I covered yesterday
    • Central Management Servers,  Are simply a special registered server which adds to the power of registered servers that we are used to using by allowing T_SQL or the Policies I just mentioned to be fired at multiple servers.  The central management server is also a good candidate for your data collection management data warehouse. I have screencast on this too, although when I did it the feature was called configuration servers..
    • image

    A lot of this stuff (Policy Management, Central Management Servers, Powershell) will work against SQL Server 2000/2005 so all you need to do get started is to have one instance of SQL Server 2008 as a base (for the Central Management Server) or just the client tools for the rest.

  • SQL Server Advent Calendar 17 – Change Tracking

    Day 17 of my virtual advent calendar, about stuff I like in SQL Server 2008..

    Today’s topic is the confusingly named change tracking i.e. nothing to do with Change Data Capture (CDC) I mentioned in my last post. CDC is only in enterprise edition only and is there to primarily support population of data warehouses while Change Tracking  is a developer orientated tool designed to support synchronisation across different platforms and is therefore included in all editions of SQL Server 2008.

    Like CDC you need to turn it on either via the Management Studio (which you won’t have if your using Express) or via T-SQL..

    ALTER DATABASE AdventureWorks
    SET CHANGE_TRACKING = ON
    (CHANGE_RETENTION = 2 DAYS, AUTO_CLEANUP = ON)

    The process relies on version numbers and every time a change is made this gets incremented and then this is used in various functions to get at the data..

    • CHANGE_TRACKING_CURRENT_VERSION()  represents the version of the last committed transaction. Before any application can obtain changes for the first time, the application must send a query to obtain the initial data and the synchronization version. The application must obtain the appropriate data directly from the table, and then use CHANGE_TRACKING_CURRENT_VERSION() to obtain the initial version like this..

    @CurrentVersion = CHANGE_TRACKING_CURRENT_VERSION()

    • CHANGE_TRACKING_MIN_VALID_VERSION() tells you the earliest version ID that is held (based on the retention period you define).  So you would use this to check that you have the right version before getting the changed data.
    • CHANGETABLE(CHANGES [source Table], @Synchronisation version) . Note you will want to RIGHT join this as a table to the source table to get the data that actually changed as this function merely tells you which row changed. I mention right join because the row may have been deleted.

    This is more a developer thing and so the virtual lab for this is on MSDN, and there is also more about it in Books On-Line here

  • 5th Day of Christmas – Deep Zoom

    Deep Zoom actually does what it says on the tin.  However while many people have seen demos of it I am not sure many have actually tried it or know how it works and that it’s free.

    To rectify that Viral Tarpara (he’s the good looking one in the shot below) explains all in this short tour..

  • SQL Server Advent Calendar 22 – Spatial Data

    Day 22 of my virtual advent calendar, about stuff I like in SQL Server 2008..

    image

    SQL Server 2008 now has data types to store points lines and polygons in 2 spatial data type, Geometry and Geography. I have already posted quite extensively on spatial data, but it is is one of my favourite new things in SQL Server so all I am going to do here is point you at my blogcasts:

    There are also some excellent developer resources on spatial data and virtual earth from Mike Ormond, and Johannes Kebeck.

    Finally all of this is in all editions of SQL Server 2008, including the free Express editions.

  • SQL Server Advent Calendar 24 – Upgrade

    It’s Christmas Eve and the last day of my virtual advent calendar on SQL Server 2008..

    Hopefully the last 23 posts have given you some idea of what’s in SQL Server 2008 so I thought it might be useful to conclude with a word on upgrade.  Firstly SQL Server 2008 has a lot in common with SQL server 2005 so if that’s where you are then upgrade is pretty painless and straightforward.

    Don’t panic if your still on SQL Server 2000, you can upgrade straight to SQL Server 2008, and there is a massive amount of support to help you. Probably the best place to start is the SQL Server 2008 Technical Reference Guide.  This covers all aspects of the SQL Server (DBEngine, Reporting Services, Analysis Services and Data Transformation Services/Integration Services) and provides a very detailed (490 pages) of help to ensure upgrade is painless as possible.

    There’s also more help on TechNet and a tool to discover where your servers are:

    So take the next few days to read over the guide so you can get started on your upgrade next week, unless that is, you have other plans tomorrow!