Sparse Columns are another new feature of SQL Server 2008 and are included in the February CTP (CTP6). They pretty much do what they say on the tin; offering a trade-off between taking more space to hold data, but none at all when they are empty. They don't get you over the 1024 column limit, but could mean you can squeeze more columns into the 8,060 byte row limit for SQL Server.
Like everything in SQL Server you need to know when they add value and when to avoid them like the plague. Fortunately one of the non-sparse areas of Books On-Line is the section covering sparse columns here.
So the good news first:
And the downsides.
As you can see from Books On-Line there is a really useful guide to when to use them for a particular data type e.g. if more than 64% of your values are null in an int column then use sparse columns, and basically the longer the data type the lower the threshold for using sparse columns.
So how does it work? Just put the keyword SPARSE into a create table statement:
CREATE TABLE CustomerDemographics
(CusomterID int PRIMARY KEY,
Gender varchar(7) NOT NULL,
EducationLevel varchar(20) SPARSE NULL,
SalaryBand varchar(10) SPARSE NULL)
Selects against this table will work exactly as for normal columns whether the sparse column is included as a column in the select column or a filter in a where clause.
Optionally to create a column set for this table append this to the end of the create table statement:
DemographicSet XML COLUMN_SET FOR ALL_SPARSE_COLUMNS
The column set DemographicSet is then treated like any xml column i.e. it can be selected and also be used for updates and inserts, Note if you do use a column set for updating data sparse columns not specifically declared in the update well be set to null.
Finally if you are wondering why this feature was developed, the simple answer is to support future versions SharePoint which was also one of the drivers behind FileStream. I can see it being applied to any content management system over SQL Server and also as I have mentioned before for reducing the overhead of storing customer demographics or product catalogs where not every column applies to every product or customer.
One of the biggest headaches in SQL Server 2005 is trying to manage resources where there are competing workloads on a shared server. A typical approach is to use instances, but this completely hides memory and CPU from other instances which are then not available to an instance under stress whether the other instances are using them or not.
The resource governor in SQL Server 2008 fixes that and I have a short blogcast here on TechNet Edge to demo how this works...
One of the worst reporting headaches is producing output for the finance department. The reports need to meet strict standards, be penny perfect, and the end user always wants the report to be in excel. With reporting services in SQL Server 2005 this could be done by the IT department but the real solution is to let the finance types do it for themselves. Performance Point went some way to addressing this and for those customers with Dynamics there was FRx. So three ways of doing things - a typical Microsoft approach.
Then very quietly, in fact so quietly that you would only know about it by searching for it by name, Management Reporter quietly launched down the Microsoft slipway on 29th January. Management Reporter is a free add-on to Performance Point users with software assurance. It's specifically designed for finance users to do reporting in Excel. It has a lot of financial intelligence built in and support for regulatory compliance e.g. Sarbannes Oxley (SOX), International Financial Reporting Standards (IFRS), and Generally Accepted Accounting Practices (GAAP) to name a few.
Unlike FRx, your accounts don't have to be in Dynamics to use it and you can save the design as a reporting services report so information can be disseminated to other parts of the business who don't have Performance Point.
There's a lot of buzz around Software plus Services (S+S) from Microsoft and others. It's all very simple as Eileen will tell you - it's when you use a service like GPS that's remote and public and combine it with some software like CoPilot or TomTom on your GPS to find directions.
In a similar vein SQL Server 2008 can store geo-spatial data but that's of no value to a business unless they can see it and the best way to do that is on a map. If you don't have mapping software of your own then you can use a web service like Virtual Earth.
Johannes Kebeck a Microsoft Virtual Earth expert spent a morning in Visual Studio 2008 to produce an application for an insurance company which I have demonstrated in this blogcast:
I have seen the filtered indexing in SQL Server 2008 mentioned a couple of times over the last couple of weeks but I only got it properly when I got to try it myself in ctp6.
I have often needed to constrain a column to a only allow unique values but also allow the column to allow nulls and with filtered indexes you can put in a where clause like this ...
create unique index production.nullidx on production.product(code) where production.product(code) is not null
create unique index production.nullidx
where production.product(code) is not null
So the filter takes out the nulls. Apparently this is also really useful if you are migrating your database from another provider to SQL Server.
I have seen some strange schemas in my time which look like a good idea on paper but not on disk. A common scenario is the schema that is created by a tool controlled by a user and so we end up with columns like user21 in usertabel7 and so on. Then I got an e-mail about the poor performance of user defined functions (UDFs) following Simon Sabin's talk on the CLR and how good they could be for replacing T-SQL UDF's at the SQL Community event last Thursday .
The database involved in the problem had several tables where generic columns where used for different purposes. Taking customer as an example there was a separate table CustomerStrings that looked like this:
CustomerID is foreign key to the main customer table and CustomerStringTypeID is another foreign key, this time to CustomerStringType which has an ID and a description, with values in the description such as 'Fax', 'Mobile', 'e-mail' etc.
So the CustomerString table would have multiple rows for the same customer for example a row for Fax, mobile, email and so on. The good thing about this is that rows would only be created where a given attribute for a customer was used and so would be efficient on space.
Getting back all the e-mail addresses for a customer wouldn't be too bad either depending on how this was implemented and this is where the developer tried to implement a generic solution with a UDF:
CREATE FUNCTION dbo.GetCustomerString(@customerID int, @customerStringTypeID int) RETURNS varchar(50) AS BEGIN
DECLARE @returnValue varchar(50);
SELECT @returnValue = CustomerStrings.StringValue
WHERE CustomerStrings.CustomerID = @customerID
AND CustomerStrings.CustomerStringTypeID = @customerStringTypeID
The problem arises when this function is used in-line on multiple occasions to get three attributes for every customer:
dbo.GetCustomerString(Customers.CustomerID, 1) AS Fax,
dbo.GetCustomerString(Customers.CustomerID, 2) AS mobile,
dbo.GetCustomerString(Customers.CustomerID, 3) AS email FROM dbo.Customers cst;
So the UDF gets called three times for every row in the customer table which is always going to be much slower than using a set based operation for example:
fax.StringValue as Fax
mob.StringValue as Mobile
eml.StringValue as email
FROM dbo.Customers cst
LEFT JOIN dbo.CustomerStrings fax ON cst.CustomerID = fax.CustomerID AND fax.CustomerStringTypeID =1
LEFT JOIN dbo.CustomerStrings mob ON cst.CustomerID = mob.CustomerID AND fax.CustomerStringTypeID =2
LEFT JOIN dbo.CustomerStrings fax ON cst.CustomerID = eml.CustomerID AND eml.CustomerStringTypeID =3;,
However maintaining all the code for the possible combinations of attributes required is a big headache for this company so how can they get the speed of the second query without writing mountains of code. In my opinion the schema is the problem here. I would either create sufficient columns in the customer table for all the attributes required or using another group of tables with one row per customer to store contact details etc. Of course if there loads of columns required (as in SharePoint) many of which are usually empty then it's time to have a look at Sparse columns in SQL Server 2008 and that is what I will be posting about next..
SQL Server 2008 enables the use of geo-spatial data or location intelligence in four ways:
Below is a short blogcast to show how these come together following on from my last two blogcast's on this interesting aspect of SQL Server 2008...
Excel is the number one business intelligence tool; It fronts Hyperion, Cognos Financial Planning, and of course SQL Server Analysis Services and Performance Point Server Planning. In Office 2007 there is now excel services so that information workers can park their work on SharePoint where it can be properly secured.
One issue I have seen is where you decide to publish an excel worksheet to excel services with a connection to an SSAS cube in it. By default the worksheet will cache the data the user sees when they publish it as well as the layout and the connection to the relevant cube. That might not be a good idea if the creating user has complete access to the underlying cube, while the intended users of the cube, have very restricted permissions. In this scenario when a restricted user opens the worksheet from excel services they will see everything that the creating user saw when they saved it.
It's simple to correct this behaviour, but very difficult to find where to do this, and in my opinion the default behaviour should be the other way round! Anyway all you need to do is to force a refresh of the data when opening the file in the connection properties dialogue:
This situation also applies to surfacing an excel worksheet in Performance Point Server.
Another issue with Excel services is that it doesn't support the publication of worksheets with the new data mining add-ins installed, and I am assured that this will be fixed as part of future changes to excel services rather than by the new version of the data mining add-in that will ship with SQL Server 2008.
CTP6 is now available for download here. This has all the functionality of the final product and amongst the new stuff for this CTP are:
One thing to note is that the link above still looks like the November CTP but does actually link to the February one, and the TechNet and msdn pages will change over the next day or so to reflect this.
Following all the acquisition mania in the BI marketplace last year, it's interesting to see that the Microsoft approach of growing your own BI platform seems to be paying off according to this article by Gartner in Information Week.
It has taken a while though - I well remember my first foray into olap services in SQL Server 7.0, we were using a market leading olap tool to build a cube from a fact table with 75 million rows in for an electricity company. It was taking 26 hours to build and then failed and we used the same hardware and SQL Server 7.0 olap services to build the same cube and it successfully completed processing after three hours, and because it was an open solution we were able to use the same front-end for the users.
That was in 1999 and since then there has been loads of work on the Microsoft BI platform so by last year I worked on a project that could load 15 millions rows of new fact data a night into a cube with two years of history.
In the latest version there aren't the big changes that we saw moving from 2000 to 2005 as the hard work was done then. However as I have posted before the key things are analysis services 2008 are:
The key white paper on these new SQL Server 2008 analysis services features is here.
British Airways not only run Vista for the in flight entertainment system, They also serve up TechNet on it...
I am not sure if this is an exclusive service on the Heathrow - Seattle flights or just part of the normal service. I also noticed that the keynote guest speaker at Techready 06, Michio Kaku, (one of the co-founders of string theory) also had a slot in this IT Pro orientated system as well.
Of course it might be cheaper to subscribe to TechNet than get it for free this way!
The geeks in Microsoft have taken over down town Seattle for the week as we all get our six monthly fix of the latest stuff that's coming out. This is one of the ways that Microsoft has to keep it's technical people happy and it certainly works for me. We get to meet the product teams and network with all the different sorts of communities we have , such as Eileen's women in IT, the bloggers, SQL experts etc.
There also lots of social events and on Tuesday night all the evangelists took over a game-works an amusement arcade, which I have to say I found great fun but not as engaging as PC Gaming. So my favourite evening of the week is when the entire Hands-on-Labs area (some 1500 PCs in one football field size room) is setup for a bit of a LAN Party.
Obviously what we actually do in Redmond stays in Redmond, but what's this all got to do with Vista sp1?
The whole party and all of the Hands on Labs is run on PCs with sp1 installed on without incident.
I often get asked about whether SQL server is compliant with some standard or other e.g. Sarbannes-Oxley (SOX) , Data Protection etc. The answer is yes, but there is no flip switch in SQL to turn any of these on , and as anyone knows who's tried it turning on c2 security is a great way of slowing your data base down, so caution is advised here.
For SQL Server 2005 there are configuration packs for SOX here and for the European Union Data Protection Directive (EUDPD) here. However these packs are applied to System Center Configuration Manager 2007 (SCCM) as these they are a set of configurations which can be used to evaluate compliance and then sent into SCCM for reporting, so you are going to need to invest in this technology or take the more expensive option of going around every server in your organisation and checking each one in turn manually.
For SQL Server 2008 this approach will continue, but is much easier as there are a whole bunch of new features to make compliance a lot easier and my top three would be:
Of course the rules of the game will also change as new legislation comes in, usually following hot on the heels of the latest compliance fiasco, so I wouldn't be surprised to see something bubbling up from the UK government's scant regard for the protection of personal data or the fallout from Societe Generale.
The next round of road shows are going to focus on Windows Server 2008 and Vista sp1 so no SQL Server 2008 this time. But no peace for the wicked like, many dba's, when not doing my day job I have to get my head around all of the other stuff that sits around the database, such as SharePoint, the operating system and networking. So I will be on stage with the rest of my compadres talking about the new ways we can support the branch offices that exist in many organisations.
In order for you able to see all the demos including James' 8pt PowerShell scripts Georgina has booked up cinemas up and down the country. Also we're adding Cardiff to the list this time, at the expense of Reading, so remember to get on the right platform if you are coming form Swindon or Bristol!
The day pans out like this:
09:00 – 09:30 Registration 09:30 – 09:45 Keynote
Consider the way you work today. Think about what’s changed in the last few years. You can be sure that there are plenty of changes along the way. How can you best position yourself to help your business make the most of change and improve your career at the same time?
During this session we well take a look at the opportunities offered by key technologies including software plus services and innovative approaches such as virtualisation and Infrastructure Optimisation. Change seems to be the only constant in our industry. Keeping ahead of the curve brings benefits to everyone. This session will help you plan for the next generation of infrastructure and your own personal development. 09:45 – 11:00 Windows Top Features
So many features. So little time. Where should you focus? This session will help you to decide where to invest your time and effort.
People’s expectations are changing. They expect to be able to work “their way”, free of the limitations of being “tied to a desk” or a particular device. Minimising the risk of security breaches while enabling flexible working can create interesting challenges. Windows Server 2008 can help – enter Network Access Protection.
Thankfully the days of one line of business application per server are numbered thanks to virtualisation. Lazy servers waste power, data centre space and maintenance charges. Windows Server 2008 can help – enter Hyper-V.
Traditionally ensuring consistent configuration of your web servers can be time consuming. Windows Server 2008 can help – enter IIS 7. 11:00 - 11:20 Break 11:20 – 12:35 How to Manage Windows Server 2008
Where should you use Windows Server Core instead of the full installation of the operating system? If you use “core” then how on Earth are you supposed to manage it? Regardless of whether you use “core” or “full” where should you use Powershell and where does Windows Server Manager make more sense?
We’ll show you! 12:35 – 13:25 Lunch 13:25 – 14:40 How to Deploy and Manage the desktop? What’s in Vista SP1?
One desktop no cry? As anyone who’s ever managed a large number of desktops knows it’s well worth making the most of automation to ensure consistent user experiences and reduce support costs. We’ll dive into the range of operating system features available to you including WDS and WSUS plus we’ll investigate MDOP and SoftGrid’s application virtualisation.
Waiting for SP1 before deploying Vista? Wait no more! We’ll open the lid and explain exactly what’s in there. 14:40 - 15:00 Break 15:00 – 16:15 How to overcome the challenges of Small Office AND branch offices
It’s often less likely anyone would notice a server going “missing” in a small or remote branch office. These servers hold replica copies of user account credentials and your production data. Windows Server 2008’s Read Only Domain Controller features and BitLocker on the server can help!
Moving to a World where small/branch offices connect via the Internet to your main server resources can reduce both your attack surface AND expense/complexity. Windows Server 2008’s much improved Terminal Services can help! 16:15 – 17:15 Q&A with the full team of presenters throughout the day 17.15 Close
Like Jimi Hendrix we love feedback so step 1: come along and step 2: tell us what you think. As usual you need to register here, but it is filling up following on from the launch event at the ICC in Birmingham on 19th March.
I thought you might find this article in the Data Warehousing Institute (TDWI) interesting. Basically Stephen Swoyer has discovered that users like Performance Point because of the way it acts as a piece of glue to bring all of the Microsoft BI stuff together rather than prescribing an approach. Integration with Excel is of course top of this list, but every BI vendor, even open source olap tools like palo have that. What makes Performance Point a little bit different is its use of Excel services and SharePoint to deploy and manage Excel, as well as report design in the new Management Reporter I mentioned in my previous post.
Another odd thing is that early adopters don't seem to mind that the product has a few rough edges, they trust Microsoft to evolve Performance Point in the same way they did with SQL Server Analysis and Reporting services, listening to them in the process.
How does that listening process work? Well the most common routes are:
I have been to lots sessions on SQL Server 2008 at Microsoft's internal technical training fest (TechReady) over the last couple of weeks, and I wanted to post about but the new for CTP 6 stuff , but only when you can try it for yourselves. I wondered if there was any buzz in the public domain around CTP6 while I have been off the radar and this looked good..
..until you read the date of October 16 2008!
So the good news is that Live Search does hit MSDN, and TechNet OK, the bad news is that there must be tons of stuff on these sites that might not be relevant any more, and the really bad news is that I still can't talk about how good full text search in SQL Server 2008 yet doh!
I am like c# - I am event driven. I like to meet people and I like to present or be presented to in person. Last night I watched one of our MVPs, Simon Sabin, give us (the UK SQL Community) the DBA's guide to the CLR - when to use it and when not to. Great stuff and you get free beer and pizza as well.
However I do understand that evenings are precious and we all have busy lives, so sometimes you just have to watch stuff off line. I have just discovered TechNet spotlight; which is packed with recordings from most of the big IT Pro events over the last coupe of years.
However there isn't anything on SQL Server 2008, although 2005 returns about 30 hits and even SQL Server 2000 has 2! So I have mainly been using it to mug up on stuff that isn't SQL Server such as System Center, and virtualisation in all its forms.
We also have TechNet Edge which is content made by people like me as well as MVP's and tends to focus on what people are doing with our stuff although I could only find a couple of SQL Server posts on here.
So if you are doing anything cool with SQL server and you want to share it, either in person or as a blogcast please get in touch. The SQL community would welcome your ideas for sessiosn at their meetings even if it's only a short slot, and I would be happy to help you get your blog casting career launched.
One of the problems with our TechNet road shows is getting at the content after the event. Of course we do provide the slide decks, but these can be abut thin as many IT Pro's (including us!) are power point averse, preferring see stuff in action. We can record it all but I am not sure anyone is going to find an hour long demo much use so one of my new year's resolution is to do a few blog casts to show particular bits of SQL Server 2008 work.
So following on from my last blogcast on Friday I thought it would be good to do another one to show what's happening behind the scenes using a simple geometry based example..
I am in soggy Seattle this week with a bunch of like minded SQL squirrels, so our super Tuesday was watching the product teams including, Ted Kummert, Donald Farmer and Christian Kleinerman putting the latest refresh of SQL Server 2008 through its paces.
One thing that made me smile as Ted was going through the history of SQL Server was that everyone complained that there was such a big gap between SQL Server 2000 and SQL Server 2005 and now everyone is complaining that SQL Server 2008 is coming too soon after SQL Server 2005! So if you are in the latter camp hopefully the fact that the product is now scheduled for Q3 2008 rather than Q2 will make you feel a little happier.