I am in the middle of SQL relay this week, but I just realised that you might like to have a copy of my Denali slide deck, which I have now posted to SkyDrive. You are welcome to have a look at it even if you couldn’t make one of the four venues I am presenting at this week, but there isn’t a lot of narrative included so you really need to be there.
Anyway the deck is on SkyDrive, please feel free to reuse but an acknowledgement of where you got it from would be a nice courtesy, and if you are considering looking at some of the cool stuff in SQL Server Denali please contact me to share your stories if you can.
Anyway tonight I am in Edinburgh, and tomorrow night at NEBytes, Newcastle University.
Before I answer that question I want to be clear on what exactly self service BI means as you could argue that users have been doing BI for themselves since the spreadsheet was invented. In those early days users would often have to rekey data into whatever analysis tool they were using as the source information was often a printout from a mainframe or mini. With the advent of odbc, users could be given permission to access source systems directly and get at the data they needed. However this lead to a number of problems:
· The end user would need a good knowledge of SQL
· The data returned might well be inaccurate as a knowledge of the structure and contents of the source system would be required.
· Their queries might perform really poorly and possibly affect other user of the system and overall performance.
· The resultant data was a point in time snapshot and often it would be impossible to track where a given set of data had come from and when the query had been run, as the end user had to consciously record that.
So IT had to help out. We had to provide appropriate access to the data, educate and train the users about SQL and the source systems and possibly provide canned views or stored procedures so they could get the data they needed with minimal impact on the source systems.
This was all fine and well if the data required was only in one system, however this was rarely the case, and there were also problems in getting historical data for trend analysis as this might not be in the line of business systems but in separate archives. The answer to this was the the data warehouse and tools to provide a business friendly view of this data so end users didn’t need to understand SQL or how the data warehouse itself was constructed. IT created the data warehouse and maintained that business friendly view (aka the semantic layer), and the users had various tools to report and analyse its contents.
However there was still a problem, the time and effort need to do this meant there was always a lag behind what users wanted and what IT could deliver. For example there are sets of data on the internet like data.gov.uk and social media metrics that are external to a business. Not only that the rate of change in modern business is more rapid than ever in response to unforeseen external factors like the economy, natural disasters, and the on-demand existence that consumers expect .
This has seen the rise of in-memory analysis tools that can handle sizable chunks of data on even a modest laptop. These tools have simple to use import procedures and can consume data from web services via Odata, from xml as well as traditional data sources, and because they capture where the data comes from it is a simple matter to refresh the data from source, on demand. Coupled to this they have built in intelligence to mash-up disparate sets of data into sophisticated analytics. So does this mean that IT is no longer involved in the process?
No, that data warehouse still has its place as a central trusted repository of internal information. Strategic scorecards and dashboards will still have their place too. However when exceptions occur or there is an external factor which could have a major impact (positive or negative) on the business then the end user BI tools will provide the analysis needed to make the necessary correction to the decision being made.
So what does IT provide in the self-service BI world?
It may be stating the obvious but any end-user BI tool is only as good as the data that is fed into it, and so IT has a key role to play in maintaining data quality. This is partly about cleansing the data and partly about augmenting it for more meaningful analysis. Cleansing means de-duplication, correcting keying errors while augmentation processes will add in missing data. In both cases IT will be working closely with users to do this, to establish the rules and provide interfaces for the users to enter missing data if automated processes detect this but cannot fix it.
In addition to maintaining the data warehouse I can see the need for IT to offer a data market of Odata services to the end user from which they can self-select sets of data they need to make timely decisions. This would to some extent replace the need to produce piles of reports, but in either case it’s important to track usage of reports to weed out those no longer in use.
If some of the users self-service BI analytics move from being tactical to strategic; from being just for their team/department to being enterprise wide, then IT will pick these up and scale them using server rather than PC based technologies.
As Ralph Kimball noted many years ago in his seminal work the Data Warehouse Lifecycle Toolkit, collaboration between IT and business is a critical success factor in any BI project and this is as true today with our modern tools as it was when he first started in BI.
There is a lot of advice on whether to virtualise SQL Server and many dbas are under pressure to do this. There’s good advice here from SQL CAT on best practice on hyper-V and if you are in the business of acquiring hardware to run SQL Server virtualised then there are two reference architectures that are relevant depending on how many databases in your organisation:
Hyper-V Fast Track is a generic hyper-v solution available from several hardware vendors including HP designed to maximise generic workloads running on Hyper-v, including SQL Server. Up from this you might consider putting all your SQL Server virtual machines on one optimised server and HP have their Database Consolidation Solution for this. Like the Hyper-V Fast Track, this is preconfigured with System Center Virtual Machine Manager (SCVMM). What makes it different is that there are ready to use template SQL Server VMs of different sizes which you can then quickly deploy.
However there comes a point when a SQL Server database is so large and so widely used that it is consuming all the resources of a physical server an obvious example of this is business intelligence. Rather than work out how to set this up there Fast Track reference architectures and appliances for this.
The Microsoft Fast Track Data Warehouse reference architecture built into the HP Business data Warehouse provides for data warehouses of up to 5Tb of compressed data. It has a bigger brother the Parallel Data Warehouse (available from HP as the Enterprise Data warehouse and also from Dell) and this scales to 500Tb, but does so by using multiple physical servers which behave as one.
The other interesting appliance is the is the Business Decision Appliance which is specifically configured for PowerPivot for SharePoint as this can be a little tricky to set up and even if you have a SharePoint 2010 farm you’ll probably want a dedicated server for PowerPivot if it is to be used widely in your organisation. So this appliance has SQL Server enterprise with all the BI components installed plus SharePoint 2010 enterprise with all the integration configured for you. All you have to do is to use the HP web config tool to join it to your network domain and you are ready to use it.
So these appliances might seem a little bit different from a scalable elastic private cloud, and that’s because what you typically want from a database platform is predictable performance.
The Accountants, sales guys and lawyers in your business don’t understand or care about how fast your hypervisor is or how many nodes there are in your cluster. What they care about is that the e-mail works the intranet is up and the business can trade with its customers. In short they care about applications, and they are paying you so they don’t have to worry too much about them.
Applications come in all shapes and sizes from a single install to the sort of server farm used by SharePoint, they can be written in house or bought in from a vendor (ISV). So how is it possible to understand them all and keep them all running?
Let me answer this and the seemingly unrelated question: “Why are there so may tools in System Center?”
Are your applications working?
You could answer this by repeatedly trawling around all the event logs in your servers either manually or with the mother of all PowerShell scripts, but you need to know about problems, before your users do or at least as soon as they do. A simpler black box approach would be to create a synthetic transaction against a key application e.g. to replicate a user logging into a site and opening a page, and then test how long this takes once every 15 minutes to check its working as planned. This capability is built into System Center Operations Manager (SCOM) today and while it will tell you there’s a problem it doesn’t really tell you what’s going on under the covers unless there are good errors in the event log from the application. A new approach in SCOM 2012 is to peer inside the code and pickup issues coming from calls in the application code and show performance problems and errors in a simple interactive interface. This built in capability comes form an acquisition, AVIcode, so actually that’s one less tool you’ll need to stay on top of things.
Managing the application
It is important to understand the dependencies on an application e.g. database, web service and of course compute network & storage, especially if this is all running as virtual machines. So in several of the System Center tools you can now diagram what these services look like in Virtual Machine Manager and in the new App Controller. Templates in SCVMM let you scale these services or deploy more of them and a change to the template is automatically reflected in the instances created from it.
Moving the application
Application virtualisation , the business of separating the application from the operating system has been available for some time for desktop applications. However server applications are often multi-tier and also run as services, so delivering server application virtualisation is harder and is only now part of Virtual Machine Manager 2012. The clever bit about this is that you can deploy your non Microsoft developed app (e.g. java, PHP) onto a windows virtual machine that could be running on XEN Server, Vmware or even Hyper-V.
Joining the dots
The days of changing things at random in a data centre are largely gone, what is needed today is extreme automation and standardisation coupled with a set of standard operating procedures. While PowerShell can certainly do this, interop with lots of disparate systems can be hard, event handling is rudimentary and the scripts can be hard to debug and maintain. The processes are much better designed in Orchestrator as this has deep integration to the key vendors in systems management, it is very visual and easy to understand and it can be driven from or drive to your help desk, be that Service Manager, Remedy or similar. This then ensures that all changes are properly tracked and all processes are consistent.
Back Up Plan
I often get told off for not mentioning Data Protection Manager, not by the Microsoft sales team but by its loyal fans at events I go to. This is because it backs up all kinds of things in the data centre from VMs to databases, SharePoint sites and Exchange but intelligently so that it understands them and can restore individual documents, mails etc.
All of the System Center 2012 suite is still under development at the moment with most of it available as a public beta:
and you can get all of these from the System Center Evaluation Center. However if you want to see this stuff in action rather than install it yourself then there’s a System Center 2012 road show coming to a city near you..
16th November 2011
6th December 2011
Imperial War Museum
14th December 2011
18th January 2012
If you didn’t manage to fight your fight your way through the Everton & Liverpool fans to SQL Bits I thought you might like to see some of the random questions I got asked..
How do I manage SQL Azure?
I think the key thing about developing for the cloud is to ensure that your application can handle retries as session disconnect and reconnect. You will also want to ensure your code is well optimised and ideally your application that’s using SQL Azure is in Azure itself.
Having done that you should be checking your application is running and the best way to do this is to use the built in management views (DMVs) that Azure has. These track such diverse information as:
Evaluation & Testing
There are time-bombed evaluation copies of most of the Microsoft platform, but in order to look at something like the new Crescent reporting tool in SQL Server Denali, on a virtual machine you’ll need a copy of windows server, Office , SharePoint as well as the Denali beta itself. So abetter way to this is to use either a TechNet or MSDN subscription. Apart form the fact that MSDN is more expensive they are there for different purposes and have different uses as this grid from MSDN shows:
So although licenses included in both subscriptions aren’t time bombed, only MSDN subscription licenses are allowed to be used for testing, as well as evaluation.
When will Denali be released
There is no specific release date for SQL Server Denali, maybe there will be some announcements at PASS this week but I doubt the date will be fixed. Suppose I could tell you this would you really plan your project around that given that I still talk to customers on Windows XP with SQL Server 2005.
OLAP & Tabular BISM
The shiny new Tabular based analysis services has not replaced OLAP. In the current beta (ctp3) Crescent only works against a tabular BISM model, and I am hoping this will be fixed fore the release of Denali. However OLAP offers a more sophisticated language (MDX), the ability to model many to many relationships to deliver more complex analytics, and has data mining built in. There are also enhancements to OLAP in Denali, so you need to decide how and when to use each technology in your organisation.
I could go on but I am about to jump in my car to head off to Birmingham for tonight’s SQL Relay in Birmingham, so if you are there tonight or in Leeds, Edinburgh, Newcastle later this week then come along and say hi and bring your questions.
Yesterday I spent a great day at the Leeds Virtual Machine User Group, with attendees varying from the IT Professionals from most of the cities law firms, to local universities and colleges where both the IT guys and the students took time out to attend, so a great talking shop and the subject of IT education came up. The students I spoke to seemed very keen to learn about the IT Professional world despite that not being the focus of their degree and on the employer side there were many there who felt that industry certification at the right level was more valuable than a degree and obviously at a virtualisation event this mean a qualification from Vmware and/or Microsoft is what matters. In the Vmware world you need to do the course and then do the exam. In the Microsoft world you can just do the exam, but there’s a catch, those exams have gotten a lot harder, possibly harder than the equivalent VCP certification from VMware. That’s not because Hyper-V is tougher to learn than Vmware especially if you have a windows server background already, rather it is to set the bar higher and build trust in the certification with employers
I also think hard exams are a good thing; they sort the wheat from the chaff, so you stand out from other candidates at interview time, and anyway what’s the point of doing them unless there is a true sense of achievement. Of course knowing what to get certified in is important, and given that virtualisation is relatively new and growing steadily in popularity in the UK I am going to stick my neck out and suggest that the first exam to go for would be 70-659: Windows Server 2008 R2, and Server Virtualization, especially if you have already got your head round windows server itself. However as I said the exam is hard so I would suggest that you
a. try and setup some sort of demo environment (I have a 101 guide here plus there’s loads on TechNet)
b. get you head around the concepts by subscribing to the Microsoft Virtual Academy
Then you can decide if the exams are for you, and either persuade your boss to let you do them or failing that mug up from the numerous books out there buy or buy the e-learning kit from Microsoft that gives you everything you need, do the exams and join a more enlightened company once you’re qualified.