Insufficient data from Andrew Fryer

The place where I page to when my brain is full up of stuff about the Microsoft platform

December, 2011

  • NO SQL Server

    I spent last Thursday at Black Marble’s Architect Forum and my slot was on SQL, NoSQL, some SQL. It went down very well but I used OneNote on my tablet not PowerPoint and like all modern art my drawing needs a bit of explaining..

    image

    Databases were originally used to  store transactions which are highly structured,  lots of little fields grouped together in tables which in turn had hard relationships plus an environment where a the transaction was either committed in its entirety or it was rolled back.  These simple transactions got more complex and other kinds of data got associated with the transaction, the contract, a picture of the product and so on.  Nowadays all sorts of stuff get thrown into databases like SQL Server,  for example all the content from a SharePoint site.  SQL Server has evolved to make this a lot easier and a lot more useful with technologies like FileStream, and the new FileTable in SQL Server 2012 so that the contents can be used as they are still individual files while remaining part of a SQL Server database (in a special file group) .    The point of storing all this data is to be able to retrieve it, and if those unstructured files have text in them then they can be included in a full text index so the contents can be searched as well as the metadata about the file.

    The next thing to consider is how are we using the database:

    image

    I have already mentioned transactions and content management, and two other key uses are business intelligence and as a backend for web sites.  This extended usage doesn’t really cause any problems in itself until we consider the users:

    image

     

    The Developer, don’t like SQL and this phobia started the No SQL movement which has changed to be Not Only SQL so (NOSQL).  You don’t have to use SQL to develop with there is entity framework  and link to bridge the gap between the relational world I know and the object world of the modern developer. You might need SQL to create objects or modify them and of course keep that under source control, however the SQL Server Developer Tools mean that it’s not necessary to get your hands dirty if you don’t want to.  Another problem for corporate developers is the need to hand of the project top the IT department for deployment as developers are rarely allowed access to  a production environment.  so there’s the Data Tier Application for that which wraps up all the database code for deployment by the DBA.

    The DBA.  The challenge of SQL Server for the modern DBA is that there can be quite a lot of databases which are only getting larger and many businesses don’t have  a dedicated DBA anyway.  So there’s extensive tooling for managing lots of database servers using Policies (like there are policies in Windows) and PowerShell support which should be familiar to a part time DBA and allows scripts to act on  SQL Server as part of a bigger script to provision virtual machines create logins in Active Directory etc.

    The Information Workers (IW) .  This is Microsoft speak for the end user and they should not have to learn SQL.   They should also be insulated from having to know too much about the detailed data structure of the systems they want to analyse and report from, so in many BI solutions there’s a semantic layer to allow users to drag and drop data without understanding the SQL or the relationships.  Having said that the power users often do have some of this knowledge and they do need to understand how to join sets of data together.  In PowerPivot 2012 for Excel these users actually create an entity framework  which they can then share with their less technical colleagues either via SharePoint or as a BI Semantic Model deployed from analysis services in SQL Server.

    There’s a bunch of technical developments which have helped to keep databases in general and specifically SQL Server relevant and fast while data volumes continue to explode:

    image

    Column based indexing rather than storing data in blocks of rows gives great read performance, the indexes can be seriously compressed which then means its possible to cache them in memory for further performance increases.  solid state storage just make things run faster with no need to change the design of a database but their expensive and possibly less resilient so a good first step is to use them for caching e.g. to put tempdb on them while the actual database still resides on a SAN behind a cluster.

    I have covered off FileTable above and spatial is actually a type of structured data which I should have drawn on the left, however it worth noting again that this isn’t just about storing that data it’s about having a rich set of functions to query the data and fast indexing to ensure the queries run quickly .

    image

    The current hot topic is Big Data, the ability to store anything and everything without worrying to much about structure.  What's important here as with any data store is the ability to search it, analyse it and make decision from it.  In an agile world this needs to be done quickly and by the business not the techies who understand the complexities of that data.  So while it’s nice to see that Hadoop is going to run on Azure to store mountains of data what interests me is the tooling that will be put into Excel to make that store of data useful and directly accessible to the user.

    image

    Another problem for a database is velocity; the  ability to be able to react quickly to incoming data and make decisions from it.  Stream Insight is actually not really anything to with SQL Server in that you don’t need a database to use it but it is part of the product, and there are occasions where the end product cold be a feed into a database so I think I can include it here. It is  a set of classes to proceeds feeds of data near real time (i.e. sub one second) by aggregating it and raising events off of it to other systems.  It is a sort of real time map reduce and uses LINQ so no actual SQL is required to code it either.

    Putting all this together  there we have Not Only SQL Server (NO SQL Server ?) databases an ecosystem around the actual database engine which is still part of the platform in SQL Server, where you can elect how to store and process data using familiar tools in unfamiliar ways, to create information and insight.  Most of the stuff I have mentioned here is doable in the current version (SQL Server 2008 R2) however File Table is new for SQL Server 2012 and PowerPivot has been enhanced to work with the new BI semantic model so you’ll need to look at the current beta if that sounds interesting

  • IT Camps

    We aren’t all the same, we learn differently, we work in a wide range of business that have very different needs, and we learn at different speeds.  So spending a day being lectured to on technical stuff,  isn’t going to be the right answer for everyone and no matter how good the speaker is he has to tread a middle line to keep the experts in a topic interested while ensuring those new to it aren’t left behind.  Talks on overall positioning or to show off some new cool stuff work well, but if you want to know how to get stuff done and get inside the technology, then a different approach is needed. 

    So Simon May and I have cooked up a different style of event, an IT camp,  where the content is driven by the audience, but within a general topic areas. We wanted to test this out by running a limited public beta,  so we invited a select group of IT professional guinea pigs to a day in London to test the idea.  We thought a basic day of clustering and server virtualisation would have the broadest appeal as Hyper-V is being more and more widely adopted.

    One of the problems with this kind of unstructured event is that we didn’t have the usual pile of PowerPoint decks as hand-outs,  Simon manned the whiteboard and I manned the keyboard.  One thing I thought would be useful would be to share some of what we built as we plan to run a lot more events like this next year..

     

    image

     

    My rough guide for installing Hyper-V server and adding it to a cluster with iscsi storage is up on skydrive ..

    and the resources we used were:

    Software:

    • Hyper-V Server 2008 R2 with sp1  is a lightweight (based server core for those that know it) edition of Windows Server that is not licensed and can only be used to run Hyper-V.
    • Windows Server 2008 R2 with sp1 Evaluation edition If you aren’t comfortable using the minimalist interface in Server core then you could use this and enable the hyper- role on it.  You will also need it to create some VM’s with
    • Core Configurator is a PowerShell based interface to make Hyper-V server and server core easier to manage directly than it is with SConfig
    •  iscsi target software allows you to emulate shared storage using Virtual Hard Disks (.VHD)
    • RDCMan allows you to control a lot of remote desktop sessions
    • Zoomit (part of Windows sysinternals) allows you to pan and zoom around your screen when presenting
    • System Center Virtual Machine Manager 2012 manages large server virtualisation deployments and is now in release candidate.  There’s also a prebuilt VHD which can integrate into your test sandbox / demo rig here

    Learning resources

    Microsoft Virtual Academy

    Installation and setup guides..

    Hyper-V Survival Guide on TechNet , this has sections on dynamic memory networking, clustering with iscsi and just about anything else you’ll need

    Our IT Camp guinea pigs seemed to like the event but also gave us a lot of helpful feedback which will be baked into the next events we do, so keep an eye out for IT camps coming to a location near you in 2012.

  • Virtual machine density in your data centre

    I can only run 11 server based  Virtual machines on my laptop, but all bar three of them are running SQL Server:

    • 3 x VMs running SQL Server 2012 beta and the new AlwaysOn Cluster. Note one of these is running SQL Server 2012 on Windows Server Core
    • 1 x VM running  SQL Server 2012 the database engine plus 3 x instances of analysis services, master data services and data quality services not to mention SharePoint with Office Web Apps enabled.
    • 1x Windows 7 VM with Office 2010, Visual Studio 2010 ultimate and all the System Center client tools, remote server administration tools
    • 2 x System Center Service Manager 2012 beta VM’s one fore the service and one for the data warehouse
    • 1 x VM for System Center Orchestrator 2012 RC
    • 1 x VM for System Center Virtual Machine Manager 2012 RC together with then new System Center App Controller
    • 1 x VM for System Center Operations Manager 2012 beta
    • 1 x VM for Red Hat Linux
    • 1 x VM as my domain controller and DHCP server

    The limiting factor I face is RAM -  the minimum memory requirements of many of the System Center tools limits what I can cram into to 16Gb, but dynamic memory is a great help here.  Anyway it’s a fair increase over the four VM per server density that was discussed when Hyper-V came out.  That ratio of virtual to physical can of course be pushed much harder on ‘proper servers’ designed for Hyper-V rather than my laptop mash-up.  A good example of this was the labs run at various big events like the Microsoft Management Summit in May where they were able run 225VMs per host although with 128Gbs or RAM they would only be getting a basic 512Mb per machine.

    However there is another way and that’s what Microsoft does in its newer data centres, like the one I visited last week.  The whole data centre runs on a modified Hyper-V but what’s different is that there are thousands of low cost basic servers rather than hundreds of huge monsters.  Blogging in more detail about how these work is more than my job’s worth so if you want to know more then the Global Infrastructure Services site is the place to go (there’s a video tour of one of the data centres  here) .  However what I can say is that all the lessons learnt from operating at this scale are then put into the next releases of Hyper-V and System Center, for example:

    • the bare metal host provisioning in Virtual Machine Manager 2012
    • the separation of duties in Virtual Machine Manager 2012 where the team who look after the physical servers don’t control the services that run on those servers, that’s down to the application teams.
    • the integration of AVIcode into Operations Manager 2012 to understand what problems are affecting the applications themselves.

    So if you want to get an idea of how to run a data centre at scale then you’ll want to spend your downtime over Christmas learning virtual Machine Manager either by watching the new content on the Microsoft Virtual Academy or by pulling down the Release candidate (which you can install or uses a preconfigured hyper-V virtual machine)

  • Merry Christmas and a Happy 2012

    goldfinch ecard

    My Christmas card for 2011 is inspired by a frequent visitor to my back garden the Goldcrest.  The trick to seeing them is to resist the temptation to tidy up and dead head too much, whereas that’s exactly the sort of thing you should be thinking about in your data centre to keep it clean and tidy.  The problem here is often knowing what you’ve got, and why it’s needed especially with an explosion of virtual machines.  So before you hit the delete key I would suggest you download and run the Microsoft Assessment & Planning Tool (MAPT).  This not only reports on Microsoft stuff it looks at 3rd party software and your hardware and virtualisation environment.

    I will certainly be doing lots of tidying up on my demo rigs to free up enough resources for System Center 2012 now much of it is at release candidate and of course the release candidate of SQL Server 2012 is now available as well.  The team have planned out lots events next year, IT camps, the System Center 2012 preview tour is still on tour and of course I’ll be at SQL Bits to complete my perfect attendance.

    Until then, whatever you’ll be doing over the Christmas break, have fun and don’t hog the XBox.

     

    Andrew

  • Private Cloud–Nothing to see please move along

    One of the reasons the term Private Cloud is getting a bad press is that it’s all marketing fluff and isn’t real. In any data centre you are going to be doing thing like..

    • deploying applications
    • fixing applications
    • performance tuning
    • load balancing
    • resource planning
    • decommissioning applications
    • patching and maintenance
    • backup! …and you might also need to do a  restore on occasion
    • audit and compliance
    • bid for more resources from management

    This list isn’t much different to what I used to do as a Unix admin back in the nineties, however how this stuff gets done is now totally different; I used to send out patches & fixes on CD out to branch offices, and had to visit these offices to setup TCPIP.  If a server or desktop died rebuilds were tortuous and painful and if that server had an application on then we would have to reinstall on another server and break out the backup. 

    Later on we could cluster servers but this was painful and expensive and only a few services, like SQL Server, could failover properly.

    Virtualisation changed things a lot, but I feel this was a bit like moving to a bigger house; you pack up everything and get rid of a lot of clutter, however a year after you moved in all the extra space has gone and in some cases there is more mess than there was before.  What matters in a post virtualised world is how much effort is required to manage those virtual machines    This takes me back to another old discipline; systems analysis -   every entity needs a process to create , read update and delete (CRUD) and this should apply to VMs as well as to data stores.  Applying CRUD to VMs means that there should be processes in place to

    Create. Use a self service portal or via a service desk request.  Another Private cloud scenario is that they might be created automatically to meet demand when  a service gets busy

    Read. Access them and continuously monitor them to ensure they are healthy

    Update. Apply fixes and patches to keep them current

    Delete. remove them when they aren’t needed any more, the service they are providing might be scaled back or it may be that the whole application has been superseded

    None of this is new to the public cloud vendors, Amazon, Google, Microsoft etc. this is what their data centre staff have setup long ago for their online services like Mail, Search, and shopping.  What is new is that the best practices arising from doing this at scale (e.g. one data centre admin per 1-2000 VMs) are being built into software like System Center 2012 so you can operate you own infrastructure as efficiently.  For example patch management is automatic,  a new VM is a mouse click, and you are fixing the problem before the user realises there is on. 

    So to quote from that well known group of IT pundits the Fun Boy Three/Banarama “It ain’t what you do , It’s the way that you do it .. and that’s what gets results”.

     

    Further reading:

     Microsoft Virtual Academy (which now has a separate module on System Center Virtual Machine Manager 2012).

    The System Center 2012 road show touring the country:

    • Birmingham on 17th January (register here).
    • Edinburgh 28th February (register here).