With SQL Azure, you're no longer in charge of the Instance of SQL Server that you're running on. You're dropped into your environment at a database level. Also, the size of those databases are more restricted than you might be used to for your on-premise system (Keep in mind that SQL Azure has a set of use-cases, and those aren't always the same as the on-premise installation of SQL Server or other RDBMS systems). Because of these two reasons, you want to start "thinking at the database level". In one case, this means a shift in thinking at a lower level, and in another, at a higher level.
First, because you're at the database level when you enter the system, you don't need to control the Instance-level settings, or work with security and so on at the higher level. That measn that you should focus "down-level" to the database settings you do control.
Second, because of those size limits you need to think differently about the strategies you use for dealing with "big" data - in this case, as of this writing, the 1-50GB databases you can create on SQL Azure. In SQL Server on-premise installations, you can "partition" large sets of data by breaking them out using a Partition Scheme and a Partition Function - more on that here, with a great explanation of previous versions of SQL Server Partitioning. You also have access to FileGroups, which point to files that can be placed on different physical devices. In SQL Azure, you can think about the database as a container like the tables are in on-premise systems, in effect, "thinking up" to the database level.
This actually has some advantages - by placing data sets that you might partition by date, customer ID and so on in different databases, you gain the advantage of a different logical system running that database, gaining you CPU, memory and so on. My friend Wayne Berry has an excellent series of articles dealing with this starting here. He develops a strategy on Partitioning in SQL Azure in that series.
The point is that using SQL Azure requires understanding the way it holds and processes data. While it's ideally suited to data in the sub-50GB range, that doesn't mean that you don't have options to work with larger sets.
Since the introduction of SQL Azure, there has been some confusion about how it should be used. Some think that the primary goal for SQL Azure is simply “SQL Server Somewhere Else”. They evaluate a multi-terabyte database in their current environment, along with the maintenance, backups, disaster recovery and more and try to apply those patterns to SQL Azure. SQL Azure, however, has another set of use-cases that are very compelling.
What SQL Azure represents is a change in how you consider your options for a solution. It allows you to think about adding a relational storage engine to an application where the storage needs are below 50 gigabytes (although you could federate databases to get larger than that – stay tuned for a post on that process) that needs to be accessible from web locations, or to be used as a web service accessible from Windows Azure or other cloud provider programs. That’s one solution pattern.
Another pattern is a “start there, come here” solution. In this case, you want to rapidly create and deploy a relational database for an application using someone else’s hardware. SQL Azure lets you spin up an instance that is available worldwide in a matter of minutes with a simple credit-card transaction. Once the application is up, the usage monitoring is quite simple – you get a bill at the end of the month with a list of what you’ve used. From there, you can re-deploy the application locally, based on the usage pattern, to the “right” server. This gives your organization a “tower of Hanoi” approach for systems architecture.
There’s also a “here and there” approach. This means that you can place the initial set of data in SQL Azure, and use Sync Services or other replication mechanisms to roll off a window of data to a local store, using the larger storage capabilities for maintenance, backups, and reporting, while leveraging the web for distribution and access. This protects the local store of data from the web while providing the largest access footprint for the data you want to provide.
These are only a few of the options you have – and they are only that, options. SQL Azure isn’t meant to replace a large on-premise solution, and the future for SQL Server installations remains firm. This is another way of providing your organization the application data they need.
There are some valid questions about a “cloud” offering like SQL Azure. Some of these include things like security, performance, disaster recovery and bringing data back in-house should you ever decide to do that. In future posts here I’ll address each of these so that you can feel comfortable in your choice. I’ve found that the more you know about a technical solution, the better your design will be. It’s like cooking with more ingredients rather than the same two or three items you’re used to.
Since my team started work on Project Houston hardly a day goes by I’m not in some discussion about the Cloud. It also hits me when I go through my RSS feeds each day: cloud this and cloud that. There is obviously a significant new trend here, it may even qualify as a wave; as in the Client/Server Wave or the Internet Wave or the SOA Wave, to name a few – ok, to name all the ones I can name. Almost 100% of what I read is focused on how the Cloud Wave impacts IT and the “business” in general. Don’t get me wrong, I completely agree it’s real and it’s going to have a profound impact. I think this posting by my friend Buck Woody (and fellow BIEB blogger) sums it up pretty succinctly. The primary point being it doesn’t matter what size your business is today or tomorrow, the Cloud will impact you in a very meaningful way.
What I don’t see much discussion about is how the Cloud Wave (is it just me or does that sound like a break dancing move?) changes the way software vendors (ISVs) develop software. In addition to Project Houston my team is responsible for keeping SQL Server Management Studio (SSMS) up with the SQL Azure feature set. If we step back for a second and look at what we have, we have a product that runs as a service on Windows Azure (running in multiple data centers across the globe) and the traditional thick client application that is downloaded, installed, and run locally. These are very different beasts but they are both challenged with keeping pace with SQL Azure.
SSMS has traditionally been on the release rhythm of the boxed product. This means a release every three years. The engineering system we use to develop software is finely tuned to the three year release cycle. The way we package up and distribute the software is also tuned to the three year release cycle. It’s a pretty good machine and by and large it works. When I went to the team who manages our release cycle and explained to them that I needed to release SSMS more frequently, as in at least once a quarter if not more often, they didn’t know what to say. This isn’t to say these aren’t smart people, they are. But they had never thought about how to adjust the engineering system to release a capability like SSMS more often than the boxed product, let alone every quarter. I hate to admit it but it took a couple of months of discussion just to figure how we could do this. It challenged almost every assumption made about how we develop, package and release software. But the team came through and now we’re able to release SSMS almost any time we want. There are still challenges but at least we have the backing of the engineering system. I’m pretty confident we would have eventually arrived at this solution even without SQL Azure. But given the rapid pace of innovation in the Cloud we were forced to arrive at it sooner.
Project Houston is an entirely different. There is no download for Project Houston, it only runs in the Cloud. The SQL Azure team runs a very different engineering system (although it is a derivation) than what we run for the boxed product. It’s still pretty young and not as tuned but it’s tailored to suit the needs of a service offering. When we first started Project Houston we tried to use our current engineering system. During development it worked pretty well. However, when we got to our first deployment it was a complete mess. We had no idea what we were doing. We worked with an Azure deployment team and we spoke completely different languages. It took a few months of constant discussion and troubleshooting to figure out what we were doing wrong and how we needed to operate to be successful. Today we snap more closely with the SQL Azure engineering system and we leverage release resources on their side to bridge the gap between our dev team and the SQL Azure operations team. It used to take us weeks to get a deployment completed. Now we can do it, should we have to, in the matter of hours. That’s a huge accomplishment by my dev team, the Azure ops team, and our release managers.
There’s another aspect to this as well. Releasing a product that runs as a service introduces an additional set of requirements. One in particular completely blindsided us. Sure when I tell you it’ll be obvious but it caught my team completely off guard. As we get close to releasing of any software (pre-release or GA)we do a formal security review. We have a dedicate team that leads this. It’s a very thorough investigation of the design in an attempt to identify problems. And it works – let me just leave it at that. In the case of Project Houston we hit a situation no one anticipated. The SQL Azure gateway has built-in functionality to guard against DOS (Denial of Service) Attacks. Project Houston is running on the opposite side of the gateway from SQL Azure, it runs on the Windows Azure platform. Since Project Houston handles multiple users connecting to different servers & databases there’s an opportunity for a DOS. During the security review the security team asked how we were guarding against a DOS. As you can image our response was a blank stair and the words D-O-what were uttered a few times.
We had been heads down for 10 months with never a mention of handling a DOS. We were getting really close to releasing the first CTP. We could barely spell D-O-S much less design and implement a solution in the matter of a few weeks. The team jumped on it calling on experts from across the company. We reviewed 5 or 6 different designs each with their own set of pros and cons. The team finally landed on a design and got it implemented. We did miss the original CTP date but not by much.
You’re probably an IT person wondering why this is relevant to you. The point in all this is simple. When you’re dealing with a vendor who claims their product is optimized for the Cloud or designed for the Cloud do they really know what they’re talking about or did the marketing team simply change the product name, redesign the logo and graphics to make it appear Cloud enabled. Moving from traditional boxed software to Cloud is easy. Do it right, is hard – I know, I’m living through it every day.
I follow the Database Trends and Applications magazine because it is pretty platform independent, with a section for Oracle, SQL Server and DB/2. I found it interesting that when I turned there yesterday and today, among the top stories are "Cloud" - front and center. It's easy to see why. When we think about technology today, we've got three big "circles" of components like a Venn diagram. In one circle is the "Platform", which includes hardware and the software that runs on it. Developers live here, as do System Administrators and even Data Professionals. In another circle is the "Connectivity", which includes network and security. In yet another circle is "Data", which includes data professionals of course, developers, SAN admins and Sysadmins from file shares to SharePoint.
The first circle - Platform - has all of the hardware in it. That's a capital expense that also has operational expenses, like people that maintain, manage and monitor them. Companies really want to tone that down wherever they can. That's where this whole "Cloud" thing comes into play. The devs are still there, writing apps for the organization. The data specialists are still there, managing and designing data. The network will always be there. But hardware and software? Does your CEO really want to pay to have that laying around? And what about capacity growth and shrink - not easy to manage when you own everything. So many of them are asking you about the Cloud. And you're probably wondering where it fits in.
Well, Microsoft is certainly one of the main players in this area. From Windows Azure, the Application Fabric and SQL Azure, we have real-live offerings you can leverage right now. But is SQL Azure ready? For many applications, yes. No, I don't recommend that you grab your 1 Terrabyte database that has a 500GB ETL process running on it and just toss that unchanged into SQL Azure. I think there's a place for it - and it keeps growing every day. There's a great article over on Softpedia where you can read more about the changes, improvements and enhancements to SQL Azure that we've just released. But here's a side of that you might not have considered: you don't have to do anything to get those improvements. There's no install. There's no patching. To be sure, there is still testing and so on that you need to do, but it just changes and gets better. Certainly this is one argument for the Cloud. Again, it's another tool in the toolbox - not meant to replace every on-premise environment today, but something you should learn about to see where it does fit.
So check out that article, and post your comments here and there. I'm curious about what you think.
So check out that article, and post your comments here and there. I'm curious about what you think.
As Mary Jo Foley recently reported migrating to SQL Azure just got a little easier with the latest version of the SQL Server Migration Assistant (SSMA), released on August 12. Included in the update are the typical cast of source databases: Oracle, Sybase, and Access. But there are two new things. First, MySQL was added as a source database. And second, SQL Azure was added as a target. If you have an existing MySQL database there’s no better way to get started with SQL Azure than migrating your existing db .The process is fast and painless. You can read more about this release of SSMA on the SSMA blog.
I know, SQL Server 2008 R2 was released some time back, but with so much content out there I thought I might simplify it a bit. I've written a complete overview that you can read through in about 10 minutes, and there's a great new whitepaper on one of the main features, PowerPivot. Thought I'd share here:
My article: http://www.informit.com/guides/content.aspx?g=sqlserver&seqNum=359
PowerPivot Whitepaper: http://whitepapers.zdnet.com/abstract.aspx?docid=1911939&promo=100303
At this point there should be no question that Microsoft is fully embracing the cloud. Windows Azure and SQL Azure launched earlier this year and have been receiving positive feedback. I’ve been happily using SQL Azure for several months. One of the great aspects of SQL Azure is the integration with the standard tooling, e.g. SQL Server Management Studio (SSMS). This is sort of like how I can connect Outlook to my personal e-mail account with my service provider.
However, I‘m not always on a machine with Outlook or with Outlook configured for my mail account. In these cases I opt for the browser email client provided by my service provider. Up until last week the only option provided by Microsoft for using SQL Azure was SSMS. If you didn’t already have SSMS you could download and install it, which isn’t always a practical solution. You’ll notice I said up until last week. Last week we announced a Community Technology Preview (CTP) for Project Houston. In short, Houston is a Silverlight client for managing your SQL Azure database, developed by none other than my team
Mary-Jo Foley recently wrote about Project Houston: http://www.zdnet.com/blog/microsoft/microsoft-delivers-test-build-of-tool-for-cloud-database-development/6910
In addition, a simple Twitter search on the tag #SQLHouston will yield a plethora of results. We’re also tweeting about it under the user name @SQLHouston.
You can access the CTP of Project Houston @ https://manage.sqlazurelabs.com/ and be sure to send us your feedback and suggestions. You can learn more about Project Houston and how to send us your feedback here: http://blogs.msdn.com/b/sqlazure/archive/2010/07/26/10042571.aspx
Most of us have some help we can provide back to the community. Even if you're new, you can write down some of the things you've learned. And there are several easy ways to to that - you can certainly jump in on the forums (http://social.msdn.microsoft.com/Forums/en-US/category/sqlserver/) and answer any questions you know the answer for and you can even participate by helping to write a form of documentation - did you know we run a Wiki, where you can edit the documentation? Check it out here: http://social.technet.microsoft.com/wiki/
So take some time, peruse those resources, and write something up. We learn best when we teach others, so don't think that you can't give back - you can!
The SQL Server Best Practices Analyzer (BPA) came out for SQL Server 2008 R2 recently, and I’ve been asked what the difference is between the BPA and Policy Based Management (PBM) that was introduced in SQL Server 2008.
While it’s true both of these tools can do similar things, each has strengths and weaknesses. The Best Practices Analyzer has a long history, and has various “rules” that compare settings on a server and provide guidance through some very nice reports. Many of these rules became Policies in SQL Server 2008. The BPA requires a separate install, PBM is installed with SQL Server 2008, and the reports are something you would have to create yourself. PBM can be run on a schedule, from a SQL Server Agent Job step or inside PowerShell, and BPA doesn’t do that out of the box. PBM also has a “SQL” task where you can define whatever you would like, BPA doesn’t have that capability in exactly that same way.
Probably the biggest difference between the two tools, however, is that PBM can be set (under certain circumstances) to prevent an action from being taken. For instance, you can actually stop a developer from naming a database object in a certain way. Again, there are restrictions on this feature, but you can use it from time to time.
So which is better? Neither! Both have their uses, and in fact I use them both. One of the greatest strengths of Microsoft products is that you can usually do the same task in multiple ways. Of course, it’s one of our great weaknesses as well!
So as usual, the answer is “it depends”. You should learn more about both, and figure out what works best for you.
Today I'm here in Washington DC, demo-ing in Bob Muglia's keynote at our Worldwide Partner Conference. This time, I'm showing something quite new - not just a new product, but a new concept in data - Microsoft's premium information marketplace codenamed "Dallas."
"Dallas" may be better known as a village of 200 or so people in northern Scotland, but now it has a new claim to fame. "Dallas" is a place where developers on any platform as well as information workers can find data from commercial content providers as well as public domain data. They can consume this data and construct new BI scenarios and new apps…
To illustrate how easy it will be for ISVs to integrate this content into their apps and to showcase a BI scenario, in the keynote demo I'll be using United Nations data from "Dallas" integrated in an early build of Tableau Software's solution to create a visualization answering study abroad trends around the world. This took only a few minutes to create using "Dallas" and Tableau Public.
To break down the demo, here are the three key pillars behind this scenario:
Discover Data of All TypesMicrosoft Codename "Dallas" makes data very easy to find and consume. The vision is to be able to post and access any data set, including rich metadata, using a common format.
Explore and ConsumeUsing Tableau and "Dallas" together means you can explore any data set simply by dragging and dropping fields to visualize it. This is a very powerful idea: anyone can easily explore and understand data without doing any programming.
Publish and ShareOnce you've found a story in your data you want to share it... Using Tableau Public you can embed a live visualization in your blog, just like the one above.
"Dallas" creates a lot of opportunities for a company like Tableau. It makes it possible for bloggers, interested citizens and journalists to more easily find public data and tell important stories, creating a true information democracy. "Dallas" also makes it easier for Tableau's corporate customers to find relevant data to mash up with their own company data, making Tableau's corporate tools that much more compelling.
For more information on how you can be a provider or how an ISV can plug into the Dallas partnership opportunities, send mail to DallasBD@Microsoft.com.