Klout (www.klout.com) measures influence across the social web by analyzing social network user data. Klout uses the impact of opinions, links, and recommendations to identify influential individuals. Every day Klout scans 15 social networks, scores hundreds of millions of profiles, and processes over 12 billion data points.
The Klout data warehouse, which relies on Apache Hadoop-based technology, exceeds 800 terabytes of data. But Klout doesn’t just crunch large data volumes; Klout takes advantage of Microsoft SQL Server 2012 Analysis Services to deliver reliable scores and actionable insights at the speed of thought.
Microsoft and Klout collaborated to build this Big Data Analytics solution. The goal for this solution was to find a cost-effective way to combine the power of Hadoop with the power of Analysis Services. The result is a solution that connects Analysis Services to Hadoop/Hive via the relational SQL Server engine, enabling Klout to reduce data latencies, eliminate maintenance overhead and costs, move aggregation processing to Hadoop, and shorten development cycles dramatically. Organizations in any industry and business sector can adopt the solution presented in this technical case study to exploit the benefits of Hadoop while preserving existing investments in SQL Server technology. This case study discusses the necessary integration techniques and lessons learned.
To review the document, please download the SQL Server Analysis Services to Hive Word document.
This is an invite-only, technical boot camp for Hosting Service Providers that offers 300 level training of SQL Server 2012. This two day training with labs shows how Microsoft SQL Server 2012 can be used by Hosting Service Providers to enhance their existing database service offerings and build out new offerings such as high availability and database as a service solutions. The event is $500 to attend and space will be reserved on a first come, first serve basis.
Even though this training is aimed for service providers it would be great for you to attend if you are planning to implement a SQL Server as a Service or database private cloud in your corporate environment.
Should you be interested in attending this training please contact me via the blog.
Due to the enormous interest already received we ask that you only register one person per camp per company at this time. Thanks for your understanding.
Sydney – October 8-9: SQL Server 2012 for Service Providers
https://msevents.microsoft.com/cui/EventDetail.aspx?culture=en-AU&EventID=1032524722
Sydney – October 10-12: Cloud Infrastructure (Windows Server 2012 and System Center 2012)
https://msevents.microsoft.com/cui/EventDetail.aspx?culture=en-AU&EventID=1032524718
Melbourne – October 15-16: SQL Server 2012 for Service Providers
https://msevents.microsoft.com/cui/EventDetail.aspx?culture=en-AU&EventID=1032524723
Melbourne – October 17-19: Cloud Infrastructure (Windows Server 2012 and System Center 2012)
https://msevents.microsoft.com/cui/EventDetail.aspx?culture=en-AU&EventID=1032524719
Today marks the General Availability of Windows Server 2012. Windows Server 2012 is a cornerstone of the Cloud OS, providing customers and partners with a modern platform for their applications. It is the world’s first server that is “built from the cloud up”.
Windows Server 2012 expands the definition of a server operating system from the single server up to the datacenter and out to the cloud, while also incorporating breakthroughs across advanced storage, software-defined networking, virtualization and automation. Windows Server 2012 delivers hundreds of new features that will help customers achieve a transformational leap in the speed, scale and power of their datacenters and applications. In combination with Windows Azure and System Center 2012, Windows Server 2012 empowers customers to manage and deliver applications and services across private, hosted and public clouds.
Where can I learn more and download it?
Go to the Windows Server 2012 online launch event and, more importantly, download and evaluate Windows Server 2012.
How does this affect SQL Server?
The first question I usually get is which versions of SQL Server are supported on Windows Server 2012. Bob Ward from the support team published a great blog post on this at http://blogs.msdn.com/b/psssql/archive/2012/09/01/installing-sql-server-on-windows-8.aspx. From this post you can see the supported versions:
The next questions is about if there are any benefits in running SQL Server on Windows Server 2012, and the good news is that there is a better together story on this. There are many enhancements in Windows Server 2012 that have a positive effect on SQL Server’s performance, below are some of them.
Better Scaling
Better Performance
Better Availability
Better Networking & DR options
Better Storage Support
The above diagram depicts the areas in which Windows Server 2012 and System Center 2012 impact the performance, manageability, scalability of SQL Server 2012 in a positive way.
Self service BI is getting lot of attention, and there are lot of tools out there in the market from multiple vendors. Some with real great visualization and leveraging the new in-memory processing models. Remember that when it comes to Self-service BI you want to make it as easy as possible for the people in your organization to explore and mine the data. They shouldn't have to spend a lot of time learning new tools, new syntax, new ways to load data, instead work with familiar tools they are already aware of. This is where the strengths of Excel 2010 and the PowerPivot add-in (free) are, you can use the already great knowledge base of Excel within your organization and provide them with a faster and easier way to deal with large amounts of data from multiple sources.
I would recommend taking a look at the below case study from The Weather Channel who had gone down the route of selecting QlikView and back tracked that implementation and now have implemented PowerPivot.
http://www.microsoft.com/casestudies/Microsoft-SQL-Server-2012-Enterprise/The-Weather-Channel/Leading-Global-Weather-Company-Saves-160-Hours-Monthly-Boosts-Efficiency-and-Control/710000001251.
Initially, The Weather Channel planned to implement QlikView as a BI tool. However, after experiencing scalability issues, the company decided to switch to a solution based on Microsoft BI tools including Power View, a new BI tool in Microsoft SQL Server 2012 Reporting Services. “We wanted to bring in data from multiple sources and add features that were either not possible or not easy to implement with QlikView,” says Drooker. “Another benefit of the Microsoft BI solution is that training is minimal. Why add complexity and increase costs with a third-party solution when we can use SQL Server 2012 and Power View to simplify integration with our existing technologies?”
Remember that for your self service BI solution to be truly self service you want to make it as easy as possible for the people in your organization that work with the data and derive insight.
I recently attended internal training on Microsoft Big Data presented by our partner in this space Horton Works. It was a bit of a learning curve as the world of Big Data is a paradigm shift to the relational world, or so I thought. It turns out it isn't that remarkably different. The Horton Works guys did a fantastic job of walking us through the Apache Hadoop world and talking about the various components. I would definitely recommend attending one of there courses through Horton Works University.
So you maybe asking what did I learn? and more importantly how can you start exploring the world of Big Data/Hadoop and what are some practical scenario.
Lets start with the main question first of what is Big Data?
To most Big Data is about the size of the data and is it TB or PB of data, however this is not the only thing we should be looking at. The volume of data is one aspect however it is commonly summarized as the 3Vs of data as follows:
1) Volume –typically thought of in the range of Petabytes and Exabyte's, however this is not necessarily the case it could be GBs or even TBs of data which happens to be semi/multi structured.
2) Variety –Any type of data
3) Velocity –Speed at which it is collected
There are many different technological approaches to help organizations deal with Big Data. Like the days of Beta vs. VHS for the video tape world or Blu Ray vs HD DVD, they help achieve the same outcome however with different technological implementations. So like in the case of Blu Ray vs HD DVD, Blu Ray won, similarly the industry sees the Apache Hadoop technology as the industry standard for Big Data solutions. Hence, Microsoft have also adopted it and partnering very heavily for implementing Hadoop on both Windows Server (on premise) and Windows Azure (public cloud). This is despite Microsoft having various technological implementations of Big Data solutions such as what we run for Bing.com and our High Performance Computing group working on LINQ for Dryad.
So what is Hadoop?
Basically Hadoop is the open source platform that enables you to store and work with Big Data. It was started by Google/Yahoo and now maintained as open source project with Horton Works being one of the leading contributors to the project. The main components of it are :
As you will see from the diagram below there are many different components that work along side the above core components. The main ones we will explore in this blog post are:
Where does Microsoft fit into this picture?
From the above diagram you can see that there are 3 main areas where Microsoft is extending/contributing to the Apache Hadoop project. They are as follows:
Having Hadoop run on Windows means that it can easily managed as the rest of your existing Windows based infrastructure through management tools like System Center, integration with Active Directory for security and single sign-on. Having it run on the cloud means you don't have to manage this infrastructure at all you can just start using the power of it for your data analysis and all the provision and management is taken care of for you.
In addition to the above direct technological investments, one of the other major areas Microsoft is adding value is in the space of data enrichment to make it easier for you to create insights and make decisions. This is through connecting data whether it be residing in Hadoop or structured world (RDBMS) services such as those being made available in Excel 2013 that will suggest other data sets that will enrich your internal data such as address data, demographic data, etc.
Is there use for Hadoop in my Enterprise?
As the Apache Hadoop project is open source based and came out of Web 2.0/startup companies, this question of suitability for enterprise does arise. With the Microsoft and Horton Works partnership, questions around supportability and stability of open source projects should no longer be a issue, you should instead think of scenarios to utilize these technologies to give your business a competitive advantage.
Analysing your organizations structured and multi-structured data can help you with creating differentiation - create new offers, better manage marketing campaigns leading to reduced waste and improved retunes. Keep your customers longer – remember acquiring costs of customers are higher than retention costs. It is analytics that can assist in transforming your business – the rest of the technologies in IT are just about making it more efficient to run. Below is a table outlining some scenarios on how you can utilize Big Data across different industries.
Web & E-tailing
Hadoop really shouldn't be put in the too hard basket, it is like using any other data platform product. The mind shift difference is that instead of modeling the data up front, you are able to ingest data in its raw format in HDFS and then run Map Reduce jobs over it. The iterations will now take place with your Map Reduce jobs instead of in the ETL/data model design. This flexibility makes it easier to work with large multi structured data.
Enough Blurb lets get our hands dirty!!!
Go to https://www.hadooponazure.com/ and sign up for access to the free preview of the Hadoop on Windows Azure. All you need is a Microsoft Account (formerly Live, Hotmail accounts).
In this blog post I showcase how the sample Hello World equivalent program in Hadoop – Word Count – can be leveraged to profile/segment customers. In my below example I take a look at banking transactions over 5 years to see if we can profile the customer.
1) Sign in and Request a Cluster
About 10 minutes later the cluster is created and deployed with all the right software. This automated deployment is doing quite a lot in the background, as the Apache Hadoop project is open source, there are many forks in the development and many versions. The Microsoft and Horton Works partnership ensures and tests the appropriate combinations (of versions) of the different components such as the HDFS version that is most compatible with HIVE, Pig versions, etc.
2) Remote Desktop to the head node of the Hadoop cluster
3) View the financial transactions
4) For this demo we will use the Word Count Map Reduce Java code, I have made this available at the following location - WordCount.java - http://sdrv.ms/PdsFP7. It is also available in the Samples once you login to your cluster.
a) Open Hadoop command prompt and set Environment paths
b) Test it out by invoking javac compiler
c) Create a directory to upload the financial transactions files into HDFS
c:\Demo>hadoop fs -mkdir demo/input/
d) Upload the files into HDFS
c:\Demo>hadoop fs -put TranData* demo/input/
e) View the listing (ls) of the files uploaded to HDFS
c:\Demo>hadoop fs -ls demo/input/
f) To output the contents of the files uploaded to HDFS use the –cat option
hadoop fs -cat demo/input/TranData1.csv
g) Let’s compile the code and make a jar file from the CLASS files.
javac WordCount.java
h) Create a jar file that includes the java program you just compiled:
jar cvf wc.jar *.class
i) Execute the Map Reduce job and the results of it will be created in the demo/output/wordcount directory in HDFS
hadoop jar wc.jar WordCount demo/input/FinancialCSVData.csv demo/output/wordcount
j) View the output file
hadoop fs -cat demo/output/wcDesc/part-00000
As you can see we were quickly able to query the data, we didn't have to create tables, create ETL and then write the appropriate queries. You can see that there was lot of rubbish output, so it is not a silver bullet either, this is were you now need to reiterate your Map Reduce program to improve the results and output as required.
Does this mean I have to be a expert Java programmer?
No it doesn't and this is were Pig and Hive come into play. In fact you will find that about 70% of your queries will either be in Pig/Hive, only for more complex processing will you be required to use Java.
1) To access pig you type pig in the hadoop command prompt:
a) Load the file into the PigStorage which is on top of HDFS. You can also optionally set a schema to the data as defined after the AS keyword below. There are limited data types, don't expect the full range that is available to you in the relational world, remember that Hadoop is relatively new (Hadoop 2.0 is in the works).
grunt> A = LOAD 'pigdemo/input/FinancialCSVData.csv' using PigStorage(',') AS (Trandate:chararray, Amount:chararray, Description: chararray,Balance:chararray) ;
b) Use the DESCRIBE command to view the schema of your data set labeled A.
grunt> DESCRIBE A; A: {Trandate: chararray,Amount: chararray,Description: chararray,Balance: chararray}
c) Use the DUMP command to execute Map Reduce job to view the data grunt> DUMP A;
d) Use STORE command to output the data into a directory in HDFS
grunt> STORE A INTO 'pigdemo/output/A';
e) You can FILTER, JOIN the data and keep on creating additional data sets to analyze the data as required.
grunt> B = FILTER A BY Trandate == '29/12/2011'; grunt> DUMP B;
For full capabilities of Pig refer to the Pig Latin Reference Manual - http://pig.apache.org/docs/r0.7.0/index.html
Seeing all this command line interface might be making you sick, is there any graphical interface
There are UIs that are provided on the cluster and through the portal on Azure.
Below are the default pages made available through the Hadoop implementation, they allow you to view the status of the cluster and the jobs that have executed and also browse the HDFS files system.
Below screenshots show case the ability to browse the file system through a web browser including viewing the output.
Hadoop on Azure Portal
The Portal has an interactive console to enable you to run Map reduce jobs and Hive queries without having to remote desktop into the head node of the cluster. This provides a more user friendly interface then just a command prompt. It also includes some rudimentary graphing capability.
Below is a screenshot of the JavaScript console, allowing you to upload files and Map Reduce jobs to execute.
Below is an interface to allow you to create and execute and monitor Map Reduce jobs.
Hive
For running the Hive queries I used the Interactive Console in the hadoop on azure portal.
1) As Hive provides a DW style infrastructure on top of Map Reduce, the first thing that is required is to create a Table and load Data into it.
CREATE TABLE TransactionAnalysis(TransactionDescription string, Freq int) row format delimited fields terminated by "\t";
Logging initialized using configuration in file:/C:/Apps/dist/conf/hive-log4j.properties Hive history file=C:\Apps\dist\logs\history/hive_job_log_spawar_201208260041_1603736117.txt OK Time taken: 4.141 seconds
2) View the schema of the table
describe TransactionAnalysis;
transactiondescription string freq int
Logging initialized using configuration in file:/C:/Apps/dist/conf/hive-log4j.properties Hive history file=C:\Apps\dist\logs\history/hive_job_log_spawar_201208260042_957531449.txt OK Time taken: 3.984 seconds
3) Load the data into the table
load data inpath '/user/spawar/demo/output/wcDesc/part-00000' into table TransactionAnalysis;
Logging initialized using configuration in file:/C:/Apps/dist/conf/hive-log4j.properties Hive history file=C:\Apps\dist\logs\history/hive_job_log_spawar_201208260046_424886560.txt Loading data to table default.transactionanalysis OK Time taken: 4.203 seconds
4) Run HiveQL queries:
select * from TransactionAnalysis;
select * from TransactionAnalysis order by Freq DESC;
For more documentation on Hive see - http://hive.apache.org/
Use Excel to access data in Hadoop
1) On the Hadoop on Azure portal, go to the Downloads link and download the ODBC Driver for Hive (use appropriate version based on your OS version i.e 32 bit vs 64 bit) :
2) Install the ODBC Driver
3) Open the ODBC Server port for your Hadoop cluster by going to the Open Ports link on the Hadoop on Azure portal
4) Create a User DSN using ODBC Data Source Administrator
5) Enter the name of the hadoop cluster you had created along with your username
6) Start up Excel, in the Ribbon Bar, select Data and you will see the Hive Pane added, click on it and a task pane will open on the right hand side.
Note: I am using Excel 2013, however the ODBC Hive Driver also works on Excel 2010.
7) After selecting the ODBC User DSN in the drop down box, enter in the password. Once connected select the table from which you wish to get the data. Use the Task pane to build the query, or open the last drop down labeled HiveQL and write your own query.
8) Click Execute Query, this will retrieve the data and make it available in Excel
Now that the data is in Excel you can analyse/mash it up using PowerPivot with your other structured data and answer questions as required. As I used Excel 2013, this helps guide me through by providing Quick Explore options to visualize the data by adding data bars and appropriate graphs.
Looking at the frequency of transactions you can quickly tell the makeup of this customer:
Very quickly we were able to get lot of information about this customer.
With the power of PowerView in Excel now you can also do powerful analysis such as geo spatial based without having to know geographic coordinates,
For more detailed information of the versions of hadoop running on Azure and also further great tutorials see this blog post from the product team - http://blogs.msdn.com/b/hpctrekker/archive/2012/08/22/hadoop-on-windowsazure-updated.aspx
When talking to customers about Big Data, a question I often get is how do you do backup Big Data?
This is a very valid question and one thing to note about HDFS is that it provides High Availability by default by automatically creating 3 replicas of the data. This protects you against HW failure, data corruption, etc, however it doesn't protect you against logical errors such as someone accidently overwriting the data.
To protect against logical errors you need to keep the raw data so that you can load it again if required, and/or keep multiple copies of the data in HDFS at different states. This requires you to appropriately size your Hadoop cluster so that it not only has space to cater to your requirements but also for your backups.
Thank you to those for submitting some great sessions this year. It was always going to be a difficult task in selecting the final 11 sessions for the Database and BI track. Congratulations to those who have been selected and we look forward to working with you on finalizing the presentations.
Attendees please ensure you start building your schedule and selecting the sessions. You can view the full catalog at - https://australia.msteched.com/topic/list/. Below are the Database & BI related ones:
Look forward to seeing you in Gold Coast.
Data Quality Services in Enterprise Info Management
Track Database and Business Intelligence
Type Track Session
James Beresford
This session introduces the basic concepts of SQL Server Data Quality Services (DQS) and how to use it as part of building Credible, Consistent Data. It will cover the concepts and terminology that the product uses and the basics of how to use it to cleanse data. The session will also showcase the automation of Data Quality processes via its integration with SQL Server Integration Services (SSIS), and how it can be used to help manage reference data in Master Data Services (MDS).
DBI224
Killer real world PowerPivot examples part II
Grant Paisley
This session is even more jam packed with amazing PowerPivot report examples than last years entertaining presentation. This is *the* session to attend if you want to get the most out of Powerpivot visualizations. Essentially you will see real world PowerPivot reports from clients in Education, Retail, Banking, and Telco and learn a variety of visualization techniques including sparklines, slicers, and charts. You will leave this fun session armed with plenty of ideas for your next personal BI project. Oh and watch out for flying Koala's.
DBI225
Big Data for Relational Practitioners
Len Wyatt
You’ve heard the hype about Big Data, Hadoop, and so on… How does this relate to you? This session will describe how the Hadoop ecosystem relates to the ETL, Data Warehousing and BI that you are already doing. We’ll discuss what no-SQL means in practice: What is gained and what is lost. There will be some examples showing Hadoop tools like Pig, Hive and Sqoop, and when you might use them. Finally, what skills should you focus on to make the transition from relational expert to Big Data expert?
DBI226
Data Mining for Fun and Profit
Kevin Clarke, Russ Blake
SQL Server Data Mining is the crown jewel of the Microsoft BI stack, delivering Best of Breed data mining algorithms packaged for ready use. But these powerful tools remain neglected on the shelf of many shops in spite of the fact that, according to Gartner Group, BI is the top priority of IT organisations over the next 5 years, with the biggest skills gap in the data mining area. We’ll demystify the data mining algorithms, terminology and tools so you can apply the technology with ease to help your organisation look into its future and adapt as needed in the present, all without any need for a Cube. EB Games will join us to discuss how they have used Data Mining to improve their countrywide retailing operations.
DBI234
Scalable SQL Server design with just a Credit Card
Peter Ward
The traditional approach to database scalability has been to simply buy a bigger box. However with the ongoing data explosion that is occurring in organisations today there is a need to re-think how to achieve scalability for the database platform. In this session we will explore how to deploy Microsoft SQL Server into a Private Cloud and how to extend this scalability as a Hybrid solution using the Azure platform. With nothing more than a credit card and the Azure platform you will see how it is now possible to create highly-available, infinitely scalable SQL Server environments that support automated deployment and elastic scale.
DBI312
Keeping the Lights On with SQLServer 2012 AlwaysOn
Warwick Rudd
In today’s ever changing environment, the requirements for High Availability and Disaster Recoverability are becoming more complex. Are you able to meet these requirements with your current SQL Server Stack ? With the release of SQL Server 2012, the High Availability/Disaster Recovery feature AlwaysOn provides a more granular approach and control over your environment. In this presentation we will look at how to configure, setup & monitor SQL Server 2012 AlwaysOn.
DBI313
Turbocharging a Warehouse with Columnstore Indexes
Do you have data warehouse queries that run too long? Adding a columnstore index can provide a tremendous performance boost. Do you know why? Do you know when to create a columnstore index and how to get the best performance from your columnstore indexes? In this session we’ll address how columnstore indexes speed up queries, best practices for creating and using columnstore indexes, and how to diagnose and treat potential issues.
DBI315
Develop a SSAS Tabular Project BI Semantic Model
Peter Myers
In this session learn how to develop and deploy a tabular project BI Semantic Model – a new data modelling approach made available to developers in SQL Server 2012 Analysis Services. The session will cover all design features including how to load and relate data, how to enrich the model with hierarchies and calculations, and how to secure and process the model data. Additionally, it will introduce and cover Data Analysis Expressions (DAX) and how it can be used to define calculated columns, measures and KPIs. A specific emphasis will be made on the new DAX functions available in the SQL Server 2012 release. Finally, and with due emphasis, theory and discussion to help you decide when the tabular project BI Semantic Model is the appropriate modelling choice will be covered.
DBI321
SQL Server Database Private Cloud Deep Dive
Danny Tambs
In this session, learn about the DPC (Database Private Cloud) Reference Architecture and how it can be used to consolidate thousands of databases on a single scalable platform. The session will drill into the hardware and software configuration and cover how its wired together. We will cover the use cases, savings, workload types and migrating workloads to the solution as well as discuss management options.
DBI332
SQL Server Warehousing (Fast Track 4.0 & PDW)
Matthew Winter, Stephen Strong
Whats in Windows Server 2012 for SQL Server
James Crawshaw, Raja Narayanaswamy
Learn about the new, key Windows Server 2012 features which compliments SQL Server on its High Availability, Performance & Scalability, Mobility and Security. The session will drill into the major enhancements such as SMB( SMB Transparent Failover, SMB Multichannel, SMB Direct (Remote Direct Memory Access), SMB Scale-Out, increased performance, and more), Hyper-V (replication-HA, Scalability, Mobility and Security enhancements)
In case you missed this announcement, I am including the link s below to free ebooks from Microsoft Press. Don't worry just because they are free they are not about out dated technologies, they are actually current versions such as Windows Server 2012, SQL Server 2012, etc.
Please leverage this great resource as you start to master the new Microsoft technologies.
http://blogs.msdn.com/b/mssmallbiz/archive/2012/07/27/large-collection-of-free-microsoft-ebooks-for-you-including-sharepoint-visual-studio-windows-phone-windows-8-office-365-office-2010-sql-server-2012-azure-and-more.aspx
and
http://blogs.msdn.com/b/mssmallbiz/archive/2012/07/30/another-large-collection-of-free-microsoft-ebooks-and-resource-kits-for-you-including-sharepoint-2013-office-2013-office-365-duet-2-0-azure-cloud-windows-phone-lync-dynamics-crm-and-more.aspx?wa=wsignin1.0
After setting up Redhat Linux 6.3 in Hyper V image follow the instructions outlined on TechNet - http://technet.microsoft.com/en-us/library/hh568449.aspx to install the Driver Manager.
1) Go to unixODBC (http://www.unixodbc.org/) page and select Download
Download the unixODBC-2.3.0 version.
4) Start a Terminal session
Note: Ensure that gcc (C compiler) is installed and in PATH
Type the following commands, ensuring you are running as the root user and in the directory where the ODBC driver files where extracted.
CPPFLAGS="-DSIZEOF_LONG_INT=8
No messages will be seen after execution of this command
export CPPFLAGS
./configure --prefix=/usr --libdir=/usr/lib64 --sysconfdir=/etc --enable-gui=no --enable-drivers=no --enable-iconv --with-iconv-char-enc=UTF8 --with-iconv-ucode-enc=UTF16LE
There will be compilation activity taking place and you will see the following screen:
Type make
There will be lots of messages similar to those on the screen
Type make install (this requires that you are logged in as the root user)
Now we can install the SSQL Server ODBC Driver for Linux. For more information see - http://technet.microsoft.com/en-us/library/hh568454.aspx
1) Extract out the native client files. These can be downloaded from - http://www.microsoft.com/en-us/download/details.aspx?id=28160. Ensure you download the appropriate version for your Red Hat version.
2) Open up Terminal window – change directory to the location where files where extracted
Type ./install.sh – it will just output information as per above screenshot. Next type ./install.sh verify. It will indicate if the computer has all the required components to install the Driver for Linux.
If it comes back with OK for all then we can proceed to install.
Type ./install.sh install
License agreement displays
Type YES if you agree to the license and to complete the installation
Run the following command to verify that the SQL Server ODBC Driver for Linux was registered successfully: odbcinst -q -d -n "SQL Server Native Client 11.0"
Test out connectivity to SQL Server using sqlcmd or Sample C++ code which can be found at - http://blogs.msdn.com/b/sqlblog/archive/2012/01/26/use-existing-msdn-c-odbc-samples-for-microsoft-linux-odbc-driver.aspx.
I used sqlcmd as per below screenshot:
Now your Linux based applications can enjoy the use of SQL Server 2012, so don't let the requirement of Linux applications stop you from using your favorite database system – SQL Server 2012.
Well there are many reasons for this question and I could go on. However in short it is not only because of the innovative technology we bring out such as Windows, Kinect, Surface, Windows Phone and of course SQL Server along with many others – it is the solutions that people dream and implement using this great technology.
This week I have the pleasure to be a volunteer at the Imagine Cup finals taking place in Sydney, Australia. I am really excited about this as I get to meet some of the brilliant university student minds from over 80 countries and see the brilliant solutions they have created leveraging Microsoft technology. It is going to a humbling experience, one I hope to learn a lot from.
Take a look at the brilliant solution by the Australian students called the StethoCloud enabling better detection of Pneumonia and having the ability to potentially eliminate the death of a child every 2 minutes,. Another example is the SmartHouse that assists the elderly population in medical consultation, emergency situations and monitoring. Take a look at all of the great entries at http://www.imaginecup.com/ and be inspired.
If you happen to be in and around the Darling Harbour (Sydney) area this weekend (6-9th July 2012) then the chances are you will see some of these brilliant university students as they prepare for the final presentations. Sydneysiders please make them feel welcome to our great country, for some of these students its their first trip overseas so show them how great Australians are.
With the launch of the Virtual Machine preview on Windows Azure along with Media Services and Websites I have been busy just setting up a few environments. It has been really easy to spin up new VMs very quickly – literally in few clicks once you have registered for the preview for Virtual Machine. What’s even better is that it is free for 90 days .
To be able to quickly start experiencing the benefits of SQL Server 2012, go to www.windowsazure.com and click on the Free Trial button and sign up, its that easy.
Once you have signed up for the free trial, you will get access to the Metro Portal:
Click on the +New at the bottom left hand corner
Then you can click on either Quick Create (which as the name implies allows you to provide minimal information such as name, the template, sizing information) or use the gallery option to enable you to provide more advanced options. Clicking Quick Create allows you to spin up a SQL Server 2012 VM very quickly:
Once you click on create virtual machine, it literally takes
About 5-10 minutes later the VM is ready to connect to.
You can then select the virtual machine from the portal and get a dashboard view of the configuration and performance stats, and on bottom are the various commands you can conduct including connecting (RDP) to the VM.
Once connected you can start working with familiar tools of SQL Server management studio and others. The image also contains the full setup files (C:\SQLServer_11.0_Full) of SQL Server 2012 Evaluation edition so you can add/remove features as needed.
Hope this quick tour of the new Virtual machine capability inspires you to create quick lab/development environments in Windows Azure. For more information, please visit our Wiki for more information.
http://social.technet.microsoft.com/wiki/contents/articles/11554.sql-server-in-windows-azure-virtual-machine-early-adoption-cook-book-en-us.aspx
Planning for TechEd Australia 2012! is well underway, and this year I have the opportunity be technical track owner for the Database & BI track along with my colleagues Raja and Simon Brown.
Teched wouldn’t be a success without great content from the field, and as such it presents an opportunity to showcase your expertise, creativity and passion for Database & BI implementations/learning's.
Please submit your sessions using the Call for Topics tool at: http://teau12.eventpoint.com/cft
As you know this year is a big year for Microsoft in regards to product releases and focus areas. With that in mind we have changed our track listing slightly to be more focused on 2 areas:
Also feel free to let your colleagues know about the other track sessions that will be available at Teched Australia this year:
Windows Azure Database and Business Intelligence Developer Tools, Languages and Frameworks Office, Office 365 and SharePoint Security, Identity and Management Exchange and Lync Virtualization Windows Client Windows Phone Windows Server
The Call for Topics tool is open from today; May 22nd 2012 until Monday June 18th 2012. Get those creative juices going and start submitting sessions. When you are submitting sessions please keep the following in mind:
To enhance the success of your session submission, please prepare your content to ensure it provides TechEd attendees with comprehensive technical readiness to inspire and succeed in their career. Your content should also align to meet the content themes for TechEd 2012, which are:
For Developers: TechEd is the place to get ready to create Windows 8 Metro style apps.
For IT Professionals: TechEd will provide an accelerated path to the skills they need to deploy, manage and secure a Private Cloud within their organisation.
Submissions that compliment the 2012 content themes will be recognised throughout the approval process.
Below are some additional tips for a successful submission:
Good luck with your submissions and we look forward to see some great sessions submitted!
This document takes you through the essential technical details for planning and testing an upgrade of existing SQL Server 2005, 2008, and 2008 R2 instances to SQL Server 2012. You will be presented with best practices for preparation, planning, pre-upgrade tasks, and post-upgrade tasks. All the SQL Server components are covered, each in its own chapter.
Download the technical guide.
The managed self-service business intelligence (BI) capabilities of SQL Server 2012 and SharePoint Server 2010 make it easier than ever for business users to create and share rich, powerful BI solutions through familiar Microsoft Office applications, while allowing IT administrators to efficiently monitor the BI infrastructure.
Download the whitepaper to explore the scenarios of implementing managed self-service BI capabilities using private cloud technologies within System Center 2012. This can reduce the time and resources that organizations need to rapidly provision virtualized BI solutions—and to return the resources when they are no longer needed.
Thank you to those who attended my SQL Saturday presentation last Saturday at Epping Boys High School. It was a very well organised event by Grant Paisley pictured below, enjoying the many boxes of pizza made available during the lunch break for attendees.
There are still more SQL Saturday events to go – Adelaide, Perth so make sure you are out there. For more information see - http://www.sqlsaturday.com/
Link to download the deck from my presentation: https://skydrive.live.com/redir.aspx?cid=895a9e8d1c0f1cd0&resid=895A9E8D1C0F1CD0!13531&parid=895A9E8D1C0F1CD0!3278&authkey=!AKldpKhT-kvHEeE
If you really have a requirement for scaling out your database workload today then an option that is available to you is the public cloud with SQL Azure. This enables you to future proof your architecture and manage the scale without the complexity and overhead of managing the infrastructure. You can be future proofed by allowing you to scale to much large number of nodes such as hundreds (through Federation) as opposed to the outdated RACs limitation of less than 10-20 nodes.
On premise SQL Server also scales tremendously well with the latest Intel architecture of 10 cores allowing you to address 256 logical processors and beyond when Windows Server 8 becomes available. With SQL Server 2012s AlwaysOn capability that provides high availability, disaster recovery and active secondaries this enables your application to be up for the required 9s.
Following are some customers who have switched to SQL Server after implementing/evaluating Oracle RAC
· City of Virginia Beach: “Based on our experience with Oracle RAC, we knew that a Microsoft high-availability solution would be much easier to implement, simpler to support, and less expensive. The licensing cost for SQL Server 2012 for our particular configuration is about $120,000 less than for an Oracle RAC solution.” - Elena Balitsky, Team Leader, Database Administration Group
· Carter Holt Harvey: Building manufacturer chose SQL Server after looking at Oracle RAC on Linux, Oracle RAC on Unix, and SQL Server on Windows. “SQL Server on Windows has proven to be highly cost effective compared with our previous system” – Chris Lowe, Architect
· Powerco: “When we first set up Oracle we configured Oracle RAC to support active/active clusters, high availability and automatic failover. However, Microsoft SQL Server has many of the same features and we can get the same redundancy from SQL Server as we can from Oracle. We’ve saved a significant amount of money, almost $390,000 per annum, in migrating from Oracle to SQL Server. We’ve been able to save on rack space, reduce hardware, maintenance and licensing costs and have created numerous efficiencies through simplifying the server environment” – Huw Griffiths, Infrastructure Manager
RELATED CONTENTS:
· Why Not Oracle RAC white paper
· Scaling Out SQL Server 2012 white paper
· SQL Server 2012 Gives You More Advanced Features (Out-Of-The-Box) white paper
· 3rd party Oracle Exadata Comparison white paper
· SQL Server 2012 Licensing Value vs. Oracle white paper
· 3rd party SQL Server 2012 Has Better Value Than Oracle article
· 3rd party Oracle Database Appliance Compete white paper
Exciting news! SQL Server 2012 has released to manufacturing. You can download an evaluation of the product today and can expect general availability to begin on April 1 (Note: This site is still getting updated, so try again tomorrow for the link to the Evaluation Edition) .
We also announced today an additional preview of an Apache Hadoop-based service on Windows Azure that will be available in the first half of 2012 to connect SQL Server and integrated business intelligence tools with unstructured data. For more information about today’s news, check out Microsoft corporate vice president Ted Kummert’s post on the Official Microsoft Blog, and for even more information, check out the STB News Bytes Blog.
New benchmarks published by hardware and software partners on the SQL Server 2012 release further help validate SQL Server as the data platform on which you can bet your business. To find out more about why SQL Server 2012 is the industry leader for performance refer to here.
SQL Server 2012 continues to be a value leading data platform. A new Forrester Consulting commissioned Total Economic Impact (TEI) study on the potential benefits of upgrading to SQL Server 2012. The study reports a potential Return on Investment (ROI) of up to 189 % with a 12-month payback period.
Tomorrow, don’t forget to attend the SQL Server 2012 Virtual Launch Event, featuring keynotes by Ted Kummert and Quentin Clark and a number of video presentations about SQL Server 2012 by other SQL Server experts.
Start evaluating SQL Server 2012 today.
PASS SQLSaturday’s are free 1-day training events for SQL Server professionals that focus on local speakers, providing a variety of high-quality technical sessions, and making it all happen through the efforts of volunteers. Whether you're attending a SQLSaturday or thinking about hosting your own, we think you'll find it's a great way to spend a Saturday – or any day.
Here are the registration links. Dates and Locations that have been planned for SQL Saturdays events in Australia. Admittance to this event is free, all costs are covered by donations and sponsorships. Please register soon, as seating is limited, and let friends and colleagues know about the event :-
SQLSaturday #137 - Canberra 2012 Canberra Club, 45 West Row, Canberrra City, ACT, 2601, Australia
SQLSaturday #138 - Sydney 2012 Epping Boys High School (EBHS), 213 Vimiera Road, Eastwood, NSW, 2122, Australia
SQLSaturday #139 - Adelaide 2012 Edge Church, Hindley St, Adelaide, SA, 5000, Australia
SQLSaturday #140 - Perth 2012 New Horizons Learning Centre, Level 6, 5 Mill St, Perth, WA, 6000, Australia
Have SQLSaturday questions or suggestions? Feel free to contact the PASS SQLSaturday HQ Team at any time!
HP and Microsoft engineering teams have worked together to create a reference architecture to Accelerate Online Transaction Processing (OLTP) database workloads with a fully-flash based HP/Microsoft architecture and achieve significant performance increases, simplified database manageability, and industry leading TCO.
The HP OLTP RA, built on HP ProLiant DL980 servers and HP VMA-series 3210 Memory Arrays provides: • 350–700k sustained input/output operations per second (IOPS) • As much as 10x faster time to database production: with faster warm-up times and reduced back-up and re-indexing times • Reduces operational costs with a single rack saving on power and cooling requirements
Featuring the PREMA architecture in the HP ProLiant DL980 server • Balanced scaling – world record performance enabled by Smart CPU Caching • Self-healing resiliency – resilient system fabric to maximize uptime • Breakthrough efficiencies – common and integrated experience with iLO3, Insight Control, and Thermal Logic
The HP OLTP RA is an enterprise-class scalable architecture optimized for Microsoft SQL Server 2008 R2. With the Non-Uniform Memory Access NUMA architecture, you have the ability to deploy SQL server instances or databases in NUMA nodes. Powered by SQL Server 2008 R2, the HP OLTP RA combines the advantage of multi-processor servers with query parallelism to bring Tier 1 SQL database features optimized for OLTP workloads. With sufficient CPU power to handle VLDB compression and large reindexing jobs, the HP ProLiant DL980 server is optimized specifically for SQL Server 2008 R2 and yet will be able to take advantage of SQL Server 2012 features when available.
How it compares to appliances?
Download Datasheet:
HP Enterprise Transaction Processing Reference
Designing a Private Cloud Strategy that meets Business Demands
Your enterprise supports hundreds (or thousands) of applications to meet business demands- and as demand increases, so does the number of applications. This growth increases the cost of deploying and managing data sources for these applications- putting stress on the IT budget that can stifle further growth. According to IDC, these pressures are setting the stage for major industry shifts driving the convergence of virtualization, cloud, automation, analytics, and the consumerisation of IT. During this event we will tackle this challenge by helping you design a datacenter strategy that provides the scalability of the private cloud needed to support the growing demands of the business with SQL Server 2012.
AGENDA
8:30am - Welcome/Registration
9:00am – Session 1 Commences (120mins)
Session 1: An Application-Focused Approach to Designing your Private Cloud Infrastructure
Why do I need a Private Cloud in my organization? Can the Private Cloud help me deliver applications against service levels demanded by the business? How can I embrace these new technologies while containing my development, deployment and management costs?
In this infrastructure-focused session, we will show you how System Center 2012 can help you can take control of your environment and deliver IT as a service to the business. We will show you a self-service model that allows the business to deploy applications and consume data independently from the complexity of the underlying infrastructure. At the same time, we will show you how to design a datacenter infrastructure that benefits from the scalability and efficiency of the Private Cloud.
12:30pm - Welcome/Registration
1:00pm – Session 2 Commences (120mins)
Session 2: Support Growing Application Demands with a Scalable Database Platform
Come to this session to learn about the products and tools
available today from Microsoft that can help you adopt a
database platform that provides the scalability and efficiency of the Private Cloud.
Learn how to consolidate SQL database workloads to provide more efficient use of your compute, network and storage resources, decrease your physical footprint and reduce your capital and operational expenses. Learn how to scale your resources efficiently and deploy them on demand, while providing standardization and improving compliance. We will discuss how SQL Server will evolve with the release of SQL Server 2012 to support the Private Cloud, as well as the faster deployment and performance provided by the Database Consolidation Appliance
EVENT DATES
Canberra 28th February
Melbourne 1st March
Adelaide 7th March
Brisbane 8th March
Sydney 9th March
Perth 14th March
CANBERRA
Microsoft Canberra Theatre 1 Walter Turnbull Building Level 2, 44 Sydney Ave Barton, Canberra ACT 2600
MELBOURNE
Microsoft Melbourne Exhibition Room Level 5, 4 Freshwater Pl Southbank, Melbourne VIC 3006
PERTH
Microsoft Perth Ennex 100 Seminar Room Level 3, 100 Georges Tce Perth, WA 6000
ADELAIDE
Microsoft Adelaide Coonawarra Room Level 12, Aurora Building 147 Pirie St Adelaide, SA 5000
SYDNEY
Microsoft Sydney Federation Room 1 Epping Road North Ryde, Sydney NSW 2113
BRISBANE
Microsoft Brisbane North Stradbroke Island Meeting Room Level 28, 400 George St Brisbane, QLD 4000
With the December update to SharePoint 2010, you can now view your existing BI solutions in your SharePoint on your iPad devices that use the iOS 5 Safari browser.
Users can view PerformancePoint scorecards, analytic charts and grids, Excel Services reports, and SQL Server Reporting Services reports by using the Safari browser on iPad. However, not all kinds of reports can be viewed on iPad. For example, users will be unable to view Visio Services reports or PerformancePoint strategy maps because of client-side requirements that iPad devices cannot support.
For more detailed information see the following Technet article - http://technet.microsoft.com/en-au/library/hh697482.aspx#part2
Just in case you missed it over the holiday period, Pej from the SQL Server product team made available Power View over the Internet so you are able to interact and create your very own reports without having to install anything. All you need is a browser with Silver Light 5 and Windows Live ID. For more information see the below link:
http://blogs.msdn.com/b/oneclickbi/archive/2011/12/27/more-demos-of-power-view-available.aspx
To help you evaluate your SQL Server database(s) readiness to move onto our public offering – SQL Azure, we have made available a new experimental online service - SQL Azure Compatibility Assessment on SQL Azure Labs. This services checks compatibility of your schema. To use this service you need:
What you do not need:
What you will get is a report listing the database objects not supported on SQL Azure, and the objects that need fix:
For detailed step by step instructions on how to use this service see the wiki article - http://social.technet.microsoft.com/wiki/contents/articles/6246.aspx
I have recently had the opportunity to present on the topic of BigData at several forums including the DW 2.0 Asia-Pac Summit and at the Microsoft Big Picture event. At both these events there were sceptics of BigData and as such I thought I might write this blog post.
I must agree partly though with some of the audience members, about “BigData” being a buzz word, or at least that is how it is portrayed. It can be a good way of signing off bigger budgets on hardware especially storage. However if we take some time to understand the origins of the term then it does make sense. As such the brief of where BigData came about from is for those organisations that are working with large amounts of data - these are beyond the realms of the typical relational data warehouses (beyond TB) – such as the Internet search engines like Yahoo, Google and Bing (many PB across 40000 compute nodes). They not only have to store the index to the Internet, but even the activities/log information produced by users when using there service. This requires petabytes of information to be stored and be able to queried quickly, as such the relational paradigm (ACID) doesn’t suit and the CAP theorem is more suitable. I am not going to be going into detail of these but one can easily conduct a Wikipedia search on this.
One of the popular implementations of this non-relational, or as it has come to be known to be noSQL (not only SQL) movement is Hadoop, which was started as a open source project by Yahoo and Google. The basis for this was the storage layer – a highly distributed filed system – HDFS – and querying/processing methodology known as Map Reduce. This noSQL implementations were required because of the following characteristics of the data:
I have captured some of the above points in the below video I also recorded on this topic, so if you would just like to hear it then view the video:
Hopefully that now provides you with an insight into BigData, now the next question that arises – is about the use case in my organisation? It is a great use case for Internet startups that generate large web logs, or use lots of media – audio, video files – but what about at my bricks & mortar organisation. To respond to this I provide many examples, one simple one is of the ability to harvest the vast amounts of social media data out there that could potentially be out there about the companies brand, products, service, etc – how can we take this and mine it for great piece insights. I then further refer to a survey conducted by Avanade that talks about sources of data internally within organisations. Below is the graph for your reference:
From this you can see a lot of data comes from unstructured systems such as email, documents, spread sheets as opposed to the line of business application databases and other structured sources. I then explain this by giving my day to day example as a member of the sales team at Microsoft. If I was to analyse a lifecycle of sales opportunity, it would be incorrect of me to just take the data from the CRM/ERP system, as this only contains the data as to when I entered it into the system, however prior to this there would have been many emails exchanged between myself, account team and the customer. There would also have been spread sheets with models reflecting the opportunity size, etc. So as you can see the data about the beginnings of the opportunity is completely missed out on if I analysed the structured data only. In some cases this might be okay, as this could be noise data, but having access to it could potentially reveal a best practice that my account team could be using that could be replicated across the organisation for example. Other examples of this might be with regards to contracts – most of these are pdf/word documents, being able to get key metrics on contracts could help you better manage them.
Below is a table of other scenarios and uses of BigData across different industries/verticals. Hopefully you find something that resonates with your situation.
So what is Microsoft doing in the space of BigData, well I am glad you asked, as we have been doing quite a lot in this space, and we made some major announcements at 2011 SQL PASS Summit. These were as follows:
What does this all mean, basically we will continue to provide end user access to their data in familiar tools such as Excel, Power View, SharePoint to gain insight from the data - whether it resides in structured world in relational databases such as SQL Server or unstructured stored in Hadoop. Below is a high level architecture:
To get a better understanding of our strategy and roadmap, I would also like to refer you to the following channel 9 video:
To see the Windows Azure Hadoop CTP implementation in action refer to:
Lastly I think this is an exciting time to be involved with data, with technologies like Hadoop and its implementation in Azure making it very easy store and then consume BigData and come up with new insights and analytics. Below is a cartoon from geek&poke that talks about how to leveraging the NoSQL boom:
The new Upgrade Assistant for SQL Server 2012 from our partner Scalability Experts is now available to download and use. It is a free web-downloadable tool. It performs application compatibility testing and detects potential functional and performance issues that may impact a database upgrade from an earlier version of SQL Server (SQL Server 2005, SQL Server 2008 or SQL Server 2008 R2) to SQL Server 2012. Comparing to the early version of Upgrade Assistant tool, it significantly enhances replay scalability and performance by building on top of SQL Server 2012 Distributed Replay (D-Replay) feature. In addition, this version provides a user-friendly configuration interface as well as enhanced reporting and analysis features.
Report Highlights
· Statistic Info: Summary info in the report provides user an intuitive knowledge of the playback success rate and performance diff between Baseline Server & Test Server
· Enhanced filter capability: Filter & error category enable user to narrow down the playback resultset and improve analysis efficiency
· View events: Locate the same event sequence to allow user to do 1:1 comparison and analysis.
Resource Link
Tool Download link: http://www.scalabilityexperts.com/tools/downloads.html
Tool Wiki page: http://social.technet.microsoft.com/wiki/contents/articles/sql-server-upgrade-assistant-tool-for-denali.aspx
Feedback to the product development teams via http://connect.microsoft.com/sqlserver
Ask questions about this tool or upgrades in general at http://social.msdn.microsoft.com/Forums/en-US/sqlsetupandupgrade/threads
I would like to take this opportunity to thank you for your support & custom during 2011 and wish you a happy and safe festive season, and a prosperous 2012!
Look forward to working with you to make your database and BI dreams blossom with the availability of SQL Server 2012 in the New Year - http://tinyurl.com/sql2012
Happy SQL’ing
sqlman
P.S: Thank you to my mum for letting me put her latest painting in this message.