• SQL Server 2014 In Memory OLTP: Memory-Optimized Table Types and Table Variables

    SQL Server 2014 In-Memory OLTP introduces two new widely advertised kinds of database objects: memory-optimized tables for efficient data access, and natively compiled stored procedures for efficient query processing and business logic execution. However, there is a third kind of object introduced by In-Memory OLTP: the memory-optimized table type.

    The use of memory-optimized table variables has a number of advantages over traditional table variables:

    • The variables are truly in memory: they are guaranteed to never spill to disk.
    • Data access is more efficient due to the same memory-optimized algorithm and data structures used for memory-optimized tables, especially when the variables are used in natively compiled stored procedures.
    • Finally, with memory-optimized table variables there is no tempdb utilization: table variables are not in tempdb, and do not use any resources in tempdb.

    The typical usage scenarios for memory-optimized table variables are:

    • Storing intermediate results and creating single result sets based on multiple queries in natively compiled stored procedures.
    • Passing table-valued parameters into natively compiled stored procedures, as well as traditional interpreted procedures.
    • As replacement for traditional table variables, and in some cases for #temp tables that are local to a stored procedure.  This is particularly useful if there is a lot of tempdb contention in the system.

    Like memory-optimized tables, SQL Server generates a DLL for each memory-optimized table type. As is the case for memory-optimized tables, the DLL includes the functions for access indexes and retrieving data from the table variables. When a memory-optimized table variable is declared based on the table type, an instance of the table and index structures corresponding to the table type is created in the user session. The table variable can then be used in the same way as traditional table variables: you can insert/update/delete rows in the table variable, and you can use the variables in T-SQL queries. You can also pass them into natively compiled as well as traditional stored procedures, as table-valued parameters (TVP).

    The following sample shows a memory-optimized table type from the AdventureWorks-based In-Memory OLTP sample.

    CREATE TYPE [Sales].[SalesOrderDetailType_inmem] AS TABLE(

          [OrderQty] [smallint] NOT NULL,

          [ProductID] [int] NOT NULL,

          [SpecialOfferID] [int] NOT NULL,

          [LocalID] [int] NOT NULL,

          INDEX [IX_ProductID] HASH ([ProductID])

                WITH ( BUCKET_COUNT = 8),

          INDEX [IX_SpecialOfferID] HASH ([SpecialOfferID])

                WITH ( BUCKET_COUNT = 8)

    )

    WITH ( MEMORY_OPTIMIZED = ON )

    As you can see, the syntax is very similar to traditional table variables. A few things to keep in mind:

    • The option MEMORY_OPTIMIZED=ON indicates that the table type is indeed memory-optimized.
    • The type must have at least one index. Like with tables, supported index types are hash and nonclustered.
      • Keep in mind that for a hash index the bucket_count should be about 1-2X the number of expected unique index keys. Over-sizing is usually better than under-sizing: if sometimes you insert only 2 values in the variables, but sometimes insert up to 1000 values, it’s usually better to specify BUCKET_COUNT=1000.
    • The restrictions on memory-optimized tables related to data types and constraints also apply to memory-optimized table types. For example, in SQL Server 2014 default constraints are supported, but check constraints are not.
    • Like memory-optimized tables, table variables do not support parallel plans.
    • Like memory-optimized tables, table variables must fit completely in memory – there is no spill to disk.
    • Traditional table variables exist in tempdb. In contrast, memory-optimized table variables exist in the user database.
    • Memory-optimized table types need to be specified explicitly using a CREATE TYPE statement. Specification in-line with the variable declaration is not supported.

    Table-Valued Parameters

    The following example script shows declaration of a table variable as the memory-optimized table type Sales.SalesOrderDetailType_inmem, insert of three rows into the variable, and passing the variable as a TVP into Sales.usp_InsertSalesOrder_inmem.

    DECLARE @od Sales.SalesOrderDetailType_inmem,

          @SalesOrderID uniqueidentifier,

          @DueDate datetime2 = SYSDATETIME()

     INSERT @od (LocalID, ProductID, OrderQty, SpecialOfferID) VALUES

          (1,888,2,1),

          (2,450,13,1),

          (3,841,1,1)   

    EXEC Sales.usp_InsertSalesOrder_inmem

          @SalesOrderID = @SalesOrderID,

          @DueDate = @DueDate,

          @OnlineOrderFlag = 1,

          @SalesOrderDetails = @od

    Note that memory-optimized table types can be used as the type for stored procedure table-valued parameters (TVPs), and can be referenced by clients in exactly the same way as traditional table types and TVPs. Therefore, invocation of stored procedures with memory-optimized TVPs, and indeed natively compiled stored procedure, works exactly the same way as the invocation of traditional stored procedures with traditional TVPs.

    Cross-Database Queries

    In SQL Server 2014, memory-optimized tables do not support cross-database transactions. This means you cannot access another database from the same transaction or the same query that also accesses a memory-optimized table. This means you cannot straightforwardly copy data from a table in one database, to a memory-optimized table in another database.

    Table variables are not transactional. Therefore, memory-optimized table variables can be used in cross-database queries, and can thus facilitate moving data from one database into memory-optimized tables in another. The idea is to use two transactions: in the first transaction you insert the data from the remote table into the variable; in the second transaction you insert the data into the local memory-optimized table from the variable.

    For example, if you want to copy the row from table t1 in database db1 to table t2 in db2, using variable @v1 of type dbo.tt1, you would use something like:

    USE db2

    GO

    DECLARE @v1 dbo.tt1

    INSERT @v1 SELECT * FROM db1.dbo.t1

    INSERT dbo.t2 SELECT * FROM @v1

    GO

  • Introducing the Microsoft Analytics Platform System – the turnkey appliance for big data analytics

    At the Accelerate your Insights event last week, Satya Nadella introduced the new Microsoft Analytics Platform System (APS) as Microsoft’s solution for delivering “Big Data in a box.” APS is an evolution of our SQL Server Parallel Data Warehouse (PDW) appliance which builds upon the high performance and scale capabilities of that MPP version of SQL Server, and now introduces a dedicated region to the appliance for Hadoop in addition to the SQL Server PDW capabilities. The Hadoop region within the appliance is based on the Hortonworks Data Platform for Windows but adds key capabilities enterprises expect for a Tier 1 appliance such as high availability through the appliance design and Windows Server failover clustering, security through Active Directory and a unified appliance management experience through Systems Center. Completing the APS package and seamlessly unifying the data in SQL Server PDW with data in Hadoop is PolyBase, a ground breaking query technology developed by Dr. David DeWitt and his team in Microsoft’s Grey Systems Labs.

    Microsoft continues to work with industry leading hardware partners Dell, HP and Quanta to deliver APS as a turnkey appliance that also delivers the best value in the industry for a data warehouse appliance.

    Go to the APS product site to learn more or watch the short product introduction video here: 

  • Sentiment Analysis with Microsoft APS and StreamInsight

    In this overview and demo, we will show you what sentiment analysis is and how to build a quick mashup that combines real-time access to multiple data sources using tools from Microsoft.

    Sentiment analysis is one of the hottest topics in the Big Data space. Sentiment analysis is the process of analyzing customer comments and feedback from Facebook, Twitter, Email, and more. The purpose of the analysis is to understand the overall sentiment the customer is trying to convey. This could be a negative sentiment, when the customer is unhappy with a company or its product. Neutral sentiment, when the customer is only mentioning a company or product, in passing, without a good or a bad feeling. The last is positive sentiment, when a customer is happy or excited about a company or its product.

    Traditionally sentiment analysis was complicated because it required a mixture of very complex platforms and tools. Each component required for sentiment analysis was offered from a different company and required a large amount of custom work. The difficulty is further exasperated by hard-to- achieve business requirements. When we discuss sentiment analysis there are 3 key business requirements we see repeated:

    • Real-time access
    • Full granular data set (structured & unstructured)
    • BI and SQL front-end

    Real-time Access

    In the case of real-time access, business users need access to fresh data. In the world of social media, customer sentiment can change rapidly. With images and videos quickly being posted with re-tweets and Facebook ‘like’ capabilities, a good or bad aspect of a company’s product can go viral in minutes. Business users need to have the ability to analyze data as it comes in, in real-time. We will show in our overview video and demo, how we can utilize Microsoft’s StreamInsight technology for real-time data analysis and complex-event processing.

    Full Granular Data Set

    In the case of full granular data, in practice we have seen that using a traditional database system can hinder development. This is because a lot of the data that comes in for sentiment analysis such as email, is in a semi-structured or unstructured format. This means the data is not easily modeled into a database. The data does not come in a simple row/column format. Thus we utilize our Big Data technology that is meant for this type of data:  HDInsight (Hadoop). HDInsight is essentially Hortonworks Data Platform running on Windows. In our case we utilize HDInsight to land all of the data, in its raw original format, into the distributed file system HDFS. This allows us to ingest any kind of data, regardless of structure, and store that data online for further analysis at low cost. The Hadoop software is open-source and readily available.

    BI and SQL Front-End

    The most important area around delivering sentiment analysis to the business is access, making sure we are able to provide the data both in real-time (and high-fidelity) within the tools that our business users know and love. Previously when our customers were doing sentiment analysis on Hadoop systems, BI and SQL access was not available. This was not because the tools could not integrate with Hadoop systems. This was because they could not scale or have the same level of functionality. Some BI users have chosen Hive ODBC in Hadoop, which many claim to be slow and ‘buggy’. Instead here we utilize one of our flagship technologies: PolyBase. With PolyBase we expose the data in Hadoop, and relational SQL Server, with one T-SQL query. What this means is users can use BI tools like Excel, SSAS, or other 3rd party tools. They can then utilize PolyBase within Analytics Platform System (APS) to query that data either in Hadoop, or Parallel Data Warehouse (SQL Server), or mash up the data from both systems!

    How It Works

    Now we will show you how to use all of the tools from the SQL Server data platform to achieve sentiment analysis. This will allow you to quickly deploy and meet all 3 business requirements through a set of tools and platforms that are very easy to use, fully integrated, and ‘just work’ together.

    Let’s get started with the first video (~5 minutes) where we present sentiment analysis using Microsoft technologies. We show you how sentiment analysis works, and how the Microsoft products fit. We then follow up by discussing the architecture in detail surrounding StreamInsight, HDInsight, and Analytics Platform System.

    Watch the overview video:

    Demo

    In the second video (~7 minutes), we show you sentiment analysis in action. The demo will include a full sentiment-analysis engine running in real-time against Twitter data along with a web dashboard. We then stream Twitter data to both HDInsight and Parallel Data Warehouse. Finally, we end the demo by showcasing PolyBase, our flagship technology. With Polybase we can do data mashups combining data from relational and non-relational systems. We will use Polybase to write standard T-SQL queries against this data to determine tweet analytics and how social sentiment is fairing for our marketing campaigns and products.

    Watch the demo video:

  • Microsoft Brings Innovations From SQL Server To Hadoop With Stinger

    Over the last two years, Microsoft has been sharing our progress in Big Data by working to bring Hadoop to the masses through the simplicity and manageability of Windows and Azure. In 2012, Microsoft communicated our expanded partnership with Hortonworks by making Hortonworks Data Platform the core of our Hadoop solution on-premise and in the cloud.

    As part of this partnership, Microsoft has collaborated with Hortonworks and the open source software (OSS) community to contribute directly to Apache Hadoop. While our first wave of contributions with Hortonworks has been to port Hadoop to Windows, we’ve recently contributed to other projects like the Stinger initiative to dramatically speed up the performance of Hive and make Hadoop enterprise-ready.

    About The Stinger Initiative

    In collaboration with Hortonworks and others from the OSS community, Microsoft has brought some of the technological breakthroughs of SQL Server to Hadoop. In SQL Server 2012, we introduced the in-memory columnstore which included a vectorized query execution engine and a columnar format that demonstrated 10-100x performance gains on data warehouse queries. While these improvements varied by customer scenarios, some achieved upwards of 600x. With the Stinger initiative, Microsoft is collaborating with Hortonworks to bring similar query execution and columnstore technologies to Hadoop so that we can collectively improve the performance of Hive up to 100x.

    The first fruits of this has already been realized with Hortonworks Data Platform 2.0 for Windows and with HDInsight previewing Hadoop 2.2 clusters. Both of these Hadoop solutions leverage phase 2 of the Stinger project which has up to 40x improvements to query response times and up to 80% in data compression.

    Microsoft is pleased to be a part of the open source software (OSS) Big Data community for the past year and a half. We’ve gained a lot from the community and are delighted to continue our partnership with Hortonworks and bringing more innovations to Hadoop.

    We invite you to learn more about Microsoft’s Hadoop offers below:

  • Join the 2014 24 Hours of PASS Business Analytics Challenge and You Could Win a Free Ticket!

    Calling all data junkies and business analysts! Play along with @SQLServer on Twitter, February 5th during the 24 Hours of PASS, Business Analytics Edition for your chance to win a cool Microsoft T-shirt and a ticket to the 2014 PASS Business Analytics Conference in San Jose, CA! Beginning at 8 am PT on February 5th, you can benefit from twelve live 1-hour sessions (followed by 12 hours of on-demand replay) that will cover data analytics and visualization, big data innovations and integration, information delivery, advanced analytics and more! 

    Near the end of each of the twelve live sessions (approximately 55-59 minutes after the session starts), you’ll have the opportunity to win a T-shirt by being the first to correctly answer a question related to the session you just finished watching in this #pass24HOP Challenge.  (We will tweet 5 minutes prior to each question to give you time to be prepare.) What’s more, we will randomly select one lucky winner from all the eligible entries received to receive a free ticket to PASS BAC!

    To take part in the #pass24Hop Challenge, you must:

      • Sign in to your Twitter account. If you do not have an account, visit www.twitter.com to create one. Twitter accounts are free.
      • Once logged into your Twitter account, follow the links and instructions to become a follower of SQL Server at @SQLServer.
      • From your account, as you see the question sent from @SQLServer, send a reply to @SQLServer and include your answer to the posted question
      • Your reply tweet must contain the hashtags #pass24hop and #msbi to be eligible for entry
      • The first person to tweet a correct reply will win a Microsoft-branded T-shirt.
      • In addition, one Grand Prize winner will be randomly selected from among all eligible entries received, limit one Grand Prize drawing entry per person, per question. The Grand Prize winner will receive a ticket to the 2014 PASS Business Analytics Conference in San Jose, CA. (Please note: The prize consists of Event registration only, the winner solely responsible for any travel or expenses not included in registration. This award is non-transferable.)
      • See official rules below.

    Register now for 24 Hours of Pass and get ready for 24 hours of play!  Learn more about the 24 Hours of Pass here and read the official rules below.

    *     *     *     *     *

    2014 PASS BUSINESS ANALYTICS #pass24HOP CHALLENGE - OFFICIAL RULES

    NO PURCHASE NECESSARY.

    COMMON TERMS USED IN THESE RULES:

    These are the official rules that govern how #pass24HOP Challenge (“Challenge”) promotion will operate. This promotion will be simply referred to as the “Challenge” throughout the rest of these rules.

    In these rules, “we,” “our,” and “us” refer to Microsoft Corporation, the sponsor of the Challenge. “You” refers to an eligible Challenge entrant.

    WHAT ARE THE START AND END DATES? 
    This Challenge starts at 8:00 a.m. PT on February 5, 2014 and ends at 8:00 p.m. PT on February 5, 2014 (“Entry Period”).  The Entry Period consists of 12 separate Prize Periods as further described below.

    CAN I ENTER? 
    You are eligible to enter this Challenge if you meet the following requirements at time of entry:

      • You are a Business Analytics or IT Professional 18 years of age or older, who is participating in the free 24 Hours of PASS: Business Analytics event “(Entrant”). Details of the event are available here.

    and

        • If you are 18 of age or older, but are considered a minor in your legal place of residence, you must ask your parent’s or legal guardian’s permission prior to entering this Challenge; and
      • You are NOT a resident of any of the following countries: Cuba, Iran, North Korea, Sudan, and Syria.
        • PLEASE NOTE: U.S. export regulations prohibit the export of goods and services to Cuba, Iran, North Korea, Sudan and Syria. Therefore residents of these countries / regions are not eligible to participate.
      • You are NOT an employee of Microsoft Corporation or an employee of a Microsoft subsidiary; and

      • You are NOT involved in any part of the administration and execution of this Challenge; and

      • You are NOT an immediate family (parent, sibling, spouse, child) or household member of a Microsoft employee, an employee of a Microsoft subsidiary, or a person involved in any part of the administration and execution of this Challenge

    This Challenge is void wherever prohibited by law.

    HOW DO I ENTER? 

    At the end of each of the first twelve live 24 Hours of PASS sessions (at approximately 58-59 minutes past each hour) a new trivia question regarding the most recent PASS session will be posted online on the @SQLServer Twitter account.

    To enter, you must do all of the following:

    1. Sign in to your Twitter account. If you do not have an account, visit www.twitter.com to create one. Twitter accounts are free.
    2. Once logged into your Twitter account, follow the links and instructions to become a follower of @SQLServer.
    3. From your own account, reply to @SQLServer and include in the tweet your answer to the posted question.
    4. Your tweet must contain both the #msbi #pass24hop hashtags to be eligible for entry
    5. Your tweet must include the complete answer to the question, or it will be disqualified.
    6. The first person to correctly tweet a correct reply to each question will win the prize described below.  In addition, all eligible entries received will be entered into the Grand Prize drawing. Limit one Grand Prize drawing entry per person per question.   

    We are not responsible for entries that we do not receive for any reason, or for entries that we receive but are not decipherable for any reason, or for entries that do not include both the #PASS24hop and #msbi hashtags, or do not provide a complete answer.

    We will automatically disqualify:

      • Any incomplete or illegible entry; and
      • Any entries that we receive from you that are in excess of the entry limit described above; and
      • Any entries that do not comply with Twitter.com terms of use. 

    WINNER SELECTION AND PRIZES 
    The first person to correctly respond will receive a Microsoft branded T-shirt.  Approximate Retail Value each $20.  A total of twelve prizes are available.

    In addition, at the close of the Entry Period one Grand Prize winner will be randomly selected from among all eligible entries received.  The Grand Prize is complimentary registration for the PASS Business Analytics Conference http://www.sqlpass.org/bac/2014/Home.aspx. The prize consists of Event registration only, winner solely responsible for any travel or expenses not included in registration.  Value $1,985

    Limit of one T-shirt and one Grand Prize per person or Twitter account for the duration of the Challenge, regardless of the number of correct answers submitted.

    If you are a potential winner, we will notify you via Direct Message on Twitter to the account provided at time of entry within seven (7) days following the end of the Entry Period.  If the notification that we send is returned as undeliverable, or you are otherwise unreachable for any reason, we may award the prize to an alternate, randomly selected winner.  Winners can expect to receive their prize within four (4) to six (6) weeks following verification of eligibility.

    If there is a dispute as to who is the potential winner, we reserve the right to make final decisions on who is the winner based on the accuracy of the answer provided, ensuring that the rules of including hashtags is followed, and the times the answers arrives based on what times are listed on www.twitter.com.

    If you are a potential winner, we may require you to sign an Affidavit of Eligibility, Liability/Publicity Release within 10 days of notification.  If you are a potential winner and you are 18 or older, but are considered a minor in your place of legal residence, we may require your parent or legal guardian to sign all required forms on your behalf.  If you do not complete the required forms as instructed and/or return the required forms within the time period listed on the winner notification message, we may disqualify you and select an alternate, randomly selected winner.

    If you are confirmed as a winner of this Challenge:

      • You may not exchange your prize for cash or any other merchandise or services.  However, if for any reason an advertised prize is unavailable, we reserve the right to substitute a prize of equal or greater value; and
      • You may not designate someone else as the winner.  If you are unable or unwilling to accept your prize, we will award it to an alternate potential winner; and
      • If you accept a prize, you will be solely responsible for all applicable taxes related to accepting the prize; and
      • If you are otherwise eligible for this Challenge, but are considered a minor in your place of residence, we may award the prize to your parent/legal guardian on your behalf; and 

    WHAT ARE YOUR ODDS OF WINNING? 
    Your odds of winning this Challenge depend on the number of eligible entries we receive and they time the entries are posted to www.twitter.com.

    WHAT OTHER CONDITIONS ARE YOU AGREEING TO BY ENTERING THIS CHALLENGE? 
    By entering this Challenge you agree:

      • To abide by these Official Rules; and
      • To release and hold harmless Microsoft, and its respective parents, subsidiaries, affiliates, employees and agents from any and all liability or any injury, loss or damage of any kind arising from or in connection with this Challenge or any prize won; and
      • That Microsoft’s decisions will be final and binding on all matters related to this Challenge; and
      • That by accepting a prize, Microsoft may use of your proper name and state of residence online and in print, or in any other media, in connection with this Challenge, without payment or compensation to you, except where prohibited by law

    WHAT LAWS GOVERN THE WAY THIS CHALLENGE IS EXECUTED AND ADMINISTRATED? 
    This Challenge will be governed by the laws of the State of Washington, and you consent to the exclusive jurisdiction and venue of the courts of the State of Washington for any disputes arising out of this Challenge. 

    WHAT IF SOMETHING UNEXPECTED HAPPENS AND THE CHALLENGE CAN’T RUN AS PLANNED? 
    If cheating, a virus, bug, catastrophic event, or any other unforeseen or unexpected event that cannot be reasonably anticipated or controlled, (also referred to as force majeure) affects the fairness and / or integrity of this Challenge, we reserve the right to cancel, change or suspend this Challenge.  This right is reserved whether the event is due to human or technical error. If a solution cannot be found to restore the integrity of the Challenge, we reserve the right to select winners from among all eligible entries received before we had to cancel, change or suspend the Challenge.

    If you attempt to compromise the integrity or the legitimate operation of this Challenge by hacking or by cheating or committing fraud in ANY way, we may seek damages from you to the fullest extent permitted by law.  Further, we may ban you from participating in any of our future Challenge, so please play fairly.

    HOW CAN YOU FIND OUT WHO WON? 
    To find out who won, send an email to v-daconn@microsoft.com by March 6, 2014 with the subject line: “#PASS24HOP Business Analytics Challenge Winners List”.

    WHO IS SPONSORING THIS CHALLENGE? 
    Microsoft Corporation 
    One Microsoft Way 
    Redmond, WA 98052