All things Data Related....

On this site I will be posting entries related to Data platform and analytics that I learn and believe could be valuable learning to others who want insight from their data

Technical Case Study: Klout implementation of Analysis Services over Hive (Big Data)

Technical Case Study: Klout implementation of Analysis Services over Hive (Big Data)

  • Comments 3
  • Likes

Klout ( measures influence across the social web by analyzing social network user data. Klout uses the impact of opinions, links, and recommendations to identify influential individuals. Every day Klout scans 15 social networks, scores hundreds of millions of profiles, and processes over 12 billion data points.

The Klout data warehouse, which relies on Apache Hadoop-based technology, exceeds 800 terabytes of data. But Klout doesn’t just crunch large data volumes; Klout takes advantage of Microsoft SQL Server 2012 Analysis Services to deliver reliable scores and actionable insights at the speed of thought.

Microsoft and Klout collaborated to build this Big Data Analytics solution. The goal for this solution was to find a cost-effective way to combine the power of Hadoop with the power of Analysis Services. The result is a solution that connects Analysis Services to Hadoop/Hive via the relational SQL Server engine, enabling Klout to reduce data latencies, eliminate maintenance overhead and costs, move aggregation processing to Hadoop, and shorten development cycles dramatically. Organizations in any industry and business sector can adopt the solution presented in this technical case study to exploit the benefits of Hadoop while preserving existing investments in SQL Server technology. This case study discusses the necessary integration techniques and lessons learned.

To review the document, please download the SQL Server Analysis Services to Hive Word document.

  • Good Article.

  • thank you

  • Great post! Apache Hive supports analysis of large datasets stored in Hadoop's HDFS and compatible file systems such as Amazon S3 filesystem. It provides an SQL-like language called HiveQL with schema on read and transparently converts queries to map/reduce, Apache Tez and Spark jobs. All three execution engines can run in Hadoop YARN. To accelerate queries, it provides indexes, including bitmap indexes. More at

Your comment has been posted.   Close
Thank you, your comment requires moderation so it may take a while to appear.   Close
Leave a Comment