Are you an ambassador on SQL Server but a pleb when it comes to technologies using Big Data on Hadoop? I have been working with SQL Server in some fashion since I graduated from college. For those of you who know my age, there have been many rotations around the sun since that day. While I wouldn’t consider myself an ambassador, I definitely have the capabilities to make a database purr like a kitten and I am dangerous enough to bring a server to its knees. While, I have worked with multiple database technologies, none have interested me as much as those revolving around Big Data.
Let’s face it…Due to the decreasing costs of storage, the increasing needs to keep and archive historical data, and the mass amounts of unstructured data available, Big Data solutions are here to stay. As consultants, we will eventually be required to work with them, in some capacity, as part of our day-to-day responsibilities. If you are like me and have a ton of experience in the Microsoft stack but limited experience in Unix, Hadoop, and the programs that allow the querying of data from Hadoop, big data can appear somewhat daunting. Well, maybe not for us Scorpians.
I was excited to learn that one of the features introduced in SQL Server 2016 is PolyBase. Prior to PolyBase, SQL Server developers had to use Sqoop to move the data from the Hadoop cluster into SQL Server. This method required the replication of the data to the SQL server resulting in negative performance due to the data transfers. With the introduction of PolyBase, the data can be accessed directly from the HDFS clusters. Analysts and developers will have the ability to query as well as join relational data from SQL Server to semi structured data in Hadoop as well as SQL Azure blob storage using Transact SQL Statements.
PolyBase was initially introduced with SQL Server PDW but wasn’t readily available due to the price tag of an MPP license. With the addition of PolyBase to SQL Server 2016, organizations can now leverage their existing SQL Server technology to start interacting with semi-structured data using their existing skillsets and BI tools.
Visit the following MSDN article for how to get started with PolyBase in SQL Server 2016: