Azure Data Lake Analytics Quick Start

UPDATE (19-01-2016):  Have a look at Azure Data Lake series for more posts on Azure Data Lake.

Azure Data Lake (both Storage & Analytics) has been in public preview for a month or two.

It already has surprisingly good documentation:

Hadoop-Azure-Logo-New_55D1639C[1]Azure Data Lake Analytics (ADLA) is a really great technology.  It combines the power of Hadoop with the simplicity of the like of Azure SQL Azure.  It’s super productive and easy to use while still being pretty powerfull.

At the core of this productivity is a new language:  U-SQL.  USQL is based on SCOPE, an internal (Microsoft) / research language and aims at unifying the declarative power of SQL with the imperative capacity of C#.

I like to call it Hive for .NET developers.

It’s the ultimately managed Hadoop:  you submit U-SQL & the number of processing unit you want it to run it on and that’s it.  No cluster to configure, no patching, no nodes to take up or down, etc.  .  Nodes are provisioned for you to run your script and returned to a pool afterwards.

I would recommend it for the following scenarios:

I thought I would kick some posts about more complex scenarios to display what’s possibile with that technology.

I won’t cover the basics-basics, so please read the Logistic / Get Started articles.

Leave a comment