Azure Data Lake Analytics Quick Start

UPDATE (19-01-2016):  Have a look at Azure Data Lake series for more posts on Azure Data Lake.

Azure Data Lake (both Storage & Analytics) has been in public preview for a month or two.

It already has surprisingly good documentation:

Hadoop-Azure-Logo-New_55D1639C[1]Azure Data Lake Analytics (ADLA) is a really great technology.  It combines the power of Hadoop with the simplicity of the like of Azure SQL Azure.  It’s super productive and easy to use while still being pretty powerfull.

At the core of this productivity is a new language:  U-SQL.  USQL is based on SCOPE, an internal (Microsoft) / research language and aims at unifying the declarative power of SQL with the imperative capacity of C#.

I like to call it Hive for .NET developers.

It’s the ultimately managed Hadoop:  you submit U-SQL & the number of processing unit you want it to run it on and that’s it.  No cluster to configure, no patching, no nodes to take up or down, etc.  .  Nodes are provisioned for you to run your script and returned to a pool afterwards.

I would recommend it for the following scenarios:

  • Exploration of data sets:  load your data in and start running ad hoc queries on to learn what your data is made of
  • Data processing:  process (or pre-process) your data into a shape useful for Machine Learning, reporting, search or online algorithms

I thought I would kick some posts about more complex scenarios to display what’s possibile with that technology.

I won’t cover the basics-basics, so please read the Logistic / Get Started articles.


5 thoughts on “Azure Data Lake Analytics Quick Start

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s