U-SQL Cross Apply


UPDATE (19-01-2016):  Have a look at Azure Data Lake series for more posts on Azure Data Lake.

Azure Data Lake (both Storage & Analytics) has been in public preview for a month or two.

You can get started by reading this.

I thought I would kick some posts about more complex scenarios to display what’s possibile with that technology.

In this post I’ll write about a specific U-SQL operation since its syntax is a bit tricky.

The Problem

U-SQL supports arrays in the form of the .NET type SQL.Array<T>.  This is very convenient in many scenarios, especially when you are parsing inputs.

We often want to take an array and explode it on different rows.  This is useful not only with arrays but with SQL.MAP<T,V>, or other complex types, e.g. XML or JSON.

This is accomplished by using the CROSS APPLY & EXPLODE operators.

The Solution

Let’s look at the solution.

@content =
	SELECT *
	FROM(
		VALUES
		(
			12,
			"a, b, c"
		),
		(
			75,
			"f, g, h, i, j"
		)) AS t([ID], LetterList);

@inArray =
	SELECT [ID],
		SqlArray.Create(LetterList.Split(',')) AS LetterArray
	FROM @content;

@exploded =
	SELECT [ID],
		letter.Trim() AS Letter
	FROM @inArray
	CROSS APPLY
	EXPLODE(LetterArray) AS r(letter);

OUTPUT @exploded
TO "/Outputs/explosion.tsv"
USING Outputters.Tsv();

In the first expression, I create a table from a string:

ID LetterList
12 “a, b, c”
75 “f, g, h, j, k”

In the second expression, I transform the second column into an array:

ID LetterList
12 {“a”, ” b”, ” c”}
75 {“f”, ” g”, ” h”, ” j”, ” k”}

In the third and final expression, I explode the arrays on different rows (and trim the string to remove trailing spaces):

ID Letter
12 “a”
12 “b”
12 “c”
75 “f”
75 “g”
75 “h”
75 “i”
75 “j”

Once you know it the syntax is pretty simple but it isn’t trivial to deduce that from the documentation, hence this post.

Conclusion

Very useful to basically pivot raw arrays to rows, CROSS APPLY with EXPLODE is another tool to help you parse semi-structured data into structured data (tables).

4 thoughts on “U-SQL Cross Apply

  1. Pingback: U-SQL Cross Apply | Dinesh Ram Kali.

  2. Pingback: Azure Data Lake Analytics – Loading files with custom C# code | Vincent-Philippe Lauzon's blog

  3. dhruba

    hi Vincent,

    How can I do the reverse thing I create a file of final expression and now I need to create In the first expression

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s