Automated deployments in Azure Data Explorer

Automated deployment is a critical part of DevOps / DataOps and enables us to:

In this article, we’ll look at different facets of deployment with Azure Data Explorer.

An Azure Data Explorer cluster is the main Azure Resource Manager (ARM) resource. It has a few children resources such as databases & data connections.

Kusto Schema Entities such as tables, materialized views, functions, policies, etc. aren’t controlled by ARM. The are controlled by Azure Data Explorer data plane. They can be created and updated via Control Commands.

ARM Templates

Azure Resource Manager Templates (ARM Templates) are JSON files defining the infrastructure and configuration of a deployment.

Bicep is a domain-specific language (DSL) that uses declarative syntax to deploy Azure resources. It provides concise syntax, reliable type safety, and support for code reuse.

Terraform

TODO (Reference a GitHub repo with a Terraform configuration?)

Imperative deployment

ARM & Terraform are the two main declarative ways to deploy Azure Data Explorer infrastructure (cluster, databases, etc.).

Infrastructure can also be created imperatively using different platforms:

Kusto Schema Entities

Kusto Schema Entities are created / updated by running Kusto scripts consisting of Control Commands.

There are many ways to automate this:

TODO: Need to validate the ‘scripts’ resource approach

Sync Kusto is an interactive developer tool. In the context of automated deployment, it can extract the schema / control commands script of an ADX Database (this step is manual). That script could then be deployed automatically.

Delta Kusto is a Command Line Interface (CLI) tool designed to be invoked in a CI/CD pipeline. It can compare two sources (a control commands script or an ADX database) and compute a delta script, i.e. a script of control commands that would bring one source structurally identical to the other. It can also push that script to an ADX database.

Data Connections

TODO… Discuss the chicken & egg issue with data connections

Data

After deploying ADX instructure & its schema entities, we often want to deploy data on it (e.g. to run tests or recreate an environment).

There are a few ways to deploy data on Azure Data Explorer. We recommend scheduling a Copy activity in Azure Data Factory / Azure Synapse or the equivalent in.


Leave a comment