Primer on Azure Monitor


pexels-photo-42384[1]

Azure Monitor is the latest evolution of a set of technologies allowing Azure resources monitoring.

I’ve written about going the extra mile to be able to analyze logs in the past.

The thing is that once our stuff is in production with tons of users hitting it, it might very well start behaving in unpredictable ways.  If we do not have a monitoring strategy, we’re going to be blind to problems and only see unrelated symptoms.

Azure Monitor is a great set of tools.  It doesn’t try to be the end all solution.  On the contrary, although it offers analytics out of the box, it let us export the logs wherever we want to go further.

I found the documentation of Azure Monitor (as of November 2016) a tad confusing, so I thought I would give a summary overview here.  Hopefully it will get you started.

Three types of sources

First thing we come across in Azure Monitor’s literature is the three types of sources:  Activity Logs, Diagnostic Logs & Metrics.

There is a bit of confusion between Diagnostic Logs & Metrics, some references hinting towards the fact that metrics are generated by Azure while diagnostics are generated by the resource itself.  That is confusing & beside the point.  Let’s review those sources here.

Activity logs capture all operations performed on Azure resources.  They used to be called Audit Logs & Operational Logs.  Those comes directly from Azure APIs.  Any operations done on an Azure API (except HTTP-GET operations) traces an activity log.  Activity log are in JSON and contain the following information:  action, caller, status & time stamp.  We’ll want to keep track of those to understand changes done in our Azure environments.

Metrics are emitted by most Azure resources.  They are akin to performance counter, something that has a value (e.g. % CPU, IOPS, # messages in a queue, etc.) over time ; hence Azure Monitor, in the portal, allows us to plot those against time.  Metrics typically comes in JSON and tend to be emitted at regular interval (e.g. every minute) ; see this articles for available metrics.  We’ll want to check those to make sure our resources operate within expected bounds.

Diagnostic logs are logs emitted by a resource that provide detailed data about the operation of that particular resource.  That one is specific to the resource in terms of content:  each resource will have different logs.  Format will also vary (e.g. JSON, CSV, etc.), see this article for different schemas.  They also tend to be much more voluminous for an active resource.

That’s it.  That’s all there is to it.  Avoid the confusion and re-read the last three paragraphs.  It’s a time saver.  Promised.

We’ll discuss the export mechanisms & alerts below, but for now, here’s a summary of the capacity (as of November 2016) of each source:

Source Export to Supports Alerts
Activity Logs Storage Account & Event Hub Yes
Metrics Storage Account, Event Hub & Log Analytics Yes
Diagnostics Logs Storage Account, Event Hub & Log Analytics No

Activity Log example

We can see the activity log of our favorite subscription by opening the monitor blade, which should be on the left hand side, in the https://portal.azure.com.

image

If you do not find it there, hit the More Services and search for Monitor.

Selecting the Activity Logs, we should have a search form and some results.

image

ListKeys is a popular one.  Despite being conceptually a read operation, the List Key action, on a storage account, is done through a POST in the Azure REST API, specifically to trigger an audit trail.

We can select one of those ListKeys and, in the tray below, select the JSON format:


{
"relatedEvents": [],
"authorization": {
"action": "Microsoft.Storage/storageAccounts/listKeys/action",
"condition": null,
"role": null,
"scope": "/subscriptions/<MY SUB GUID>/resourceGroups/securitydata/providers/Microsoft.Storage/storageAccounts/a92430canadaeast"
},
"caller": null,
"category": {
"localizedValue": "Administrative",
"value": "Administrative"
},
"claims": {},
"correlationId": "6c619af4-453e-4b24-8a4c-508af47f2b26",
"description": "",
"eventChannels": 2,
"eventDataId": "09d35196-1cae-4eca-903d-6e9b1fc71a78",
"eventName": {
"localizedValue": "End request",
"value": "EndRequest"
},
"eventTimestamp": "2016-11-26T21:07:41.5355248Z",
"httpRequest": {
"clientIpAddress": "104.208.33.166",
"clientRequestId": "ba51469e-9339-4329-b957-de5d3071d719",
"method": "POST",
"uri": null
},

I truncated the JSON here.  Basically, it is an activity event with all the details.

Metrics example

Metrics can be accessed from the “global” Monitor blade or from any Azure resource’s monitor blade.

Here I look at the CPU usage of an Azure Data Warehouse resource (which hasn’t run for months, hence flat lining).

image

Diagnostic Logs example

For diagnostics, let’s create a storage account and activate diagnostics on it.  For this, under the Monitoring section, let’s select Diagnostics, make sure the status is On and then select Blob logs.

image

We’ll notice that all metrics were already selected.  We also noticed that the retention is controlled there, in this case 7 days.

Let’s create a blob container, copy a file into it and try to access it via its URL.  Then let’s wait a few minutes for the diagnostics to be published.

We should see a special $logs container in the storage account.  This container will contain log files, stored by date & time.  For instance for the first file, just taking the first couple of lines:


1.0;2016-11-26T20:48:00.5433672Z;<strong>GetContainerACL</strong>;Success;200;3;3;authenticated;monitorvpl;monitorvpl;blob;"https://monitorvpl.blob.core.windows.net:443/$logs?restype=container&amp;comp=acl";"/monitorvpl/$logs";295a75a6-0001-0021-7b26-48c117000000;0;184.161.153.48:51484;2015-12-11;537;0;217;62;0;;;"&quot;0x8D4163D73154695&quot;";Saturday, 26-Nov-16 20:47:34 GMT;;"Microsoft Azure Storage Explorer, 0.8.5, win32, Azure-Storage/1.2.0 (NODE-VERSION v4.1.1; Windows_NT 10.0.14393)";;"9e78fc90-b419-11e6-a392-8b41713d952c"
1.0;2016-11-26T20:48:01.0383516Z;<strong>GetContainerACL</strong>;Success;200;3;3;authenticated;monitorvpl;monitorvpl;blob;"https://monitorvpl.blob.core.windows.net:443/$logs?restype=container&amp;comp=acl";"/monitorvpl/$logs";06be52d9-0001-0093-7426-483a6d000000;0;184.161.153.48:51488;2015-12-11;537;0;217;62;0;;;"&quot;0x8D4163D73154695&quot;";Saturday, 26-Nov-16 20:47:34 GMT;;"Microsoft Azure Storage Explorer, 0.8.5, win32, Azure-Storage/1.2.0 (NODE-VERSION v4.1.1; Windows_NT 10.0.14393)";;"9e9c6311-b419-11e6-a392-8b41713d952c"
1.0;2016-11-26T20:48:33.4973667Z;<strong>PutBlob</strong>;Success;201;6;6;authenticated;monitorvpl;monitorvpl;blob;"https://monitorvpl.blob.core.windows.net:443/sample/A.txt";"/monitorvpl/sample/A.txt";965cb819-0001-0000-2a26-48ac26000000;0;184.161.153.48:51622;2015-12-11;655;7;258;0;7;"Tj4nPz2/Vt7I1KEM2G8o4A==";"Tj4nPz2/Vt7I1KEM2G8o4A==";"&quot;0x8D4163D961A76BE&quot;";Saturday, 26-Nov-16 20:48:33 GMT;;"Microsoft Azure Storage Explorer, 0.8.5, win32, Azure-Storage/1.2.0 (NODE-VERSION v4.1.1; Windows_NT 10.0.14393)";;"b2006050-b419-11e6-a392-8b41713d952c"

Storage Account diagnostics obviously log in semicolon delimited values (variant of CSV), which isn’t trivial to read the way I pasted it here.  But basically we can see the logs contain details:  each operation done around the blobs are logged with lots of details.

Querying

As seen in the examples, Azure Monitor allows us to query the logs.  This can be done in the portal but also using Azure Monitor REST API, cross platform Command-Line Interface (CLI) commands, PowerShell cmdlets or the .NET SDK.

Export

We can export the sources to a Storage Account and specify a retention period in days.  We can also export them to Azure Event Hubs & Azure Log Analytics.  As specified in the table above, Activity logs can’t be sent to Log Analytics.  Also, Activity logs can be analyzed using Power BI.

There are a few reasons why we would export the logs:

  • Archiving scenario:  Azure Monitor keeps content for 30 days.  If we need more retention, we need to archive it ourselves.  We can do that by exporting the content to a storage account ; this also enables big data scenario where we keep the logs for future data mining.
  • Analytics:  Log Analytics offers more capacity for analyzing content.  It also offers 30 days of retention by default but can be extended to one year.  Basically, this would upgrade us to Log Analytics.
  • Alternatively, we could export the logs to a storage account where they could be ingested by another SIEM (e.g. HP Arcsight).  See this article for details about SIEM integration.
  • Near real time analysis:  Azure Event Hubs allow us to send the content to many different places, but also we could analyze it on the fly using Azure Stream Analytics.

Alerts (Notifications)

Both Activity Logs & Metrics can trigger alerts.  Currently (as of November 2016), only Metrics alert can be set in the portal ; Activity Logs alerts must be set by PowerShell, CLI or REST API.

Alerts are a powerful way to automatically react to our Azure resource behaviors ; when certain conditions are met (e.g. for a metric, when a value exceeds a threshold for a given period of time), the alert can send an email to a specified list of email addresses but also, it can invoke a Web Hook.

Again, the ability to invoke a web hook opens up the platform.  We could, for instance, expose an Azure Automation runbook as a Web Hook ; it therefore means an alert could trigger whatever a runbook is able to do.

Security

There are two RBAC roles around monitoring:  Reader & Contributor.

There are also some security considerations around monitoring:

  • Use a dedicated storage account (or multiple dedicated storage accounts) for monitoring data.  Basically, avoid mixing monitoring and “other” data, so that people do not gain access to monitoring data inadvertently and, vis versa, that people needing access to monitoring data do not gain access to “other” data (e.g. sensitive business data).
  • For the same reasons, use a dedicated namespace with Event Hubs
  • Limit access to monitoring data by using RBAC, e.g. by putting them in a separate resource group
  • Never grant ListKeys permission across a subscription as users could then gain access to reading monitoring data
  • If you need to give access to monitoring data, consider using a SAS token (for either Storage Account or Event Hubs)

Summary

Azure Monitor brings together a suite of tools to monitor our Azure resources.  It is an open platform in the sense it integrates easily with solutions that can complement it.

2 thoughts on “Primer on Azure Monitor

  1. Pingback: Azure Weekly: Nov 28, 2016 – Build Azure

  2. Pingback: Azure Monitoring – SIEM integration | Vincent-Philippe Lauzon's blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s