Tag Archives: Integration

Integration of applications (e.g. API, messaging, storage) & networks (e.g. hybrid connectivity).

Invoking a Stored Procedure from a partitioned CosmosDB collection from Logic Apps

I struggled a little to make that work, so I thought I would share the learning in order to accelerate your future endeavour.

I was looking at a way to populate a CosmosDB quickly with random data.

Stored Procedures came to mind since they would skip client-server latency.  We can call a stored procedure creating hundreds of documents with random data.

Each Stored Procedure runs in a partition, so we need something external to the stored procedure to loop and decide of the partition key.

Enter Logic Apps:  cheap to run and quick to setup.

Stored Procedure

Something important to realize is that some portal features aren’t supported when we deal with a partitioned collection.

One of them is to update the content of a stored procedure (same thing for triggers).  We therefore need to delete it and re-create it.

Here is the stored procedure we used:

function createRecords(recordCount) {
    var context = getContext();
    var collection = context.getCollection();
    var createdIds = [];

    for (i = 0; i < recordCount; i++) {
        var documentToCreate = { part: "abc", name: "sample" + i };
        var accepted = collection.createDocument(
            collection.getSelfLink(),
            documentToCreate,
            function (err, documentCreated) {
                if (err) {
                    throw new Error('Error' + err.message);
                }
                else {
                    createdIds.push(documentCreated.id);
                }
            });

        if (!accepted)
            return;
    }

    context.getResponse().setBody(createdIds)
}

We take the number of documents to create in parameter, loop & create documents.  We return the document IDs in a list in the output.

The documents we create are trivial:  no random data.

Logic App

On the canvas, let’s type Cosmos in the search box for actions.

image

Let’s choose Execute stored procedure.

We are prompted to create a new Cosmos DB connection.  We need to:

  • Type a name for the connection (purely for readability, can be anything)
  • Select an existing Cosmos DB collection

We can then pick the database ID, the collection ID & the stored procedure ID.

image

Stored Procedure parameters are expressed as a JSON array.  For instance here, we want to pass 1000 as the recordCount parameter, so we type [1000]:  no parameter name and always square brackets.

If we would run the app now we would get an error stating the operation requires the partition key.

In order to set the partition key, we need to Show advanced options.

image

In order to specify the partition key value, we simply type its value:  no square bracket, no quotes.

Now we can run the Logic App and it should execute the stored procedure and get its output in the action’s output.

Summary

Invoking a Cosmos DB stored procedure from Logic App isn’t rocket science but there are a few items to get straight in order for it to work properly.

Advertisements

Multi-Tenant SaaS with Azure Active Directory B2B & B2C

Scenario:  I’m creating a Software as a Service (SaaS).  I’m having multiple customers & I want to manage their identity.  For some of my customers the users won’t have corporate identity ; I would like to offer them to login using their social identity (e.g. Facebook) if they want to or create an account on my site otherwise.  For other customers, they’ll come in with a corporate AAD (likely synchronized with on premise AD) and I would like to integrate with that.  How do I do that in Azure?

Azure Active Directory (AAD) is an Identity & Access Management (IAM) service in Azure.

AAD can be used in multiple numbers of ways.  You can use it as a simple identity store where you store user accounts with a given password.  You can then use it to authenticate those users against a web site or a web API.  You can have it synchronize with an on premise Active Directory, using Azure Active Directory Connect.  You can enable multi-factor authentication with it.  You can use it as an identity provider for third party SaaS applications (e.g. SalesForce.com), enabling SSO.  You can also use it for on premise web applications, projecting them in the cloud, with Application Proxy.  You can enable conditional access on those application proxies.

Lots of ways to use it.

Two new ways to use it, still in preview, are AAD B2C & AAD B2B.  Both services are in preview at this time (mid-March 2016).

Business 2 Consumer

AAD B2C allows AAD to integrate with social networks:

This means your AAD won’t be authenticating the users, those social networks will.  But, AAD will still be logging in the users.  This means your application integrates with only one identity provider, which in turns integrates with many.  It therefore federates identities from different sources.

image

You stay in control of the entire onboarding, login & other User Experience (UX) integration with social network by customizing those UX.

AAD B2C still allows you to create “local users”, i.e. users existing solely in your AAD tenant.  This supports the scenario of “falling back” to creating accounts for your site only.  Those users can have access to a self-service password reset.

On top of that AAD B2C allows you to customize the user schema, i.e. adding custom attributes on top of the standard ones (e.g. Given Name, Surname, City, etc.).

Since user accounts are imported in your tenant, you can put different users coming from different social networks within AAD groups, to manage application roles for instance.

You can see an introduction video here, see the pricing here (based on number of users & number of authentications).  You should also consult the up-to-date list of known limitations here.

AAD B2C is actually a special type of AAD.  You have to decide at creation if you want an AAD B2C or not.

image

Business 2 Business

AAD B2B allows you to bring identities from other AADs into your tenant and give them access to your applications.

This feature is free.

Currently (mid March 2016), the only to import user accounts is by using CSV files.  This is done via the “Add User” window:  simply select Users in partner companies in the as the type of user.

image

You need to supply a CSV file with a given format, i.e. with the following column names:

  • Email
  • DisplayName
  • InviteAppID
  • InviteReplyUrl
  • InviteAppResources
  • InviteGroupResources
  • InviteContactUsUrl

The way it works is that the email you give is basically the key to find the user in the right tenant.  For instance, vince@contoso.com will use contoso.com as identity provider.  The user is then imported in your directory and its attributes are copied into it.  The only thing that remains in the foreign AAD is the authentication.

Only non-Microsoft accounts are accepted in those.

Optionally you can specify the “InviteAppID” column.  This will add the user in the given App within your tenant.

image

One of the advantage of importing the user account within your tenant is that you can assign it to groups within your tenant.

There is excellent documentation on this feature.  You can find a detailed walkthrough and learn about the up-to-date current limitations.

Multi tenant SaaS strategy

The strategy I’m giving here leverages Azure Active Directory and isolates each of the SaaS tenant in a different tenant.

You could use one AAD tenant and throw the user accounts of all your tenants in it ; you would then solely use applications within your AAD to segregate between your SaaS tenants.  Although this would work, it would be much harder to manage users this way.

Having separate AAD segregates your SaaS tenants more strongly.  For instance, you could wipeout an entire SaaS tenant by deleting the corresponding AAD.

Here is how it could look like.

image

Cust-X & Cust-Y are the same scenario:  the customers do not have a corporate AAD, hence you rely on AAD B2C to store and federate identities.  You can then use AAD groups within the AAD B2C to control user membership and implement some application roles.

Cust-Z is a scenario where the customer do have a corporate AAD and you leverage AAD B2B to import the users that will use your application.  The Cust-Z AAD tenant becomes a proxy for the corporate AAD.

A variant on this scenario is with Cust-W, where there is no proxy AAD and the AAD application used to authenticate your application’s users is within the corporate AAD of your customer.

So when would you choose Cust-Z over Cust-W scenario?

Cust-Z allows the owner of the application (you) to create groups within the AAD proxy and include different users into it.  With Cust-W you can’t do that since you do not control the AAD.  The advantage of Cust-W scenario is that your customer can more easily control which users should access your application since that one lives within his / her AAD.

Conclusion

So there you have it, a strategy to manage identities within a multi tenant SaaS solution.

As usual with architecture, there are more than one possible ways to solve the problem, so your mileage will vary as you’ll perform different compromises.

Nevertheless, the combination of AAD B2C & AAD B2B adds a lot of flexibility on how AAD can manage external identities.

Integration with Azure Service Bus

message[1]

I’ve been consulting 1.5 years for a customer embarking a journey leveraging Microsoft Azure as an Enterprise platform, helping them rethink their application park.

Characteristic of that customer:

  • Lots of Software as a Service (Saas) third parties
  • Business is extremely dynamic, in terms of requirements, transitions, partnerships, restructuring, etc.
  • Medium operational budget:  they needed to get it pretty much right the first time
  • Little transaction volume

One of the first thing we did was to think about the way different systems would integrate together given the different constraints of the IT landscape of the organization.

We settled on use Azure Service Bus to do a lot of the integrations.  Since then, I worked to help them actually implement that in their applications all the way to the details of operationalization.

Here I wanted to give my lessons learned on what worked well and what didn’t.  Hopefully, this would prove useful to others out there set out to do similar integration program.

Topics vs Queues

The first thing we decided was to use Topics & Subscriptions as opposed to queues.  Event Hubs didn’t exist when we started so it wasn’t considered.

They work in similar ways with one key difference:  a topic can have many subscribers.

This ended up being a really good decision.  It costs nearly nothing:  configuring a subscription takes seconds longer than just configuring a queue.  But it bought us the flexibility to add subscribers along the way as we evolved without disrupting existing integrations.

A big plus.

Meta Data

imageIn order to implement a meaningful publish / subscribe mechanism, you need a way to filter messages.  In Azure Service Bus, subscription filter topic messages on meta data, for instance:

  • Content Type
  • Label
  • To
  • Custom Properties

If you want your integration architecture to have long-term value and what you build today be forward compatible, i.e. you want to avoid rework when implementing new solutions, you need to make it possible for future consumers to filter today’s messages.

It’s hard to know what future consumers will need but you can try populating the obvious.  Also, make sure your consumers don’t mind if new meta data is added along the way.

For instance, you want to be able to publish new type of messages.  A topic might start with having orders published on it but with time you might want to publish price-correction messages.  If a subscription just take everything from the topic, it will swallow the price-correction and potentially blow the consumer.

One thing we standardized was the use of content-type.  A content type would tell what type of message the content is about.  The content-type would actually contain the major version of the message.  This way an old consumer wouldn’t break when we would change a message version.

We used labels to identity the system publishing a message.  This was often useful to stop a publishing loop:  if a system subscribes to a topic where it itself publishes, you don’t want it to consume its own message and potentially re-publish information.  This field would allow us to filter out message this way.

Custom Properties were more business specific and the hardest to guess in advance.  It should probably contain the main attributes contained in the message itself.  For an order message, the product ID, product category ID, etc.  should probably be in it.

Filtering subscription

filter[1]Always filter subscription!  This is the only way to ensure future compatibility.  Make sure you specify what you want to consume.

Also, and I noticed only too late while going into production:  filtering gives you a massive efficiency boost under load.

One of the biggest integration we developed did a lot of filtering on the consumer-side, i.e. the consumer C# code reading messages would discard messages based on criteria that could have been implemented in the filters.  That caused the subscriptions to catch way more messages than they should and take way more time to process.

Filtering is cheap on Azure Service Bus.  It takes minutes more to configure but accelerate your solution.  Use it!

Message Content

You better standardize on the format of messages you’re going to carry around.  Is it in XML, JSON, .NET binary serialized?

Again you want your systems to be decoupled so having a standard message format is a must.

Automatic Routing

There is a nice feature in Azure Service Bus:  Forward To.  This is a property of a subscription where you specify which topic (or queue) you want every message getting into the subscription to be routed to.

Why would you do that?

Somebody had a very clever idea that turned out to pay lots of dividend down the road.  You see, you may want to replay messages when they fail and eventually fall in the dead letter queue.  The problem with a publish / subscribe model is that when you replay a message, you replay it in the topic and all subscriptions get it.  Now if you have a topic with say 5 subscriptions and only one subscription struggles with a message and you replay it (after, for instance, changing the code of the corresponding consumer), then 4 subscriptions (previously successfully processing the message) will receive it again.

So the clever idea was to forward messages from every subscriptions to other topics where they could be replayed.  Basically we had two ‘types’ of topics, topic to publish messages and topic to consume messages.

Semantic of Topics

While you are at it, you probably want to define what your topics represent.

Why not put all messages under one topic?  Well, performance for one thing but probably management at some point.  At the other end of the spectrum, why not one topic per message type?

Order.

Service Bus guaranties order within the same topic, i.e. messages will be presented in the order they were delivered.  That is because you can choose to consume your messages (on your subscription) one by one.  But if messages are in different topics, you’ll consume them in different subscription and the order can be altered.

If order is important for some messages, regroup them under a same topic.

We ended up segmenting topics along enterprise data domains and it worked fine.  It really depends what type of data transits on your bus.

Multiplexing on Sessions

imageA problem we faced early on was due to caring a bit too much about order actually.

We did consume one message at the time.  That could have performance issues but the volume wasn’t big, so that didn’t hit us.

The problems started when you encounter a poison message though.  What do you do with it?  If you let it reach the dead letter queue then you’ll process the next message and violate order.  So we did put a huge retry count so this would never happen.

But then that meant blocking the entire subscription until somebody got tired and looked into it.

A suggestion came from Microsoft Azure Bus product team itself.  You can assign a session-ID to message.  Messages with the same session-ID would be grouped together and order properly while messages from different session can be process independently.  Your subscription needs to be session-ful for this to work.

This allowed us to have only one of the session to fail and the other messages to kept being processed.

Now how do you choose your session-ID?  You need to group messages that depend (order-wise) on each other together.  That typically boils down to the identifier of an entity in the message.

This can also speedup message processing since you are no longer bound to one-by-one.

After that failing messages will keep failing but that will only hold on correlated messages.  That is a nice “degraded service level” as opposed to completely failing.

Verbose Message Content

One of the thing we changed midway was the message content we passed.  At first we use the bus to really send data, not only events.

There are advantages in doing so:  you really are decoupled since the consumer gets the data with the message and the publishing system doesn’t even need to be up when the consumer process the message.

It has one main disadvantage when you use the bus to synchronize or duplicate data though:  the bus becomes this train of data and any time you disrupt the train (e.g. failing message, replaying message, etc.) you run into the risk of breaking things.  For instance, if you switch two updates, you’ll end up having old data updated in your target system.  It sounds far fetched but in operation it happens all the time.

Our solution was to simply send identifiers with the message.  The consumer would interrogate the source system to get the real data.  This way the data it would get would always be up to date.

I wouldn’t recommend using that approach all the time since you lose a lot of benefits from the publish / subscribe mechanism.  For instance, if your message represents an action you want another system to perform (e.g. process order), then having all the data in the message is fine.

Summary

This was the key points I learned from working with the Azure Service Bus.

I hope they can be useful to you & your organization.  If you have any question or comments, do not hesitate to hit the comments section!

Azure Data Factory Editor (ADF Editor)

Azure Data Factory is still in preview but obviously has a committed team behind it.

networking[1]

When I looked at the Service when the preview was made available in last Novembre, the first thing that stroke me was the lack of editor, of designing surface.  Instead, you had to configure your factory entirely in JSON.  You could visualize the resulting factory in a graphical format, but the entry mode was JSON.

Now don’t get me wrong.  I love JSON as the next guy but it’s not exactly intuitive when you do not know a service and as the service was nascent, the document was pretty slim.

Obviously Microsoft was aware of that since a few months after, they released a light-weight web editor.  It is a stepping stone in the right direction.  You still have to type a lot of JSON and there are no, but the tool provides templates and offers you some structure to package the different objects you use.

It isn’t a real design surface yet, but it’s getting there.

I actually find this approach quite good.  Get the back-end out there first, get the early adopters to toy with it, provide feedback, improve the back-end, iterate.  This way by the time the service team develop a real designer, the back-end will be stable.  In the mean time the service team can be quite nimble with the back-end, avoiding to take decision such as “let’s keep that not-so-good-feature because it has too many ramifications in the editor to change at this point”.

I still haven’t played much with ADF given I didn’t need it.  Or actually, I could have used it on a project to process nightly data but it wasn’t straightforward given the samples at launch time and the project was too high visibility and had too many political problems to add an R&D edge to it.  Since then, I am looking at the service with interest and can’t wait for tooling to democratize its usage!

Azure Key Vault

Has somebody been peeking on my X-mas list?

Indeed, one of the weakness of the current Azure Paas solution I pointed out last year was that on non-trivial solutions you end up with plenty of secrets (e.g. user-name / password, SAS, account keys, etc.) stored insecurely in your web.config (or similar store).

I was suggesting, as a solution, to create a Secret Gateway between your application and a secret vault.

Essentially, Azure Key Vault fulfils the secret vault part and a part of the Secret Gateway too.

Azure Key Vault, a new Azure Service currently (as of early February 2015) in Preview mode, allows you to store keys and other secret in a vault.

One of the interesting feature of Azure Key Vault is that, as a consumer, you authenticate as an Azure Active Directory (AAD) application to access the vault and are given authorization as the application. You can therefore easily foresee scenarios where the only secret stored in your configuration is your AAD application credentials.

The vault also allows you to perform some cryptographic operation on your behalf, e.g. encrypting data using a key stored in the vault. This enables scenarios where the consuming application never knows about the encrypting keys. This is why I say that Azure Key Vault performs some functions I described for the Secret Gateway.

I see many advantages of using Azure Key Vault. Here are the ones that come on the top of my head:

  • Limit the amount of secrets stored in your application configuration file
  • Centralize the management of secrets: a key is compromised and you want to change it, no more need to chase the config files storing it, simply change it in one place in the vault.
  • Put secrets at the right place: what is unique to your application? Your application itself, i.e. AAD application credentials. That is in your app config file, everything else is in the vault.
  • Audit secret access
  • Easy to revoke access to secrets
  • Etc.

I think to be air tight, the Secret Gateway would still be interesting, i.e. an agent that authenticates on your behalf and return you a token only. This way if your application is compromised, only temporary tokens are leaked, not the master keys.

But with Azure Key Vault, even if you do not have a Secret Gateway, if you compromise your master keys you can centrally rotate them (i.e. change them) without touching N applications.

I’m looking forward to see where this service is going to grow and certainly will consider it in future Paas Architecture.

Large Projects

There is something about large projects that you’ll never find, hence never learn, in smaller projects. The complexity, both technical and in terms of people dynamics, creates an all new set of challenges.

I read the article I Survived an ERP Implementation – Top 10 Gems of Advice I Learned the Hard Way at the beginning of the week. I was interested by the title since although I often work in companies where the ERP occupies a central place (don’t they always?), I’ve never been part of the implementation of an ERP.

As I read the article though I found much similarities between what the author was saying about the dynamics of an ERP implementation and large projects I’ve been on.

I therefore recommend it even if you don’t plan implementing an ERP anytime soon 😉

For instance, here are comments I would throw on the top of my head for some of her gems around ERP that apply to large projects in general:

10. Don’t be fooled by the system sales team. If they tell you “of course our system can do that” or “absolutely, with small modifications”, have your technical experts talk to their technical experts. Go in with eyes wide open.

 

Overselling a product isn’t a monopoly of the ERP sub industry, unfortunately!

Any complex product can easily be oversold by sampling the feature sheet of said product and matching it with a project’s requirement. The devil often is in the detail and on a large project, you can’t dive into all little details straight from the beginning.

If you are the technical expert or the one assigned to evaluate a product here are a couple of tips:

  • Ask questions, lots of questions
  • You won’t be able to cover everything so try to do horizontal and vertical sweeps: walkthrough an entire business process, look at an entire slice of data, look at an end-to-end identity journey, etc.

9. Whatever time period (and budget) you think will be required to go-live, you are most likely underestimating it. It’s tough enough to be immersed in an implementation, but continually pushing back the go-live date is deflating to the entire organization.

 

Large projects take more time than small projects, right? 😉 Well, no, they take even longer!

Large projects have explicit steps that are implicit or much smaller scale on smaller projects: data migration, change management, business process optimization, user experience, etc. . Each take a life of its own and shouldn’t be underestimated.

8. There is no possible way you can over-communicate. Regardless of forum, of timeliness, of method, there is no such thing as too much.

 

Large projects have more people involved and last longer, hence give time for staff to churn. Your message will get distorted through layers of team, time, etc. . So repeating your message ensure that pieces of it will reach their destination.

6. Data is sexy. Learn to love it; treat it with respect and care. It’s the backbone of a successful implementation. You don’t want to to experience go-live with a broken back.

 

Amen.

 

5. A lack of change management will bite you in the butt. Post go-live, the speed at which you get through the hangover period will be heavily dependent on how well you managed change throughout the project.

 

You think the finish line is the delivery of the project? No, it’s the user acceptance of a new product, the large user base. If they reject it, whatever you have done won’t matter.

 

1. ERP implementations are equal parts politics and emotions. Ignore the effect of either of these at your own peril.

 

Expectations, perceptions, unspoken assumptions… ghosts that can harm you as much as the real thing. Do not ignore them!

Azure ACS fading away

ACS is on life support for quite a while now.  It was never never fully integrated to the Azure Portal, keeping the UI it had in its Azure Labs day (circa 2010, for those who were born back then).

In an article last summer, Azure Active Directory is the future of ACS, Vittorio Bertocci announces the roadmap:  the demise of ACS as Windows Azure Active Directory (WAAD) beefs up its feature set.

In a more recent article about Active Directory Authentication Library (ADAL), it is explained that ACS didn’t get feature parity with WAAD on Refresh Token capabilities.  So it has started.

For me, the big question is Azure Service Bus.  The Service Bus uses ACS as its Access Control mechanism.  As I explained in a past blog, the Service Bus has a quite elegant and granular way of securing its different entities through ACS.

Now, what is going to happened to that when ACS goes down?  It is anyone’s guess.

Hopefully the same mechanisms will be transposed to WAAD.