Large Projects

There is something about large projects that you’ll never find, hence never learn, in smaller projects. The complexity, both technical and in terms of people dynamics, creates an all new set of challenges.

I read the article I Survived an ERP Implementation – Top 10 Gems of Advice I Learned the Hard Way at the beginning of the week. I was interested by the title since although I often work in companies where the ERP occupies a central place (don’t they always?), I’ve never been part of the implementation of an ERP.

As I read the article though I found much similarities between what the author was saying about the dynamics of an ERP implementation and large projects I’ve been on.

I therefore recommend it even if you don’t plan implementing an ERP anytime soon ;)

For instance, here are comments I would throw on the top of my head for some of her gems around ERP that apply to large projects in general:

10. Don’t be fooled by the system sales team. If they tell you “of course our system can do that” or “absolutely, with small modifications”, have your technical experts talk to their technical experts. Go in with eyes wide open.


Overselling a product isn’t a monopoly of the ERP sub industry, unfortunately!

Any complex product can easily be oversold by sampling the feature sheet of said product and matching it with a project’s requirement. The devil often is in the detail and on a large project, you can’t dive into all little details straight from the beginning.

If you are the technical expert or the one assigned to evaluate a product here are a couple of tips:

  • Ask questions, lots of questions
  • You won’t be able to cover everything so try to do horizontal and vertical sweeps: walkthrough an entire business process, look at an entire slice of data, look at an end-to-end identity journey, etc.

9. Whatever time period (and budget) you think will be required to go-live, you are most likely underestimating it. It’s tough enough to be immersed in an implementation, but continually pushing back the go-live date is deflating to the entire organization.


Large projects take more time than small projects, right? ;) Well, no, they take even longer!

Large projects have explicit steps that are implicit or much smaller scale on smaller projects: data migration, change management, business process optimization, user experience, etc. . Each take a life of its own and shouldn’t be underestimated.

8. There is no possible way you can over-communicate. Regardless of forum, of timeliness, of method, there is no such thing as too much.


Large projects have more people involved and last longer, hence give time for staff to churn. Your message will get distorted through layers of team, time, etc. . So repeating your message ensure that pieces of it will reach their destination.

6. Data is sexy. Learn to love it; treat it with respect and care. It’s the backbone of a successful implementation. You don’t want to to experience go-live with a broken back.




5. A lack of change management will bite you in the butt. Post go-live, the speed at which you get through the hangover period will be heavily dependent on how well you managed change throughout the project.


You think the finish line is the delivery of the project? No, it’s the user acceptance of a new product, the large user base. If they reject it, whatever you have done won’t matter.


1. ERP implementations are equal parts politics and emotions. Ignore the effect of either of these at your own peril.


Expectations, perceptions, unspoken assumptions… ghosts that can harm you as much as the real thing. Do not ignore them!

WADL in a bottle eating noodles

In my last entry about REST web services I talked about its biggest weakness for me: the lack of description model of REST services.

The idea of hitting an HTTP endpoint as a shot in the dark is for me quite a leap of faith, and very likely an invitation to spend hours troubleshooting.

But despair no more, enters WADL! If it sounds like WSDL, it’s because it’s essentially the same acronym:

Web Service Definition Language -> WSDL

Web Application Description Language -> WADL

So WADL aims to be the WSDL of REST.

But… it was submitted the W3C in December 2009 by Sun Microsystem… one month before it was acquired by Oracle. Since them, it hasn’t budged… coincidence?

No other parties seem to have backed it, so it seems deemed to join the junkyard of unilateral attempt at standardizing global assets!

You can look up at an example on Wikipedia.

Maybe we’ll have another standard one day. Or maybe it’s a non-issue and I’m the only one to worry about it.

REST style with Hypermedia APIs

Once upon a time there was SOAP. SOAP really was a multi-vendor response to CORBA. It even share the same type of acronym, derived from object. Objects are so 90’s dude… The S in SOAP stands for Simple by the way. Have a go at a bare WSDL and try to repeat in your head that it is simple…

Then REST came along. I remember reading about REST back in 2002. It was a little after Roy Fielding‘s seminal article (actually his PhD thesis). Then there were a few articles about how SOAP bastardized the web and how XML RPC was so much better. But like the VHS vs Betamax battle before, the winner wasn’t going to be chosen on technical prowess. At least not at the beginning.

Then I stopped hearing about REST in 2003 and started seeing SOAP everywhere. We implemented it like COM+ interfaces really. A classic in the .NET community was to through Datasets on the wire via SOAP services. That really was a great way to misuse a technology… Ah… the youth… (a tear).

Microsoft tried to correct the trajectory by introducing WCF which enforced, or at least strongly suggest, a more SOA approach with a stronger focus on contracts and making boundaries more explicit. But somehow it was too late… something else was brewing beneath the SOA world…

In 2007, REST came back into fashion but now it was mainstream, i.e. people didn’t understand it, misquote it and threw it everywhere. Basically, it was: cool man, no more bloody contracts, I just send you an XML document, it’s so much simpler! Which of course works awesomely for 2-3 operations, then you start to get lost without a service repository because there are no explicit documentation!

If you see a parallel with the No-SQL movement (cool man, no more bloody schema, I just throw data in a can without ceremony, it’s so much simpler), I got no idea what you are talking about.

Anyway, if it wasn’t obvious, I’m not at all convinced that REST services solve that many issues by themselves. Ok, they don’t require a SOAP stack which make them appealing for a broader reach (read browser & mobile). But without the proverbial Word document next to you to know which service to call and to do something with, they aren’t that easy to use.

Then, finally, came Hypermedia API… I’ve a few articles about those, including the very good Designing and Implementing Hypermedia APIs by Mike Amundsen. I found in Hypermedia APIs the same magic I found when looking at HTML the first time: simple, intuitive & useful.

Hypermedia APIs are basically REST Web Services where you have one (or few) entry doors operations and from which you can find links to other operations. For instance, a list operation would return a list of items and each item would contain a URL pointing to the detail of the item. Sounds familiar? That’s how a portal (or dashboard) work in HTML.

Actually, you already know the best Hypermedia API there is: OData. With OData, you group many entities under a service. The root operation returns you a list of entities with a URL to an operation listing the instances of those entities.

The magic with Hypermedia APIs is that you just need to know your entry points and then the service becomes self-documented. It replaces a meta data entry (a la WSDL) with the service content itself.

The difference between now and the 2000’s when SOAP was developed is that now we really do need Services. We need them to integrate different systems within and across companies.

SOAP failed to deliver because of its complexity but mostly because it’s a nightmare to interoperate (ever tried to get a System.DateTime .NET type into a Java system? Sounds trivial, doesn’t it?).

REST seems easier on the surface because it’s just XML (or JSON). But you do lose a lot. The meta-data but also the WS-* protocols. Ok it was nearly impossible to interoperate with those but at least there was a willingness, a push, to standardise on things such as security & transactions. With REST, you’re on your own. You want atomicity between many operations? No worries, I’ll bake that into my services! It won’t look like any else you’ve ever seen or are likely to see though.

Mostly, you lose the map. You lose the ability to say ‘Add Web Reference’ and have your favorite IDE pump the metadata and generate nice strongly type proxies that will show up in intellisense as you interact with the proxy. Sounds like a gadget but how much is Intellisense responsible for the discovery of APIs for you? For me, it must be above %80.

Hypermedia API won’t give you Intellisense, but it will guide you in how to use the API. If you use it in your designs, you’ll also quickly find out that it will drive you to standardise on representations.

Applied SOA: Part 9–Service Versioning

This is part of a series of blog post about Applied SOA. The past blog entries are:

In this article, I’ll cover Service Versioning.

In SOA a Service usually has many consumers & it should be able to evolve independently of its consumers.  By that I mean that when a Service evolves, consumers shouldn’t break.  Otherwise we would have a hard time to evolve services.

We can do quite a bit of changes to a Service without breaking existing consumers logic. For instance:

  • Changing Service implementation details, i.e. not its behaviour, e.g. tap to another Database, add logging, route information to different systems, etc.
  • Adding a new Service Operation to a Service
  • Adding optional data-items in message payload

What needs to remain stable for consumers to continue to operate normally is called the Service contract:  Service behaviours and information exchange patterns, i.e. schemas & bindings.  This definition of Service contract is broader than WCF Service Contracts.

Service Contract can be backward compatible, i.e. a new Service Contract can support the same consumer base (for instance when we add a Service Operation without altering existing ones).

We need to version Service Contracts in order to accommodate for breaking changes.  When an evolution of a Service requires to introduce breaking change in its Contract, we need to support the current version until all consumers can be moved to the new version.

We want to keep track of two types of Service Versions:  breaking ones and non-breaking changes.  Breaking change versions require multiple versions to co-exist in order to support different consumers while non-breaking change versions are useful to determine the capacity of a Service.

A typical versioning scheme is to track breaking change versions with major version number while non-breaking versions are tracked with minor version number.  This is the scheme used by Windows Azure REST services for instance.

Concurrent Service Versions

Maintaining more than one Service Contract version at the same time is notoriously hard.  Typically you introduce breaking changes because either a business process or a back-end system imposes that change on you.  Those changes are often hard to burry under a legacy Service Contract version.

That problem isn’t unlike its equivalent in the component world.  How do you evolve a component?  You version its interface and try to keep old interface backward compatible as long as you can before deprecating them.

Service Behaviours

Just a word about Service Behaviours:  those are encapsulated in Service Contract along operation signatures.  Behaviours are any logic surfaced to the consumer, as opposed to implementation details which aren’t surfaced.

For example, if a Service encapsulates access to a Database, a consumer won’t know which Database the Service read / writes from.  The exact Database is therefore an implementation detail.

On the other hand, if a Service operation takes an integer in input and validates the input as being greater than zero by raising a fault if that pre-condition isn’t met, that becomes a behaviour.  A consumer depends on that validation:  if a change requires the input to be greater than 5, consumer sending values between 1 and 5 will break.

Service Behaviours are hard to document with tools.  For instance, WCF Service Contracts do not capture pre & post conditions.  This typically require documentation on the side.

Applied SOA: Part 8–Security

This is part of a series of blog post about Applied SOA. The past blog entries are:

In this article, I’ll cover security.  Security is a very broad topic and by no mean unique to SOA.  I’ll focus on what is specific about service security.

I once had a colleague whose mantra around security was the triple ‘A’:

  • Authentication
  • Authorization
  • Audit

I actually find that triple ‘A’ a very good summary of security concerns in most systems:  who are you, what can you do with the system and keeping traces of what you did.


When a service is invoked, it is done by something (e.g. a browser, another service, etc.) on behalf of somebody (e.g. an end user, a system user, an anonymous agent).

Some concerns jump in:

  • How do we identify users in your system, i.e. what is the identity model in your system?  Is the identity a username?  A SID?  A set of claims?
  • What is the authentication process, what are the credentials involved?  User name / password?  Certificate?
  • Where and when is the authentication done?  Once and a proof of that authentication (e.g. token, cookie) is transported thereafter or is it done at each tier boundary?  Service boundary?
  • How does it flow through service composition?  Does the end user identity flows all the way or does the identity of services take over somewhere in the pipeline?  How do you address the case where a service composition is done through asynchronous means, i.e. can the invoker identity be stored with the asynchronous message?

In the Microsoft realm of technology, some classical approach exist.  Using Windows Authentication is often used within a corporate network as a secure way to authenticate user with little friction (user is already logged in on his / her workstation).  Identity can flow via impersonnification or caller context.  The database is often considered a trusted subsystem, i.e. that it doesn’t require the end-user identity, trusting the calling services.  In some cases, you might want to flow the identity right through the database though, in order to strictly enforce security policy.

Outside corporate firewalls identity is often managed using custom means, i.e. username / password and a combination of cookie & custom token.  Claims based security model is rising and becoming more common but is still a young technology.  Interoperability is a challenge with Claims based as very little products support it hence you would have problem to flow the identity everywhere.  This is often managed by having a security policy of controlling the user identity at the Enterprise boundary and then flowing the service identity.


Now that you know the identity of the agent invoking your service, how do you control access?  Many models are possible given the following variables:

  • Are you going to be context insensitive (e.g. this operation requires role X to execute) or context sensitive (e.g. this operation requires role X and role Y.<the name of the department you are trying to access> to execute)?
  • Are you going to control the access at the operation border only?  If so, a user can either invoke your service operation or not.  You might also need to control what the user can see and perform security trimming even once the user has been given access to the service operation.
  • Are you going to grant access based on role (RBAC), to a set of claims or to a set of permissions, itself determined from role, claims and other data about the user identity?
  • Where are you going to control access?  Everywhere?  At an entry point only?

Typically, you’ll have a mix of those variables in your system.  Just being aware of them and which compromise you are making in your architecture will go a great way of ensuring the clarity of the security model, how well it will be understood by different stakeholder and hence the strength of its eventual implementation.


Often forgotten in security is the audit.  Indeed, we can secure a system with access control (authorization), but another big requirements for a secure system is to be able to trace back what a given user did or who did what in a system.

Audit is the ability of tracing back actions.  This is typically enabled by logging actions to a store.

Now the first question you need to ask yourself is if you’ll need to audit easily, i.e. via some UI with search facilities, etc. in which case you want the logging information to be easily mined or is audit going to come as a consequence of some rare legal problems, in which case you just want the information to be available so that some professional could go and mine the information on a need-to base.

Logging can be done at different level:

  • At the service boundary you can log information about an operation, payload in / out, time of request, identity of the user, etc.  .  In an SOA architecture, this should be sufficient to reconstruct user actions.
  • You can continue to log within a service, at different layer, even going as far as logging DB access.  This is useful if your service implementation changes often or if their logic is strongly influenced by some configurable sources (e.g. business rule engine in BizTalk, custom application configuration, etc.) in which case service operation treatment might vary over time and it could be hard to reconstruct actions just by looking at service log.

You can log actions within your system, e.g. by stamping changes with a user name, by keeping all changes, etc.  .  This is done by systems such as TFS which never forget anything.  It is easier to do in such systems because they have generic entity management system.  On system managing many type of entities (e.g. many tables in a DB), it can become difficult to keep all changes in the system without cluttering information schema.  In those cases, you might want to perform logging on the side, e.g. in log files.

Regardless of where you log, it might be challenging to reconstruct user actions from log.  So it’s better to think in advance at what type of reconstruction you would need to do to make sure it will be feasible.  Often the reconstruction scenarios are simple, e.g. who changed the salary of X?  But sometimes, it can be harder, for instance, if you need to reconstruct a long running business process where different people can have played a different role.

Again, you’ll often end up with a mix of the options presented here, i.e. logging on the side + keeping last changes in store which you can then expose at the application level.

Golden rule

Architect with security in mind day 1!

We repeat that everywhere all the time but more often than not, security comes second.  This is actually quite natural since as an architect you typically sit down with business people to specs out an application.  Those stakeholders often have very remote concerns about security.  A good way to get around this tendency is to involve security-aware people (e.g. legal, your Enterprise Security guru) early in the process.

Applied SOA: Part 7–System Consistency

This is part of a series of blog post about Applied SOA. The past blog entries are:

In this article, I’ll cover system consistency.  In every distributed solution, a major concern should be consistency.  How to you ensure that different moving parts in your solution create a consistent state?

Many patterns and solutions exist to address that problem.  We’ll look at a few and see how it impacts other attributes of a SOA solution.

Distributed ACID Transaction

First solution that pops to mind is ACID transactions.  ACID transaction is a very elegant and very mature technology allowing distributed parts of a system to remain consistent.  I won’t go into details here and just assume you know the basics of ACID transactions.

ACID transactions can actually solve the consistency problem in a SOA solution as well.  We would need to use distributed transactions (through WS-Atomic protocol for instance) where transaction would span through the calling tree of a service.  This way, everything in the calling tree would be part of the same transaction and would either commit or rollback as a whole.  It would conflict with solution attributes of SOA though.

The main attribute it would conflict with is service autonomy / isolation.  A distributed transaction implies that a calling service holds lock in a composed service for a certain duration.  This means the composed service can’t guarantee to other callers it can process their requests.  That control has been partially externalized by the transaction.  It breaks the trust boundaries between services:  a service now needs to trust services invoking those services will hold locks on its resources.  Locks also limit scalability of services.

ACID transactions are excellent at ensuring data consistency but they come at a high cost.  This cost is deemed unacceptable in Cloud Computing because of scalability.  The same goes for services invoked across trust boundaries (e.g. B2B services).


The typical fall-back pattern when transactions aren’t appropriate is Workflow + Compensation.  With this pattern, each service expose a rollback, or compensation, logic so that a calling service can call an operation and later on call another operation in order to rollback the effect of the previous transaction.  This typically assume that some sort of workflow engine is at the root of all those calls, orchestrating the service call chain.

This patterns also work and solve the consistency problem.  It introduces different compromises.  For instance, services must trust the workflow engine to take care of the consistency.  Mostly, it is a more complicated solution.  Instead of relying on a system (the transaction coordinator) to take care of compensation, you basically must implement it yourself.

Above the technical difficulty, it might be difficult to compensate after a certain lapse of time.  This would be a business problem.  A typical example is A puts $100 in B’s account, B pays a $90 bill, A wants to rollback its original transaction, but the money is already spent, what can we do?  This is where the orchestrating workflow will get more and more complicated.


We’ve seen the two basic patterns for consistency.  Typically we mix them together.  For instance, a service implementation will use ACID transaction internally to ensure consistency internally.  For read-only operations, composition without distributed transaction can be ok in many system where changes aren’t happening too quickly.  This typically leaves you with only a small set of services requiring state-altering composition.  Those often are business processes and are best handled in a Workflow engine.