Sunday, April 13, 2014

There is always a single point of failure

Everyone these days is [or, at least, wants to show they are] concerned about system availability. And, as we all know, the best way to ensure availability is through redundancy. So most of the architecture documents going past me in recent years are careful to point out the lack of the dreaded single point of failure. But appearances can be deceiving, for example please spot the single point of failure in the following classic 3-tier application:

A classic 3-tier application

Come on! It’s easy – it’s right there in red:

A classic 3-tier application in real life

If an airplane falls from the sky or the nearby river floods the datacenter your entire application is gone. Ok, no problem: we can start designing around that by spreading our application across two datacenters, but then the connectivity between them becomes the weak link as CAP Theorem raises its ugly head. And on, and on it goes…

The moral of the story is: beware of architects who claim their system does not have a single point of failure – they were probably just not paying attention. There is always is a single point of failure (and sometimes more than one). We need to make sure that we have identified all of them and are ready to live with the risk they pose. Or go back to the drawing board and design a solution to mitigate them which will likely introduce another one, or, if we are not careful, more than one.

Friday, February 14, 2014

On the Subtle Art of Technical Communications

What I have been doing for the last decade or so can be summed up in one word: Innovation. The main product of innovation is information and we communicate information via documents. I wanted to share some lessons I learned about creating documents that, I think, some of you might want to adopt or, at least, consider.

Remember the three main goals of technical writing – a good document balances all 3:

Educate – make people understand more than they did before reading your stuff
Inform – make sure people know more than they did before reading your stuff
Entertain – make sure no one ever feels bored while reading/listening to you. This last one is an art in its own right, and requires a combination of practice and talent. Intrigue can often serve as a good substitute: make a claim/statement/promise that will grab readers’ attention early on and then weave your story to substantiate it.

There is a fourth ever-present the implicit goal: Influence. If you write or say something, there has to be a reason for it. In many cases it is to make or change your readers’ minds and everything you do should support this purpose. Documents should be carefully architected, designed and built – just like software!
There is always should be a coherent story – make sure you construct your narrative before you start your work.
People love stories and anecdotes: weave them into your story line to support each part of your narrative. A colleague of mine once tried to understand why I deliver a presentation that grips, engages and influences the audience every time and he never gets the same result when telling the same story using the same deck. He used a scientific approach and gathered data over a number of my presentations and noticed that I have a library of about 35 different stories, anecdotes and vignettes and each time I deliver a presentation I use a different subset of about 25 – roughly one per slide. He tried to imitate that approach and got much better traction with the audience.
Text is boring and tedious – use pictures. If you don't have a relevant technical picture, put a humorous one related to the key concept of the slide:

Talking about the dangers of exclusively relying on SSL for Web Service Security – use an image of leaky pipes:
Covering SLA Enforcement – show a speed trap:

Describing ways to interconnect virtual private clouds through a IPSec bridge – awe the reader with an image of colliding galaxies exchanging a stream of stars:

Couple of interesting posts by Nancy Duarte on the subject:

Saturday, November 9, 2013

SOA Governance Evaluation Guidelines

Original publication date: Oct 29, 2007

A pragmatic approach to assess and compare alternative solutions

Today I will try to outline the general principles which might be useful when selecting or architecting a comprehensive SOA Governance Solution. They have helped my team in the past and hopefully could be of some use to others as well.

Depth and Scope

By definition Service Oriented Architecture represents an array of business and technical strategies that the IT organization of an enterprise pursues to promote the exposure of business functionality within and between enterprises in a consistent, published, secure and contractual way. Yet most “SOA enabler” products including Governance Solutions are centered around Web Services, WSDL and UDDI – none of which are an intrinsic part of SOA. The reason for this is that due to its recent popularity, SOA has become a must have on all product spec sheets and architecture proposals, and the vendors are trying to limit the scope of SOA to match their offerings. The following table summarizes the common misconceptions about the substance, scope and depth of SOA and some of its aspects:

SOA ≠ Web Services or Any Other Technology
SOA ≠ Integration or OOP or MOM
Service Registry ≠ UDDI
Service Description ≠ WSDL
Service Monitoring and Control ≠ JMX
Service Management ≠ Component Deployment
Service Versioning ≠ Repository Source Control
Service Security ≠ SSL or XML Firewall or Security Appliance
Service Accountability ≠ System Logs

Service QOS ≠ HA Solutions
Service Metering, Chargeback ≠ Usage Logs
Service Routing ≠ BPEL or MOM or Java Code
Service Interoperability ≠ HTTP and SOAP
Service Composability, Orchestration ≠ Published WSDL
Service Discoverability, Dynamic Binding ≠ UDDI
Service Fault Handling & Analysis ≠ Java Exception Stack or WSDL Fault or System Logs
Service Throttling ≠ Firewall or JMS or Application Server Configuration

When considering investment in a SOA Governance Solution, it is important to make sure that prospective vendor or architecture team have not artificially narrowed the scope and thus future usability by repeating any of these mistakes.

Solution Drivers

The following drivers should be decisive in evaluating the architecture and functionality of a governance solution:

Does it define a comprehensive Governance Information Model (GIM) which contains information required by all stakeholders and actors in the SOA landscape? Such service model should be radically different form the metadata-based approaches to SOA Governance adopted by the majority of existing solutions: it should define a strongly typed, easily validated view of the governance domain with enforced referential integrity, while the latter simply adorns the service WSDL. The SOA service model should be extendible with metadata to allow users capture additional service attributes, such as geography or pricing, which are pertinent to their business model, but the model itself should be defined as the first-rate data.
Does it separate the views of a service from the Technical, Business and Governance domains and allowing them to be expressed and maintained independently?
Does it provide support for all core SOA characteristics as well as any additional ones required by the enterprise without forcing the users to implement the characteristics that have no current business value in their domain?
Does it maintain complete separation between service implementations and service aspects implementing the SOA characteristics, allowing incremental adoption and rollout of governance solutions?
Does it facilitate the support of all existing and future SOA-related standards without forcing users to lock into any particular standards or profiles? Does it allow users to switch between competing standards or between proprietary and standard aspect implementations? In other words: can it be used as hedging against uncertainty in the SOA standards space?
Does it support packaging policies into reusable Governance Contracts to prevent uncontrollable policy application and proliferation? Does it help to prevent unnecessary service proliferation by exposing the same service implementation to different constituencies with different contracts?
Does it facilitate true service reusability through direct support for service virtualization and refinement? Service Refinement refers to the ability to slightly change the interface, vocabulary, granularity, semantics or behavior of enterprise services without changing the service implementation or affecting existing service consumers.
Does it provide complete support for the Web Services technology stack (HTTP, SOAP, WSDL, UDDI), while allowing service producers and consumers to use the technologies of their choice when implementing and consuming enterprise services? Does it minimally support JMS, EJB and Java service bindings, and allow users to define and implement new binding types.
Is it Aspect-based? Does it provide SOA adopters with the means to support all aspects of service governance without forcing them to accept any arbitrary decisions on which governance aspects they should implement or how these aspects should be implemented. Does it come out of the box with a reasonable governance model and a set of aspect implementations providing a useable starting point for recent adopters of SOA? At the same time is it equally applicable for advanced service-oriented enterprises with established governance model, standards and toolsets?
Does it support aspect delegation: many of the service governance aspects including: security, versioning, monitoring and management, etc.; have existing packaged solutions broadly adopted in the prospective SOA marketplace? The examples of such solutions are Sun’s Access and Identity Managers and CA’s eTrust for security, IBM’s Tivoli and CA’s Unicenter for monitoring and management, CVS and ClearCase for version control. Does it allow users who have already adopted a packaged solution in one or more aspect spaces to delegate the implementation of these aspects to those solutions, provided that the solutions themselves support such delegation? At the same time it should not rely on any external packaged solutions, providing independent implementations for all core aspects.
Are the services exposed through the solution consumable without any custom client code? Is it compatible with industry-standard tools like SoapScope? There is a contention between the need for interoperability and decoupling versus convenience and the need to hide complexities. Can the solution resolve this contention by offering service consumers a spectrum of options for invoking governed services? These options should include:
- Access points, which allow service invocation without any client-side components, but require client implementation to manage the complexities of dealing with find-bind-execute cycle and compliance with the governance policies.
- Client interfaces, which provide service-generic but client platform-specific solution components that insulate consumers from most of the complexities when invoking any exposed service.
- Convenience APIs, which provide both service- and client platform-specific layer for zero-effort invocation of designated services.
Does it manage the cost of compliance? To be able to satisfy all of the above drivers would require a solution of significant level of complexity - does it compromise overall viability of the SOA implementations it is designed to support? Specifically:
- Does it introduce performance bottlenecks to underlying service implementations?
- Does it introduce scalability bottlenecks to underlying service implementations?
- Does it introduce a single point of failure or negatively affect the availability of underlying service implementations?
- Does it introduce any additional vulnerabilities or negatively affect the security of underlying service implementations?
- Does it negatively affect the testability of underlying service implementations?

Brownfield-friendly SOA Governance

Original publication date: Nov 07, 2007

The time is now!

The real life usage of such governance solutions is evolving even compared to the last year. Until recently most companies that introduced SOA Governance did so in some form of pilot project, which usually represented a "greenfield" environment: service consumers, composite applications and often the services themselves were being developed at the same time. So the key factors on which SOA Governance solutions were evaluated were centered around their design- and run-time capabilities and operational characteristics but did not include the "brownfield" environment friendliness factor. Consequently SOA Governance vendors had little incentive to invest in those capabilities of their products. But all of this is about to change, and the ability to support effective and painless introduction into existing IT environments will soon become one of the key differentiators in the SOA Governance marketplace. We have experienced this first hand during a recent implementation of Sun Service Governance Framework (SGF) at a large European media company, during which it turned out that the majority of issues with development, implementation and rollout of the governance solution were directly related to the "brownfield" category. Specifically they included:

The need to reconcile and integrate Service Governance with the SDLC used by the client’s IT organization.
Ability to support service governance across multiple development, testing, staging and production environments.
The need to provide support for “uncooperative” clients – the ones which are impossible or not feasible to change to accommodate governance-related service changes.
The need to quickly and efficiently bring large numbers of existing services under the control of the Governance Solution.
The need to support safe and effective sharing and migration of governance data between multiple environments.

It was estimated that without the above capabilities, the total effort required to introduce the Governance solution into the IT SOA landscape would exceed half of a man-year.

State of the union

I have not seen any full-spectrum SOA Governance products or technologies that provide noteworthy "brownfield" environment friendly capabilities described above. There are number of design-time only governance products (Registries) that provide federation capabilities which could be utilized to support some form of migration and reconciliation of governance data across multiple environments. There are also some run-time only governance products which allow import and export of governance data and can be used to ease some of the pains of reconciling Governance with the SDLC. But that’s about it! Existing governance methods and solutions are focused on governing services in the context of established SOA environment complete with underlying governance infrastructure. Let me bring an example: WebMethods in their definitive whitepaper on the subject write: "Ensure that governance capability-related milestones are synchronized with SOA adoption milestones so that you do not end up trying to retro-fit governance after the fact" and "the right time for governance is before you put any services into place" - great advice if you only deal with clients that have never played with SOA before! This situation is most representative for "greenfield" environment and is highly atypical for real-life enterprise IT. This static nature of service governance can become a significant barrier for its introduction (and consequently the success of SOA overall) in "brownfield" environments with its existing sets of services, legacy consumers, third-party composite applications and established software development processes and practices.

The Answer

When we first recognized this problem (and the shortcomings of our own governance solution) we set out to define the list of capabilities and enhancements to SGF that would solve the challenge of [near] painless introduction of SOA Governance into existing SOA-based IT environments. This is what we end up with:

Staging-aware SOA Governance which aims to resolve the disconnect between the fact that governance is essentially an oversight activity and thus should happen in (or at least as close as possible to) production with the need to put governance artifacts through the same QA processes as the rest of the IT assets.
SDLC support in Governance which addresses the fact that transition from so called monolithic or siloed applications to SOA has in fact, from the SDLC point of view, made the entire IT environment even more monolithic than it was before that transition. In the past at least these applications were independent form one another and could have been taken through SDLC phases one-at-a-time. As companies embrace SOA they are facing potentially infinitely connected mesh of services, consumers and composite applications and the only guaranteed safe option becomes to take through SDLC the entire snapshot of enterprise IT, making it more difficult and costly then ever to introduce new changes required by the business. Extending Governance solution with SDLC capabilities makes it possible to take individual services, consumers and entire composite applications through the lifecycle stages as required by the IT practices and procedures.
Transparent Governance Mode which resolves the tension between the need for a Governance platform to transform services and the need to support legacy clients that can not (easily) change to accommodate governance-related changes to service interfaces. For example declaring that a certain service has to be authenticated with WS-Security requires changing the WSDL to reflect the fact that it now needs wsse-compliant header.
Bulk operations which would allow to quickly and consistently bring under the umbrella of governance groups of existing services, spread throughout the Enterprise.

I believe that brownfield-friendliness will be a decisive differentiator amongst the SOA Governance products in the coming year so I am planning to talk about each of these capabilities in more detail in future posts.

Tuesday, November 5, 2013

SOA Governance Scorecard

I stumbled on an unanswered question how can we evaluate a governance solution on all the factors given by you? for the SOA Governance Evaluation Guidelines article on my old blog that somehow survived my separation from Sun few years ago and its subsequent swallowing by Oracle. In fact, I have been getting a lot of inquiries lately about the stuff described in that blog. And, although I have not been focusing on SOA Governance and Security for the past two years, every time I do cursory research to answer a question I get the impression that things have not changed much since I left that scene.

So, in case Amit still needs the answer to his question, I have dug out the SOA Governance Scorecard which I have defined five years ago to measure and track us against the competition. It uses 62 evaluation criteria organized in three-level taxonomy. Quite a few people have found it useful and a year later Accenture even adopted it almost verbatim as their Accenture SOA Governance Vendor Analysis v 1.0. As completed my archeological excavations, extracted the scorecard from the dig and gave it a gentle cleaning my trusted brush, the emerged criteria seemed surprisingly relevant: the only things I would have changed if I were writing it today would be expand the aspect list in section 2.1 and added a new top-level section on Operational Readiness along the lines of requirements outlined my article on Brownfield-friendly Governance.

A sample scorecard based on this criteria can be downloaded here.

Scorecard Criteria

1. Governance

overall weighted rating for the entire Governance Section.

1.1. Defines a comprehensive SOA service model

overall weighted rating for the Governance Model used by the solution.

1.1.1. Service Metadata

does the solution have the capability to associate metadata with individual services?

1.1.2. Strongly typed model

does the solution define a comprehensive SOA service model which contains information required by all stakeholders and actors in the SOA landscape? Such service model would be radically different form the metadata-based approaches to SOA Governance adopted by the majority of existing solutions: it should define a strongly typed, easily validated view of the governance domain with enforced referential integrity, while the latter simply adorns the service WSDL.

1.1.3. Model validation

does the solution have the specific capability to validate the model and most importantly the conformance of services and other artifacts? Also takes into the account whether the model only allows after-the-fact validation of published services, or through referential integrity, will prevent users from creating invalid entries.

1.1.4. Enforceable Model

does the solution have the specific capability to enforce the governance information captured by the model during all parts of the governance cycle?

1.1.5. Extendable Model

does the solution offer the capability to extend the model to accommodate customer-specific requirements?

1.1.6. Separate consumer and provider views

does the solution differentiates between the governance data which should be disclosed to the consumers and the data which should only be available to the service providers and infrastructure? For example the fact that certain service requires a valid lease token should be made available to both consumer and enforcement agent, while the policy how to deal with missing and expired lease tokens (whether to fail requests, include warnings, and / or raise alerts) should not be disclosed to the consumers.

1.2. Separates Technical, Business and Governance views of a service

does the solution separate the views of a service from the Technical, Business and Governance domains and allowing them to be expressed and maintained independently? A developer should not be concerned with the cost and subscription model of the services they are developing and consuming, and Business Actors should not worry about the bindings, protocols and lease requirements of the services their organization provides and utilizes. Such separation of concerns and un-concerns should be maintained throughout all aspects of a governance solution starting with the model.

1.3. Separates Service Lifecycle from SDLC

does the solution governance offering maintain clear separation between Service Lifecycle (including service definition, validation, approvals, publication, evolution and retirement) and Software Development Lifecycle of the service implementations (coding, testing, QA, staging, production and maintenance)?

1.4. Breadth of Governance Solution

overall weighted rating for the breadth of Governance functionality addressed by the solution.

1.4.1. Run-time Governance

does the solution support run-time governance functions such as service invocation cycle (find-bind-execute), policy enforcement, security, service virtualization, protocol and binding adaptation, monitoring, management, etc?

1.4.2. Design-time Governance

does the solution support design-time governance functions such as contract, service and service offering definition and lookups, registry-repository access, etc?

1.4.3. Analysis-time Governance

does the solution support service categorization and discovery? Definition and cataloging of reusable governance artifacts such as service and customer tiers, contracts, SLAs, etc.

1.4.4. Lifecycle Governance

does the solution support service lifecycle activities, approval workflows, lease renewal, service evolution and retirement?

1.5. Manages Policies as Reusable Governance Contracts

does the solution support packaging policies into reusable Governance Contracts to prevent uncontrollable policy application and proliferation? Does it help to prevent unnecessary service proliferation by exposing the same service implementation to different constituencies with different contracts?

1.6. Focus of Governance Solution

how focused is the solution on SOA Governance? Higher ratings are given to pure-play solutions focused exclusively on solving the service governance problems, versus other types of solutions such as Management Platforms, ESBs, EAI and Identity Management products and Composite Application Suites that have some SOA Governance capabilities.

2. Support for SOA characteristics

overall weighted rating for the capability of the solution to support various SOA characteristics. Takes into account support for core and additional characteristics, ability of the product to define and support client-specific characteristics and customer's flexibility in choosing which characteristics to implement and how to support them.

2.1. Supports all core SOA characteristics

overall weighted rating for support of the core characteristics.

2.1.1. Security

solution's support for SOA security. Takes into account support for different security aspects: Authentication, Authorization, Integrity, Confidentiality, Accountability, Identity Management and Security Policies; as well as the facets where they manifest themselves: Transport, Message, Application, Asset (Data), Knowledge and Control Security.

2.1.2. Versioning

solution's support for service versioning, including ability to support multiple concurrent versions of the same service, designate primary versions, define version compatibility lists and ability to adapt requests to deprecated versions. Also takes into account to distinguish between version changes that change service interface and those which only affect governance contract.

2.1.3. Lease

ability to support service lease to assure true decoupling of service consumers and producers. Ability to issue, publish and validate lease tokens and implement service lease enforcement policies.

2.1.4. Monitoring

solution's support for service monitoring (both real-time and historic).

2.1.5. Metering and Chargeback

solution's support for collecting service usage information suitable for generating consumer bills. Takes into account support for advanced billing capabilities, such as real-time checks and ability to deny requests for delinquent and overdrawn accounts.

2.1.6. Throttling

ability to define and implement different throttling policies to protect service providers and ensure sustainable service levels. Throttling can include limiting TPS per service, instance, client, etc.

2.2. Supports any additional SOA characteristics

solution's ability to define provision and enforce additional client-specific SOA characteristics.

2.3. Characteristics supported out of the box

number of SOA characteristics which solution supports out of the box (without any additional development).

2.4. Freedom to select any subset of supported characteristics

as the number of SOA characteristics supported by the solution increases, so does potential for overhead. This criterion assesses the ability to support a subset of relevant characteristics during the implementation to ensure that the solution does not introduce any unnecessary overhead.

2.5. Allows incremental rollout of characteristics

solution's ability to change the set of supported and implemented characteristics without affecting any existing service consumers or providers.

3. Architecture

overall weighted rating for architectural maturity and flexibility of the solution and its ability to support all aspects of SOA Governance as the notion of governance continues to evolve.

3.1. Aspect-Based Governance

support for aspect-oriented governance model, which decouples different SOA characteristics and implements them as independent aspect methods.

3.2. Governance Delegation Capability

overall weighted rating for support of the aspect delegation: many of the service governance aspects including: security, versioning, monitoring and management, etc.; have existing packaged solutions broadly adopted in the prospective SOA marketplace. Does the solution allow users who have already adopted a packaged implementation in one or more aspect spaces to delegate the implementation of these aspects to those packages, provided that the packages themselves support such delegation?

3.2.1. Complete Delegation

can the solution delegate entire aspects to an external package?

3.2.2. Partial Delegation

can the solution delegate parts of aspect implementations to an external package?

3.2.3. Orchestrated Delegation

can the solution delegate the entire aspect functionality to and external package but "rearrange" the default functionality through low level APIs?

3.3. Supports SOA beyond Web Services

SOA is often confused with web services, which limits its applicability and adoption. This criterion evaluates how applicable is the solution to other SOA implementation technologies.

3.4. Supports Different Binding Types

overall weighted rating for support of different types of service bindings.

3.4.1. Web Services

support for the standard web service bindings (SOAP over HTTP/HTTPS).

3.4.2. JMS

support for JMS service bindings.

3.4.3. EJB

support for EJB service bindings.

3.4.4. POJO (Java)

support for POJO (pure java) service bindings.

3.5. Customer Defined Binding Types

ability to add new customer-specific binding types, such as .NET, AJAX, COBOL, etc.

3.6. Supports Dual Binding

does the solution support dual (consumer and provider side) bindings? This allows service creation and consumption to occur in multiple environments characterized by different platforms, technology preferences, skill sets, stages of lifecycle, etc. If path to interoperability is seen through the use of single underlying service implementation technology, such as Web Services, it would take unacceptably long time to reach the level of adoption, necessary for SOA to succeed.

3.7. Provides Complete Spectrum of Client Solutions

there is a contention between the need for interoperability and decoupling versus convenience and the need to hide complexities. Can the solution resolve this contention by offering service consumers a spectrum of options for invoking governed services?

3.7.1. Compatible with industry-standard tools

are the services exposed through the solution consumable without any custom client code? Is it compatible with industry-standard tools like Mindreef SOAPScope?

3.7.2. Provides Access points

does the solution provide access points, which allow service invocation without any client-side components, but require client implementation to manage the complexities of dealing with find-bind-execute cycle and compliance with the governance policies?

3.7.3. Provides Client interfaces

does the solution provide client interfaces, which provide service-generic but client platform-specific solution components that insulate consumers from most of the complexities when invoking any exposed service?

3.7.4. Provides Convenience APIs

does the solution provide convenience APIs, which provide both service- and client platform-specific layer for zero-effort invocation of designated services?

3.8. Manages the cost of compliance

does the solution helps to manage the cost of compliance? Is it able to satisfy all of the above criteria would require a solution of significant level of complexity - does it compromise overall viability of the SOA implementations it is designed to support? Specifically:

Does it introduce performance bottlenecks to underlying service implementations?
Does it introduce scalability bottlenecks to underlying service implementations?
Does it introduce a single point of failure or negatively affect the availability of underlying service implementations?
Does it introduce any additional vulnerabilities or negatively affect the security of underlying service implementations?
Does it negatively affect the testability of underlying service implementations?

3.9. Service Virtualization

does the solution provide capabilities for building task-specific “virtual” services from existing services? Does it allow to:

Consolidate one or more operations from different services into one
Hide selected operations of an existing service
Rename selected operations of an existing service
Hide bindings and other details of the service implementation

3.10. Service Refinement

does the solution provide capabilities to slightly change the interface, vocabulary, granularity, semantics or behavior of enterprise services without changing the service implementation or affecting existing service consumers?

3.11. Non-invasive Architecture

is the solution build on a non-invasive architecture, or does it requires to rebuild, re-instrument or redeploy existing service to bring them under the aegis of SOA Governance?

4. SOA Standard Support

overall weighted rating for standard compliance and capability to support SOA related standards. 4.1. Complete support for the Web Services technology stack

overall weighted rating for standard compliance of the existing solution (HTTP, SOAP, WSDL, UDDI / ebXML).

4.1.1. Standard Service Registry

service registry supports UDDI and/or ebXML standards.

4.1.2. Service Repository

solution includes a Service Repository (there is no standard for repositories except ebXML Registry-Repository).

4.2. Capable of supporting all existing SOA-related standards

does the solution has an architecture in place to add support any existing SOA-related standard not currently supported out of the box.

4.3. Capable of supporting any future SOA-related standards

does the solution has an architecture in place to add support any future or custom SOA-related standard not currently supported out of the box.

4.4. Provides hedging against uncertainty in the SOA standards space

overall weighted rating how the solution helps to address and mitigate the uncertainty in the WS Standard space and lack of many needed SOA standards.

4.4.1. Allows to switch between competing standards

does the solution allows to switch implementation of a particular aspect between competing/complimentary standards?

4.4.2. Allows to switch between proprietary and standard aspect implementations

does the solution allows to switch implementation of a particular aspect between proprietary and standard or draft compliant?

Monday, July 30, 2012

Evolution of Architecture part II: what is evolving?

I guess I should have been more specific in the yesterday’s post when I wrote about the "forgiving environment". In order for evolution to work well, the environment has to be sufficiently (but not overly to avoid complete extinction) unforgiving in providing evolutionary feedback back to the species that are evolving. Feedback provided to other species and the ecosystem as a whole does not count! If a change in the design of bees causes one of the plant spices that depend on them for pollination to fail, it will not act as an evolutionary feedback on the bees themselves (provided they have alternative food sources) and will not force them to change further or abandon this evolutionary path.

If we take a closer look at what is actually evolving in the enterprise IT ecosystem, the answer will most likely be ideas and views (architectural memes) and practices. Both of these are carried by humans in a symbiotic (or parasitic – depending on the way one looks at it) relationship and the latter usually survive the death of organizations and projects: when have we last heard of an architect committing seppuku when his creation failed to attain the required systemic properties?

From the Wikimedia Commons. Copyright Kalyan Varma

So, similarly to the above bee analogy, failure of a project or even entire organization as a result of absent or subpar architecture in most cases will not provide noticeable evolutionary feedback to the ideas, views and practices that perpetrated it. On the contrary, such failure might even help to spread them as their hosts disperse through the organization and beyond as spores of a burst puffball mushroom. In an ironic twist, since majority of people rarely see themselves at fault and their actions as a cause of failures, these architectural memes often mutate into best practices as they spread through and colonize IT ecosystem.

So even though the Enterprise IT environment can be unforgiving to failures from business point of view, it is, in most cases, mild and sheltering to people and practices that cause them.

Friday, July 27, 2012

Is Agile a true Darwinian Force?

I have just come across a very interesting post on Igor Lobanov’s Enterprise Systems Engineering blog: Can an architecture emerge? The gist of the article, as I was able to understand it, is in answering this question posed in the title by highlighting the analogy between agile development (on both micro- and macro-levels) and the Darwinian natural selection process. Igor’s conclusion is that both are capable of architecture emergence but produce results burdened with evolutionary/architectural debt. I would wholeheartedly agree with the debt part, but let us take a closer look at the analogy itself. It is an original and a very intriguing idea. There is no doubt that evolution drives outcomes that clearly look architected (hence propagating the meme of intelligent design). I would however argue that although somewhat similar, agile development has much weaker drive towards emergence of architecture than evolution for three primary reasons:

1. Lack of parallelism - nature "tries" millions variations in parallel, while project managers, being a frugal bunch, typically allow only one development team/stream for each system being built. And although developers and organizations have some memory for past successes and failures which provides some degree of virtual parallelism (for every path taken developers would have considered and rejected a few alternatives that either failed or delivered suboptimal results in the past) it provides orders of magnitude sparser coverage of the solution space. Furthermore being human, we are naturally strongly biased towards remembering and replicating success, thus leading us to favoring local optimums and reusing past solutions and approaches outside the context where they originally emerged and were proven.

2. Forgiving environment – setting aside the clichés of corporate jungle, most development occurs in much more sheltered and forgiving environment than those found in nature. I often argue that it is one of the underlying root causes for the lack of respect towards architecture in corporate ecosystems: at its core, architecture is about doing things right, so why bother with it if there is little downside to messing up. In software development almost working is often good enough and, in absence of a ticking bomb of some kind, a working solution is almost never reengineered just because there is a possibility to make it better. In nature however, not being up to scratch is punished very swiftly and even successful solutions are under constant pressure from the competition.

3. Shorter time horizon – nature had three billion years to evolve to the present state which is quite a bit longer than most deadlines I have come across in the course of my career. Even if comparing the two processes in the units of their natural cycles (biological generations versus sprints of even code-build-deploy cycles) software development needs to arrive to the architecture emergence stage in no more than a few dozen generations.

If you accept the above differences, you would agree that emergence of meaningful architecture from a purely myopic (focused only on the current problem/iteration) agile development process is highly unlikely.

Monday, June 11, 2012

Service Refinement

The Philosopher’s Stone of SOA

Original publication date: Oct 30, 2007

This paper is dedicated to the problem of Service Refinement in the Enterprise SOA implementations and outlines my approach to solving this problem in the context of the comprehensive SOA Governance Solution. Let's start with a definition:

Service Refinement refers to the ability to slightly change the interface, vocabulary, granularity, semantics or behavior of enterprise services without changing the service implementation or affecting existing service consumers.

The Challenge

One of the core benefits of SOA is reuse, which means that ideally every service is implemented once and then used throughout the enterprise. In other words every business service, such as InvoiceCustomer, would have to be developed once and then could be used in hundreds of places across dozens of applications for multitude of purposes. It is unlikely that the exact same interface, granularity, vocabulary and level of abstraction will be right for all possible service uses. Thus creating a single implementation that would out of the box satisfy all existing and future uses throughout the company is not realistic - it would have to be infinitely flexible to work on different levels of abstractions, granularities, with different sets of defaults and assumptions, etc. Infinite flexibility means infinite complexity: to design and build such a service would be very costly, require a very long time, and it would still likely miss some of the use cases.

Another approach is to define and create a reasonable 80/20 service and apply some glue as needed on the consumer side. This is also a very costly approach, potentially requiring service consumers to repeatedly re-write the missing 20% of the code in every place where the service is consumed. Furthermore, such glue code will be very tightly coupled with the service itself, so whenever the latter changes in any visible way, the owners of all composite applications that consume such service will have to go through every instance of that glue code and change it accordingly.

Neither of these approaches is acceptable in enterprise-level post-pilot SOA since both result in overcomplicated and fragile solutions, which often forces the adopters of SOA to take an even less desirable route of service replication thus ultimately defeating the purpose of SOA to save money and improve quality through reuse.

The Answer

The best way to implement service refinement is in the context of a comprehensive aspect-based SOA Governance solution. This architecture naturally supports service refinement through well defined, reusable refinement aspects. Service producer defines and implements a reasonable service implementation, which is published and can later be combined with one or more refinement aspects into a number of consumable service offerings. Service consumers then select and bind to the offering which best matches their usage scenarios.

An even more powerful solution would combine the use of refinement aspects that implement common refinement concerns with support of decorating individual services with individual refinement methods or filters to address one-off requirements. When a new use case with some unique requirements is discovered, a new service offering can be created by combining the existing service implementation with a new refinement method, which can be developed by service consumer, producer or a third party.

The best part of the solution is that neither service producer not any of existing consumers of basic or refined service offerings are affected by this change. Any future fixes and improvements of the service implementation would be immediately effective to all of the consumers.

The Uses

Below are some of the real-life SOA challenges that have been successfully solved through service refinement:

· Customer Information Service: following the SOA best practices, a company implemented a single coarse-grained service getCustomerInfo which provided consolidated customer information, instead of multiple fine-grained services like getCustomerName, getCustomerAddress, etc. This worked well for most consumers except a remote application which required only minimal customer information and was accessing the service over an unsecured, low bandwidth, high cost wireless connection. A simple granularity refinement method allowed to reuse the common lookup service for this remote application.

· Employee Lookup Service: a company developed a service to lookup employee information based on a number of criteria. The service proved to be very useful and was immediately utilized in a number of applications run by different departments, including payroll, HR and training. However it became apparent that although all of them have a concept of employee, each attached a different meaning: in payroll system employees were everyone who gets paid, including permanent and contractors, training system only considered permanent full-time employees who were eligible for training, and HR software was required to also keep track of former employees. A number of straightforward semantic refinement methods allowed to implement a common lookup service, which was not tied to any particular definition of employee thus reducing coupling. When a new system was discovered in Legal, which had to track employees based on nationality, the same lookup service was reused without any changes.

· The same Employee Lookup Service: was easily made compliant to the new privacy regulations with a use of vocabulary refinement method that transparently translated employee social security numbers that were used as employee IDs into a surrogate IDs for the consumers invoking the service from the overseas subsidiaries.

Conclusion

Support for service refinement should be a crucial part of any SOA enablement platform. Similarly to service versioning, which allows services to evolve in time to meet the changing consumer needs, refinement allows services to adapt to different usage scenarios without unnecessary duplication and service proliferation. A universal SOA governance solution allows to manage service offerings based on refinement in the same way as those based on different security models or SLAs.

Defining the Internet Scale

I really like crisp definitions. Actually, that was an understatement – I obsess about them. Perfectly defining something helps me fully understand it, separate from related, but distinct concepts and, ultimately, conquer it. It’s like knowing the true name of your magic adversary in a fantasy novel.

I have been working on a paper about Monitoring on the Internet Scale and, naturally, I wanted to open it with the definitions of monitoring (I happen to have a fairly good one) and Internet Scale. After doing a lot of searching and quite a bit of thinking, I came up with the following:

Internet Scale is really big. While there is no common definition, systems that are rightfully referred to as being internet-scale typically consist of 10s of thousands nodes, service 10s of millions users and performing 10s of billions actions per day. It is also important to point that the size implied by the internet scale designation is constantly growing with its namesake

And, being a visual person, I would usually accompany this definition with one of the colorful OPTE Project images depicting the entire internet:

However, I had a nagging feeling that this definition, like Plato’splucked chicken, lacked something important. And today I came up with a better definition of an internet-scale system, which appears to capture the nature of the phenomenon while distinguishing it from the other like concepts. It also does not need to be adjusted for inflation (of the internet). Without further ado:

An internet-scale system is a system where engineers stop worrying about scale, scalability and scaling and leave worrying about it to the accountants.

Thursday, March 8, 2012

Big Data + CEP = percolator?

The question of what is located on the intersection of the Complex Event Processing and Big Data technologies keeps coming up a lot lately and I have not seen an adequate answer so far. Let me try to come up with one. Big Data is about solving really big problems (like back-rubbing the entire Web to determine the rank of every page and recalibrate ranking at the same time or assessing risk of the entire market). Most real life problems are well-posed or stable ones: small changes to the problem yield commensurably small changes to the solution. If the input data for big problem we are trying to solve is continuously undergoing small changes, the typical approach is to freeze it at arbitrary intervals, solve and use this solution for the next period hoping that things are not changing too fast (overnight/weekly/monthly/quarterly batch).

Google built percolator to go from re-indexing the entire web every couple of days to being able to process each change detected by the crawler separately in a matter of minutes. CEP is about reacting to a vast stream of discrete small changes to the state of the universe or, to put it another way, about continuously varying input for a big data problem. Am I the only one who finds this problem both general and utterly fascinating? Is anyone aware of anything being done in this space outside Google and Microsoft (they were building their own percolator when I talked to them a year ago)? Is there a paying need for such a platform that would solve a continuously-changing big problem? I can think of at least two possible candidates:

· Market-in-a-box: a model of systemic risk of the entire financial market, that can be used both to simulate strategies and their combinations and also conduct joint “maneuvers” or simulation exercises buy major players.

· Omniscient (way beyond smart) Grid: giving utilities ability to continuously solve the distribution problem based on real time feeds from myriad of smart meters.

A lot of people have pointed me to Storm as the answer. Based on a quick look at https://github.com/nathanmarz/storm/wiki/Rationale it does not appear to be drastically different from regular CEP/Stream processing platforms in the sense that I see nothing there to bootstrap or reset the percolation: ability to solve a static Big Problem in the first place.