Wednesday, August 19, 2009

Domain-Driven Design, Services, and Data Access

The term Domain-Driven Design, or DDD, has long become a buzz phrase in the software community – after it was coined and popularized by Eric Evans in his book “Domain-Driven Design: Tackling Complexity in the Heart of Software”. The book does an excellent job summarizing the essential principles of object-oriented software design based on domain modeling, stressing the importance of strategic design, refactoring, etc., as well as documenting various practical design patterns (some of which I like better than the others, by the way.) It is fair to say that long before the book first came out in 2004 and the DDD term was coined, the importance and necessity of proper modeling of software systems based on distinct clearly defined functional domains was well understood among the better software engineers. However, the well-written and intelligently organized book on the subject was immediately embraced as a consolidated reference and effective tool in evangelizing the concepts of good design.

To represent a system in a clear and effective way, a model should draw a clean distinction between its artifacts based on their roles and functional domains they relate to, as well as accurately define the relationships between such artifacts and between the domains themselves. It is not only important and extremely useful to distinguish the model artifacts by their roles and purposes, but to group them by the functional domains.

Today, few people argue the benefits of the DDD principles. As DDD has gained wide acceptance and recognition, it inevitably became a buzz phrase. With that came some unwanted side effects.

Like any sophisticated and effective design methodology, DDD has not been born out of nothing. It has its prerequisites. Understanding of DDD requires familiarity and deep understanding of the programming fundamentals, structured design, object-oriented principles, modularity, re-factoring, etc. More than anything, to appreciate and benefit from DDD the programmer must truly understand the dangers of excessive complexity of software, and feel the sincere need to avoid such complexity by applying intelligent design and creative thinking. In other words, mechanical following the patterns listed in a book will lead you nowhere. Your heart and mind must be in it!

Lately, I have been reading or hearing comments similar to this: “Oh, we don't use services; we use DDD.” That usually comes with an implication that “services” are so last season...

Hmmm... Why? Is such point of view based on only a superficial familiarity with DDD (perhaps, limited to a second-hand familiarity with one or two of the design patterns from the Evans book?) Or is it, perhaps, based on a misinterpretation of Martin Fowler's article in which Fowler talks about the flaws of the anemic domain model? In the anemic model, domain objects are reduced to all but data holders with getters and setters, while procedural services implement all the behavior applicable to the domain objects. Fowler – rightfully so – dismisses such approach as an anti-pattern and a quite incompetent [mis]use of objects. I whole-heartedly agree with him, and anyone who thinks that objects in a well-designed model should encapsulate both data and behavior. In the context of a Domain Model, this means that domain entities must implement domain-entity-specific logic that belongs on those entities, the logic essential to the definition of the nature of the entity, the logic without which no instance of the entity would make complete sense. Such logic must include the implementation of the ways the entity manages its own data as well as the relationships with other domain entities that the entity instances directly rely on.

The logic inside an entity class should not, however, include anything that is completely foreign to the domain model in question, e.g. the business logic specific to a particular application. That includes any data access logic. Such logic, according to Eric Evans, should live in the Application Layer, or... services.

In other words, domain entities should contain as much logic as possible, but no more than that!

However let's take a look at how some programmers (and architects!) interpret this seemingly simple and obvious concept. It is not uncommon to see application logic and data access logic wired directly into the domain entity classes. Needless to say, this makes the domain objects rigid, hard-wired for a specific application, and often all but useless when new business scenarios come up – even within the same enterprise. Have these folks, perhaps, skipped the “Introduction to Structural Programming” and “Object-Oriented Programming, Part I” classes on their way to becoming Sr. Software Engineers?

Embedding any notion of persistence (be it self-persisting methods or a reference to some object repository) is not my idea of a good design pattern, by any means. (This is where I might disagree with Eric Evans, and many others, but I am sticking to my guns on this subject, and please don't crucify me for this.)

Just the fact that an instance of A may under certain circumstances participate in scenario B, does not mean that the logic for B must be forever burnt into A making the two inseparable. The basic rules of software design and common sense suggest that we must always do our best to separate and de-couple things that do not need to be molded together. Not the other way around!

Naturally, one of the most essential steps in object modeling is to properly identify which data and which behavior actually belong on each given object/class. That is not always a trivial task by any means. I may sound as if I am talking to my 3-year-old when I repeat the same thing over and over: not all functionality that may be applied to a class must live on that class. It seems such an obvious and easy to understand point. But why do programmers continue to stuff application logic into their domain entities? And, may I note, very proudly so. I have lost count of the self-proclaimed DDD nouveau experts whose idea of good OO design is consolidating all (thinkable and unthinkable) business logic inside a domain entity.

When asked, they always point at the Evans DDD book. As if the book really promotes such nonsense.

Have they really read and understood the book? Doesn't Eric Evans specifically state that any application logic should live in the Application Layer, or … er... application services? Amazingly, this simple idea is overlooked by the zealots who proudly state that they "use DDD, and not services."

All this only proves to me how dangerous any tool, methodology, or teaching can be in the hands of those who have no patience or desire to actually learn it and understand the principles behind it. Instead, many people prefer to skim the surface and stop as soon as they come across something that seems like a quick solution to their immediate problem. Unfortunately, such attitude is very common in the software industry today, in general. It is common for programmers to search for easy, ready-to-use solutions on the internet, grab the first one that looks like it does the job, then cut and paste it into their applications without understanding how it works. And it may not. I have seen that too often...

De-coupling the application logic from domain logic provides possibilities to use the same domain models in various contexts and applications, without having to produce duplicates, without creating more work and hurdles. Any functionality that does not conceptually belong on the domain entity should live on a different type of object.

Generally speaking, use-case-specific operations must live inside the object that implements the use case. It does not matter how you call such objects. Personally, I see no problem with calling such objects “services.” If you don't like the word or think it has been stigmatized, feel free to call it something else. Just don't put your specific application's logic into a true domain entity class.

Use cases are normally specific to a particular application, not a generic domain.

Domain models - generally - should be designed to be useful for more than one particular application. At the very least, they should be re-usable by potentially multiple applications within the enterprise, if appropriate, or multiple portlets within a portal, etc. One generic thing that a Domain may helpfully provide for the applications to implement is the API for domain service objects - whenever appropriate, of course.


Functional Domains as Reusable Components

It is convenient and very effective to consolidate each distinctive functional domain inside a dedicated component. This means that all software artifacts for the given domain are physically grouped together. In terms of Java packages, this suggests that all classes for the given domain are packaged under the same root package. Applications may use these domain packages as JAR dependencies, and, if necessary, provide their own, application-specific implementations of the domain service APIs (if the domain model provides such API), including the application-specific data access detail, – in their “application layers.”

People like to say that they don't believe in re-use, that any new project requires writing brand new classes from scratch anyway, etc. I categorically disagree with such philosophy. I can't tell you how many times I was able to benefit from re-using generic domain components I had written – by using them on more than one application. The key here is thoughtful design and getting things done well the first time around. There is no point in trying to re-use a poorly written rigid and not adoptive piece of code... That's why I believe that getting things well really pays off in a long run. Of course, as time goes by, I find myself making adjustments and improvements to such components, but that is called normal re-factorings.

A Domain Model normally consists of the following types of domain artifacts:

  • Domain entities; those are subjects and actors in the given domain; Domain entity objects must have no knowledge of any application-specific functionality whatsoever. They should not define any persistence logic, nor should they store any direct references to any type of objects that implement data access. For example, some ''UsZipCode'' class may implement such behavior as ZIP validation, parsing of the input data, splitting the ZIP into a 5-digit code, 4-digit extension, or representing itself in several different ways: 5-, 9-, or 11-digit code, etc. All of such functionality is essential to defining the very concept of a US Zip Code. However, it has nothing to do with the world outside the class itself. Nothing in such ZIP Code class depends on, or makes assumptions about, how the objects of the class will be used in applications. Entity definitions must be de-coupled from any external operations that may be performed on the entity instances. The necessity in such operations (use cases) may come and go, but the domain entity objects will remain what they are - regardless of how they are used. It is absolutely valid to expose domain entity objects to the presentation tier (e.g. controllers, form beans, action forms, etc.) and DAOs. If necessary, instances of domain entities may and should be passed between the presentation tier classes and application services in the middle-tier.

  • Domain Services (APIs): define the operations that do not conceptually belong on domain entities; these may be generic domain-specific use cases/scenarios that may be applicable to the domain entities but may not be considered the integral part of the essential behavior of any particular domain entity; these use cases define the context in which some instances of the domain entities may be used and any operations applicable to these entities; the actual implementations of these service APIs may be application specific. The clients of the services may be application presentation tiers, external applications (i.e. web applications, batch processes, web services, etc.) or other services. A ''service'' module exposes only the ''use cases'' the service implements – via its public API/Domain Model. Everything else, including any data access logic and technology, should be considered the implementation details of the service that are not directly exposed to the clients. Use cases implemented by services must reflect logical operations and never expose a notion of any particular data store. For example, a client that submits a product order should only be aware of the mere placing of the order, with no indications of what the underlying system actually does with the order, whether the data is stored in the database, etc. As Eric Evans and martin Fowler both point out, the application services would normally be fairly light-weight and abstracting the minimum application-specific logic and persistence, if necessary. It is absolutely fine for such services to do little more than forwarding to the underlying DAOs. That is a very small price to pay for flexibility, ability to swap implementations, and future maintenance.

  • Domain Factories that produce instances of domain objects;

  • Domain Value Objects, if necessary; I can't say I often find much use for pure value objects, however.

A domain-specific component abstracts any implementation details of the functionality it provides while exposing the operations via the ''public interfaces''. In some cases, it may not even have to provide the implementations at all – leaving that task up to its clients. In the terms of a programming language such as Java, the Application Programming Interfaces (APIs) of such components are defined as methods that represent the business operations of the given functional domain. These methods may live on the entities themselves, if appropriate, or on domain services if such services are applicable to the given domain. The domain entity objects may be returned by and/or accepted as the arguments of such interface methods.

Domain components may use other components and 3rd party libraries as their dependencies. Each component is developed, maintained, and distributed independently of any client applications or other components that may rely on their functionality.


Data Access

Some use cases implemented by a service or component may require access to data resources such as databases, in-memory data, file systems, web services, or other types of remote or local data sources.

I whole-heartedly believe that a data access operation should not be considered a use case in its own right outside the context of the business operation that requires that data access in order to complete. It should always be abstracted by a service that implements the use cases.

For example, some Order service may expose the "Place Order" Use Case API that reveals nothing about how and where, or whether, the order should be saved. It only tells the client that the order will be processed and the client can expect the result promised by the API. Once the client calls this generic API, the actual service implementation may (or may not) need to ask another service (e.g. some Product Service) to check whether the given product is available, etc., and then invoke its data access module to save the order in the data store defined by the service configuration. The Order service is agnostic of the Product service’s data access implementation and is only aware of the Product service’s public interface, e.g. the API that implements the “Check Product Availability” use case. The client, in this particular case, is not aware of any communication between the Order service and the Product service. The Order service API, in this example, serves as a single-entry façade for the clients who need to place orders. Simple.

A data access object (DAO) is a Java class (POJO, of course) that abstracts any data access logic and data source for the given service. DAOs are not limited to implementing access to databases. They may abstract any kind of data sources including in-memory data, files, access to data via web services, etc.

I do not share the preference of some architects that DAOs should be designed on the one-per-entity or one-per-table basis. Such approach, in my view, has at least two major flaws. First, it produces extra complexity due to a large number of finely grained DAO/repository classes. Second, it usually exposes the notion of persistence to the entities. A service, on the other hand, provides a single logical entry per use case while abstracting any DAOs/repositories inside and normally grouping data access functionality by the use-case relevance, not by entity type. This way, one may get by with only one or two DAOs per service vs. one per each entity type. Since data sources and the data access details are usually specific to a particular application, I lean towards providing application-specific implementations of DAOs (one or several) per each domain/application service.

Generally, a service may abstract a single DAO or a set of dedicated DAOs. Any given DAO may only be used by one service and not exposed to multiple services. Since data access operations are nothing more than implementation details of a more generic use cases represented by the service, no other service or application should ever need to access the DAOs directly. Instead, they must talk to the APIs of the service that wraps the DAO(s.)


The above are my personal views and recommendations that I hope the reader finds helpful and comprehensive. I don't have a goal of converting everyone to my point of view. These approaches to multi-tier architecture and domain modeling, however, are shared by many architects, and have proven quite effective. There is no single good way to solve every problem. Needless to say, variations and alternatives to the approaches described here are relevant and often desirable. The general goal however should never change: keep things simple and clean, as much as possible.