Internal Data Transfer Objects

The Layers Pattern aims to minimize complexity by grouping objects with similar responsibility. One common question that arises when people apply this Pattern is how to integrate the Presentation and Business Layers, the two topmost Layers in the diagram below.

I find it curious that the relationship between these two Layers in particular sparks debate, whereas interactions between others, such as between Business and Persistence,seem to be more straightfiorward. I believe that we should use the same mindset when talking about relationships between any of this Layers, namely: the bottom Layer defines an API and the top Layer uses it. This API takes in and returns regular objects, and these objects encapsulate implementation details about which Layer they come from. The API is usually implemented as a Façade, and it returns proper, nom anemic, objects, with state and behavior.

But this isn’t necessarily a opular perspective. It is somewhat common to use Data Transfer Objects as the communication medium between Layers.

I think that the usage of DTOs as describe above has originated some of the most confusing architectures I’ve ever been exposed to.

DTO Quick Intro

Let’s quickly review what Data Transfer Objects are all about. Imagine a system with two nodes (e.g. Virtual Machines, processes, servers, web services… “node” here means something that has its own address space). Let’s imagine that you want to share data structures between these two nodes.

One popular pattern is to send a proxy of the object to the client, which will then treat this proxy as if it was the actual object. One problem I have found with this approach is that, when following usual Object-Orientation guidelines, the object sent between nodes shouldn’t execute too much logic on its own but rather collaborate with other objects to accomplish iots goals. Those collaborating objects might not have been copied to the new server, therefore operations performed by the local proxy/copy may incur in expensive RPC or IPC calls as the proxy tries to reach out for its constellation of collaborating objects to perform the most basic tasks, like calling a toString() method.

Martin Fowler has cataloged the Data Transfer Obejct Pattern in his seminal Patterns of Enterprise Application Architecture book. This is a Distribution Pattern in which you wouldn’t just send a proxy of the object to the client, but a data structure that packs pretty much everything that the object will need to perform its tasks into a single object, optimized for distribution over the network. I like to think of it as a .tar.gz containing the objects we need.

Although there are many good reasons to use this Pattern, there are obviously some undesirable effects. The most impactful one, to me, is that suddenly you have to maintain two different and heavily coupled hierarchies of objects, the DTOs and the business objects. You also need to write and maintain the mapping logic that converts between one and the other.

This complexity cost is exacerbated when you consider that the DTO objects represent accidental complexity, they aren’t part of your domain model.

DTOs only exist because they are pretty much the only way in which we can make distributed systems work. Withoutsomething like them, most distributed operations would be extremely slow and inefficient.

DTOs are generall useful when building distributed systems, but I have found that it is very unlikely* that they are needed for local communication, such as when two Layers interact.

In the next sections, let’s explore some of the reasons people give when asked why they still use internal DTOs.

“Because MVC Requires DTOs”

I wouyld argue that there is some generalised confusion about the MVC Pattern in our industry, especially about what the Model in it actually means. Let’s revisit some of the foundational literature to see if we can find something some clarity.

The original MVC paper describes Model as:

A Model is an active representation of an abstraction in the form of data in a computing system

When cataloging the MVC Pattern, Martin Fowler says:

In its most pure OO form the model is an object within a Domain Model. You might also think of a Transaction Script as the model providing that it contains no UI machinery.

And in Refactoring: Improving the Design of Existing Code, he also says:

The gold at the heart of MVC is the separation between the user interface code (the view, these days often called the presentation) and the domain logic (the model). The presentation classes contain only the logic needed to deal with the user interface. Domain objects contain no visual code but all the business logic. This separates two complicated parts of the program into pieces that are easier to modify. It also allows multiple presentations of the same business logic. Those experienced in working with objects use this separation instinctively, and it has proved its worth.

Exploring these authors’ work, we find no direct relation between Layers and MVC. You can use one without the other. Applying a layered architecture together with MVC can be seful, though. I like the way Craig Larman explain this interplay in his book:

This is a key principle in the Pattern Model-View-Controller (MVC). MVC was originally a small-scale Smalltalk-80 pattern, and related data objects (models), GUI widgets (views), and mouse and keyboard event handlers (controllers). More recently, the term “MVC” has been co-opted by the distributed design community to also apply on a large-scale architectural level. The Model is the Domain Layer, the View is the UI Layer, and the Controllers are the workflow objects in the Application layer.

Following Larman’s, MVC is an interesting way to organize Layers. We can draw a picture like this:

Which leads us to the conclusion that, when applying both Layers and MVC, our Domain Model plus the whole infrastructure that supports it is our Model in MVC parlance.

Back to DTOs, let’s take a look at how the original MVC paper thought about the relationship between Model and View:

VIEW

DEFINITION

To any given Model there is attached one or more Views, each View being capable of showing one or more pictorial representations of the Model on the screen and on hardcopy. A View is also able to perform such operations upon the Model that is reasonbly associated with that View.

[…]

VIEWS

A view is a (visual) representation of its model. It would ordinarily highlight certain attributes of the model and suppress others. It is thus acting as a presentation filter. A view is attached to its model (or model part) and gets the data necessary for the presentation from the model by asking questions. It may also update the model by sending appropriate messages. All these questions and messages have to be in the terminology of the model, the view will therefore have to know the semantics of the attributes of the model it represents. (It may, for example, ask for the model’s identifier and expect an instance of Text, it may not assume that the model is of class Text.)

So the View in MVC is not only tied to the Model but it will also filter its data and display only what is relevant. That is very interesting because creating different perspectives over the domain model is one of the main reasons for people to use DTOs internally:

But what we’ve just read about the MVC Pattern tells us that this View’s role. This makes the DTO in the previous diagram completely unecessary.

Therefore there’s nothing in the MVC Pattern per se that would require you to use internal DTOs. The View is responsible for accessing the Model and extracting what should be displayed in it.

Using it to Prevent Calls to Dangerous Methods

Another common reason given to justify the internal DTO approach is to prevent UI code (i.e. MVC’s View and Control) to call “business methods” present in domain model objects. Over time, I figured out that by “dangerous” people usually mean operations with side-effects.

DTOs allow you to hide those methods and, theoretically, people developing the front-end won’t be able to call them.

At first this sounds reasonable. Unless your are building suer-rich desktop applications, it is not very likely that these methods should be called from the Presentation Layer. But I’d like to offer an alternative solution to this issue: Just don’t call the bloody methos!. Your developers are not children, they should need no fancy training wheels.

But even if you can’t trust the development team for some reason, there’s always a way to call those methods. It does not matter how many layers of DTOs you use to hide your business objects a developer can always find a way.

And if you really really want to do such a thing, there are other means. The simplest solution that I can think of is to define checkstyle (or equivalent) rules that forbid those calls and break the build. If you really want to go all fancy just define an interface that doesn’t have the “dangerous” methods and use something like Macker to avoid calls to the implementation.

Loose Coupling

When the Presentation Layer is too coupled with the Business Layer, we might need to change he UI code whenever there’s a change in the Business Objects, even if the change itself shouldn’t impact the user experience. DTOs are one way that people have found to help minimize the pain here. I personally think that it makes matters worse.

Prior to introducing DTOs, we would have two components, Business and Presentation. Whenever we change Business it is possible that we need to change the Presentation.

When adding a DTO, we now have three components: the Presentation, the DTO, and the Business. When Business changes, it is very likely that DTO has to be changed. And when the DTO changes, it will potentially require a change in the Presentation anyway.

The DTO here doesn’t solve the coupling problem, it only adds another moving piece. Instead of keeping two components in sync now you have to do that with three components.

This reminds me of a classic quote by David Wheeler:

Any problem in computer science can be solved with another layer of indirection. But that usually will create another problem.

The key to reducing coupling between Presentation and Business is to define a good API between them. Make sure that your are not leaking too much detail about how the Business Layer is implemented leak to the Presentation Layer or vice-versa. If your Presentation Layer has to be extremely decoupled from the Domain Model, think about adopting a proper Presentation Model.

Concluding

I am not sure why Data Transfer Objects aresuch a popular Pattern when integrating Presentation and Business Layers. I think that there are two main drivers:

Data Transfer objects are often misused as records. Even after decades doing Object-Oriented programming and using OO tools and languages, people still unconsciously run to the Procedural style, where a problem is solved by dumb data structures plus smart procedures. There’s nothing wrong going Procedural if you know what you are doing, but this is never the case here.
Sun evangelized the use of what they call Transfer Objects (previously called Value Objects) in its EJB 2.x architecture. Those are internal or remote DTOs used to solve some problems introduced by Entity Beans and J2EE technology in general. In newer versions of the EJB spec and in applications that don’t use that technology –e.g. using Hibernate instead- the Pattern is not only not required but also introduces new problems.