Bad Code: Too Many Object Conversions Between Application Layers And How to Avoid Them

Have you ever worked with an application where you had to copy data from one object to another and another and so on before you actually could do something with it? Have you ever written code to convert data from XML to a DTO to a Business Object to a JDBC Statement? Again and again for each of the different data types being processed? Then you have encountered an all too common antipattern of many “enterprise” (read “overdesigned”) applications, which we could call The Endless Mapping Death March. Let’s look at an application suffering from this antipattern and how to rewrite it in a much nicer, leaner and easier to maintain form.

Note: This is more a design discussion than bad vs. good code but I still think it fits this blog. The code for the “good” part is available at GitHub.

The application, The World of Thrilling Fashion (or WTF for short) collects and stores information about newly designed dresses and makes it available via a REST API. Every poor dress has to go through the following conversions before reaching a devoted fashion fan:

  1. Parsing from XML into a XML-specific XDress object
  2. Processing and conversion to an application-specific Dress object
  3. Conversion to a MongoDB’s DBObject so that it can be stored in the DB (as JSON)
  4. Conversion from the DBObject back to the Dress object
  5. Conversion from Dress to a JSON string

Uff, that’s lot of work! Each of the conversions is coded manually and if we want to extend WTF to provide information also about trendy shoes, we will need to code all of them again. (Plus couple of methods in our MongoDAO, such as getAllShoes and storeShoes.) But we can do much better than that!

Eliminating the Manual Conversions

It’s time-consuming, error-prone and annoying to code all the conversions while you actually want to use your time to build business logic and not some boilerplate code. We can eliminate the manual work in two ways:

  1. Generalize the conversions so that they only need to be written once (likely leveraging existing conversion libraries)
  2. Eliminate them, e.g. use the same data format through the complete processing chain

To be fair, I have to admit that the manual approach also has some advantages: you have full (fool? :-)) power over the form of the objects and can fit them perfectly for the processing required, you don’t need to introduce huge and buggy libraries, and you earn more money, if paid by LOC.

However the disadvantages named above nearly always overweight the advantages, especially if you reuse suitable, mature, high-quality libraries that allow you to customize the processing to any detail on an on-need basis.

One question remains: how do we represent the data? We have two possibilities:

  1. With generic data structures, i.e. maps. This is common in dynamic functional languages such as Clojure and it is extremely easy and comfortable.
    • Pros: Less work, very flexible, generic operations can be applied easily (map, filter, etc.)
  2. With objects specific for each data type, i.e. POJOs such as DressVariant, Shoes
    • Pros: Type safety, the compiler helps to ensure that your code is correct, it might be easier to understand
    • Cons: You have to write and maintain a class for each possible data element being processed

Sidenote: The Business Domain

You might skip this section and only come back later if you want to understand the reasoning behind the design.

WTF has to do some processing of the dress elements that it retrieves, mainly because multiple elements may represent the same dress only with slight variations such as color. WTF thus stores such a group of related elements as a list of DressVariant items inside a parent Dress object, generates a unique ID for the Dress and stores the IDs of the input elements in an attributed named “externalIds”. Therefore N input elements becomes M Dress elements with 1+ DressVariants, M <= N.

WTF also has to do some other processing on its WTF XML input such as detecting which images are real and which are just fake placeholders but we won’t discuss that.

Implementing the Static-Typed Generic Processing

I’ve decided to keep having a class per data type not to diverge too much from the current implementation. How do we now make the manual conversions generic and reusable?

Let’s first see how I would like to construct the processing pipeline:

fetchFrom("http://wtf.example.com/atom/dresses.xml")
   .parseNodesAt("/feed/dress")
   .transform(DressVariant.class, new DressDeduplicatingTransformer()); // Transformer
   .transform(new PojoToDboTransformer()); // Transformer
   .store(new MongoDAO());
// + we'll use DBObject.toMap() + PojoToJson mapper when serving the data via REST

So we fetch XML from a URL, send it to a parser to extract some nodes that are automatically converted to DressVariant objects, next we use a transformer that merges multiple DressVariants into a single, unified Dress object, and finally we convert the resulting POJO into a Mongo DBObject before storing it into the DB. What do we use for the conversions?

  1. XML -> DressVariant: Use JAXB to convert Nodes to our POJO annotated with @XmlRootElement. Notice that you can customize the conversion that JAXB performs very much, if the need be. Thus you only need to create a simple POJO and add one annotation.
  2. DressVariant -> Dress: We will check the MongoDB and either send further an existing Dress or a new Dress object with this DressVariant added (this will result in multiple updates if the dress really has multiple formats in the input feed, but that isn’t a problem for us). This conversion is type-specific, i.e. for each data type we have to code its own transformation. That is good because for example Shoes don’t need any such deduplicating processing/converting.
  3. Dress -> DBObject: We will use the Jackson Mongo Mapper, and extension of the first-class JSON mapping library, that adds support for Mongo DB. It will also performs some special data sanitization required by Mongo, such as replacing ‘.’ in map keys with ‘-‘.
  4. DBObject -> MongoDB: We will have one generic method, storeDocument(String collectionName, DBObject doc), where the collection name is derived from the original object (e.g. DressVariant -> “dressVariants”). The doc’s attribute id is expected to be its unique identificator (and thus we will either update or insert based on its [missing] value).
  5. MongoDB -> DBObject: Again a generic method, list(String collectionName)
  6. DBObject -> Map: The DBObject does that itself
  7. Map -> JSON: We will use the PojoMapping feature of the Jersey REST library to automatically convert the Map produced by our methods to JSON when sending it to the clients.
  8. JSON -> clients: We will have one GenericCollectionResource with a list method mapped to the URL /list/{collectionName}”. It will load the collection from Mongo as described and return a List, automatically converted to JSON by Jersey.

Result: Aside of custom data-type-specific transformations, instead of 1 POJO, 4 hand-coded converters, and 2+2 methods for each data type we now need only 1 POJO per data type plus one generic converter, 4 generic methods and one or two libraries. Less coding, less code, less defects, more productivity, more fun.

Notice that thanks to our choice of libraries, if the default conversion schemas turn out not to be sufficient for us, we can tweak them as much as we want – though we most certainly don’t want to go that way. It’s better to sacrifice some flexibility and more fit data formats than doing too many tweaks, struggling with the mapping libraries instead of leveraging them. A wise man chooses his battles.

Sample Code

Sample code demonstrating automatic, generic mappings XML -> Java -> Mongo -> REST with JSON is available at GitHub - generic-pojo-mappers.

Summary and Conclusion

Many applications force developers to convert data between a number of objects, which is very unproductive and error-prone. A better approach is to avoid the conversions and use the same object throughout the whole processing as much as possible, doing conversions only when really necessary. These conversions are better written in a generic and reusable way than hand-coded for each data type and it often pays off to use an existing, mature mapping library for that (though you must make sure your intended use is aligned with its philosophy and design).

Using the same object throughout the processing causes it to be less fitted for the individual processing stages but it makes them much easier and faster to write and maintain. We lose some performance due to using reflection but that is negligible with respect to the I/O (retrieving a file over HTTP, sending data to a DB) and XML parsing.

In the example of the World of Thrilling Fashion, we have cut the amount of manual coding and methods considerably and the result is a smaller, cleaner, and more flexible code (w.r.t. adding new data types).

Criticism

But I really need to use objects fine-tuned for each layer of processing!

Your choice, if you really need it, do it – but be aware of how much you pay for it.

Libraries are evil!

Well, yes. Sometimes it’s better to hand-code things but not always. Make sure that you don’t use a library in a way different than intended because then you might lose more time fighting it than being productive.

You are an idiot!

Yes, many people think so. Thank you for reading.

Do you say that I’m an idiot if I wrote code like that?

Not at all, you might have good reasons to do so. Or you might not know the alternatives. Or you just haven’t such a strong dislike of writing mindless code as I do. That’s OK.

Related

The rich Persistent Domain Object + slim Gateway patterns by Adam Bien also make it possible to use the same object (a JPA entity) throughout all the application (web UI – DB) in the name of increased productivity.

M. Fowler’s EmbeddedDocument is a pattern for working with JSON flowing in/out of our services (REST <-> JSON-friendly DB) without unnecessary conversions but with good encapsulation; naive approach: json -> object graph -> (processing) -> json; “In many of these situtiations a better way to proceed is to keep the data in a JSONish form, but still wrap it with objects to coordinate manipulation.” – use a lib to parse the JSON into a generic structure (e.g. a structure of lists, and maps/dicts) and store in a field of an object defining methods that encapsulate it – f.ex. for an Order we could have a method returning the customer and another computing the cost, accessing the underlying generic structure. The user of the wrapper object doesn’t need to know/care about the underlying structure.

The sweet spot for an embedded document is when you’re providing the document in the same form that you get it from the data store, but still want to do some manipulation of that data. [..] The order object needs only a constructor and a method to return its JSON representaiton. On the other hand as you do more work on the data – more server side logic, transforming into different representations – then it’s worth considering whether it’s easier to turn the data into an object graph.

Republished from The Holy Java.

2 thoughts on “Bad Code: Too Many Object Conversions Between Application Layers And How to Avoid Them

  1. Johannes Brodwall (@jhannes)

    I think excessive layering is the most common malady of applications today. I’m not sure that I agree on your recipe to solve it before you post the classes for Dress and DressVariation ;-)

    I think the most important issue is this: We don’t want to see more than one Java object that contains the list of fields defining “Dress”, even if one or more of these objects are auto-generated.

    Personally, I don’t mind a translator that maps “jsonObject.set(“name”, dress.getName())” and “dress.getName(resultSet.getString(“dress_name”))”. Especially if the latter is stuff like “dr_dress_name_field11″. I prefer the explicitness over the magic behind JAXB or Hibernate. If you don’t understand SQL and XML, get out off software development!

    Reply
  2. Jakub Holý Post author

    Hi Johannes, it is a pleasure to having you commenting here! :)

    I unfortunately haven’t the classes. What is it in them that would make you (dis)agree?

    I believe that the fields of Dress were partly duplicated (but differently structured) in XDress but that is all. And I agree that it is unfortunate.

    “Personally, I don’t mind a translator that maps ..” – most people I know agree with you. However I feel deeply offended whenever I have to type such mindless code. I am a programmer to make the computer do mindless tasks for me, not vice versa. When I type code I haven’t to think about, I feel something is wrong. I want to focus on solving business problems, not on copying fields between objects.

    “I prefer the explicitness over the magic behind JAXB or Hibernate.” – I perfectly understand that. Magic is bad because it isn’t transparent, frameworks are bad because as soon as you want to do soemthing little unusual, you are in big troubles [1]. However, I do not see it as such a big problem if 1) the magic is very straightforward (such as mapping fields of the same name, without handling related entities etc.), and 2) I can easily, on-demand, drop out of the magic and do the piece I need manually, without needing to invoke even more black magic and without having to switch completely to do-it-yourself mode. Admittedly, libraries satisfying that are rare. F.ex. regarding JAXB – when it works, I would use it. When it gets more complicated, I’d switch into the boring manual conversion (instead of trying to force JAXB to succumb to my will with its black magic annotations).

    Thank you!

    [1] http://nealford.com/memeagora/2013/01/22/why_everyone_eventually_hates_maven.html

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s