It is not always clear which code is better or worse as it might depend on the needs and the team in question. Let’s have a look at two different implementations of a database-backed Set, one that is straightforward and easy to understand and another one that has more structure and less duplication at the expense of understandability. Which one would you choose?
Have you ever worked with an application where you had to copy data from one object to another and another and so on before you actually could do something with it? Have you ever written code to convert data from XML to a DTO to a Business Object to a JDBC Statement? Again and again for each of the different data types being processed? Then you have encountered an all too common antipattern of many “enterprise” (read “overdesigned”) applications, which we could call The Endless Mapping Death March. Let’s look at an application suffering from this antipattern and how to rewrite it in a much nicer, leaner and easier to maintain form.
In our batch jobs for data import we had many similar classes for holding the data being imported. Technically they are all different, with different fields, yet conceptually they are all same. I find this conceptual duplication discomforting and have written a single, more generic, class to replace them all.
The refactoring has been inspired by Clojure and its preference of few generic structures such as maps with many functions over the OO way of many case-specific data structures (i.e. classes), as explained for example in this interview of Rich Hickey, starting with “OO can seriously thwart reuse”.
We have here a Python API for fetching data from Mongo and either returning the raw JSON or a formatted, “parsed,” one. There is certainly a number of things that could be improved (it has been written by a non-programmer and no Python expert, so it is actually a real achievement for him) but what I want to focus on is the API exposed to the clients.
It troubles me because the API exposes too many details about its inner workings and forces the clients to know them.
Welcome to my new blog, dear reader! Its purpose is to explore beautiful and, well, less beautiful pieces of code to understand better what makes code good. My hopes are that it will help myself and other, especially junior, developers understand and appreciate the qualities of code, thus making us better programmers.
The blog is open to all – contributions are welcome!
I’ve become interested in code quality when a reputable fellow student of mine at the Czech Technical University, Martin Vejmelka, told me that the piece of Assembly code I have sent him for review is an unstructured piece of crap that is really hard to read and understand. Later I worked on a couple of maintenance projects, evolving code created long ago by unexperienced developers, and I have experienced first-hand the pains of bad – a.k.a. legacy – code. And it was not only the subjective suffering, but also an objective loss of productivity when wasting days trying to understand and modify the ill-structured, duplicated, unreadable code, regularly and inadvertently introducing bugs, that made me really appreciate good code.
Some of the main sources of my inspiration regarding code quality are Uncle Bob’s Clean Code, Kent Beck’s Implementation Patterns (my review), Michael Feather’s Working Effectively with Legacy Code (a nice study of code “badness” and how to deal with it), and Rich Hickey’s talk Simple Made Easy (where he argues that “easy to understand” and “simple,” i.e. less complex, are two rather different qualities, the former sometimes weakening the latter).
So what is good code? It is code that is easy to read, without accidental complexity, and easy to evolve. That implies that it is testable, well-structured, communicates well the intention of the programmer, and follows the principles of good design such as Single Responsibility Principle, Don’t Repeat Yourself, encapsulation, the four simple design rules from Clean Code etc.
As Jeff Atwood has written, when you ask a competent programmer “What’s the worst code you’ve seen recently?”, the answer is “My own.” It is humbling to realize that the code we write is never perfect and that code that initially seemed to be good, looks much worse in hindsight. Therefore we shall not criticize people for writing bad code but rather use the code to learn to create better one.
By the eay, if you want to see some bad code of mine, you will certainly find something to criticize in my GitHub projects. Enjoy 🙂
Let’s close with a quote from Jeff Atwood’s Nobody Hates Software More Than Software Developers:
In short, I hate software – most of all and especially my own – because I know how hard it is to get it right. It may sound strange, but it’s a natural and healthy attitude for a software developer.