Method Promiscuity Or The Case For Encapsulation

We have here a Python API for fetching data from Mongo and either returning the raw JSON or a formatted, “parsed,” one. There is certainly a number of things that could be improved (it has been written by a non-programmer and no Python expert, so it is actually a real achievement for him) but what I want to focus on is the API exposed to the clients.

It troubles me because the API exposes too many details about its inner workings and forces the clients to know them.

Here is the code:

Disadvantages It will be hard to change the internal organization of Clearjson once it’s widely used and the clients need to know and do too much, while the only thing they really care about is getting data, either raw or pretty formatted. And there is an invisible temporal coupling: the three methods/fields must be called in this particular order.

This is what I would have preferred (though, generally speaking, still far from being perfect):

Improvements The code now corresponds to what the client actually wants (getting data, with some variations such as formatting), places much less cognitive load on the client since there is only one method to call, something you can hardly do wrong. The API is simpler to understand, simpler to use, less bug-prone. And we are free to evolve and re-arrange the internals of Clearjson however we see fit, as long as we keep the signature of the single method.

Objections Somebody might dislike the increased complexity of get_mongojson that now has three parameters (Clean Code recommends as few parameters as possible, three being max) but I think it is a small price for the overall improvement, especially given that there is a reasonable default value and that we can use named arguments.


5 thoughts on “Method Promiscuity Or The Case For Encapsulation

  1. James Tikalsky

    The first version is object-oriented. The class Cleanjson takes the data returned by the query and hides it from the client programmer, allowing them to only access the “clean” version of the data. It caches the clean version which means in theory the query and cleaning can be performed only once.

    The second version adds a boolean flag that allows the client programmer to ask for a non-clean version.

    The main design issue is that Cleanjson embeds mongo querying. The modified code attempts to undo this by adding a boolean parameter that just returns the result of the query.

    Try this:

    1. Create a function (not class) that wraps the basic mongo query semantics.

    This will give the client programmer who wants “raw” results something to work with.

    2. Create a second function with the same arguments that wraps the raw results in an instance of Cleanjson. (This function can obviously reuse the first function to do the query.) Modify Cleanjson to wrap it’s data in the constructor, so that there’s no awkward uninitialized state. (The state of the instance between initialization and the first call of get_mongojson)

    This will give the second client programmer who wants “clean” results something to work with.

    1. Jakub Holý Post author

      I do not feel we are on the same boat here.
      The purpose of Cleanjson (the name is admittedly bad) is to get data from wherever it comes – without its user needing to know about the source – and returning it either as is or specially formatted. This requirement is satisfied by both solutions. The former solution exposes more details about how it works while the latter hides them.
      So “The main design issue is that Cleanjson embeds mongo querying” is not correct; this is the main functionality of the class.
      Thank you for reading.

  2. James Tikalsky

    As you say, the responsibility of Cleanjson is to get data from wherever it comes, AND formatting. That’s two responsibilities for one class. The third responsibility is to present a one-method interface to the client programmer. This strongly suggests that there should be at least two classes, if not three.

    I believe you own a copy of Clean Code by Robert C Martin… I don’t, but looking at the table of contents on, I suspect that he deals with this idea in the section titled “The Single Responsibility Principle” in Chapter 10.

    1. Jakub Holý Post author

      That is the nice thing about the 2nd implementation: if the mixing of concerns becomes a problem, I can split it into 2 or 3 classes *without impact on the client* Encapsulation wins!

      You are right it isn’t 100% pure but let’s be pragmatic,do what works best & refactor when needed. YMMV

    2. Jakub Holý Post author

      Hi James, thank you for the discussion! We might not agree completely but it is valuable to discuss. Funnily enough, the same topic has surfaced in the latest post [1], where I have been on the other side of the barricade, advocating more abstracted and structured code at the expense of decreased readability (due to distribution of the logic instead of all at one place). It turns out that the right balance between the two depends on many factors and will vary from person to person.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s