Why the “private” keyword is the modern day “goto”

! Warning: this post hasn't been updated in over three years and so may contain out of date information.

evil-privateMost developers these days will tell you that one aspect of object-orientated (OO) best practice is to make member variables private. I contest that they are wrong and that not only is it bad practice, it can force others to have to implement nasty hacks to get around the use of the private keyword.

If you are old enough to remember programming in the pre 1990s, you’ll likely have used the keyword “goto” in your code at some point and the longer you have been programming, the more you will have used it. If you came to programming after that, you’ll (hopefully) have been told from day one not to use it. You may even use a a language that doesn’t support it.

The “goto” statement took a long time to die, but it has gone from being a fundamental programming concept to a programming pariah used by only the most misguided of developers (and possibly a few folk working on extreme edge-case applications.) This highlights the fact that software engineering/ craftsmanship is an evolving field, where what constitutes best practice can change over time. Other previously considered best practices have suffered a similar fate to the goto keyword, and I’ll cover some of those later. Others though seem to either resist challenge, or just never get properly challenged. I wish to address one of those here: the “private” keyword.

Why do folk use private? There are two aspects of OO that at first glance seem to encourage it: encapsulation and inheritance:

“Data hiding”, or encapsulation is key to OO. If I have a list class, I allow other classes to interact with the list, but I hide the list’s implementation away. Any developer using my List class should not need to worry about how the list is stored, instead I should provide a set of List-related methods, eg Split, AddToHead for manipulating it.

As “Uncle Bob” puts it: “The Open-Closed Principle … says that objects should be open for extension but closed for modification. In other words, you should be able to change what a module does without changing the module”. This principle was regarded as best practice in the early days of OO.

The combined requirements of data hiding and preventing modification through inheritance inevitably gave rise to the private keyword for hiding member variables (fields) away. The thinking being that fields themselves should be private and that public methods should be provided for accessing those fields.

Many younger developers may not aware of this, but this “open for extension but closed for modification” mantra led to some pretty daft ideas. One such was that if I had some arbitrary code:

when it was noticed that Div(n, 0) results in an a divide by zero error, method Div would not be fixed as that would be modification of the API and so could impact existing code. Instead one would extend, eg:

In the early 90s, I actually tried to argue for the adoption of this practice in the company I worked for at the time. With hindsight I am very pleased that I was voted down by my more pragmatic colleagues. These days, with the great ideas of unit tests and refactoring gaining in popularity, the mantra of “open for extension but closed for modification” has gone from a mere dubious idea to a downright debunked one. Modifying through refactoring is now positively encouraged.

There is another twist in the tale of why “private” is so popular and that is due to properties (getters and setters if you prefer, though to my mind these should refer only to the weird accesor methods that Java has instead of proper properties). Because one can wrap a private field in a public read/ write property, there seems to have developed a consensus that one should do so always without exception. This is a bad state of affairs as it results in ridiculous stuff like:

Why do I claim this is a ridiculous class? Two reasons. First, there is only pseudo encapsulation occurring with the class, as the the two properties fully expose the underlying data. They serve no purpose whatsoever other than to add complexity to the class (thus increasing the chances of bugs) whilst fulfilling some grossly misguided belief that best practice is being followed. The other is that it is a classic attempt at writing a value object (VO) by someone who doesn’t know what a VO is. A key feature of a VO is that it should be immutable: its a value, not a reference. Shame on you if you didn’t spot that! 😉

Before we go any further, let me make one thing clear: the well thought out use of public properties and hidden fields is good practice as it hides the implementation. What I’m suggesting is that blindly using them in a simple full-mutable record-style (or tuple-style if you prefer) class is bad practice. We can rewrite the above class to make it a proper VO:

You may notice that I’ve not only made the class a true VO, I’ve switched the fields from private to protected. This brings us to the main topic: why using protected rather than private for fields should be regarded as good practice and the use of the private keyword should be avoided at all costs.

As previously mentioned, one reason people use private rather than protected stems from the outdated idea of “extend, do not modify”, but there is more to it than that. There is also a belief that developers somehow need protecting from side effects within the base class when extending it and from needing to know about the base class’ inner workings. Many such folk seem to view public and protected as somehow equivalent. They are not. Protected respects encapsulation; public doesn’t. Finally, people just use private because that is what other folk, and many automatic tools, do.

Regarding the second point, I personally find this attitude both patronising and symptomatic of a lazy attitude to writing classes. Its patronising in that the base class’ developer doesn’t trust me. It’s lazy because documenting a class properly for the benefit of anyone wanting to extend it takes effort. It’s much easier to just mark everything private and hide it away. This attitude is especially annoying when used in frameworks or libraries for which the source is available. I can see those useful fields and support methods sitting there tempting me, but I can’t access them.

I’m sure I’m not the only one who has had to resort to cutting and pasting the code for a set of private members out of a base class into a sub class in order to modify some subtle aspect of the base class’ behaviour simply because the original developer used the private keyword. If they’d used protected, then the issue would never have arisen. If you have ever done more than simple work with the Flex SDK, then you’ll likely know exactly what I mean.

Even when the source isn’t available, debuggers taunt the developer with details of private members beyond our reach. This, coupled with the over-zealous use of the private keyword, has even spawned an entire micro-industry in hacks to the Flash event classes to work around its perceived shortcomings.

I found an article by the previously mentioned software engineer, “Uncle Bob”, which argues the opposite of this article. It is worth addressing the points he raises in order to explain why those points are misguided in my view, which should reinforce why I’m so anti the “private” keyword.

“How do you protect a module from the forces that would try to modify it?”
As previously discussed, inheritance is all about modification (what’s the override keyword for, if not for providing alternate behaviour for a method for example?) If you want to protect a class from modification, make it immutable and final. If you don’t make it immutable and final, accept that someone developing a sub class is doing so because they want to modify your class’ behaviour! So please make it simple for them to do so.

“If a variable is not private, then it is open to be used in a way that the module that owns that variable does not intend … For example, given a variable v used by a module m, such that v should never be negative. If you make v … protected someone could set it to a negative number breaking the code in m…”
Write your class. Document how it works. Document that v should not be negative and explain why. Then step away and leave sub class developers to make up their own minds. If I chose to make v negative, then I have to deal with the consequences. Don’t treat me as a child and try to prevent me doing so. If course if v is to be accessible outside of m‘s inheritance tree, put a public property in place to prevent outside agencies making it negative. Again: public and protected are not equivalent. Protected respects encapsulation; public doesn’t.

“Privacy does not preclude extensibility. You can create public or protected accessor methods that: 1) provide extenders access to certain variables, and 2) ensure that the extenders don’t use the variable in an unintended way.”
Let’s rewrite that to reflect modern software development techniques and what ought to be best practice. Encapsulation does not preclude access to an object’s protected values. You can create public properties that provide external functionality access to those values whilst ensuring that the external functionality cannot modify those values in an unintended way.

To wrap up, here is a list of what I’d regard as best practice with regard to what’s been discussed here:

  1. Avoid using the “private” keyword. If possible, never use it. Pretend it doesn’t exist.
  2. Follow some sensible rules of encapsulation:
    1. Is the class as simple record/ tuple-style class (a DTO) with no methods for causing side effects to the fields? If so, make the fields public.
    2. If not, does the field contain data that needs to be exposed to external functionality? If yes then make it protected and wrap it in a public property only if changing its value has side effects, it needs validating or it is a read-only value. If it doesn’t need validating, has no side effects and doesn’t need to be read-only, leave it public.
    3. If the field should not be exposed to external functionality then make it protected.
  3. Properly document your class so that developers can easily extend it and can understand the consequences of modifying the base class’ behaviour.

Remember, unless you unlucky enough to be using Java or another archaic language with no proper property support, you can always refactor your class to replace a public field with a protected one with a public accessor property if requirements change. Refactoring is the key: do not design your class with future expectations in mind. Design it to meet only your current requirements and write unit tests to test that functionality. Then in future you can refactor to meet new requirements (which probably won’t be what you were expecting) confident in the knowledge that your unit tests will pick up any breakdown in existing functionality. If enough developers adopted these modern development best practices, we just might be able to consign the private keyword to software engineering history, which is where it belongs.

I have written a new – shorter! – article that demonstrates the unit-testing benefits of using protected rather than private. See ‘Real world use-case of “use protected, not private”

62 thoughts on “Why the “private” keyword is the modern day “goto”

  1. again on point 3. to make something extensible does not mean to make all variables accessible to the extending class by using protected this is “sometimes” the only way to do so in an intelligible manner but by no means is or should this be the only way to design something with extension in mind.

  2. The last few years I have also come to the conclusion that private survives little use… and is more cumbersome… fields might as well be protected and offer the possibility of more flexibility in derived classes.

  3. David,

    Excellent article! Just one comment: everything you said about ‘private’ shouldn’t also stand for ‘final’?

    I mean: if you are really really absolutely positively sure that your class should be ‘final’, why not just document it as this and let the developers ‘deal with the consequences’ if they wish to inherit from it?

  4. I think two worlds are clashing here: the library developers and the library users. The first want ‘private’ to have freedom to modify their classes without breaking user’s code; the other want freedom to modify the library without touching the library itself.

    I feel the first is the less reason. Modifications are extensions in most cases, and often you can put the modification into a derived class, where you are free to define new private members.

    After I was forced to use ‘#define private protected’ a couple of times, I adopted two simple policies:

    1. I put data members into ‘protected’ instead of ‘private’ if I have no good reason to do otherwise. I use ‘private’ only if I am absolutely sure the connected functionality *must* be handled in this class. The typical guidelines say that data members are best in ‘private’; I do not think so.

    2. never put functions in private. If a function in ‘protected’ is using private members, it is going to be ‘semantically private’ anyway (meaning you cannot put the same code in a derived class), but I let people who are deriving from my class to replace the function or use it if they want. It is their mess.

  5. When I saw this post title, I said to myself, ‘Hey, this man is throwing the one rock to make the avalanche!’. And the avalanche is keep rollin’ until now.
    Long time ago, when there are no AS3, I believe that any inaccessible part of a class should be labeled as a private member. And after I dive into big projects using AS3 I keep swearing library creators as I had to made a turn around to solve some of possibly simple problems. Sometimes it’s just because I can’t name my new method with the built in method that inaccessible. Sometimes I just need to made a little change to the existing ones.
    All in all, +1 for modifiable private methods, keep the private still in private, but let me modify it if I really need to do. In other words : Use protected for useful methods! (That pretty the same, but I hope you understand what I meant to say)

  6. Hey there David,
    I’d like to ask you a question, if you don’t mind. In one of your comments, you mentioned “modern managed languages” in opposition to C++. Now, I’ve been getting the impression that you’re not too fond of Java (which I would regard as at least more recent than C++), either, so could you give some examples what languages you’re thinking of or which ones you’d recommend? Note that I have very little experience with programming, so I don’t really know what the more subtle benefits/detriments of C++ or Java are.

  7. @Socob,

    There is no simple answer to your question. What do you want to do with the language? Which platforms do you want to target? Which languages do you enjoy using? Is speed or maintainability more important to you? Do any patterns particularly appeal to you? Do you like to develop using the command line or do you prefer to stay within an IDE?

  8. I was afraid something like that would be the answer. Generally, for my purposes, I’m not unhappy with what I’m using right now, which is Delphi – of course, if you have some kind of opinion on that, I’d be glad to hear about it.

    However, I was intrigued when you talked about said unnamed “modern” languages, because it sounded like you had something specific in mind. Additionally, at least to my knowledge, there haven’t really been any ground-breaking newcomers to the group of programming languages since Java (the only more recent thing worth mentioning that I can think of is C#, which I know nothing about), so I was surprised when you complained about Java’s age/lack of modernity.

    Another thing is that with Delphi, sometimes I feel like things are more complicated to do than, say, in PHP. Now, PHP probably isn’t the best example for a “good” language, but its arrays with string indices (array[“a”]) is something that springs to my mind. It’s not the only example, but more like a general feeling I have.

    So, really, I don’t know exactly what I’m looking for. I just have this feeling that there’s something I don’t know about and that I’m missing out on. I’m sorry that you’ve had to put up with my ramblings, but if you can make some sense of it, I’d appreciate it if you could comment on it.

  9. Don’t use “private”, don’t use protected either. Document props and methods as private.

    Don’t type props. Let everything be objects, only document variables as what their interfaces are.

    And namespaces – who needs them if You can just put info in docs…

    Sorry for sarcasm, but I just found it as the shortest way to describe my opinion about this idea. What You encourage, David, I think we call a philosophy of liberal programming, where user is not hampered in any way, he’s free and can do what he wants with the libraries. I agree with that.

    Sometimes, however, a developer doesn’t want to let users use his library beyond his own functionality. Sometimes users want the same – mainly because when there are no private properties, then everything is private – I don’t know which parts I shouldn’t change on my own, so for future compatibility I don’t change anything. Similarly, if I’ll read in an instruction of an electric kettle that I can do whatever I want with it, then I’ll be scared to touch it with wet hands, as opposed to an instruction that clearly state what’s dangerous.

  10. @Broady,

    You offer a reductio ad absurdum fallacy as your argument and thus it is unfortunately a non-argument regardless of whether you are being sarcastic or not.

    If a developer doesn’t want a user to use his library beyond his own functionality, then the correct thing to do is to mark the classes as final or sealed. If a class is not marked that way, the developer is implicitly allowing the user to extend and override the built-in functionality. In this case, avoiding private and using protected instead encourages the developer to document how the class works from the perspective of a user wishing to extend and override the built-in functionality.

    In the nine months since I wrote this article I have been examining both my own and other people’s code with regard to the use of the private keyword. What has struck me the most is that it is harder as a library developer to use protected as one must document better. Private is the lazy developer’s tool of choice…

Comments are closed.