This post is part of the Powerful Python series where I talk about features of the Python language that make the programmer’s job easier. The Powerful Python page contains links to more articles as well as a list of future articles.
The question of what exactly comprises object-oriented development is still an open question, but one of the generally accepted parts of the answer is encapsulation. Simply put, encapsulation means that the data structures used by your program should present a simple external interface while having a hidden internal structure. This is generally considered a good thing because it means that users of your structures only need to be concerned with what it they’re supposed to do, not how they do it. They don’t need to bother with what goes on under the hood and equally importantly they can’t just reach in and tinker with your component’s internal structure if you don’t want them to. This in turn gives the developer freedom to change the internal implementation as long as the external interface is maintained.
Java and Python are both considered to be object-oriented languages, at least in the sense that they support creating classes and instantiating objects. They also both support some generally accepted features of orientation like inheritance, polymorphism and encapsulation. There are some differences as to how these features are supported though. In this article I’ll be focusing on the different approaches they take towards encapsulation. Let’s start with Java.
Java allows you to apply encapsulation at the level of individual class members. The programmer gets to see what external software sees each field or method by using one of the following access modifiers (also called access specifiers):
- no modifier — by default all members in a class are visible within the rest of the class and to other classes in the same package
private
— only the rest of the class can access a private member. It can’t be used by anything outside.protected
— These are visible within the same class, within the same package and to all subclassespublic
— These are visible to everyone
Classes themselves can be either public (visible to everyone) or have no modifier, showing that they are only visible in the same package. This is a straightforward way to implement encapsulation and it works quite well too. However, there are some drawbacks. Java’s brand of encapsulation (which is shared to some extent by C++) recommends that all fields be set to private. Any fields that need to be accessed from or modified by external software needs to have getters and setters. Here’s an example with a simple class that has 2 integer fields and a method that returns their sum:
class Simple { private int x,y, sum; public void setX(int tmpx) { x = tmpx; } public void setY(int tmpy) { y = tmpy; } public int getSum() { sum = x + y; return sum; } }
Java’s recommended strategy makes perfect sense. If you decide that you want to limit the integers to positive only, you just change the set methods and you’re done. Exposing the fields directly as public fields would have meant that any client code would have to then change when you rewrote your class. However, the pre-emptive encapsulation strategy that Java suggests does lead to verbose code and you can quickly become overwhelmed with getters and setters. And if you don’t do input validation on most of them, you’re left with a lot of code that’s sitting around doing practically nothing.
Python’s approach to encapsulation is slightly different. There are only public or private attributes. Anything that starts with (but doesn’t end with) two underscores is private to the class (or module). Everything else is public. However, Python doesn’t really enforce encapsulation strongly like Java. Rather it does something called name mangling. Essentially the names of all private attributes are hidden by internally adding the class name in front of them. For example attribute __attrib in class cls would internally become _cls__attrib. When someone tries to access __attrib externally, an error is raised saying that __attrib doesn’t exist. Programmers are expected to be polite and not barge into other people’s classes if they’re not supposed.
With this knowledge, it would be entirely possible to write the above Java code in Python in almost exactly the same way, but that would miss one of Python’s great encapsulation features: properties. Properties allow you to have class attributes that are accessed like simple attributes (fields or instance variables in other object-oriented languages) but are actually implemented by the class’s methods. Let’s rewrite the above code in Python using properties:
class Simple(object): def __init__(): self.__x = 0 self.__y = 0 self.__sum = 0 def setX(self, t, tmpx): self.__x = tmpx def setY(self, tmpy): self.__y = tmpy def getSum(self): self.__sum = self.__x + self.__y x = property(None, setX) y = property(None, setY) sum = property(getSum)
It may a bit hard to see why Python properties are so nice with this example, but there are a few things that are still pretty clear. Firstly, the user of your class just needs to know the name of a variable to set or get. They don’t need to deal with your naming conventions for setters or getters. This in turn lets you steer development of your class with finer accuracy.
Unlike Java, you don’t need to lock down your fields by declaring them private and then writing getters and setters which really do nothing. You can start by having everything be wide open and public and save a lot of boilerplate. Then when the need arises to make something closed, you can implement that as a method and leave everything else unchanged. Through it all users are blissfully unaware of what’s going on. The class becomes a smart store-house of data which they can get just by reaching for it. To see a larger example of how setters are useful, see this really informative post.
Over the past years I’ve learned to be careful when comparing languages, but in this case, I can say definitely that Python wins. To recap, here’s why:
- No boilerplate necessary. Compact code is mostly a good thing. Programs can easily be huge and complex, adding tons of getters and setters just in case you might someday need them doesn’t help.
- Better encapsulation (IMO). The way I see it, encapsulation is all about communicating with the user on a need to know basis. Why should the user need to know implementation details like whether you have a getter and setter and if your class attribute is public or private? Give them a single name by which they can access and alter a specific piece of data. Your API documentation should say whether something is available to read/write or not.
- Properties let you put off the private/public decision. The fewer decisions you make, the less chance of having to change them later.
One note to remember is that for your classes to use properties, they have to inherit from object
. I hope this article has helped you understand one of Python’s really useful but not-as-often used features. It can be hard to get used to a snappy, fluid language like Python coming from bulky Java or C++, but using language features like properties is a great way to start.
