I was thinking about this whole issue of what is an object after reading the apparently quite well known article by Allen Holub recently. I only found out about this article a week or two ago, even though it was written in 1999.
One of the common definitions of an object is that it is a construct which contains both data and methods, or something similar. I don’t think that this is a very useful definition, in fact I don’t think it is a definition at all. Rather it is a description. That is, objects will contain both data and methods.
Holub, in his article, suggests defining an object in this way: “First and foremost, an object is a collection of capabilities.” This was the first of his statements that I disagreed with.
Steve McConnell in his book, Code Complete 2 defines an object as an abstract data type. And here is a definition that I can agree with. As an aside, much, or is that most or all, of what McConnell writes makes good sense. I would recommend his books and I will discuss some of them in a later article.
But for now, I will mention a few other issues raised by Allen Holub which grated on me and then discuss what an object is and does.
I’ll explain the whys and wherefors in a moment, but here are some rules of thumb that you can apply to see if you’re really looking at an object-oriented system:
- All data is private. Period. (This rule applies to all implementation details, not just the data.)
- get and set functions are evil. (They’re just elaborate ways to make the data public.)
- Never ask an object for the information you need to do something; rather, ask the object that has the information to do the work for you.
- It must be possible to make any change to the way an object is implemented, no matter how significant that change may be, by modifying the single class that defines that object.
- All objects must provide their own UI.
If the system doesn’t follow these rules, it isn’t object-oriented. It’s that simple.
OK, so how true are these statements?
The first is probably all right. Data should be private. There are a lot of reasons for this, one of which I wrote about some time ago.
The second, in my opinion is plain rubbish. Get and Set functions provide perfect encapsulation – because they completely hide the implementation of the getting and setting of data. Suppose you have a class called customer and it has a read property called surname. This property is accessed through a get function, and it is most likely a string type, unless you live on a planet with very strange surnames. How is the surname held in the database? And what does the class do when you create an instance of customer and access the surname property? Well, you can probably guess that it reads a string from the database. But that is nothing more than a guess. The implementation is hidden from you. OK, deep down you know that surname is held in the database. But what about a property called name? The getter may concatenate the customer’s first name and surname and return the result. The database may have an attribute called name, as well as first name and surname, and hence violate normalisation principles. The point is that the implementation is hidden and so you just don’t know.
The third point is well taken. It is certainly preferable to tell the object to do something rather than ask it for data. But never is a very strong word. In the real world of delivering software it is dangerous to make these sort of rules. Your client doesn’t care that you have achieved a certain level of object oriented purity. But I certainly don’t advocate writing unmaintainable code.
I agree that changes to the implementation should not affect the interface of the class.
This last one I find unbelievable. Holub suggests that if you want to display a customer’s name on a screen then you need a Name class which knows how and where to display itself. This means that every field in the database table for customer would have its own class. He gives some reasons for doing so.
Consider a system designed to get names from users. You might be tempted to use a TextField from which you extract a String, but that just won’t work in a robust application. What if the system needs to run in China? (Unicode comes nowhere near representing all the idiographs that comprise written Chinese.) What if a user wants to enter a name using a pen (or speech recognition) rather than a keyboard? What if the database you’re using to store the names can’t store Unicode?
If the system needs to to run in China then this is an issue which would have been resolved long before any designing of individual classes arose. Similarly, the database to be used is an architecture issue that should have been resolved. And in spite of all the hype about decoupling of classes and subsystems, the fact remains that databases are different and a change of database will almost invariably result in some changes to, at least, the data access layer of the application.
But by considering objects as abstract data types we avoid these problems. A customer is, probably a person, it may be a corporation, but we will assume that it is a person. The point of abstraction is that we have created an abstract representation of that person which contains the attributes relevant to our application. One of those attributes may be the person’s name, and we get that name through a getter method.
The final thing I want to say about Holub’s article is his assertion that an object “is a collection of capabilities.” Capabilities are defined by an interface. An object, or rather its defining class, will implement the interfaces which define the capabilities. The instanced object will then contain those capabilities.
Object oriented programming is really a matter of writing procedural code in classes then adopting a declarative style to tie the objects defined by those classes together. To do this requires abstraction to determine the nature of the objects and good interfaces so that you can implement the capabilities correctly.