Tuesday, October 24, 2006

.

Object-oriented programming

Friends who know something of computers, beyond surfing the web and reading email, but who don't do computer programming have occasionally asked me to explain something about object-oriented programming, and what makes it different from any other kind. Let's take a look at that now, and those of my readers who are computer programmers, please bear with me... and maybe keep me honest in the comments section, if I make any mistakes or say things that can do with clarification (though realize that this will of necessity be just a cursory introduction).

A computer program — of any kind — is a list of instructions that are meant to tell the computer how to perform a task. The "task" may be as simple as computing a mortgage amortization or as complex as controlling the play of a MMO RPG.

At the basic level, it's all ones and zeroes, binary codes that give the computer very basic instructions. Load the number from this location in memory. Add these two numbers. If this number is less than that one, jump to a different point in the program. Programming in ones and zeroes is essentially infeasible, but we have assembly language that gives us a more human-readable way to deal with low-level bit twiddling.

Realistically, though, nearly all programming is done in high-level languages, programming languages that look more like what we want to tell the computer to do. One line of a high-level language will compile into quite a series of binary instruction codes, involving many low-level commands, with jumps and loops, reading and writing the hard drive, and communicating over the Internet.

Since we usually think of tasks in a procedural way, as a list of smaller tasks and eventually as a list of basic steps, that's also often how we program. Programming languages such as COBOL (COmmon Business-Oriented Language), FORTRAN (FORmula TRANslation), BASIC (Beginner's All-purpose Symbolic Instruction Code), Pascal (named for the mathematician), and C (which came after B, which...) began as procedural languages.

If we designed an warehouse inventory management system procedurally, we might think of the things we'd need it to do: add things to the inventory, remove and edit them, place orders to suppliers, process orders from stores, ship the good to fulfill the orders, and so on. Then we'd break each of those down further. To add an item to the inventory we'd have to look the item number up in our inventory database first to see if we already have some in stock. If we do, we take the quantity we're adding and add it to the quantity that's already there and store the record back in the database. And so on. As the procedure brought us to the data that we had to manipulate, we'd design the data structures around the procedure.

But some programmers realized that sometimes it's better to think first about the data, and to design the procedures from there. Object-oriented programming was born. While O-O features were added to all the procedural languages listed above — morphing C into C++ (which is C notation for "C+1") — other languages, such as Smalltalk and Java, were designed specifically as object-oriented programming languages.

The basic concepts are that a piece of data is encapsulated in an object, and an object is an instance of a class. Related classes form a hierarchy, where subclasses have characteristics that they inherit from their parent classes (or super-classes). Methods are things that objects can do, and the idea is that from the outside, a program that has an object sees it as an abstraction, and manipulates it only with the methods that are defined and documented for that purpose.

For example, we might have a class called MotorVehicle, with subclasses LandVehicle, SeaVehicle, and AirVehicle. LandVehicle might have subclasses Train and RoadVehicle, and the latter could have the subclasses Truck, Car, and Motorcycle. There's our class hierarchy.

For methods, I suppose that any MotorVehicle can turnOn and turnOff, and can goForward, goBackward, and, of course, stop. AirVehicle can also goUp and goDown. And so on. We can define a number of things that vehicles at each level in the hierarchy can do, and vehicles in subclasses inherit the methods from the more general classes.

But the declaration of the methods, which we made above, just tells us what things (objects) that belong to these classes can do, but not how they do them. And that's a good thing, because it leaves the details to the implementation of each lower-level class, but allows a program that has a Vehicle to just tell it to goBackward, without having to know that a car has to change gears, a motorboat has to reverse its propellers, and a motorcycle has to have its driver push it backward.

This model of programming, object-oriented programming, lets us look at how we manipulate objects — and, thus, the procedure we're trying to make the computer perform — in general terms, which is good for a "top-down" approach to programming. Eventually we have to get into the fine details of just how this class of object implements that directive, but we needn't worry about it at the higher level.

We have a slight flaw here, though, because airplanes that are flying can't goBackward, but AirVehicle inherits the goBackward method from Vehicle, and passes it on to Airplane. We might have declared our methods differently, pushing the goBackward method down to only certain subclasses of vehicles. And, indeed, that's what we should generally do if we have a characteristic that's shared by some subblasses and not others. In this case, though, the vast majority of MotorVehicle subclasses are able to do it, and this particular subclass is an exception... so we've chosen to treat it that way. When we implement the goBackward method in the Airplane class, we will have it throw an exception to tell the program that's using it that it can't do that. This is an example of the decisions we have to make when we're designing a class hierarchy.

Sometimes object-oriented programming seems like an unnatural way to look at things, and, indeed, for some programs it's not the best choice. But for most projects of any complexity it seems to be a good way to organize and design the programs, and it helps in the development of the project. It also provides a great deal of flexibility in low-level implementation, without affecting the higher-level components, which are just using the interface that's well defined and that doesn't need to change. When someone designs a new sort of MotorVehicle, most of the programs that use MotorVehicles can just use it without even knowing about it, because for their purposes it just behaves like a MotorVehicle. Only programs that need to use the new vehicle's special features need to know about them.

Going back to the warehouse inventory system, let's look at it from an O-O point of view. The basic class is an Item. We can see that the inventory itself is a ListOfItems. An order is also a list of items, so we have two subclasses of ListOfItems: Order and Inventory. I think we can also make Shipment another subclass. Some methods we'll want to declare for all ListOfItems are add(Item), remove(Item), and countItems(). Order may specifically have methods to deal with payment, while we might declare methods in Shipment that relate to the shipping company we used to send it. But you can see that lots of things that manipulate ListOfItems will only need to use the general methods, and will work for us regardless of whether we're dealing with the warehouse inventory, an order to our supplier, a customer order, or a shipment.

Note that the use of an object-oriented language, such as Java, doesn't necessarily mean that you have a proper object-oriented program. You still have to use good design techniques, and become experienced in object-oriented design. Similarly, we can write object-oriented programs using procedural programming langauges... it's just that the O-O languages make it much easier because of the features of the language itself.

I'll close with two general opinions:

  1. No programming language or programming model is right for everything. While it's true that you can probably use one language to write everything, certain languages, models, and techniques are best for certain tasks. You can bang in a nail with a pair of pliers if you have to, but sometimes it's better to go find a hammer.
  2. Object-oriented programming helps avoid certain program design flaws. It also exposes others. There's no substitute for good design and good programming practice, and it's easy to write bad programs in any language. There are lots of examples out there.

No comments: