Python is an object-oriented programming language, which means that it provides features that support object-oriented programming (OOP).
Object-oriented programming has its roots in the 1960s, but it wasn’t until the mid 1980s that it became the main programming paradigm used in the creation of new software. It was developed as a way to handle the rapidly increasing size and complexity of software systems and to make it easier to modify these large and complex systems over time.
Up to now, most of the programs we have been writing use a procedural programming paradigm. In procedural programming the focus is on writing functions or procedures which operate on data. In object-oriented programming the focus is on the creation of objects which contain both data and functionality together. (We have seen turtle objects, string objects, and random number generators, to name a few places where we’ve already worked with objects.)
Usually, each object definition corresponds to some object or concept in the real world, and the functions that operate on that object correspond to the ways real-world objects interact.
We’ve already seen classes like str, int, float and Turtle. We are now ready to create our own user-defined class: the Point.
Consider the concept of a mathematical point. In two dimensions, a point is two numbers (coordinates) that are treated collectively as a single object. Points are often written in parentheses with a comma separating the coordinates. For example, (0, 0) represents the origin, and (x, y) represents the point x units to the right and y units up from the origin.
Some of the typical operations that one associates with points might be calculating the distance of a point from the origin, or from another point, or finding a midpoint of two points, or asking if a point falls within a given rectangle or circle. We’ll shortly see how we can organize these together with the data.
A natural way to represent a point in Python is with two numeric values. The question, then, is how to group these two values into a compound object. The quick and dirty solution is to use a tuple, and for some applications that might be a good choice.
An alternative is to define a new class. This approach involves a bit more effort, but it has advantages that will be apparent soon. We’ll want our points to each have an x and a y attribute, so our first class definition looks like this:
Class definitions can appear anywhere in a program, but they are usually near the beginning (after the import statements). Some programmers and languages prefer to put every class in a module of its own — we won’t do that here. The syntax rules for a class definition are the same as for other compound statements. There is a header which begins with the keyword, class, followed by the name of the class, and ending with a colon. Indentation levels tell us where the class ends.
If the first line after the class header is a string, it becomes the docstring of the class and will be recognized by various tools. (This is also the way docstrings work in functions.)
Every class should have a method with the special name __init__. This initializer method is automatically called whenever a new instance of Point is created. It gives the programmer the opportunity to set up the attributes required within the new instance by giving them their initial state/values. The self parameter (we could choose any other name, but self is the convention) is automatically set to reference the newly created object that needs to be initialized.
So let’s use our new Point class now:
This program prints:
0 0 0 0
because during the initialization of the objects, we created two attributes called x and y for each and gave them both the value 0.
This should look familiar — we’ve used classes before to create more than one object:
The variables p and q are assigned references to two new Point objects. A function like Turtle or Point that creates a new object instance is called a constructor, and every class automatically provides a constructor function which is named the same as the class.
It may be helpful to think of a class as a factory for making objects. The class itself isn’t an instance of a point, but it contains the machinery to make point instances. Every time we call the constructor, we’re asking the factory to make us a new object. As the object comes off the production line, its initialization method is executed to get the object properly set up with its factory default settings.
The combined process of “make me a new object” and “get its settings initialized to the factory default settings” is called instantiation.
Like real world objects, object instances have both attributes and methods.
We can modify the attributes in an instance using dot notation:
Both modules and instances create their own namespaces, and the syntax for accessing names contained in each — called attributes — is the same. In this case the attribute we are selecting is a data item from an instance.
The following state diagram shows the result of these assignments:
The variable p refers to a Point object, which contains two attributes. Each attribute refers to a number.
We can access the value of an attribute using the same syntax:
The expression p.x means, “Go to the object p refers to and get the value of x.” In this case, we assign that value to a variable named x. There is no conflict between the variable x (in the global namespace here) and the attribute x (in the namespace belonging to the instance). The purpose of dot notation is to fully qualify which variable we are referring to unambiguously.
We can use dot notation as part of any expression, so the following statements are legal:
To create a point at position (7, 6) currently needs three lines of code:
We can make our class constructor more general by placing extra parameters into the __init__ method, as shown in this example:
The x and y parameters here are both optional. If the caller does not supply arguments, they’ll get the default values of 0. Here is our improved class in action:
The key advantage of using a class like Point rather than a simple tuple (6, 7) now becomes apparent. We can add methods to the Point class that are sensible operations for points but may not be appropriate for other tuples like (25, 12), which might represent, say, a day and a month, e.g. Christmas day. So being able to calculate the distance from the origin is sensible for points, but not for (day, month) data. For (day, month) data, we’d like different operations, perhaps to find what day of the week it will fall on in 2020.
Creating a class like Point brings an exceptional amount of “organizational power” to our programs and to our thinking. We can group together the sensible operations and the kinds of data they apply to, and each instance of the class can have its own state.
A method behaves like a function, but it is invoked on a specific instance, e.g. tess.right(90). Like a data attribute, methods are accessed using dot notation.
Let’s add another method, distance_from_origin, to see better how methods work:
When defining a method, the first parameter refers to the instance being manipulated. As already noted, it is customary to name this parameter self.
Notice that the caller of distance_from_origin does not explicitly supply an argument to match the self parameter — this is done for us, behind our back.
We can pass an object as an argument in the usual way. We’ve already seen this in some of the turtle examples, where we passed the turtle to some function — like draw_bar in the chapter titled Conditionals — so that the function could control and use whatever turtle instance we passed to it.
Be aware that our variable only holds a reference to an object, so passing tess into a function creates an alias: both the caller and the called function now have a reference, but there is only one turtle!
Here is a simple function involving our new Point objects:
Most object-oriented programmers probably would not do what we’ve just done in print_point. When we’re working with classes and objects, a preferred alternative is to add a new method to the class. And we don’t like chatterbox methods that call print. A better approach is to have a method so that every instance can produce a string representation of itself. Let’s initially call it to_string:
But don’t we already have a str type converter that can turn our object into a string? Yes! And doesn’t print automatically use this when printing things? Yes again! But these automatic mechanisms do not yet do exactly what we want:
Python has a clever trick up its sleeve to fix this. If we call our new method __str__ instead of to_string, the Python interpreter will use our code whenever it needs to convert a Point to a string. Let’s re-do this again, now:
Functions and methods can return instances. For example, given two Point objects, find their midpoint. First we’ll write this as a regular function. The function creates and returns a new Point object:
Now let us do this as a method instead. Suppose we have a point object and wish to find the midpoint halfway between it and some other target point. This method is identical to the function, aside from some renaming.
While this example assigns each point to a variable, this need not be done. Just as function calls are composable, method calls and object instantiation are also composable, leading to this alternative that uses no variables:
The original syntax for a function call, print_time(current_time), suggests that the function is the active agent. It says something like, “Hey, print_time! Here’s an object for you to print.”
In object-oriented programming, the objects are considered the active agents. An invocation like current_time.print_time() says “Hey current_time! Please print yourself!”
In our early introduction to turtles, we used an object-oriented style, so that we said tess.forward(100), which asks the turtle to move itself forward by the given number of steps.
This change in perspective might be more polite, but it may not initially be obvious that it is useful. But sometimes shifting responsibility from the functions onto the objects makes it possible to write more versatile functions and makes it easier to maintain and reuse code.
The most important advantage of the object-oriented style is that it fits our mental chunking and real-life experience more accurately. In real life, our cook method is part of our microwave oven — we don’t have a cook function sitting in the corner of the kitchen, into which we pass the microwave! Similarly, we use the cellphone’s own methods to send an sms, or to change its state to silent. The functionality of real-world objects tends to be tightly bound up inside the objects themselves. OOP allows us to accurately mirror this when we organize our programs.
Objects are most useful when we also need to keep some state that is updated from time to time. Consider a turtle object. Its state consists of things like its position, its heading, its color, and its shape. A method like left(90) updates the turtle’s heading, forward changes its position, and so on.
For a bank account object, a main component of the state would be the current balance and perhaps a log of all transactions. The methods would allow us to query the current balance, deposit new funds, or make a payment. Making a payment would include an amount and a description so that this could be added to the transaction log. We’d also want a method to show the transaction log.