A Brief Guide to CLOS

Jeff Dalton <J.Dalton@ed.ac.uk>

Contents:

  1. Introduction
  2. Defining classes
  3. Instances
  4. Inheritance of slot options
  5. Multiple inheritance
  6. Generic functions and methods
  7. Method combination

Introduction

This is a brief introduction to CLOS that says enough for someone who already knows something about Common Lisp to start using CLOS.

CLOS stands for Common Lisp Object System and is pronounced "see loss" or "kloss". (Opinions differ as to which is right.) CLOS is the part of Common Lisp that is directly concerned with the usual features of object-oriented programming, such as classes and methods. CLOS contains a number of complex and powerful features, but it is not necessary to understand all of CLOS in order to use it effectively. The basic parts of CLOS, which will be covered here, include pretty much everything you'd find in a typical object-oriented language, plus a few other things, and are sufficient for a wide range of applications.

When using CLOS, we will want to do three things:

  1. Define classes.
  2. Construct objects that are instances of classes.
  3. Define methods and generic functions.

We can do these things using only three new constructs: defclass for defining classes, make-instance for constructing instances, and defmethod for defining methods. defstruct can also be used to define classes, but it is more limited than defclass.

It's important to know how CLOS fits into the rest of Common Lisp. It turns out that all Common Lisp data objects are instances of classes. Consequently, you can define methods for all kinds of existing object types (such as numbers, hash-tables, and vectors); you don't have to use defclass at all before defining methods. Moreover, the objects created using defclass and make-instance can be used with ordinary Lisp functions, not only with methods.

To find the class of an object, use the function class-of. For example:
Expression Value
(class-of 'a) #<Built-In-Class SYMBOL>
(class-of "a") #<Built-In-Class STRING>
(class-of 12) #<Built-In-Class INTEGER>
(class-of '(a b)) #<Built-In-Class CONS>
(class-of '#(a b c)) #<Built-In-Class VECTOR>

In the rest of this document, we will often specify the syntax of something in Common Lisp. Here are the conventions used in these descriptions. Upper case names such as DEFCLASS are literals, but upper case is just to distinguish them from syntactic variables: you do not have to type them in upper case. Lower case is used for syntactic variables. "Thing*" means zero or more occurrences of thing; "thing+" means one or more. Curly brackets { and } are used for grouping, as in {a b}+, which means a b, a b a b, a b a b a b, etc.

Defining classes

You define a class with defclass:
(DEFCLASS class-name (superclass-name*)
  (slot-description*)
  class-option*)
For simple things, forget about class options. We will say nothing more about them here.

A slot-description has the form (slot-name slot-option*), where each option is a keyword followed by a name, expression, or whatever. The most useful slot options are

   :ACCESSOR function-name
   :INITFORM expression
   :INITARG symbol
Initargs are usually keywords that correspond to slot names. For instance, the initarg for the slot name would be :name.

Defclass is similar to defstruct, but the syntax is a bit different, and you have more control over what things are called. For instance, consider this definition:

(defstruct person
  (name 'bill)
  (age 10))
Defstruct would automatically define slots with expressions to compute default initial values, access-functions like person-name and person-age to get and set slot values, and a make-person that took keyword initialization arguments (initargs) as in
(make-person :name 'george :age 12)
A defclass that provided similar access functions, etc, would be:
(defclass person ()
  ((name :accessor person-name
         :initform 'bill
         :initarg :name)
   (age :accessor person-age
        :initform 10
        :initarg :age)))
Note that defclass lets you control what things are called. For instance, you don't have to call the accessor person-name. You can call it name, or object-name, or pretty much whatever you want.

In general, you should pick names that make sense for a group of related classes rather than rigidly following the defstruct conventions.

You do not have to provide all options for every slot. Maybe you don't want it to be possible to initialize a slot when calling make-instance (for which see below). In that case, don't provide an :initarg. Or maybe there isn't a meaningful default value. (Perhaps the meaningful values will always be specified by a subclass.) In that case, no :initform.

Note that classes are objects. To get the class object from its name, use

(FIND-CLASS name)
Ordinarily, you won't need to do this.

Instances

You can make an instance of a class with make-instance. It's similar to the make-x functions defined by defstruct but lets you pass the class to instantiate as an argument:
(MAKE-INSTANCE class {initarg value}*)
Instead of the class object itself, you can use its name. For example:
(make-instance 'person :age 100)
This person object would have age 100 and name bill, the default.

It's often a good idea to define your own constructor functions, rather than call make-instance directly, because you can hide implementation details and don't have to use keyword parameters for everything. For instance, if you wanted the name and age to be required, positional parameters, rather than keyword parameters, you could define

(defun make-person (name age)
  (make-instance 'person :name name :age age))
The accessor functions can be used to get and set slot values:
cl> (setq p1 (make-instance 'person :name 'jill :age 100))
#<person @ #x7bf826> 

cl> (person-name p1)
jill 

cl> (person-age p1)
100 

cl> (setf (person-age p1) 101)
101 

cl> (person-age p1)
101 
Note that when you use defclass, the instances are printed using the #<...> notation, rather than as #s(person :name jill :age 100). But you can change the way instances are printed by defining methods on the generic function print-object.

Slots can also be accessed by name using

(SLOT-VALUE instance slot-name)
Note that using slot-value reveals that the value is stored in a slot. Accessors are more abstract since they reveal less and don't have to access slots. Since accessors are functions (indeed, generic functions as described below), they can be (re)defined to obtain their value some other way. If you decide that a value should no longer be stored in a slot, and e.g. computed as needed instead, you can change the access function without having to change all the code that calls it.

Here's an example of slot-value:

cl> (slot-value p1 'name)
jill 
 
cl> (setf (slot-value p1 'name) 'jillian)
jillian 

cl> (person-name p1)
jillian 
You can find out various things about an instance by calling describe:
cl> (describe p1)
#<person @ #x7bf826> is an instance of class 
     #<clos:standard-class person @ #x7ad8ae>:
The following slots have :INSTANCE allocation:
age     101
name    jillian
Different implementations of Common Lisp will show such information in different ways and may also differ in the details of what appears in the #<...> notation.

Inheritance of slot options

The class above had no superclass. That's why there was a "()" after "defclass person". Actually, this means it has one superclass: standard-object.

When there are superclasses, a subclass can specify a slot that has already been specified for a superclass. When this happens, the information in slot options has to be combined. For the slot options listed above, either the option in the subclass overrides the one in the superclass or there is a union:

   :ACCESSOR  -  union
   :INITARG   -  union
   :INITFORM  -  overrides
This is what you should expect. The subclass can change the default initial value by overriding the :initform, and can add to the initargs and accessors.

However, the union for :accessor is just a consequence of how generic functions work. If they can apply to instances of a class C, they can also apply to instances of subclasses of C. (Accessor functions are generic.)

Here are some subclasses of person. Teacher is a direct subclass of person; maths-teacher is a direct subclass of teacher and an indirect subclass of person (via teacher).

cl> (defclass teacher (person)
       ((subject :accessor teacher-subject
		 :initarg :subject)))
#<clos:standard-class teacher @ #x7cf796> 

cl> (defclass maths-teacher (teacher)
       ((subject :initform "Mathematics")))
#<clos:standard-class maths-teacher @ #x7d94be> 

cl> (setq p2 (make-instance 'maths-teacher
		 :name 'john
		 :age 34))
#<maths-teacher @ #x7dcc66> 

cl> (describe p2)
#<maths-teacher @ #x7dcc66> is an instance of
     class #<clos:standard-class maths-teacher @ #x7d94be>:
 The following slots have :INSTANCE allocation:
 age        34
 name       john
 subject    "Mathematics"
Note that classes print like #<clos:standard-class maths-teacher @ #x7d94be>. The #<...> notation usually has the form
   #<class-of-the-object ... more information ...>
So an instance of maths-teacher prints as #<maths-teacher ...>. The notation for the classes above indicates that the classes are instances of the class standard-class. Defclass defines standard classes; defstruct defines structure classes.

Since classes are objects, they are too are instances of classes. A class, such as standard-class, that has classes as its instances is called a metaclass.

Multiple inheritance

First, some terminology. When defining a class C using defclass, the superclasses listed in the definition of C are the direct superclasses of C. We can also consider the superclasses of the direct superclasses, the superclasses of those superclasses, and so on. These are the indirect superclasses of C. So in the examples above teacher is a direct superclass of maths-teacher, and person is an indirect superclass.

Note that a superclass is more general and hence less specific that its subclasses. CLOS defines a notion of being more or less specific that covers all of the superclasses of a given class.

A class can inherit from all of its superclasses, both direct and indirect. When something is specified by more than one of these superclasses, there has to be a rule (such as the ones for slot options above) that says how this information is combined. In Common Lisp, rules of this sort can refer to a total ordering of all the superclasses of a class. This ordering is called the class precedence list of the class. Classes earlier in the list are considered more specific than later ones, and they may override information provided by less specific classes. (Of course, it depends on the rule. Some rules say "override", others don't.)

When all the classes being considered have only one direct superclass - single inheritance - the total order is easy to find. The following CLOS rule is enough, on its own, for that case:

Each class is more specific than its superclasses.
CLOS, however, allows a class to have more than one direct superclass. So CLOS provides multiple inheritance. With multiple inheritance, ordering superclasses becomes harder, and the rule above is no longer enough. Suppose we have
(defclass a (b c) ...)
Class A is more specific than B or C, by the rule we've already seen; but what if something (for instance, an :initform, or a method) is specified by both B and C? Which overrides the other? The rule in CLOS is that the superclasses listed earlier in the class definition are considered more specific (relative to the class being defined) than those listed later. This gives us a second rule:
For a given class, superclasses listed earlier are more specific than those listed later.

The two rules are still not always enough to determine a unique order, however, so CLOS has an algorithm for breaking ties. This ensures that all implementations always produce the same order, but it's usually considered a bad idea for programmers to rely on exactly what the order is. If the order for some superclasses is important, it can be expressed directly in the class definition.

[A perhaps not obvious property of the CLOS class precedence rules is that different classes, C1 and C2, can list the same direct superclasses in different orders, provided that C1 and C2 are not related to each other in a way that leads to a violation of the two rules given above.]

Generic functions and methods

Generic functions in CLOS are the closest thing to "messages". Instead of writing
(SEND instance operation-name arg*)     ;not CLOS
you write
(operation-name instance arg*)          ;CLOS
The operations / messages are generic functions - functions whose behavior can be defined for instances of particular classes by defining methods.
(DEFGENERIC function-name lambda-list)
can be used to define a generic function. You don't have to call defgeneric, however, because defmethod automatically defines the generic function if it has not been defined already. On the other hand, it's often a good idea to use defgeneric as a declaration that an operation exists and has certain parameters.

Anyway, all of the interesting things happen in methods. A method is defined by:

(DEFMETHOD generic-function-name specialized-lambda-list
  form*)
This may look fairly cryptic, but compare it to defun described in a similar way:
(DEFUN function-name lambda-list form*)
A "lambda list" is just a list of formal parameters, plus things like &optional or &rest. It's because of such complications that we say "lambda-list" instead of "(parameter*)" when describing the syntax. [I won't say anything about &optional, &rest, or &key in methods. The rules are in CLtL, if you want to know them.]

So a normal function has a lambda list like (var1 var2 ...). A method has one in which each parameter can be "specialized" to a particular class. So it looks like:

  ((var1 class1) (var2 class2) ...)
The specializer is optional. Omitting it means that the method can apply to instances of any class, including classes that were not defined by defclass. For example:
(defmethod change-subject ((teach teacher) new-subject)
  (setf (teacher-subject teach) new-subject))
Here the new-subject could be any object. If you want to restrict it, you might do something like:
(defmethod change-subject ((teach teacher) (new-subject string))
  (setf (teacher-subject teach) new-subject))
Or you could define classes of subjects.

Methods in "classical" object-oriented programming specialize only one parameter. In CLOS, you can specialize more than one. If you do, the method is sometimes called a multi-method.

A method defined for a class C overrides any method defined for a superclass of C. The method for C is "more specific" than the method for the superclass, because C is more specific that the classes it inherits from (eg, dog is more specific than animal).

For multi-methods, the determination of which method is more specific involves more than one parameter. The parameters are considered lexicographically from left to right. (Hence earlier parameters are more important than later ones.)

(defmethod test ((x number) (y number))
  '(num num))

(defmethod test ((i integer) (y number))
  '(int num))

(defmethod test ((x number) (j integer))
  '(num int))

(test 1 1)      =>  (int num), not (num int)
(test 1 1/2)    =>  (int num) 
(test 1/2 1)    =>  (num int) 
(test 1/2 1/2)  =>  (num num) 

Method combination

When more than one class defines a method for a generic function, and more than one method is applicable to a given set of arguments, the applicable methods are combined into a single effective method. Each individual method definition is then only part of the definition of the effective method.

One kind of method combination is always supported by CLOS. It is called standard method combination. It is also possible to define new kinds of method combination. Standard method combination involves four kinds of methods:

:Before, :after, and :around methods are indicated by putting the corresponding keyword as a qualifier in the method definition. Here's where the qualifier goes:

(DEFMETHOD gf-name qualifier specialized-lambda-list
  form*)

:Before and :after methods are the easiest to use, and a simple example will show how they work:

(defclass food () ())

(defmethod cook :before ((f food))
  (print "A food is about to be cooked."))

(defmethod cook :after ((f food))
  (print "A food has been cooked."))

(defclass pie (food)
  ((filling :accessor pie-filling 
	    :initarg :filling
	    :initform 'apple)))

(defmethod cook ((p pie))
  (print "Cooking a pie.")
  (setf (pie-filling p) (list 'cooked (pie-filling p))))

(defmethod cook :before ((p pie))
  (print "A pie is about to be cooked."))

(defmethod cook :after ((p pie))
  (print "A pie has been cooked."))

(setq pie-1 (make-instance 'pie :filling 'apple))
And now:
cl> (cook pie-1)

"A pie is about to be cooked." 
"A food is about to be cooked." 
"Cooking a pie." 
"A food has been cooked." 
"A pie has been cooked." 
(cooked apple)
The final line - (cooked apple) - is printed by the Lisp system. It's the value returned by the primary method: the one that prints "Cooking a pie."

:Around methods are more complex, so we'll consider them separately. Let's see what happens if we add an :around method for the class food and then cook the pie again. Note that we have to save the result of calling call-next-method in order to return it at the end.

(defmethod cook :around ((f food))
  (print "Begin around food.")
  (let ((result (call-next-method)))
    (print "End around food.")
    result))
If we do (cook pie-1) again, we will see:
"Begin around food." 
"A pie is about to be cooked." 
"A food is about to be cooked." 
"Cooking a pie." 
"A food has been cooked." 
"A pie has been cooked." 
"End around food." 
(cooked (cooked apple))
Note how the :around method gets to run around the other methods. The final value is (cooked (cooked apple)) because we've now cooked the same pie twice.

It is possible to specify arguments when calling call-next-method, but you must not change which method would be next. If no arguments are specified, the objects given to the current method when it was called are passed to the next method. That's what we did in the :around method above. Assignment to the method's formal parameters would not cause different objects to be passed. The only way to pass different objects to the next method is to give them as arguments to call-next-method. But you must be very careful about the classes of those arguments, because the list of methods that are considered by call-next-method must remain the same.

In addition to call-next-method, there is a zero-argument function called next-method-p that returns true if there is a next method and false if there is not. The call-next-method and next-method-p functions are allowed only within the body of a method.


j.dalton@ed.ac.uk
My home page