An object is defined by an internal collection of values representing its state and an associated collection of operations which can be used to sense and manipulate that state, and possibly to combine the object's internal state values with similar values drawn from other objects. Objects can be used to represent anything not directly or easily represented by one of SETL's built-in set-theoretic types. In particular, all of SETL's advanced facilities for user interaction, as presented in Chapter 10, are built using the object facilities explained in this section. These object facilities also serve to ease use of other external-language facilities availble to SETL by enclosing them in helpful 'syntactic wrappers'.
The operations defined on an object are called its methods, and are conceptually similar to procedures. But an important difference between a method and a procedure is that a method's name specifies an operation to be performed, but leaves the specific procedure to be used in performing the operation to be chosen dynamically, in a manner dependent on the type of the object to which the operation is to be applied. To illustrate the difference between these ideas, suppose we have two objects, one built around a bitmap and the other around an encapsulated Postscript graphic. Each of these objects might naturally have a method called draw, which draws it onto the screen (or elsewhere). We might then find ourselve working with a variable x that could be either a bitmap or Postscript graphic. Then the expression
will call either the draw method defined on bitmaps or the method defined on Postscript graphics, depending on the value of x at the time the expression is evaluated. A method is therefore something like an overloaded procedure.
A method is invoked by passing it an object of a kind for which it was defined, along with any other arguments required by the method. This is much like a procedure call, except that the object both determines the procedure called and becomes an implicit argument to the call. The standard syntax for this is
However, SETL also allows method calls to be written in infix notation, and, indeed, allows very aggressive overloading of all its built-in operations and much of its built-in syntax. The importance of this capablity is explained below.
A class or object class describes the data elements and methods common to a family of similar objects. Each of the (dynamically created, and possibly very numerous) objects described by a class is called an instance of the class. All these instances share a common set of methods (coded procedures), but each of these instances ordinarily contains its own private copy of some (typically of almost all) of the variables manipulated by these procedures. These variables, which exist in separate copies for each object of a class, are called the instance variables of the class; those shared by all the objects of the class are called the class variables of the class. Any of the instance variables, class variables, or methods defined for a class CL can be either be made public (visible to other programs, packages, or classes which use CL) or kept private (invisible to these other programs, packages, or classes.)
Classes can be related to underlying classes which aid in their definition by declaring a relationship called inheritance between them. This relationship integrates classes into an inheritance hierarchy. The inheriting class CL in such a relationship is called a subclass of any class from which it inherits, which conversely is called a superclass of CL. Subclasses are able to use data elements or methods in their superclasses as if they were declared internally. However, it is also possible for a subclass to override method and variable definitions made in one of its superclasses, simply by defining a method or variable with the same name again. Details of the conventions which apply are explained below.
The rules governing the syntax of a class definition closely resemble those which apply to package definitions. A class is treated as a compilation unit, at the same level as a package or program. It is not possible to embed a class within a package, program or other class. As in the case of packages, the definition of a class consists of two parts, a class specification and a class body. (The specification and body need not occur in the same source file, but the specification must always be compiled before the body.) Each class specification declares the list of superclasses (if any) from which the class inherits, along with names of class-specific methods and data elements visible to units which use the class. The complete syntax of a class specification is:
class class_name; inherit names_of_superclasses; -- there can be several of these data_declarations; -- const and var declarations, as in packages and programs method_declarations; -- like the procedure declarations in a package header end class_name; -- here, 'class_name' is optionalEach of these components of a class specification is described in a following section.
A class body defines some of the methods declared in the class specification, along with other methods and data elements visible only within the class. This can include a list of (class-private) instance and class variables. The syntax of a class body is:
class body class_name; use names_of_other_classes_and_packages; -- there can be several of these data_declarations; -- const and var declarations, as in packages and programs method_definitions; -- like the procedure declarations in a package body end class_name; -- here, 'class_name' is optional
Note that ( for consistency) this syntax similar to that used for package specifications / package bodies; but there are a few differences:
Data elements declared in connection with classes fall into one of several categories, based upon where they are stored and where they are visible. A data element declared in an object class or in one of the object classes from which it inherits can have a separate copy stored in each instance of the class, or it can be global to all instances of the class. This is the distinction between instance variables and class variables.
An instance variable is declared with the same syntax as is used for global variable declarations within packages or programs:
var name_1,name_2 := initializing_expression,...;
Note that all instance variables must be explicitly declared. Instance variables in the current instance (see 8.4) of any class C may be referenced by name only; and instance variables in other instances of the same or ancestor classes (these are the classes from which C inherits directly or indirectly) can also referenced using the more elaborate syntax illustrated by instance.name.
If initialization clauses are attached to the declarations of instance variables they are executed when an instance is created, but before any create method (see 8.5) is called.
Class variables are declared in a manner similar to that used for instance variables, but with the extra keyword class, as shown in
class var name_1,name_2 := initializing_expression,...;
If initialization clauses are attached to the declarations of class variables they are executed when a class is loaded by the SETL interpreter. This can happen at one of two times:
Methods are similar to procedures, except that many different classes can contain identically named methods. The standard syntax of a method definition is identical to that of a procedure definition, except that read-write and write-only parameters are not allowed.
procedure name(parameters); data_declarations; -- const and var declarations, as in packages and programs ... statements ... subprocedure_definitions; -- like other imbedded procedure definitions end name; -- here, the name is optional
Methods declared using this the standard syntax are invoked using the syntax
x.method_name(parameter_1,..,parameter_k).
Here x must be an instance of a class for which method_name is a method applicable to objects of that class. The method is invoked and the object x passed to it as an implicit parameter, referenced within the method using the reserved keyword self. Within any of an object's methods, any 'direct' reference to instance variables (i.e. any reference that is not preceded by an explicit object reference using the syntax instance.instance-variable) refers to the corresponding instance variable in the object x. Thus a direct reference
The more general syntax
can be used to refer to the instance variables of other objects of the same class (or any of its superlasses), provided that these are visible at the point of call.
Similarly, the methods of a class may be called within a class body without the instance prefix., i.e. method_name(parameters) is synonymous with self.method_name(parameters) within such a class.
The B>self object passed to a method call as an implicit parameter is analogous to a read-write parameter of a procedure, in that the method can modify the object and pass the modified result back to the point of call; all other parameters of a method are always read-only. Internally, the SETL interpreter stores objects as tuples, where the first element of the tuple is a key indicating the class of the object and the remainder of the tuple stores the values of instance variables. When a method is called, the values of the instance variable are copied into the instance variables. When the method returns, those values are copied back into the tuple. Note that this protocol passes parameters by copying, not by reference.
An object is of a given class is created with a call to a create method that must be declared and defined within the class and its class body if any instances of the class are ever to be created. Such a method must appear in both the class specification and class body, since it must be visible outside the class body. (But if the class is used only as a superclass of other classes, no create method need be declared or defined for it.) Since create calls must specify the class to which the newly created object is to belong, they have a special syntax, which uses the name of the class as a surrogate creation routine name for that class. For example, if we have an object class rational_number (introduced to make rational numbers, which are not built into SETL, available), an object of that class might be created by writing
three_fourths := rational_number(3,4); -- create the fraction 3/4
As this example shows, a create method can accept parameters and use them to initialize the created instance. If this is done, the specified number of actual parameters must be attached to the creation call and will then be passed to create.
The number of actual parameters of the creation call must agree with the formal parameters in create. As part of the creation process, all of the initialization clauses attached to its instance variable declarations are executed.
Note that 'create' procedures need not return anything. This is because when create receives control, the object (self) to which it implitly refershas been created by the interpreter and the instance variables of this bject will already have been initialized (at least partially) by using to the initialization clauses on the instance variable declarations. The create procedure is invoked after this preliminary initialization, and has the responsibility of completing initialization (in complex cases this may require creation of various ancillary objects). When create terminates, the self is automatically returned; any other value returned by create will be ignored.
'Create' is not a reserved word. It is only special in that if a method named create is present in a class specification it will be called implicitly at the time an object is created by using the class name of the desired object.
SETL's built-in operators can be overlaoded, allowing them to invoke methods defined on classes. For example,we can define '+' and '*' operators for our hypothetical rational_number class, allowing us to write code like
three_fourths := rational_number(3,4); two_thirds := rational_number(2,3); print(three_fourths * three_fourths + two_thirds * two_thirds);Clearly the SETL interpreter will not know how to add or multiply two objects of some newly defined class unless an addition or multiplication method is explicitly defined for the class. To define a multiplication method for the class rational_number, something like the following the following could be placed in the class body:
procedure self* x; -- multiplication operator for rational numbers return rational_number(numerator * x.numerator,denominator * x.denominator); end;
We now detail the conventions that apply to such operator overloading.
To overload one of SETL's standard binary operators, for example '+' or max, we use the operator itself in a procedure header. Two overload methods can be defined for each of SETL's built-in binary operators. These have headers of the following forms:
procedure self binary-op second_arg; or procedure second_arg binary-op self;The reson for allowing both of these two forms is that an object can appear either as the left or the right hand operand of a binary operator. SETL operations often expect both operands to be of the same type, but there are exceptions. For example, the built-in * operator can operate on integers and strings or integers and tuples, in which case the operands can appear in either order. We allow the two forms of binary operators to enable the same sort of thing for general objects. The first form above is used if the left operand determines the method used, in which case the left operand will become the current instance. Otherwise the second method will be used and the right operand will become the current instance.
In deciding how to process a binary operation the SETL interpreter gives precedence to the left operand. That is, it goes through the following steps before invoking a method of this kind:
Each of these methods should return a value, although that is not enforced by the system. If no other value is returned, then OM will be returned. If an object of some class is to be returned the method will need to create, initialize, and then return a new instance.
Here is a complete list of SETL's standard binary operators.
- * / ** mod min max with less lessf npow
All of SETL's standard unary operators also allow associated methods to be defined. The headers for these methods have the following form
procedure unary-op self;In this case, the method used will always be determined by the class of the value of the operand, which will become the current instance. Each of these methods should return a value, otherwise OM will be used. If an object of some class is to be returned the method will need to create, initialize, and then return a new instance.
Here is a complete list of SETL's standard unary operators.
- # arb domain range pow
Many, but not all of SETL's relational operators can be overloaded. We do not allow overloading of the equality and inequality operators, since these primitive operations play a fundamental role in the definition of set and map membership. We also restrict the overloading of comparison operators, since the SETL code generator assumes that a < b if b > a. Therefore, we allow a < method but do not allow a > method. For each operation we invoke the < method, but in the expression a < b, a will become the current instance, while in a > b, b will become the current instance.
Because of these restrictions, only < and 'in' can have associated methods. The < method allows two forms (left and right) just like other binary operators and follows the same rules. The 'in' operator also has two forms but we give precedence to the right operator when determining the method to be invoke for an in expression.
Any method associated with < or 'in' in this way must return either true or false.
The < method will be called when any of the expressions: a < b, a <= b, a > b, or a >= b is encountered. The expression a <= b is interpreted as a < b or a = b.
SETL provides four expressions for refering to portions of maps, tuples or strings, namely f(x), f{x}, f(i..j) and f(i..). Each of these expressions can appear in both left and right hand side contexts. The ability to overload these syntactic constructs is particularly valuable, since it enables us to define new own aggregates organized any way we like, while retaining the ability to access components of those aggregates elegantly.
Each of these expression forms allows two associated methods, one for appearances of the form in left hand contexts and the other for right hand appearances. The syntax of the headers for these methods is as follows:
procedure self(id1) := id2;
procedure self{id1};
procedure self{id1} := id2;
procedure self(id1..id2);
procedure self(id1..id2 ) := id3;
procedure self(id1..);
procedure self(id1..) := id2;
Each of the methods used in right hand side contexts (those whose headers do not include an := symbol) should return a value. Otherwise OM will be returned. The methods for left hand contexts should not return anything, but should use their rightmost argument to modify the current instance.
SETL provides three extraction/deletion operations:
from, fromb, and frome,and object methods can be defined for any of these. The syntax of the corresponding method headers is:
procedure fromb self;
procedure frome self;
Notice that there is no left operand in these headers, even though each is a binary operator. The left operand in a deletion operation is written but not read. Whatever value these methods return will be assigned to the left operand as well as the target operand. Each must return a value, or OM will be used.
SETL provides several syntactic constructs which call for iteration over an aggregate. For example, on encountering the expression
the interpreter will iterate over S, screening each element by applying the condition x < 5 and collecting all values which satisfy that condition. Iterators are used in set and tuple forming expressions, 'for' loops, and quantified expressions. Three general forms of iterators are provided:
expression1 = expression2 (expression3)
expression1 = expression2 { expression3}
Note that the expression y = f(x) is equivalent to [x,y] in f, and so does not enter into the following discussion.
SETL allows two pairs of built-in methods, corresponding to the first and third of these syntactic constructs. These are
procedure set_iterator_start; procedure set_iterator_next;
When the interpreter encounters code requiring an iteration over an object, it calls the iterator_start or set_iterator_start method, depending on whether the iterator was of the first or third form above. Then it repeatedly calls iterator_next or set_iterator_next to get successive elements of the object.
The iterator_start and set_iterator_start methods need not return a value. They are only used to let an object initialize an iteration. The iterator_next and set_iterator_next methods should return the next element x in the object, including it in a tuple [x] of one component if the iteration is to continue, or should return OM if iteration is to terminate.
If the iterator form is y = f(x), then the first pair of iterator methods will be used, but each value retured must be a pair, so each return return statement within the method will look something like this:
notice the double brackets. The outer tuple indicates that a value was found, and the inner tuple is the pair of values required in this form of iteration.
If the iterator expression is y = f{x} then the second pair of iterator methods will be used. The return values must obey the same rules as for y = f(x) iterators.
None of the method names appearing in this subsection are reserved words. If not used as iterator methods, they can have any number of parameters and return anything you like. But if they are to be used for iteration, they must conform to the rules above, or the program will abort.
Objects are printed by first calling the built-in procedure str, then printing the string that results. A default string is alwaysproduced when str is applied to an object of a user-defined class, but this is mainly useful for debugging. (It lists all the instance variables, but in an ugly format.) This default string conversion can be overridden with something more elegant by furnishing a class with a method having the name 'selfstr', declared with the following header
If this method is provided for a class it will be called when str is applied to objects of that class. 'selfstr' can return any value, but ideally should return a printable string version of the object.
The type of an object can be determined using SETL's built-in type procedure. The value returned will be the name of the object's class as an upper case character string. SETL is not case sensitive, but always keeps names as upper case.
Suppose that we introcuce the following simple object class and then run the test program shown:
class simple; -- simple demonstration object class procedure create(n); -- creates object with given internal value procedure setval(n); -- sets internal value procedure selfstr(); -- string conversion routine end simple; class body simple; -- simple demonstration object class var val; -- internal value procedure create(n); val := n; end create; -- creates object with given internal value procedure setval(n); val := n; end setval; -- sets internal value procedure selfstr(); return str(val); end selfstr; -- string conversion routine end simple; program test; -- use simple; x := y := simple(1); x.setval(2); print(x," ",y); end test;
The value printed by the test code is
This is because SETL objects introduced by class definitions, like built-in SETL objects, have strict value semantics: the operation x.setval(2), which changes the object x, must by definition have no effect on the object y, any more than it would have if we write
For this reason the SETL system creates a fresh copy of x before executing x.setval(2), preventing this operation or others like it from affecting y.
The same effect is seen in the test code
program test; -- use simple; x := {y := simple(1)}; y.setval(2); print(x," ",y); end test;
which for the same reason produces the output
This is not always the effect wanted (and one wants this effect less often in object settings than in dealing with orinary SETL values.) Suppose, for example, that our objects represnet the employees of a multi-department firm, whose departments are then represented by sets of employees, and that an operation like x.setval(n) is used to change an employee's salary when they get a raise. SETL's value-emeantics rule would then imply that this operation would have no effect on the summed value of salaries over the set of all employees in a department, sice the set of such employees would always contain objects with unchanged salary values. It is as if the operation x.setval(n) created an entirely new employee, with the new salary, while leaving the former employee unchanged in all contexts which the name 'x' does not directly identify. This strict value-semantics convention makes'salary' part of an employee's very identity, clearly not what is wanted. These are situations in which pointer semantics, not value semantics, should be employed.
To create SETL objects which display pointer rather than value semantics in regard to certain modifyable attributes, one can set the attribute items stored internally in the objects to unchanging atoms p, created by the creation routines ofthe objects themselves, and then attach the attribute values to these attribute atoms (which accordingly function as 'pointers') using the special global SETL map '^'. To modify our example class in this way we would write
class simple; -- simple demonstation object class procedure create(n); -- creates object with given internal value procedure setval(n); -- sets internal value procedure selfstr(); -- string conversion routine end simple; class body simple; -- simple demonstation object class var val; -- internal value procedure create(n); val := newat(); ^val := n; end create; -- creates object with given internal value procedure setval(n); ^val := n; end setval; -- sets internal value procedure selfstr(); return str(^val); end selfstr; -- string conversion routine end simple;
Now the output produced is
showing that our modified objects have pointer semantics.
A method can be used as a first class object just as a procedure can, but only when an implicit instance variable is first attached to it. The procedure-like value of a method which would be called as
can be captured by writing
Such a value can then be called in the same way as any other procedure, e.g. by witing
This expression will invoke 'method' with the value of x as the current instance.
This semantic rule is consistent with that applied when SETL forms procedure closures. Recall from Section XXX that whenever a procedure value is returned that otherwise would reference some variable defined outside its own body, SETL saves the environment of that procedure, which is to say saves as much as necessary of the current activation of all enclosing procedures. Since, inside a class body. the current instance is part of a method's environment, we bind the current instance to the method and save the combination as a closure-like procedure value. It follows that object methods cannot be used as first-class objects until an associated object instance is bound with them.
Suppose that we need to do an extensive calculation using rational numbers, which SETL does not provide as built-in objects. SETL's object class facility allows us not only to introduce objects of this kind, but to specify that the standard infix operators +, -, *, /, and also comparison operators like < and <= will be used in standard fashion to combine and compare rational numbers. Once this is done we can write arithmetic expressions for rational numbers in just the same way as we do for integers.
The following class definition accomplishes this. Since fractions are represented mathematically simply as pairs of integers, the way to set up this class is rather obvious. Each fraction (ie. object of 'rational' type) is defined by two internal instance variables num,den (its numerator and denominator, from which we always eliminate any common factor, so that they are always in 'lowest terms'.) The codes for the various arithmetic operators and comparisons seen in the class body simply reflect the standard mathematical rules for operating with fractions. All these operators are written so that they can combine either fractions with fractions, or fractions with integers, in either order. Whenever a new fraction is formed, we use the internal 'simplify' routine to reduce it to lowest terms. The 'str' procedure is redefined so as to give rational numbers their standard printed form 'num/den'. Since division of rationals by integers is available, the create routine simply takes a single integer numerator as parameter, and we build fractions like 2/3 by writing 'rational(2)/3'.