Chapter Contents.
12.1 General Introduction to the Functionality and Behavior of SETL Databases. SETL includes a powerful facility for database design and implementation. Associated with this is an extended facility for prototyping and implementing complex commercial applications, based on the notion of 'hibernating transactions' which can be resident for long periods of time in the database itself. These facilities are fully transactional, in the sense explained below, and also facilitate prototyping of and implementation of large distributed databases.
SETL databases provide an alternative to standard SQL databases. Each SETL database object is a potentially very large (up to hundreds of gigabytes) abstract object stored on disk. Each such database object resembles a map from a set of system generated 4- or 5-byte record identifiers to the records which the database stores, each of which is a fully general object of the SETL language, but which is itself not excessively large (e.g. no more than a few dozen megabytes.) The database records are systematically indexed for efficient access, using the abstract technique described below. Once a record of a SETL database has been accessed, all the operations of the SETL language can be applied to it. Conversely, any standard SETL object can be stored as a database record.
The operations specific to SETL databases (other than standard SETL operations applicable to individual records) are as follows:
x := db(id); -- find a database record given its identifier db(id) := x; -- store x as a database record, associating it with the specified identifier db with x -- create a new record and put x into it. -- a pair of the form [modified_db,id_of_new_record] is returned. db.to_string; -- database-associated mapping of records to -- reduced string variants ('summary strings') -- used for indexing; see comment below db.contains(wd) -- creates and returns a database iterator object, which iterates -- over all records whose summary string contains the given word
Iterations 'x in iter_obj' over these iterator objects return the ordered sequence of keys of all records containing the specified word (in a standard key-collating order). These iteration objects support the operations:
#iter_obj (number of keys in an iteration group) arb(iter_obj) (first item in the iteration group, or OM if none) iter_obj + iter_obj (union iteration) iter_obj * iter_obj (iterate over intersection), iter_obj - iter_obj (iterate over difference)Internally, the database 'records' are represented by long string sections, separated by special marks representing each record's id. The detailed layout of the 'id-and-record' file used to store this information is described below.
The SQL view of database semantics has become pervasive. However, the alternative approach described in this chapter may have significant advantages. To begin to see how these to approaches compare, note that the SQL database records ordinarily depicted as
Name | Age | Depart. | Salary | Employed | Posn. |
John Peter Doe | 43 | 063 | 21000 | 6/7/79 | Clerk |
Mary M. Smith | 25 | 063 | 45000 | 9/1/91 | Mgr. |
can be converted to the strings (used only for searching in the SETL version of the same database)
by a function f sophisticated enough to recognize middle names and treat them in a special way. Then the operations listed above would allow the crucial core of database searches, e.g. a search for the manager of the employee John Doe, to be written as follows:
{name(y): x in contains("Doe"), y in contains(department(x)) | posn(y) = "Mgr." & "John" in words_of(string_of(x))}.More properly, this returns all the managers in any department containing an employee called John Doe, along with whatever 'accidental' data might occasionally result from accidental combinations of John and Doe, as in a conceivable department that manufactured Doe Skin gloves and had an employee named Wallace John, or John Johnson. But this sort of thing should be quite rare, and can in any case be detected by forming and filtering or displaying the set of pairs
{[x,y]: x in contains("Doe"), y in contains(department(x)) | posn(y) = "Mgr." & "John" in words_of(string_of(x))}.in an appropriate format.
This example shows the felt advantage of the approach proposed: (a) it separates the optimization of database queries rather cleanly from their abstract definition; (b) no special statement of the fields to be indexed is required; (c) it allows arbitrary strings to appear in fields, and is prepared to search using concordances of the words in these fields, which is often an effective search technique. (E.g. we can just as well find the manager of "Peter Doe", or "J. Peter Doe", a task which might well vex an SQL database.)
Indexing of SETL databases. Each database has an administrator-defined 'to_string(record)' method which can be applied to each of its records and which yields a 'record summary string' defining those aspects of the record which are to be indexed to improve the efficiency of searches over the database. By cutting these summary strings into words (simply by cutting them at all included whitespaces) we generate an associated internal word_list(record). These word lists are automatically used to build a 'word index' associated with the database. Iterations over the db.contains(wd) iterator objects described above exploit this index.
Transaction handling in SETL databases - Basics. SETL databases, like all other SETL objects, have value semantics. For example, if db is a database object, then the sequence of statements
changes db by adding a new record, but leaves db2 unchanged since no new assignment has been made to it. The SETL database implementation creates the 'logical copies' that this requires in an efficient way, whose cost is only weakly dependent of the size of the database objects involved.
This gives an easy way of obtaining transaction-like effects. To begin a transaction which may have to be aborted, simply write
Then edit db as desired. To 'commit' the transaction, simply erase the copy by writing db_copy := OM. To back out of the transaction instead, simply write db := db_copy; this will restore db to its original state.
Commercial 'transactional' database systems provide several important transaction-related capabilities in addition to the basic transaction-backout ability noted above. These are: (i) guaranteed recovery after system failure; (ii) distributed execution, with guaranteed coordination of multiple distributed files, and guaranteed recovery after failure of any of the components of a distributed system; (iii) parallel execution of multiple independent transactions, with guaranteed isolation of each transaction from the others (serializability.) These fundamental capabilities are also supported by the SETL database system.
We model the system failures to be guarded against by assuming that:
If any of these assumptions are violated in an actual physical system, a lower supporting layer of physical and software redundancy will be needed. This must implement a 'virtual system' satisfying assumptions (a-c).
Recovery after system failure is guaranteed if we can always restart the system after crash by reading the disk and recovering to a process state equivalent to one reached previously to, and not too distant from, the moment of crash. We call system states to which we could recover 'checkpoints', and can think of them as being marked by a statement
checkpoint();
Such checkpoint operations must reliably create something logically equivalent to a complete, self-standing copy of the database, but, since they must be reasonably frequent they must not be too expensive. Recovery is then possible if (i) we can always accept a crash event which destroys RAM and leaves a half-written record on the disk, as explained above; (ii) we can write a top-level 'handler' for this event which reads the state of the disk and brings us back to a system state fully equivalent to that reached immediately prior to execution of the last previous checkpoint() call.
SETL databases support this functionality. The technique used to accomplish this is described in a later section.
A First Example of SETL Database Use. The following example illustrates the fundamental database operations described above, and can be executed to test them. In it we create a very tiny 'database' containing just a few records and apply the main database operations to it. Note that the elements of the 'database' are simply SETL pairs [name, age], but that the database-associated mapping 'pair_to_stg' sends the pairs into summary string of the form 'N:name A:age'.
program test; -- a first test program for the SETL database class use SETL_database; -- use the class of SETL database objects db := SETL_database(pair_to_stg); -- create an empty database, to store test pairs [db,jack_id] := db with ["Jack",71]; -- add a first pair print("Jack record: ",db(jack_id)); [db,diana_id] := db with ["Diana",43]; -- add a second pair [db,rachel_id] := db with ["Rachel",43]; -- add a third pair print("\n*** Initial Iteration test - records containing 'A:43' ***"); for idx in db.contains("A:43") loop print(db(idx)); end loop; print("\n*** Testing empty arb - records containing 'Walter' ***"); print(arb(db.contains("N:Walter"))); print("\n*** Initial Union test - records containing 'Jack' or 'Diana' ***"); for key in (jd_recs := (db.contains("N:Jack") + db.contains("N:Diana"))) loop print(db(key)); end loop; print("\n*** Iterate again ***"); for key in jd_recs loop print(db(key)); end loop; print("testing arb for union: list a record containing 'Jack' or 'Diana': ", db(arb jd_recs)); print("\n*** Intersection test - records containing '43' and 'Diana' ***"); for key in (d43_recs := (db.contains("A:43") * db.contains("N:Diana"))) loop print(db(key)); end loop; print("\n*** Intersection test - records containing '43' and 'Rachel' ***"); for key in (r43_recs := (db.contains("A:43") * db.contains("N:Rachel"))) loop print(db(key)); end loop; print("\n*** Difference test - records containing '43' and not 'Diana' ***"); for key in (nd43_recs := (db.contains("A:43") - db.contains("N:Diana"))) loop print(db(key)); end loop; print("\n*** Difference test - records containing '43' and not 'Rachel' ***"); for key in (nr43_recs := (db.contains("A:43") - db.contains("N:Rachel"))) loop print(db(key)); end loop; print("testing arb for union: list a record containing 'Jack' or 'Diana': " ,db(arb jd_recs)); -- **** print("testing arb for Intersection: list a record containing '43' and 'Rachel': " ,db(arb r43_recs)); print("testing arb for Intersection: list a record containing '43' and 'Diana': " ,db(arb d43_recs)); print("testing arb for Intersection: list a record containing 'Rachel' and 'Diana': " ,arb(db.contains("N:Rachel") * db.contains("N:Diana"))); -- try various iterator-based retrievals print("testing arb: list a record containing 'Jack': " ,db(jack_key := arb(db.contains("N:Jack")))); print("testing arb: list a record containing 'Diana': " ,db(arb(db.contains("N:Diana")))); print("testing arb: list a record containing 'A:71': " ,db(arb(db.contains("A:71")))); print("testing arb: list a record containing 'A:43': " ,db(arb(db.contains("A:43")))); print("testing count: count records containing 'A:43': " ,#db.contains("A:43")); -- now try some elementary database edits print("testing change of record, set ['Jack',71] to ['Jack',72]"); db(jack_key) := ["Jack",72]; -- another year gone by print("testing arb after change: list a record containing 'N:Jack': ", db(arb(db.contains("N:Jack")))); print("testing arb after change: list a record containing 'A:71': " ,arb(db.contains("A:71"))); print("testing arb after change: list a record containing 'A:71': " ,if (a_rec := arb(db.contains("A:71"))) /= OM then db(a_rec) else "Non-existent ***" end if); print("testing arb after change: list a record containing 'A:72': " ,if (a_rec := arb(db.contains("A:72"))) /= OM then db(a_rec) else "Non-existent ***" end if); print("testing arb after change: list a record containing 'N:Diana': " ,db(arb(db.contains("N:Diana")))); print("testing arb after change: list a record containing 'A:43': " ,if (a_rec := arb(db.contains("A:43"))) /= OM then db(a_rec) else "Non-existent ***" end if); print("\nadding ['Jack', 6]: "); [db,younger_id] := db with ["Jack", 6]; -- add a younger jack print("testing the basic iterators again: list records containing '6'"); for key in (jack_recs := db.contains("A:6")) loop print(db(key)); end loop; print("*** Print all records containing 'Jack' ***"); for key in (jack_recs := db.contains("N:Jack")) loop print(db(key)); end loop; print("\n*** Deleting a record -- the 'Jack' with A:72 ***"); db(jack_id) := OM; print("*** Print all records containing 'Jack' after deletion ***"); for key in (jack_recs := db.contains("N:Jack")) loop print(db(key)); end loop; print("\n*** Restoration Test - Restoring ['Jack',71]"); [db,jack_id2] := db with ["Jack",71]; -- the other Jack again print("*** Print all records containing 'Jack' after restoration ***"); for key in (jack_recs := db.contains("N:Jack")) loop print(db(key)); end loop; print("*** Add large group of new records ***"); for j in [1..20] loop -- add sixty new records [db,-] := db with ["Jack",j]; [db,-] := db with ["Diana",j]; [db,-] := db with ["Rachel",j]; end loop; print("records with 'A:3'"); print(str([db(x): x in (jnd_recs := db.contains("A:3"))])); print("records with 'N:Diana'"); print(str([db(x): x in (jnd_recs := db.contains("N:Diana"))])); print("\n*** Union test - records containing 'Jack' or 'Diana' ***"); print([db(key): key in (jd_recs := (db.contains("N:Jack") + db.contains("N:Diana")))]); print("\n*** Union test - records containing 'Rachel' or 'Diana' ***"); print([db(key): key in (jd_recs := (db.contains("N:Rachel") + db.contains("N:Diana")))]); print("\n*** Intersection test - records containing '43' and 'Diana' ***"); for key in (jd_recs := (db.contains("A:43") * db.contains("N:Diana"))) loop print(db(key)); end loop; print("\n*** Intersection test - records containing '43' and 'Rachel' ***"); for key in (jd_recs := (db.contains("A:43") * db.contains("N:Rachel"))) loop print(db(key)); end loop; print("\n*** Difference test - records containing '43' and not 'Diana' ***"); for key in (jd_recs := (db.contains("A:43") - db.contains("N:Diana"))) loop print(db(key)); end loop; print("\n*** Difference test - records containing '43' and not 'Rachel' ***"); for key in (jd_recs := (db.contains("A:43") - db.contains("N:Rachel"))) loop print(db(key)); end loop; print("recs with Jack and '6': "); -- this adds a second ["Jack,6] record, with a different id if (jack_rec6 := arb(db.contains("N:Jack") * db.contains("A:6"))) /= OM then print(db(jack_rec6)); end if; print("\n*** Difference test ***"); print("recs with Jack but not '6'"); print([db(key): key in (jn6_recs := db.contains("N:Jack") - db.contains("A:6"))]); print("\n*** Intersection test ***"); print("recs with Jack and '6'"); for x in (jnd_recs := db.contains("N:Jack") * db.contains("A:6")) loop print(db(x)); end loop; print("recs with Jack and '72': "); print(if (jack_rec72 := arb(db.contains("N:Jack") * db.contains("A:72"))) /= OM then db(jack_rec72) else "Nonexistent" end if); print("\n*** Deleting a record -- the 'Jack' with A:10 ***"); jack_rec10 := arb(db.contains("N:Jack") * db.contains("A:10")); db(jack_rec10) := OM; print("remaining recs with Jack"); print([db(key): key in (jack_recs := db.contains("N:Jack"))]); print("\n*** Restoration Test - Restoring ['Jack',10]"); [db,jack10_id] := db with ["Jack",10]; -- the other Jack again print("Records with 'Jack' but not 'Diana'"); print([db(key): key in (jnd_recs := db.contains("N:Jack") - db.contains("N:Diana"))]); print("\n*** Deleting another record -- one of the two 'Jack's with A:6 ***"); print("Deleting ",db(lit_jack_rec := arb(db.contains("N:Jack") * db.contains("A:6")))); db(lit_jack_rec) := OM; -- remove this younger 'Jack' print("All remaining records with 'Jack'"); print([db(key): key in (jack_recs := db.contains("N:Jack"))]); for j in [1..20] loop -- add twenty new records [db,-] := db with ["Jack",100 + j]; [db,-] := db with ["Diana",100 + j]; [db,-] := db with ["Rachel",100 + j]; end loop; print("records with 'A:103'"); print(str([db(x): x in (all3_recs := db.contains("A:103"))])); print("records with 'N:Diana'"); print(str([db(x): x in (jnd_recs := db.contains("N:Diana"))])); print("records with 'A:3' but not 'N:Diana'"); print(str([db(x): x in (jnd_recs := db.contains("A:3") - db.contains("N:Diana"))])); print("records with 'N:Diana' but not 'A:3'"); print(str([db(x): x in (jnd_recs := db.contains("N:Diana") - db.contains("A:3"))])); print("records with 'N:Diana' and 'A:3'"); print(str([db(x): x in (jnd_recs := db.contains("N:Diana") * db.contains("A:3"))])); print("records with 'N:Diana' or 'A:3'"); print(str([db(x): x in (jnd_recs := db.contains("N:Diana") + db.contains("A:3"))])); print("\n ***END OF TESTS ***"); end test;The output produced by this program is as follows:
Jack record: ["Jack", 71] test elimination of duplicates in union iteration [["Jack", 71]] *** Initial Iteration test - records containing 'A:43' *** ["Diana", 43] ["Rachel", 43] *** Testing empty arb - records containing 'Walter' *** <om> *** Initial Union test - records containing 'Jack' or 'Diana' *** ["Jack", 71] ["Diana", 43] *** Iterate again *** ["Jack", 71] ["Diana", 43] testing arb for union: list a record containing 'Jack' or 'Diana': ["Jack", 71] *** Initial Intersection test - records containing '43' and 'Diana' *** ["Diana", 43] *** Initial Intersection test - records containing '43' and 'Rachel' *** ["Rachel", 43] *** Initial Difference test - records containing '43' and not 'Diana' *** ["Rachel", 43] *** Initial Difference test - records containing '43' and not 'Rachel' *** ["Diana", 43] testing arb for union: list a record containing 'Jack' or 'Diana': ["Jack", 71] testing arb for Intersection: list a record containing '43' and 'Rachel': ["Rachel", 43] testing arb for Intersection: list a record containing '43' and 'Diana': ["Diana", 43] testing arb for Intersection: list a record containing 'Rachel' and 'Diana': <om> testing arb: list a record containing 'Jack': ["Jack", 71] testing arb: list a record containing 'Diana': ["Diana", 43] testing arb: list a record containing 'A:71': ["Jack", 71] testing arb: list a record containing 'A:43': ["Diana", 43] testing count: count records containing 'A:43': 2 testing change of record, set ['Jack',71] to ['Jack',72] testing arb after change: list a record containing 'N:Jack': ["Jack", 72] testing arb after change: list a record containing 'A:71': <om> testing arb after change: list a record containing 'A:71': Non-existent *** testing arb after change: list a record containing 'A:72': ["Jack", 72] testing arb after change: list a record containing 'N:Diana': ["Diana", 43] testing arb after change: list a record containing 'A:43': ["Diana", 43] adding ['Jack', 6]: testing the basic iterators again: list records containing '6' ["Jack", 6] *** Print all records containing 'Jack' *** ["Jack", 72] ["Jack", 6] *** Deleting a record -- the 'Jack' with A:72 *** *** Print all records containing 'Jack' after deletion *** ["Jack", 6] *** Restoration Test - Restoring ['Jack',71] *** *** Print all records containing 'Jack' after restoration *** ["Jack", 6] ["Jack", 71] *** Add large group of new records *** records with 'A:3' [["Jack", 3], ["Diana", 3], ["Rachel", 3]] records with 'N:Diana' [["Diana", 43], ["Diana", 1], ["Diana", 2], ["Diana", 3], ["Diana", 4], ["Diana", 5], ["Diana", 6], ["Diana", 7], ["Diana", 8], ["Diana", 9], ["Diana", 10], ["Diana", 11], ["Diana", 12], ["Diana", 13], ["Diana", 14], ["Diana", 15], ["Diana", 16], ["Diana", 17], ["Diana", 18], ["Diana", 19], ["Diana", 20]] *** Union test - records containing 'Jack' or 'Diana' *** [["Diana", 43], ["Jack", 6], ["Jack", 71], ["Jack", 1], ["Diana", 1], ["Jack", 2], ["Diana", 2], ["Jack", 3], ["Diana", 3], ["Jack", 4], ["Diana", 4], ["Jack", 5], ["Diana", 5], ["Jack", 6], ["Diana", 6], ["Jack", 7], ["Diana", 7], ["Jack", 8], ["Diana", 8], ["Jack", 9], ["Diana", 9], ["Jack", 10], ["Diana", 10], ["Jack", 11], ["Diana", 11], ["Jack", 12], ["Diana", 12], ["Jack", 13], ["Diana", 13], ["Jack", 14], ["Diana", 14], ["Jack", 15], ["Diana", 15], ["Jack", 16], ["Diana", 16], ["Jack", 17], ["Diana", 17], ["Jack", 18], ["Diana", 18], ["Jack", 19], ["Diana", 19], ["Jack", 20], ["Diana", 20]] *** Union test - records containing 'Rachel' or 'Diana' *** [["Diana", 43], ["Rachel", 43], ["Diana", 1], ["Rachel", 1], ["Diana", 2], ["Rachel", 2], ["Diana", 3], ["Rachel", 3], ["Diana", 4], ["Rachel", 4], ["Diana", 5], ["Rachel", 5], ["Diana", 6], ["Rachel", 6], ["Diana", 7], ["Rachel", 7], ["Diana", 8], ["Rachel", 8], ["Diana", 9], ["Rachel", 9], ["Diana", 10], ["Rachel", 10], ["Diana", 11], ["Rachel", 11], ["Diana", 12], ["Rachel", 12], ["Diana", 13], ["Rachel", 13], ["Diana", 14], ["Rachel", 14], ["Diana", 15], ["Rachel", 15], ["Diana", 16], ["Rachel", 16], ["Diana", 17], ["Rachel", 17], ["Diana", 18], ["Rachel", 18], ["Diana", 19], ["Rachel", 19], ["Diana", 20], ["Rachel", 20]] *** Intersection test - records containing '43' and 'Diana' *** ["Diana", 43] *** Intersection test - records containing '43' and 'Rachel' *** ["Rachel", 43] *** Difference test - records containing '43' and not 'Diana' *** ["Rachel", 43] *** Difference test - records containing '43' and not 'Rachel' *** ["Diana", 43] recs with Jack and '6': ["Jack", 6] *** Difference test *** recs with Jack but not '6' [["Jack", 71], ["Jack", 1], ["Jack", 2], ["Jack", 3], ["Jack", 4], ["Jack", 5], ["Jack", 7], ["Jack", 8], ["Jack", 9], ["Jack", 10], ["Jack", 11], ["Jack", 12], ["Jack", 13], ["Jack", 14], ["Jack", 15], ["Jack", 16], ["Jack", 17], ["Jack", 18], ["Jack", 19], ["Jack", 20]] *** Intersection test *** recs with Jack and '6' ["Jack", 6] ["Jack", 6] recs with Jack and '72': Nonexistent *** Deleting a record -- the 'Jack' with A:10 *** remaining recs with Jack [["Jack", 6], ["Jack", 71], ["Jack", 1], ["Jack", 2], ["Jack", 3], ["Jack", 4], ["Jack", 5], ["Jack", 6], ["Jack", 7], ["Jack", 8], ["Jack", 9], ["Jack", 11], ["Jack", 12], ["Jack", 13], ["Jack", 14], ["Jack", 15], ["Jack", 16], ["Jack", 17], ["Jack", 18], ["Jack", 19], ["Jack", 20]] *** Restoration Test - Restoring ['Jack',10] *** [["Jack", 6], ["Jack", 71], ["Jack", 1], ["Jack", 2], ["Jack", 3], ["Jack", 4], ["Jack", 5], ["Jack", 6], ["Jack", 7], ["Jack", 8], ["Jack", 9], ["Jack", 11], ["Jack", 12], ["Jack", 13], ["Jack", 14], ["Jack", 15], ["Jack", 16], ["Jack", 17], ["Jack", 18], ["Jack", 19], ["Jack", 20], ["Jack", 10]] *** Deleting another record -- one of the two 'Jack's with A:6 *** Deleting ["Jack", 6] All remaining records with 'Jack' [["Jack", 71], ["Jack", 1], ["Jack", 2], ["Jack", 3], ["Jack", 4], ["Jack", 5], ["Jack", 6], ["Jack", 7], ["Jack", 8], ["Jack", 9], ["Jack", 11], ["Jack", 12], ["Jack", 13], ["Jack", 14], ["Jack", 15], ["Jack", 16], ["Jack", 17], ["Jack", 18], ["Jack", 19], ["Jack", 20], ["Jack", 10]] records with 'A:103' [["Jack", 103], ["Diana", 103], ["Rachel", 103]] records with 'N:Diana' [["Diana", 43], ["Diana", 1], ["Diana", 2], ["Diana", 3], ["Diana", 4], ["Diana", 5], ["Diana", 6], ["Diana", 7], ["Diana", 8], ["Diana", 9], ["Diana", 10], ["Diana", 11], ["Diana", 12], ["Diana", 13], ["Diana", 14], ["Diana", 15], ["Diana", 16], ["Diana", 17], ["Diana", 18], ["Diana", 19], ["Diana", 20], ["Diana", 101], ["Diana", 102], ["Diana", 103], ["Diana", 104], ["Diana", 105], ["Diana", 106], ["Diana", 107], ["Diana", 108], ["Diana", 109], ["Diana", 110], ["Diana", 111], ["Diana", 112], ["Diana", 113], ["Diana", 114], ["Diana", 115], ["Diana", 116], ["Diana", 117], ["Diana", 118], ["Diana", 119], ["Diana", 120]] records with 'A:3' but not 'N:Diana' [["Jack", 3], ["Rachel", 3]] records with 'N:Diana' but not 'A:3' [["Diana", 43], ["Diana", 1], ["Diana", 2], ["Diana", 4], ["Diana", 5], ["Diana", 6], ["Diana", 7], ["Diana", 8], ["Diana", 9], ["Diana", 10], ["Diana", 11], ["Diana", 12], ["Diana", 13], ["Diana", 14], ["Diana", 15], ["Diana", 16], ["Diana", 17], ["Diana", 18], ["Diana", 19], ["Diana", 20], ["Diana", 101], ["Diana", 102], ["Diana", 103], ["Diana", 104], ["Diana", 105], ["Diana", 106], ["Diana", 107], ["Diana", 108], ["Diana", 109], ["Diana", 110], ["Diana", 111], ["Diana", 112], ["Diana", 113], ["Diana", 114], ["Diana", 115], ["Diana", 116], ["Diana", 117], ["Diana", 118], ["Diana", 119], ["Diana", 120]] records with 'N:Diana' and 'A:3' [["Diana", 3]] records with 'N:Diana' or 'A:3' [["Diana", 43], ["Diana", 1], ["Diana", 2], ["Jack", 3], ["Diana", 3], ["Rachel", 3], ["Diana", 4], ["Diana", 5], ["Diana", 6], ["Diana", 7], ["Diana", 8], ["Diana", 9], ["Diana", 10], ["Diana", 11], ["Diana", 12], ["Diana", 13], ["Diana", 14], ["Diana", 15], ["Diana", 16], ["Diana", 17], ["Diana", 18], ["Diana", 19], ["Diana", 20], ["Diana", 101], ["Diana", 102], ["Diana", 103], ["Diana", 104], ["Diana", 105], ["Diana", 106], ["Diana", 107], ["Diana", 108], ["Diana", 109], ["Diana", 110], ["Diana", 111], ["Diana", 112], ["Diana", 113], ["Diana", 114], ["Diana", 115], ["Diana", 116], ["Diana", 117], ["Diana", 118], ["Diana", 119], ["Diana", 120]] ***END OF TESTS ***The following comments will aid in understanding this output. We begin by creating a database, into which a first record '["Jack",71]' is inserted and then retrieved. (The summarization routine for the database simply maps such records into strings like "N:Jack A:71", ignoring all but the first two components of each record.) Next two additional records, '["Diana",43]' and '["Rachel",43]' are inserted, and then retrieved using the iterator 'db.contains("A:43")' (which iterates over both of these records, since both contain '43' as their second component.) The empty iterator 'db.contains("N:Walter")' is then used in combination with the 'arb' operation, which returns OM since no record contains the name 'Walter'.
Next we iterate over all records containing either of the names 'Jack' or 'Diana', The union iterator 'db.contains("N:Jack") + db.contains("N:Diana")' is saved in a variable and then used a second time to show that this is possible, and a third time in combination with the 'arb' operation.
Next the corresponding intersection and difference iterators are invoked, either to support iterations or in connection with 'arb' or the count operator '#' and seen to behave as expected. After this, we use the database record-change operation db(x) := y to change a record, which is then retrieved in changed form, using an iterator, while the previous record is seen to be absent, and other records in the database are seen to be unchanged. Then a second record '["Jack", 6]' containing the name "Jack" is added to the databases, and retrieved, first using 'db.contains("A:6")' (which retrieves only '["Jack", 6]'), and 'db.contains("N:Jack")' (which retrieves both '["Jack", 6]' and '["Jack", 72]'). The record '["Jack", 72]' is then deleted by using the record-change operation in the form db(x) := OM, its absence verified, and the original '["Jack",71]' record added back to the database. Following this, we add 60 records, containing ["Jack",j], ["Diana",j], and ["Rachel",j] for j from 1 to 20 to the database, verify via various retrievals and iterations that this has been done correctly, and delete a record, subsequently verifying the correctness of this deletion, and then re-inserting the deleted record and re-verifying. Finally we insert another 60 records, now containing ["Jack",j], ["Diana",j], and ["Rachel",j] for j from 101 to 120, and re-verify.
The following short program illustrates several other significant properties of SETL databases, specifically that
The program is:
program test; -- test of SETL database class use SETL_database; db := SETL_database(pair_to_stg); -- create an empty database, to store test record [db,jack_id] := db with ["Jack",71]; -- add a first record [db,diana_id] := db with ["Diana",43]; -- add a second record [db,rachel_id] := db with ["Rachel",43]; -- add a third record print("Database records"); for key in (db.contains("A:43") + db.contains("A:71")) loop print(db(key)); end loop; db_save := db; -- save the original database db(jack_id) := db(jack_id) with "a swell guy"; db(diana_id) := ["Diana",44]; db(diana_id) := db(diana_id) with "a swell gal"; db(rachel_id) := db(rachel_id) with "a folksinger"; db(rachel_id) := db(rachel_id) with "remember upcoming club date"; print("\nEdited database"); for key in (db.contains("A:43") + db.contains("A:71")) loop print(db(key)); end loop; print("\nSaved database"); for key in (db_save.contains("A:43") + db_save.contains("A:71")) loop print(db_save(key)); end loop; print("\nComparison of records in the two databases"); print("db_save(jack_id): ",db_save(jack_id)); print("db(jack_id): ",db(jack_id)); print("db_save(rachel_id): ",db_save(rachel_id)); print("db(rachel_id): ",db(rachel_id)); print("db_save(diana_id): ",db_save(diana_id)); print("db(diana_id): ",db(diana_id)); database_list := [db,db_save]; print("\nFirst database"); for key in (database_list(1).contains("A:43") + database_list(1).contains("A:71")) loop print(database_list(1)(key)); end loop; print("\nSecond database"); for key in (database_list(2).contains("A:43") + database_list(2).contains("A:71")) loop print(database_list(2)(key)); end loop; procedure pair_to_stg(p); -- convert name, age pair to string [nm,age,-] := p; return "N:" + nm + " A:" + str(age); end pair_to_stg; end test;The preceding program produces the following output.
Database records ["Jack", 71] ["Diana", 43] ["Rachel", 43] Edited database ["Jack", 71, "a swell guy"] ["Rachel", 43, "a folksinger", "remember upcoming club date"] Saved database ["Jack", 71] ["Diana", 43] ["Rachel", 43] Comparison of records in the two databases db_save(jack_id): ["Jack", 71] db(jack_id): ["Jack", 71, "a swell guy"] db_save(rachel_id): ["Rachel", 43] db(rachel_id): ["Rachel", 43, "a folksinger", "remember upcoming club date"] db_save(diana_id): ["Diana", 43] db(diana_id): ["Diana", 44, "a swell gal"] First database ["Jack", 71, "a swell guy"] ["Rachel", 43, "a folksinger", "remember upcoming club date"] Second database ["Jack", 71] ["Diana", 43] ["Rachel", 43]
The program seen above creates a database and inserts three records, each a simple pair, into it. The database is then copied to a second variable 'db_save', and the original database is modified by editing each of its three records, to which additional string components are freely added. (The simplicity of this operation, which is a bit clumsy in a standard SQL database since it changes the number of SQL 'columns', illustrates one of the advantages of SETL databases: since records in them are entirely free-form, additional elements of any kind can be added to records in a completely dynamic way.) The edited records are retrieved from the edited database using an iterator, but then also from the saved database, which is seen to contain the original records, unchanged. This illustrates a second advantage of SETL databases, their value semantics, on which a 'transactional' capability can be built in the manner noted above. Finally we include the databases in a tuple, from which the databases are then recovered and used. This illustrates the fact that databases behave like ordinary SETL objects, and so can be put into tuples and/or sets, passed as parameters to subroutines, etc.
More Examples: Databases which store Free-form Text. Next we give a few examples having somewhat more realistic application flavors. The first of these, seen below, reads a file of text divided into paragraphs by blank lines, and inserts these paragraphs into a database. Paragraphs are summarized by collecting their words, reducing these to lowercase, suppressing any words which belong to a file of 'excluded_wds', and depluralizing them by dropping any terminal 's'. After building the database, the code seen below executes a few sample queries, some of which illustrate the use of Boolean combinations of search keys.