Introduction
Bugs occur when ideas are not clearly understood - so good software depends on
defining terms correctly and exactly. All projects need a collection of
definitions. Call a set of definitions a dictionary . A BNF grammar is
such a dictionary. BNF is a syntactically sugared context free grammar. EBNF
has the same expressive power as BNF, but is easier to write because it
permits regular expressions on the right hand side of a definition:
BNF <number>::=<digit>|<number><digit> [Naur et al 64]
EBNF number::= digit {digit} [ANSI 83 ADA]
The '{...}' indicate 'zero or more of. In EBNF a series of non_terminal
are defined in terms of a regular expression that represents a set of strings
(a language). However, EBNF-like dictionaries turn up in all parts of
software engineering theory and practice, see Table 1 below.
Table 1. Applications
Name Elements Defined_terms Some Sources
Syntax,
Grammar Tokens|Lexemes Syntactic units Naur et al 64,...Ada 93
Lexical
.AS_is Dictionary Strings Tokens|lexemes Bell Labs 83(lex)
Protocol Messages Communications protocols
>As_is Zave 85, Naik & Sarikaya 92
Messages
Methods Objects Botting 86a
RFCs Lines Email Messages Internet RFCs 80..93
Data
Dictiona Data types Data files De Marco 80, JSP
Data
StructureData types I/O Patterns Warnier/Orr, JSP,
Path expressions Subprogram calls Paths, S/SL
Entries names Ada task structureHemendinger 90
Dynamic Dictionary Events Significant patternsKleene 1940s,DAD
Commands System BehaviorsBelli & Grosspietsch 91
ELH Events Life HistoriesRobinson 79, JSD,SSADM
Logic Grammar Simple Terms PredicatesAbramson & Dahl 89
Program Structure C Function Calls C Function DefinitionsBaecker & Marcus 90
Language Theory a..z A..ZEilenberg 74,...THEORY
The theory of EBNF has been the same for 30 years and is based on defining
"string" and "sets of strings." Strings come from an empty string by
concatenating symbols. I will show concatenation of strings by an exclamation
mark (!). It is assumed to be an associative operation with a unit - the empty
string "". Each string, by definition, has a unique sequence of atomic
symbols, which when concatenated generate the original string. Temporarily we
symbolize strings by lower case variables and sets of strings by upper case
variables. Example strings will shown in the C notation (see later if
necessary). We next define operations of union(|), and concatenation () on
sets of strings by:
- A B::={c || c=a ! b and a in A and b in B}, -- the set of all c where
c=a!b and a is in a and b is in B.
- A | B::={c || c in A or c in B},
A string is treated as a singleton set when it is in a context that needs a
set of strings:
- For string c, c::sets_of_strings={c}.
In this section I stands for the set of the empty string:
- I::sets_of_strings={""}.
The definition of (#) takes more machinery. We could write,
- #A = I | A | A A | A A A | ... .
but this formula is not formal enough to be used by a proof checked since
... is not defined. We could define the powers of a set (A^n) and then write
an infinite union - but first we would have to define these ideas themselves.
So we start from the property
- #A = I | A #A
and define #A as the smallest solution to the equation X = I | A X:
- #A = the smallest { X || X = I | A X }.`
Further Kleene showed that by repeatedly applying the operation that replaces
X by ( I | A X) we get closer and closer to #A in the obvious sense. So
(1) #A is the smallest value of X solving:
- X::=I | A X,
(2) Repeated substitution generates successive approximations to #A.
These give a paradigm for defining the meaning of any grammar. For instance,
the grammar { L::= a L b | "". } has one definitionL::= a L b | ""., one
defined_term "L" and two elements: "a", "b". We define the meaning of
"L" to be M as the smallest set of strings such that the following
equation is true:
- M = a M b | I.
So
- M= {c || for some x in M (c =a ! x ! b)} | I.
The above gives a recursive process testing whether a string lies in M. It
does not tell us how to find elements in M. By Kleene's work M is also
the smallest fixed-point of the mapping g, where
- g(X) = a X b | I.
So to find M we use an iterative process to generate the Kleene Sequence:
- S(0) = {},
- for all natural_numbers n, S(n+1) = g( S(n) ).
The series gives successive approximations to M:
- S(0)={},
- S(1)= I | a{}b = I ,
- S(2)= I | a I b = {"","ab"},
- S(3)= I | a{"", "ab"}b = {"", "ab", "aabb"},
- S(4)= I | a{"", "ab", "aabb"}b = {"", "ab", "aabb", "aaabbb"},
- S(5)= I | a{"", "ab", "aabb", "aaabbb"}b = {"", "ab", "aabb", "aaabbb",
"aaaabbbb"},
- ...
It is intuitively obvious that these sets are tending toward a limit of
- M = { a^n ! b^n || n=0,1,2,... }
Indeed M is a fixed point of g since M = g(M). Any proper subset of M
is not a fixed point. The reason that the Ss seem to get closer to M is
that as i tends to infinity we have to look at longer and longer strings
to find a string in M that is not in S(i). Using a limit to define the
meaning of grammar has been around in the literature for a long time (a
bibliography is in section 6.7 of chapter 3 of my unpublished monograph). The
fixed point definition came later and is similar to Scott's denotational
semantics [ACM 86]. Fixed points, iteration, and topological spaces appear
to be a vital part of the theory of software.
XBNF is EBNF with a notation which can be used with other mathematical
concepts and so more power. XBNF uses white space, bars ("|") and the symbol
- "::=" as in BNF. The "<" and ">" of the original BNF are used as relations
and as parts of arrows and crosses. Parentheses "()" are used as they are in
algebra. It adds "#" to mean any number of, including zero thus letting
the brace symbols of EBNF be used for mathematical expressions of sets. The
Ada EBNF symbols "[]" are used to show subscripts and so are not used in XBNF.
XBNF also has a standard dictionary that defines some simple functions to be
used: Ada EBNF { x } is #( x ) in XBNF and x #(x) is written N( x ) . [x
] in Ada EBNF is O(x ) (optional x ) . Here is a comparison of BNF, Ada
EBNF and MATHS:
BNF <number>::=<digit>|<number><digit> [Naur et al 64]
EBNF number::= digit {digit} [ANSI 83 ADA]
XBNF number::= N(digit), where N(x)=x #(x ). [Botting 9x]
We take over the notation from section 2.3 above and add intersection (&) and
complementation (~) by:
- A & B::={c || cA and c B},
- A ~ B::={c || c A and not ( c B) }.
This section is an XBNF dictionary describing XBNF:
- XBNF_dictionary::= #( XBNF_definition | comment ).
A dictionary has any number of definitions and comments.
- XBNF_definition::= defined_term "::=" XBNF_expression.
- A definition has a "::=" between the term being defined and the expression
defining it.
- XBNF_expression::=set_expression.
In XBNF we are interested in sets of sequences:
- set_expression::= item | union | intersection | sequence | phase |
element.
A set_expression is either an item, or a union, or, ...
- item::= element | defined_term | "(" set_expression ")".
An item is either an element (defined below), or a defined term, or any set
expression in parentheses,
- union::= alternative #( "|" alternative ).
A union has at least one alternative. Alternatives in a union are separated
by vertical bars.
- alternative::= complementary_form | sequence.
Each alternative is either a complementary_form or a sequence.
- sequence::= #phase.
A sequence is a number of phases. The "number sign" (ASCII code hex(23), hash,
sharp) shows the occurrence of any number of the next item - including none.
A sequence can therefore be empty, one phase, or two phases, etc..
- phase::= ("#" | ) item.
A phase is part of a sequence where an item occurs once, or the same item can
occur any number of times including no times at all.
- intersection::= item #("&" item ),
The ampersand character("&") shows that all the descriptions (items) must hold
at once.
- complementary_form::= item "~" item,
A complementary_form - A~B say - describes the set of As that are not
Bs. Here is an example:
- comment::= (#character )~definition.
Any number characters (in a dictionary ) that are not a definition form a
comment. This paragraph is a comment in this XBNF dictionary.
In a lexical dictionary the following definition of element is used:
- element::=character_string | name_of_non_printing_character | function |
natural_language_string,
- character_string::=quote #( char ~ (quote | backslash) | backslash
char) quote,
- quote::="\"",
- backslash::="\\",
- name_of_unprintable_character::=capital_letter #( capital_letter | digit
).
The names of unprintable characters of ASCII are well known (NUL,..., SP, CR,
LF, ...DEL). In MATHS the character names used in Ada are preferred
[silicon:/u/faculty/dick/cs320 /ada/lrm/* and
silicon:/u/faculty/dick/cs320/math/comp.ascii].
- function::=identifier "(" List( arguments) ")",
Simple functions defined by non-recursive definitions do not change the
expressive power but makes a practical difference. The following are a
standard functions and can used in any lexical or syntactic dictionary:
- For all sets_of_strings x,
- List(x)= x #( comma x),
- non(x)= #char ~ x,
- N(x)= x #(x),
- O(x)= (x|).
Introducing the ability to define and use syntactic functions allows context
sensitive features to be handled elegantly and efficiently... See below.
Another standard kind of element is the identifier. An identifier does not
start or end with an underscore - it is made of letters, digits and break
characters, and must always start and end with either a letter or digit:
- identifier::= ( (letter | digit ) #id_character )
- & ( #id_character (letter | digit) ),
- id_character::=(letter | digit | underscore).
Some identifiers are defined_terms:
- defined_term::= natural_identifier & defined |... .
When the defined_terms are identifiers and the expressions have no
intersections or complements then the dictionary is context free. The three
dots (...) indicate that there a defined term can take other forma. A natural
identifier is made up of words :
- natural_identifier::= word #(underline word)
- & (declared_identifier | defined_identifier |
unknown_identifier).
Words are either numbers or are made of letters. When made of letters they
must be in a dictionary - either of a natural language or a special dictionary
for the project,
- word::=number | ( (letter #letter) & in a dictionary ),
- number::=digit #digit,
- digit::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9".
Rules for creating and using identifiers in general documentation are
discussed in my monograph. Meanwhile note that it is common to have several
names with one meaning - typically there is a mathematical variable or symbol
(like "a", "+", "\λ", ...) which is used in formulae plus a longer
natural_identifier .
Often, a language is defined by two dictionaries - a lexicon that
gives names to strings of symbols, and a syntactic dictionary or grammar. The
grammar describes the structure of the language in terms of the tokens
identified in the lexicon. In this presentation I combined both
lexical and syntactic "shells" into a single presentation. Normally they
would be split up into separate pieces of documentation - to do this I have
experimented with a formalism that treats a dictionary as a module. This is
described later. Meanwhile we will assume that the terms defined in the
standard lexicon for the ASCII code can be used in future examples:
thus terms like comma, r_paren, etc will be used in later syntax
definitions. I would suggest that we extend EBNF to also be a glossary. A
glossary definition is a definition in the form:
- term::=informal description.
Syntactically define a glossary as a collection of these:
- glossary::=#( glossary_definition | comment )
- glossary_definition::= defined_term "::=" natural_language_string
punctuation.
So in a general piece of documentation a term may have two different
definitions:
- natural_language_string::= back_quote #(non(back_quote)) back_quote,
- natural_language_string::=`A description in a natural language of the
meaning and/or purpose of something`,
- back_quote::="`".
A similar three level approach is useful in data processing applications - the
analyst preparing a data dictionary that is equivalent to a lexical
description of the data, the program designer working out the structures
caused by logical and physical constraints, and a team thrashing out the
glossary definition. It is possible and useful however to keep all
definitions of a single term together rather than in separate files of
sections of a document.
Such a three way approach has been tried out in class. Any user with
~dick/bin in their PATH can query a data base of half-a-dozen language
specifications written in XBNF:
- define language term -- print out relevant definitions of term
- lookup language lexeme -- print out definitions in which lexeme is
used.
I have invented a notation that allows dictionaries, grammars and glossaries
to refer to each other. My define/lookup commands take account of this
feature. The effect is that the C++ syntax file inherits definitions from the
C syntax plus a glossary of object oriented terms. The syntax inherits the
definitions of the ASCII characters and the functions and notation in a meta-
definition of EBNF. define always searches a general glossary of computer
science. This is all transparent to the user:
Example: define c++ class
[../c++/syntax] class::=....
- ...
[./c/syntax] storage_class_specifier::=... .
[../glossary.objects] class::=`A set of objects classified by some property or
other. See classification`.
Notice that the program extracts likely definitions (including mis-spelt
terms) and relies on human intelligence to select the best match. I believe
is humans being intelligent and machines doing the grunt work!
Because union is defined as made of alternatives which, in turn, contain
sequences we have:
- |-( a b | c d) = ( (a b) | (c d) ),
The number sign ("#") always applies to the immediately following item and
means "a number of (including none)". Similarly, (a | #b) indicates
either an a or any number of bs, as does (#b|a). This, plus the
precedence of spaces over vertical bars means that a #b c | #b describes
a string with a single a followed by many bs and one c or else many
bs. Parentheses (()) are put around unions within sequences, and
iterations.
- (a | b) (c | d) = (a c | a d | b c | b d)
reads "a sequence of a or b, end of sequence and then a sequence of c or
d, end of sequence,"
Or informally
"the first item is either an a or a b followed by either a c or a
d."
Similarly
#(a | b) =any number of a s and b s in some order.
#(a b) =any number of pairs of one a followed by one b.
In #(a b) each a must be followed by a b. Thus #(a b) is proper
subset of #(a | b):
- |-#(a b) =>> #(a | b).
The theory of defining a language as the intersection of a finite number of
context free languages was developed by Liu and Weiner in 1973. For practical
examples see: the Turing project [Holt & Cordy 88]. , S/SL, Jackson's work,
etc.[Hehner and Silverberg 83]. Warnier's notation permits intersections but
they are only used to derive conditions in programs. Intersections let us
define languages that are not context free:
- L::=P&Q,
- P::= #a B, B::= O(b B c),
- Q::=A #c, A::=O(a A b).
Therefore
- |-B={b^m c^m || n=0..},
- |-P={a^n b^m c^m || n, m=0..},
- |-A={a^n b^n || n=0..},
- |-Q={a^n b^n c^m || n, m=0..},
- |-L={a^n b^n c^n || n=0..}.
The last result shows that a dictionary with an intersection can describe a
non-context free language. Since a complement of a union is the intersection
of the complements the same effect can be got by permitting complementation
and NO intersection. However this seems to be a useless restriction.
Defining a language as the intersection of several grammatical constraints is
neither new nor inefficient - Constrained Expressions are expressive and based
on simple regular expressions [Dillon et al 86]. Hehner and Silverberg[83] use
the term Communicating Grammars for a set of grammars with some terminals
and non-terminals occurring in both. The model here is simpler - I propose
the terms concurrent context free grammars (CCFG) for the dictionary model
proposed here, and concurrent context free languages(CCFLs) for the family
of languages defined by CCFGs. Hemendinger notes that Ada tasks have
communication patterns that are CCFLs[Hemendinger 90]. Dictionaries with
intersections define languages that are recognized by networks of parallel CFG
acceptors. Psychologists use a network of "daemons" for modelling perceptual
processes [ Human Information Processing, Lindsay & Norman, Pub Academic
Press, 1970]. Finally: the hardest part of Jackson Structured Programming
[JSP] is the use of semi-co-routines (in COBOL!) and to handle non-context-
free problems by quasi-concurrency[Jackson 1975...].
Using functions in a grammar has the practical effect of letting us name and
use short hand formulae for common constructions such as lists. However they
permit the definition of context dependencies in a natural way. First we
need a way to define a function and a way to express its image. The natural
way to define simple functions is too allow the use a symbol like (_) as the
place where the argument of the function must be written:
- List::= (_) #( comma (_) ),
where List maps a set of strings into another set of strings..
The image of a function F can be written post(F) - the set produced after F
is applied to all the elements in it domain
- post(F)::= {F(x) || for some x }.
For example, if F is a map from the Positive_Integers into sets of strings
defined by:
- F::=a^(_)b^(_)c^(_),
then
- post(F) ={a^n b^n c^n || n=0..}.
This technique leads to a way to express the context dependencies found
programming languages. For example take the labelled loop in Ada. Suppose
that loop_with_label is a function the maps identifiers into valid Ada
loops:
- loop_with_label::= ( (_) ":" "loop" #( statement | "exit" (_) "when"
condition ";" ) "end loop" (_) ";".
(for example
- "SUM: loop x:=x+1/i; exit SUM when x>100; i:=i+1; end loop SUM"
- is a string in loop_with_label(SUM).
)
To be made precise the statement in the Ada grammar would also need to an
argument representing the set of valid labels in internal loops.
Now, this is a context dependent set of strings and yet the definition shows
us a simple and efficient way to parse (and check the syntax) loops: First,
use CFG a parser for
- loop::= ( label ":" "loop" #( statement | "exit" label "when"
condition ";" ) "end loop" label ";".
and check for the reoccurrence of label . Clearly functions get us to the
level of attribute grammars.
In XBNF '@' means 'subsets of' so
we can define a complicated feature of a language in stages, first
declare that it is a subset of a context free language and later provide the
specific rules.
- Ada_loop::@ (label ":" LOOP"
- #(statement|"EXIT"label"WHEN"condition";)
- "END LOOP" label ";"
- ).
- Here the "::" is split of from the "=" because it declares a new identifier
but does not define a particular value. This let's us take complex specifications and requirements one simple assumption or condition at a time.
Combining intersection with functions is interesting. Suppose that we have
already defined declaration(V), usage(V), and no_ref(V) as the strings
in a language which (1) declare or introduce V, (2) make use of V, and
(3) don't refer to V respectively then:
- valid_wrt::=#no_ref(_) declaration(_) #(usage(_) | no_ref(_) ),
describes all sequences of components where the argument is declared before it
is used - and is NOT re-declared. If we now define
- valid::= &valid_wrt,
where
- &F = { y || for all y (y in F(x) },
- then
- block::="begin" ( (statement # (";" statement )) & valid ) "end",
defines a block with all its variables declared exactly once before they are
used. This is a common context sensitivity in programming languages. Finally
the '&'s in the definition imply that the parser would be structured as a
number of parallel parsers, all with the same structure. This form
concurrency is very easy to handle in both theory [Hoare 79 & 85] and in
practice [Jackson 75, Andre Banatre & Routeau 81, Cameron 89].
Definitions that include parameters are useful for defining and specifying
maps and relations. Sets of definitions that include parameters are called
Logical Grammars [ Abramson & Dahl 89 ]. A more general form of dictionary is
the W-Grammar, or Two-level Grammar or Van-Wijngarten Grammar that contributed
to the demise of Algol 68 yet is alive as a tool for documenting data
structures[Gonnet & Tompa, Gonnet & Baeza-Yates 91]. The effect is that of a
logic programming language [Ballance et al 90].
Dictionaries in Abstract
Abstract models are easier to re-use than concrete ones. Any system that fits
the theory of strings plus regular sets(A semiring with Kleene closure ) can
use a similar kind of dictionary to those described above. This part of my
presentation is a quick tour of such areas.
Abstract models are easier to re-use than concrete ones. Any system that fits
the theory of strings (a monoid) plus regular sets(A boolean algebra) with
kleene closure can use dictionaries. The grammar can be put to work to
describe files in terms of data in stead defining languages in terms of
characters. The equivalence of a lexical dictionary is the analysts "data
dictionary". The counterpart of the syntactic grammar is a Jackson or Warnier
"data structure". All we need is an empty object and a concatenation
operator. For example if T stood for a set of records and #T for files,
then the empty string models the empty file (UNIX /dev/null) and
concatenation models writing one file after another (UNIX: cat a b >c).
Consider
- magnetic_tape::=label #file end_of_tape_marker,
- file::=header #data_block sentinel.
These define the magnetic tape's physical structure. If we were designing a
program to process a file of student records we might describe a file sorted
by sections as follows:
- file::=#section,
- section::=class_header #student,
- student::=A_student | student~A_student.
Strictly speaking a definition of form A::= B | A~B is a theorem `|- A = B |
A~B`.
If the domain has the words "file", "section" and "student" we use a similar
structure, even if the class header is absent and a student record identifies
the student's class instead. The result is called a logical structure.
- file::=#section,
- section::=#student,
- student::=A_student | student~A_student.
If the problem is to summarize the data on each section then we might show the
output as:
- output::= header_page #section_summary college_summary,
and note that each section corresponds to exactly one section_summary and
the input corresponds to the output. It is easy to combine these into one
abstract structure:
- consume_file_and_produce_output::= produce_header_page # consume_section_and_produce_section_summary produce_college_summary.
- consume_section_and_produce_section_summary::= # consume_student
- consume_student::=consume_A_student | consume_non_A_student.
The structure above becomes a program when operations and conditions are added
(see my monograph or any good JSP book).
A data dictionary is free of semantic and pragmatic structures because a file
can be used for many purposes and hence is parsed into different structures.
Typically:
- free_file::=# record,
- record::=type1 | type2 | type3 | ...,
Now if a,b,c,... are elements then
- F::=#(a | b | c | ... ),
is the free semigroup or unstructured structure. In categorical
terminology this is a free, initial, or universal repelling object in the
category of semigroups with units. Free dictionaries are the weakest useful
dictionaries. In consequence a design that has two or processes that
communicate by consuming and producing strings(files) can interpret or
structure the files as they wish. This is the powerful form of information
hiding that is implicit in Jackson's work.
Data dictionaries can show a non-programmer what a program is expected to do
in their own terminology not computer jargon. The canonical
grammar/dictionary /glossary shows the user or client that we understand all
their terminology. Canonical diagrams and dictionaries show how the user's
data is handled in terms that the users and clients understand. The test of a
canonical dictionary is that clients and users certify it as complete and
accurate. Canonical dictionaries are the strongest useful dictionaries.
- Functions (in the pure mathematical sense) are a part of most software
systems. Because they are there we need some simple conventions that let them
be defined very easily. The result is close to the MIRANDA language developed
in the United Kingdom[ ??] . The key idea is a simplification of the λ
notation. In the λ notation a variable is introduced to indicate the
value to which the function is to be applied. For example if we use the key-
word 'map' to indicate such an expression we might write:
- double::= map[x] (x+x).
- square::= map[x] (x*x).
- cube::=map[x] (x*square(x)).
- ....
These imply
- cube(double(square(2)))=cube(double(4))=cube(8)=8*square(8)=8*(8*8)=8*6
4=512.
However, simple maps can be written with a constant blank space symbol (_)
instead of the variable x:
- double::= (_)+(_).
- square::= (_)*(_)
- cube::=(_)*square((_)).
As an immediate application are the STANDARD definitions of the syntax
functions mentioned in the previous section:
- #::=( (_) #((_)) |).
- L::= (_) #( comma (_)), L::=List of (_).
- P::= l_paren L((_)) r_paren, P::=Paranthesized list of (_).
- non::= char ~ (_).
- N::= (_) #((_)), N::=Many (_).
- O::= ((_)|), O::=Optional (_).
So if we wrote
- some_map::=(_),
then some_map(x) = x. So, (_) must represent the identity map. Extend this
line of thought further. Putting any function in an expression where a
value is expected and we get an expression for a function instead of an
expression of a value. So if g maps into the domain of f then
- f(g) = map[x](f(g(x)).
If 1st, 2nd, 3rd, 4th, ... n.th ... are maps from n-tpls into their
components then
- power::= (1st)^(2nd),
implies that
- power(3,2) = (1st(3,2))^(2nd(3,2)) = 3^2 = 9.
The result is a notation like the command line arguments in a C program, a
UNIX shell script, ICL George 3 Macros, ...
The same simple approach, starting with simple operations like successor
(succ), predecessor(pred), and a constant zero(0) lets us develop the Peano
arithmetic in EBNF_like definitions:
- +::= (2nd)=0; (1st) |(2nd)<>0; succ((1st))+pred((2nd)).
- *::= (2nd)=0; 0 | (2nd)=1; (1st) |(2nd)>1; (1st)+((1st)*pred(2nd)).
- ^::= (2nd)=0; 1 | (2nd)=1; (1st) |(2nd)>1; (1st)*((1st)^pred(2nd)).
Here the semicolon acts like "if_then_": (;)=if (1st) then (2nd) but is
actually the relational product described next.
I have shown that BNF definitions can define functions in a natural way.
Further if 1st and rest map a string into two parts so that x=1st(x)!rest(x),
then
- reverse::=""+>"" | (strings~""); ( reverse((rest)) ! (1st) ).
A relation R in @(T1,T2) can be defined in terms of a well formed formula
W(x,y) by
- R::@(T1,T2)= W((1st), (2nd)).
or any of the following forms
- For x:T1, y:T2, x R y::@= W(x,y).
Given two binary relations R and S then
For R: @(T1,T2), S:@(T2,T3),
- R; S::= for some z:T2 ((1st) R z and z S (2nd)),
- R; S::= The relation between x and y such that for some z, x R z S y,
- R | S::= (1st)R (2nd) or (1st) S (2nd),
- R | S::= `The relation between x and y such that either x R y or x S
y`,
- R & S::= (1st)R (2nd) and (1st) S (2nd),
- R & S::= The relation between x and y such that both x R y and x S y.
- R ~ S::= (1st)R (2nd) and not (1st) S (2nd).
- R ~ S::= `The relation between x and y such that both x R y and not x S
y`.
/R is the converse relation to R. For example (_^2) is the 'square'
function and /(_^2) is the square root relation, Similarly sin is the sin
function and /sin includes the arcsin function. Further we want
/parent_of=child_of.
Therefore define:
- For R:@(T1,T2), /R::@(T2,T1)= (2nd) R (1st).
In context, a set can also be used as a relation:
- For A:@T, A::@(T,T)=(1st) in A and (1st)=(2nd).
Complex situations can be defined quickly using these notations in and EBNF-
like way:
Suppose that we have sets male and female, and a relation parent:
- parent::=(1st)is a parent of(2nd).
- ...
- child_of::=/parent.
- father::=male;parent.
- mother::=female;parent.
- son::=male;child_of.
- daughter::=female;child_of.
- grandfather::=father;parent.
- ...
- granddaughter::=daughter;child_of.
- grandson::=son;child_of.
- ...
- sibling::= (/parent;parent) ~ (=). -- I am not my own sibling!
- sister::=female;sibling.
- ...
- uncle::=brother;parent.
- ...
- niece::=daughter;sibling.
- first_cousin::=/parent;sibling;parent.
- second_cousin::=/parent;/parent;sibling;parent;parent.
- lineal_descendant::= child_of | lineal_descendent; child_of,
- ancestor::=parent | parent; ancestor
- cousin::=/parent;sibling;parent | /parent;cousin;parent,
- first_cousin_once_removed::=/parent;first_cousin | first_cousin;parent.
- [From Logic.Natural.KINSHIP]
- ...
The last four definitions above are like productions in context free grammar.
The definitions of ancestor and lineal_descendant illustrate the
transitive closure or ancestral relation. It is so common that a special
notation is defined:
- do(R)::= `The relation between x and y such that x=y, or x R y or x R;R
y, or...`.
Example:|-ancestor=parent; do(parent) and
- lineal_descendant=child_of; do(child_of).
The defining property of do is that if something is invariant (or fixed, or
closed) under R then it will not be changed by any number of Rs.
This was originally in Whitehead and Russell's Principia the same form is
in Floyd's loop invariant technique and reminiscent of Kleene closure in
grammar theory. This definition relies on using higher order quantifiers[cf
Guerevich 87].
- inv(R)::= the invariants of R
- inv::=the sets X such that if x in X and x (_) y then y in X.
Example: families::=inv(parent).
- For R:@(X,X), F:@X, if F.R in @F then F in inv(R).
- For R:@(X,Y), F:inv(R), F.do(R) in @F.
- Example: families.do(parent) in @families.
Is a subset of">
Normally X in @Y is written X==>Y and means that X is a subset of Y
- or is Y itself.
- For X,Y:Sets, X==>Y iff for all x:X ( x in Y).
[ logic_30_Sets.html ]
EBNF is a usable context free grammar that uses notation equivalent to
Kleene's Regular Expressions. Kleene developed Regular expressions as a way
for documenting Events in Neural Networks. They were put to good use in both
automata theory and grammar theory in 1950's and 1960's. What is not well
known is that they have also been in use (from about 1975 onward) to document
that possible patterns of events that effect an entity in a system
[Kleene,DAD, Robinson 79, JSD,SSADM]. Since 1980 the behavior of
communicating sequential processes has been a hot topic for research [ Hoare,
Milner,....] . The formulations always seem to involve an XBNF like component
to describe the behavior of objects and entities.
Here is simple example: A button for an elevator can be pushed many times
before an elevator finally services it:
- button::= # request. request::= pushed #pushed serviced.
Another example: A stack accepts items that are pushed onto it and requests
for items to be popped off it. When unempty it makes sense to request a copy
of the top item one or more times.
- stack::= empty | push non_empty_stack pop.
- non_empty_stack::= #top | push non_empty_stack pop.
Jackson has shown how such formula lead directly to the designs of software
that behaves as specified[JSD, SSADM,...].
Relations behave like the operations and conditions in a programming language
because each operation in a program relates the current state to the set of
possible future states. If there was always a single next state then the
relation is deterministic, and by composing deterministic steps carefully we
get an algorithm. In turn an algorithm can be coded for input to a suitable
translator giving an executable program. Thus a relation describes a set of
acceptable behaviors and the implementor selects a particular algorithm
[compare with Kowalski, Hoare 87].
MATHS relations can be written in a form that makes them look like statements
in a programming language. Some syntactic sugar is defined for those who want
to write while/for/loops/. . . . Here is an introduction. Formal treatment
comes later when we have the formulae to handle it [Chapters 5].
The key definition is that of a dynamic predicate. A predicate is a
proposition that has free variables. A dynamic predicate has free variables
with a prime (') attached to them. Here is an example:
- x'=x+y.
After the statement, the value of x is equal to the sum of the old values of x
and y. y is unchanged. It could be coded as x:=x+y in Ada/Pascal/... The
rule is that "a prime means the new and no prime means the old" [Ross Ashby
56, Part II of Lam & Shankar 90, compare Hehner 84a & 84b, Guttag & Horning
91]. Formally x'=x+y defines a relation on a structured set with
variables x and y.
- x'=1
guarantees that x is 1 after the operation but no other variables can change
during the operation.
- x'=1-x
means that the value of x will be complemented - its new value will be the
result of subtracting its old value from one while all other variables stay
the same.
- x=1
has no primed variable. It means that x has value one, and does not change -
all variables are left untouched. It is a relation that holds between
identical states when and only when x has value 1. It filters out all states
when x is not 1. It allows computation under a condition. We call it a
"Condition" therefore. It can be shown that it is a condition in the sense
of Arbib and Manes [87].
Here is a more general statement of the MATHS interpretation of a predicate
that contains variables with primes. Consider a simple state space S which
declares m variables x1, x2,...x[m] of types X1, X2, ...X[M]:
- S::=${x1::X1,x2::X2,...,x[m]::X[m]}.
Let P(y1',y2',y3',...y[n]',x1,x2,x3,...x[m]) be a predicate with n distinct
primed variables(y1, y2,...y[n]) which also may also appear without primes as
xs. For example in x'=x+y, x1=x, x2=y, and y1=x.
The relation defined by P(y1', y2', y3', ...y[n]', x1, x2, x3, ...x[m]) is
found by:
- (1) Replacing primed variables (y1,...) by values in the new state(s2),
and
- (2) Replacing non_primed variables(x1,...) by values in the old
state(s1), and
- (3) Adding the requirement that the static (unprimed) variables do not
change.
So:
- P(y1',y2',y3',...y[n]',x1,x2,x3,...x[m])::=
- rel [s1:S, s2:S](
- P(s2.y1, s2.y2, ..., s2.y[n], s1.x1, s1.x2, ...,s1.x[m]
- and for all i:1..m where x[i] not in {y1, y2, ..., y[n]},
- s2.x[i]=s1.x[i]
- ).
A more rigorous definition will be worked out in my monograph meanwhile here
are some examples. To specify a process that solves two simultaneous linear
equations we write
- a*x'+b*y'=c and d*x'+e*y'=f.
To specify a process that solves a quadratic equation we write
- a*x'^2+b*x'+c=0.
On the other hand
- a'*x^2+b'*x+c'=0,
specifies that x is left alone and a, b, and c are adjusted to fit the
equation.
Consider an expression that uses (do), (|), (;) from the calculus of relations
to structure dynamic predicates, then it specifies, at a high level a program.
For example:
- SUM::=(n'=1 and s'=0; do(n<>101; s'=s+n and n'=n+1); n=101).
The definition for SUM looks like a program but is an expression in the
calculus of relations and so has provable properties. First since the (+) is
defined to be an operator on numbers, so the state space is ${n,s::Numbers}.
Next define names for the parts of SUM:
- S::=(n'=1 and s'=0).
- L::=(s'=s+n and n'=n+1).
- E::=(n=101). W::=not E.
- |-SUM= (S; do(W; L); E).
Now after E we know that n=101 (and s has its previous value), since
- post(E)=img(E)={ x :X || x.n=101}.
Similarly after S we know that n=1 and s=0. Further not E by definition
will not change the values of s and n. Next we look for something that
does not change in L. Both s and n change - in parallel:
- |-L = ( (s',n')=(s+n,n+1) ).
With a little creativity or experience we discover the following invariant of
L:
- F::= ( s=1+2+3+4+...+(n-1)).
- |-F; L ==> F,
- |-F in inv(L).
Similarly|-F in inv(W) and inv(E),
so|-F in inv(W;L).
Thus|-(F; do(W; L)) ==> F.
Now|-S ==> F and F;E==>F.
- so
- |-SUM = (S; do(W; L); E) ==> F and n'=101
Hence, after SUM, s= 1+2+3+4+...+(n-1) where n=101 = 5050.
Notice that MATHS is not a programming language. So
- x (do(W; L)) y
is not the same as
- x (do(W; L); E) y.
in the first case y.n can have many values. In the second y.n will be 101.
Similarly, when
- x ( i=1; i'=2 | i'=3 ) y
permits y.i=2 even if i<>1, but if x.i=1 then y.i MUST be 2. Whatever
happens y.i is either 2 or 3. Such non-determinism seems to be essential
for a simple model of processes[Dijkstra 76, Hoare 78, Hoare 85, Gabrielian &
Franklin 91].
MATHS includes structured programs because
For R:@(S,S),
- no(R)::= ((1st)=(2nd) and for no y((1st) R y)
::=The relation between x and x such that for no y does x R y
- For A:@S, R:@(S,S), while(A)(R)::=(do(A; R); no(A)),
- For I,J,K:@(S,S),R:@(S,S), for(I, J, K)(R)::=(I; while(J) ( (R); K ) ),
- For A:@S, R,T:@(S,S), (if A then R else T fi)::=(A; R | no A; T).
We can, for example, when n<=m define:
- SUM(f, i, n, m, s)::=for(i'=n, i<=m, i'=i+1)(s'=s+f(i))
- and show
- SUM(f, i, n, m+1, s) = SUM(f, i, n, m, t) and s=t+f(m+1).
Using Chapter 4.Section 8,
- SUM(f, i, n, m, s)=for(i'=n, i<=m, i:+1 )( s:+f(i) )
MATHS even has an extension of Dijkstra's Guarded Command. Any group of MATHS
users can document and then use its own set of abbreviations. For example:
- ADA_PDL::=
Net{
...
- For A,B,C:@(S,S), loop A; exit when B; C; end loop::= (do ( A; no(B);
C); (A; B) ).
- ...
}=::
ADA_PDL.
MATHS can express mixtures of conditions and actions in one relation. A well
known enigmatic program is often written:
- Enigma::=
( while(n>1) (
- if n in even then n'=n/2
- else n'=3n+1
- fi
- )
).
In MATHS it can be transformed to read:
- |-Enigma=while(n>1)( 2*n'=n | 2*n'=3*n+1 ).
MATHS makes the derivation of programs more logical: (1) state the properties
of the changed variables as a set of equations and then (2) solve for the
primed variables. Similarly, (1) give non-deterministic relationships and
reduces them to a form that is compatible with a declarative programming
system such as Prolog [Compare Lamport 87].
Finally, MATHS relational expressions follow the calculus of
relations[Whitehead & Russell]which can be used to simplify sequences into
disjunctive normal form, like the following:
- R = A1 and A2 and A3 and... or B1 and B2 and... or...
Typical rules are
- |-R;(S|T) = (R;S) | (R;T)
- |-(P(x, y); Q(x, y, y')) = (P(x, y) and Q(x, y, y'))
- |-(Q(x, y); x'=e(x, y) ) = (x'=e(x, y) and Q(x , y))
- |-(x'=e(x, y); Q(x, y, y')) = (x'=e(x, y) and Q(e(x, y), y, y'))
There is no advantage in using mathematics only as a program design language.
The big pay-offs will occur by introducing formal methods where the errors are
made - early in the process. Most errors are made in the analysis of
requirements - misunderstanding, ignorance, ambiguity, and jumping to
conclusions. Mathematics is best used recording ideas and defining terms
throughout a project. For example it is not useful to a client to just quote
specification and algorithm:
- euclid is a program that calculates the greatest common denominator(g):
- euclid(x,y,g)::=(x=y=g' | x<y; euclid(y,x,g') | x>y; euclid(x-y,y,g') ).
- [transcribed from a Prolog text - Schnupp & Bernhard 87, p36]
Shorter forms are worse [Knuth et al 89]:
- gcd::=map[x,y] (x=y+>x | x<y +> gcd(x,y) | x>y +> gcd(x-y, y)).
The better way is to define why and then derive what to do from the Why.
Z lets you gather a set of declarations and axioms into a schema. I have
experimented with using an EBNF like definition to name a set of declarations,
definitions, comments, and assertions. The documentation is put between
matching "Net{" and "}" strings. The result can be named and so reused:
- GCD::=
Net{
Theory of the greatest common denominator of a set of
positive integers.
- Uses Number_Theory.
- |-Nat::=Natural. |-Nat0::=Positive. |-Nat::=Nat|{0}. --
Definitions in Number_Theory
|-For x,y :Nat0, x divides y ::= for some f:Nat0(x*f=y),
- factors(y)::= {f:Nat0 || f divides y},
- multiples(x)::={y:Nat0 || x divides y}. -- Definitions in Number_Theory
- ...
- |-(1) for x,y,z,a,b:Nat,
- if x and y divides z then a*x+b*y divides z.
- ...
(def gcd) gcd::nAT><Nat->Nat=greatest(factors(1st) & factors(2nd)),
- ...
- |-for x,y, gcd(x,y)=gcd(y,x) { ...(&) in commutative(@Nat0) ...}.
- |-for x,y, if x>y then gcd(x,y)=gcd(x-y,y).
- ...
- euclid::= (do( x<y; x'=y and y'=x | x>y; x'=x-y ); x=y=g').
- |-for x,y,g:Nat, (x=>x,y=>y,g=>g) euclid (x=>x', y=>y', g=>gcd(x,y)).
- ...
}=::GCD
GCD can then be referred to by Uses GCD in other pieces of documentation.
An abstract specification permits innovative but correct implementations to be
invented. For example, if one has to evaluate a many GCD's then Euclid's
algorithm is clearly inefficient, and remains so when coded iteratively. For
example, when gathering statistics on the GCD's of all pairs of numbers in the
range 1..2^p as p+>oo I found that the iterative version of Euclid's
algorithm recalculates the same values many times. Backtracking to the
definition in above (GCD(def gcd) ) we have:
(GCD(def gcd) )|- For x,y, gcd(x,y)::=greatest(factors(x) & factors(y)).
Therefore: gcd in (Nat >< Nat)->Nat. Now ((Nat >< Nat)->Nat) is
monomorphic to @(Nat, Nat, Nat). So gcd can be a table, a data base, a
data structure, or even a hybrid of data and algorithm:
- With T::@(Nat, Nat, Nat) define
- euclid_plus::=
( x0'=x; y0'=y;
- do( no (x, y, g') in T;
- ( x<y; x'=y and y'=x
- | x>y; x'=x-y
- )
- );
- ( x=y; g'=x; T:|{(x0, y0, g)} --(put gcd in T)
- | (x,y,g') in T --(find gcd in T)
- )
)
The form of euclid_plus is
- LOOP::=do ( no A ; ( x<y; b | x>y; c) ); ( A | x=y; d )
where A=((x,y,g') in T), and b, c, d are all total functions. So
- |-LOOP=do ( no A ; (x<>y); ( x<y; b | x>y; c) ); ( A | x=y; d )
- =do ( (x<>y) and no A ; ( x<y; b | x>y; c) ); ( A | x=y; d ).
So LOOP is a Zahn loop [Zahn 69]. So if`T ==>{ (x, y, gcd(x,y) ) ||
x,y::Nat}` is true before each call then it is also true after the call. It is
an invariant of Euclid_plus. Consider T={} before euclid_plus:
- T={}; euclid_plus =( x0'=x; y0'=y; do( x<y; x'=y and y'=x | x>y; x'=x-
y ); x=y; g'=x; T:|{(x0, y0, g)} ))
and so ends up with (x0,y0,gcd(x0,y0)) being placed in T. By induction,
euclid_plus works when T is empty initially and encapsulated. When
Euclid_plus is coded, a representation of T will be chosen and a limit
will be put on the size of T. However Euclid_plus works if `T:|{(x0,
y0, g)} is changed to T'==>T|{(x0, y0, g)}`. Code is often a finite
approximation to a design. Other ADTs, classes, and objects have defined.
XBNF can be extended or embedded into an ASCII based language that I have
called MATHS. MATHS lets terms of any type to be defined - bits, numbers,
functions, specifications of programs, ... . To make this work we need an
extended theory of the meaning of a set of definitions. MATHS assumes that
each definition adds a variable and also adds an axiom:
- t::= e
means
- t::type(e). t=e.
MATHS use the Pascal/Ada syntax:
- term "::" Type.
Therefore we will need develop a meaning for such declarations.
In general I have been studying packages of declarations and
well_formed_formula. Syntactically:
- package::="Net{" #( #(label) (declaration | definition | well_formed_formula| ...|comment ) ) "}"
For example:
Net{
0::S. +::S><S->S. For x,y,z:S(x+(y+z)=(x+y)+z). For all x:S(
x+0=0+x=x). }
Nets are collections of assumptions not programs. Associated with a set of
declarations is an abstract space. A declaration adds a dimension to the
space and an equality removes a dimension. In consequence a definition that
does not involve recursion leaves the system unchanged. Some recursive forms
invalidate their universe but others(for example LISP) has a BCNF which is not
recursive form.
MATHS borrows an idea from the Ada package and the Z schema: networks can be
named.. The original BNF notation is appropriate:
- NAME::=package.
For example:
- MONOID::=
Net{
- 0::S.
- +::S><S->S.
- (assoc): For x,y,z:S(x+(y+z)=(x+y)+z).
- (zero): For all x:S( x+0=0+x=x).
- }.
Nets are not interpreted as logic program. In many ways they can thought to
define algebraic systems rather than a program,... indeed in most of these
system the Unification Algorithm is not a valid deduction. MATHS provides the
ability to show that the uncertain statements must follow from less doubtful
ones. A special sign (|-) shows that the formula can be derived from
previous formulae. A complete inference engine or automatic theorem prover
will be an uneconomic investment since the engineer can work interactively
with a simpler proof checker. A proof typically has a Natural Deduction
form like this:
- MONOID.
- |-For all z:S, if for all x:S(x+z=x) then z=0
- Let{ (1) z:S. (2) for all x:S(x+z=x).
- (2, x=>0)|-(3) 0+z=0,
- (zero)|-4) 0+z=z
- (3, 4, Euclid)|-z=0.
- }.
I have developed a calculus for constructing packages of documentation. They
can be used in many ways - sets of object, proofs, data description, logistic
systems, notes, ...
Thus EBNF has become a way to document mathematical and logical systems:
- PEANO::=
Net{
- Natural_numbers::Sets,
- succ::Natural_numbers->Natural_Numbers,
- 0::Natural_numbers.
- pred::(Natural_numbers~{0})->Natural_Numbers,
- (Axiom 1): pred(succ)=(_).
- (Axiom 2): For all n:Natural_numbers~{0} ( succ(pred(n)) = n ).
- (Axiom 3): For no n:Natural_numbers ( succ(n)=0 ).
- ...
- 1::= succ(0).
- <::= ((1st)=0) & ~( (2nd)= 0) | pred(1st)<pred(2nd).
- |-ORDER( Set=> Natural_numbers, Relation=>(<) ).
- >::= (2nd)<(1st).
- <=::= <|=.
- >=::= >|=.
- ...
- (Axiom 4): For all S:@Natural_numbers, one n:S, all m:S ( n <= m).
- ...
- +::= (2nd)=0; (1st) | ~((2nd)=0); succ((1st))+pred((2nd)).
- *::= (2nd)=0; 0 | (2nd)=1; (1st) | (2nd)>1; (1st)+((1st)*pred(2nd)).
- ^::= (2nd)=0; 1 | (2nd)=1; (1st) | (2nd)>1; (1st)*((1st)^pred(2nd)).
- ...
}
Once named and filed these systems can be referred to in other documents. For
example, a group inherits the structure of a semigroup or monoid:
- GROUP::=
Net{
- (monoid): MONOID.
- m::=Inversion operator::S->S.
- (inverse): For all x:S( x+m(x)=0 ).
}=::
GROUP.
The effect is to formalize what mathematicians have always done and
simultaneously introduce inheritance into computer documentation. I have
converted most of my notes on algebra and topology into this form on a 7K
floppy disk.
Here is an example of a practical application based on a recent publication:
[Snyder 93, Alan Snyder<alan.snyder@eng.sun.com>, The Essence of
Objects: Concepts and Terms, IEEE Software V10n1(Jan 93)pp31-42]
- OBJECTS_0::=
Net{
All objects embody an abstraction. Objects provide services.
Clients issue requests. Objects are encapsulated. Requests can identify
objects. New objects can be created. Operations can be generic. Objects can
be classified in terms of their services. Objects can have common
implementations. Objects can share partial implementations.
}=::
OBJECTS_0.
We proceed by abstracting noun phrases from these sentences and replacing them
by sets and relations:
- OBJECTS_1::=
Net{
Objects, Abstractions, Services:: Sets.
- embody::Objects>->Abstractions.
- provides::@(Objects, Services),
provides=(1st) provides (2nd). ...
}=::
OBJECTS_1.
. . . . . . . . . ( end of section Applications) <<Contents | End>>
By extending EBNF to XBNF and then to MATHS we get a powerful way to document
the domain in which software is or will be operating.
"An application domain system model of the equipment and its intended
environment of use is defined. This is a dynamic system model, and defines
the overall states of the system as functions of time. Requirements are
constraints on this model" [Hansen Ravn & Rischel 91 ].
Simple mathematics - sets, mappings, relations, definitions, assertions, etc.-
- can define important ideas. Some details are in the Annex.
MATHS is more abstract than any other computer language. It is designed to
be too powerful for any conceivable computer to use as a programming language
because the language used at the sharp end of a project must be able to define
problems well enough so that we can recognize that they are (1) hard, or (2)
not computable. Problems of efficiency, computability, data vs algorithms,
etc. are handled as part of the implementation. The MATHS description of the
problem and the specification of a requirement should be independent of all
implementations and technology. Ignoring efficiency etc. is difficult at
first. It is essential to describe the purpose and qualities of something
before choosing or testing techniques. Hence:
- Formulae are descriptive, clear, and correct. Efficiency comes later.
- Formulae specify results and properties but do not over determine an
algorithm or data structure.
- Some formula look like programs but they often take intelligence to
interpret and/or code them.
- Some formula can be implemented as data or program.
MATHS & XBNF help solve several outstanding problems in the software process.
First, we can show how the user's data is handled in terms that they
understand. We can guarantee that this understanding is accurately modelled
in the software by using rigorous design techniques. Second, Strachey pointed
out the need to discover data structures before attempting functional design
[Strachey 66]. The sets and maps found by formal analysis connect facts to
conceptual data structures. Standard data base and data structures techniques
can then implement the conceptual solution to the problem.
Projects that use MATHS can "zero-in" on the best software. First: Define
things that appear in the description of the problem and (1) are outside the
software, (2) interact with the software, and (3) are individually
identifiable by the software. Second: Define events, entities, identifiers,
relationships, attributes, facts, and patterns. These steps are an exercise
in stating the obvious. Facts are recorded and verified by experts and by
observation. Simple facts lead to validation of algorithms, data structures,
and data bases.
If a system interacts with another system then it will have an internal model
of the external system. An abstract model of the external system is a model
for the internal system and so a structure for a design. Computer Science
already teaches the techniques to implement the designs plus the calculus of
sets, maps, and relations. So software design can be based on science. There
is also a natural graphic representation (TEMPO). This is a basis for Software
Engineering.
Annex3 Appendix
Introduction
Place in the Software Process
This summarizes the notation (MATHS) I have developed for encoding discrete
mathematics in ASCII so that software engineers don't have to make an
investment in new hardware and software to do formal methods. It was
developed to fit with notations that are already in use and to take advantage
of programming language knowhow whenever possible. A formal definition has
been developed and has been submitted to a series of publishers. I keep an
online summary, dictionary etc in a directory called "math" in my home
directory (~dick/math or ~rbotting/math).
Formal Analysis proceeds by stating the obvious and so certain truths. Simple
facts often delimit the set of logically correct structures. Grammars and
dictionaries are used to document the data and dynamics of a domain.
Relations can specify processes in a structured dictionary. Sets and maps
define entity types, abstract data types, and classes of objects of the
problem domain and are used to document requirements and specifications. When
implemented, maps become paths through the data structures and data bases. So
they define the structure shared by solution and problem.
Project documentation also describes the physical constraints caused by the
technology used. Again there are two extremes: free and inconsistent. Some
where between them is the best level. This records every assumption that can
be made about the implementation technology. It is the canonical
documentation of the technology being used when the software engineers are its
clients and users.
Some documentation says too little, and some says too much. Intuitively we
have a category of logical systems. In this category of documentation we can
distinguish two extremes. There is a free, initial, or universal repelling
object in the category of documented objects. Free or initial documentation
is the weakest useful documentation. At the other extreme there are final,
inconsistent, or universally attractive documentation - this overdefines the
situation to the point where it can not exist. In between there is the level
of documentation that reflects exactly what the client understands the
situation to be. I call this the canonical documentation. It forms the
cannon or Bible for the project. It shows the user or client that we
understand all their terminology. Canonical documentation shows the user's
world in terms that the users and clients understand. The test of canonical
documentation is that clients and users certify it as complete and accurate.
Canonical documents are the strongest useful conceptual documentation.
This annex summarizes the notation I have developed to express discrete
mathematics in ASCII.
. . . . . . . . . ( end of section Introduction) <<Contents | End>>
A proposition can have two possible meanings - true and false. In MATHS the
symbol @ represents the Boolean type with two standard values {true, false}.
The following table defines the operators.
P true false Syntax Type
not P false true prefix @->@
P true true false false
Q true false true false Syntax Type
P and Q true false false false infix(serial) @><@->@
P or Q true true true false infix(serial) @><@->@
P iff Q true false false true infix(parallel) @><@->@
if P then Q true false true true natural @><@->@
Unlike most programming languages, mathematics has two kinds of infix
operator:
- Serial: P1 and P2 and P3 = (P1 and P2) and P3
- Parallel: P1 iff P2 iff P3 = (P1 iff P2) and (P2 iff P3).
A predicate is a proposition that can have variables and quantifiers. MATHS
spells quantifiers out and includes numerical quantifiers like "one", "no",
"0..1" as well as "all" and "some". The syntax attempts to make the writing of
formal statements as natural as possible. In the following X is a set or a
type, xis a variable and W(x) a proposition containing x in a number of
places.
for all x:X(W(x))::=For all x of type X, W(x) is true,
for x:X(W(x))::= for all x:X(W(x)),
for no x:X(W(x))::=for all x:X(not W(x)),
for some x:X(W(x))::=not for no x:X(W(x)).
for 0 x:X(W(x))::= for no x:X(W(x) ),
for 1 x:X(W(x))::= for some x:X( W(x) and for 0 y:X(x<>y and W(x)) ),
for 2 x:X(W(x))::= for some x:X( W(x) and for 1 y:X(x<>y and W(y)) ).
for 3 x:X(W(x))::= for some x:X( W(x) and for 2 y:X(x<>y and W(y)) ).
- . . .
for 0..1 x:X(W(x))::=for all y,z:X( if W(y) and W(z) then y=z),
- . . .
Equality is a special predicate that is defined for all universes of
discourse. It is symbolized by "=". "<>" is (as in Pascal and BASIC) used
for inequality. Equality is a serial operator on objects in each universe. A
type is defined to a maximal collection of objects for which the lower
predicate calculus with equality can be applied.
For any type T there is an another type symbolized by @T with objects that
are sets of objects of type T. In MATHS [unlike Pascal but as in Ada] a
subset of a type is not a type, but a set. Each set is made of elements of
one type and there type is determined by the type that the set does belong
to. If a set A is of type @T then As elements must be of type T. It
is therefore unambiguous to use sets to declare the type of a new variable [
see Brinch Hansen's critique of Pascal in the Edison Papers]. In MATHS the
symbol for the type is also a symbol for the universal set of that type. It
is also used in expressions to indicate the type of a value - if e can have
values of different types then T(e)=(e).T is the value in T.
- Expression Meaning Type Example.
Sets {} null or empty set Ambiguous Numbers < 1 and >2
- T universal set @T Numbers
- {e1} singleton @T (e is of type T) {1}
- {e1,e2,e3,...} extension @T (e1,e2,...are type T) {1,2,3}
- {x:T || W(x) } intention @T {n:Nat||n*n=4}={2}
${D} Set satisfying documentation D ${x,y::Real. x^2+y^2=1.0}
- N Set satisfying documentation Named N MONOID
- S1 & S2 intersection @T><@T -> @T ..n&m..=n..m
- S1 | S2 union @T><@T -> @T ..n-1 | m+1..
- S1 ~ S2 complement @T><@T -> @T = Nat ~(n..m)
Predicates e1 in S1 membership T><@T -> @ 2 in 1..
- S1<> S2 set inequality @T><@T -> @ 1..<>0..
- S1 = S2 set equality @T><@T -> @ 1..2={1,2}
- S1==>S2 subset or equal @T><@T -> @ 1..2==>1..3
- S1=>>S2 proper subset @T><@T -> @ 1..2=>>1..3
- S1 >==S2 S1 partitions into S2 @T><@T -> @ {1,2,3}>==
- {{1,2},{3}}
- S1 non-empty set @ (overloaded)
Short hand S1 are S2::=S1==>S2, x is in S2::=x in S2.
We can use a set anywhere a type could be used, for example (when A:@T):
for all x:A(W)::=for all x:T (if x in A then (W))
In English and mathematics the common idiom puts a type name before the
variable - for example:
- For all real x>0 there are two real y's such that y^2=x,
or in MATHS
- for all Real x where x>0, 2 Real y where y^2=x,
meaning
- for all x:Real (if x>0 then for 2 y:Real (y^2=x) ).
Another useful form is like There are two elevators and five floors. MATHS
form is 2 Elevators and 5 Floors. The general form is when Q is any
quantifier(all,some,none, 1, 2...) or number and A:@T,
- Q A::= for Q x:T (x in A).
Thus|-for all Sets A, no A iff A={}.
- |-1 A iff for some a:T(A={a}). . . .
MATHS has n-tupls and lists. They are modeled as maps from subsets of the
integers.
- n..m::@Integer={i:Integer || n<=i<=m}.
- For T:Types, %T::= { x:I->T || for some n,m:Integer, I=n..m},
For each Type T MATHS defines maps 1st, 2nd, 3rd, 4.th, ..., n.th,
... last from %T into T that have the obvious meanings; plus a
'prefix/postfix' operation(?) that generates all lists/n-tupls:
- For t:T, l:%T, t?l and l?t in %T.
- For t:T, l:%T, t=1st(t?l) and l= rest(t?l) and t=last(l?t) and
l=front(l?t).
- |-(1st)?(rest) = (_) = (front) ? (last).
- head::=1st, tail::=2nd.
%%T, are lists of lists not lists of elements:
- ((1), (1,2), (1,2,3)) is a different type of object to (1,2,3).
Strings are also part of the MATHS - #T is the type of object generated by
concatenating objects of type T together. They inherit the operations,
properties, and predicates acting on lists. In addition
- for any x,y:#T, x ! y in #T and (!) is the free associative operation
with a unit () on #T.
So strings=#char and files=#record, etc. If T is the type of elementary
objects then #T is the "free" semigroup. Unlike lists, ##T=#T. Here is a
sample of the kind of definitions are possible using MATHS:
- For t:T, x:#T, A:@T, a:A, b:T~A,
filter(A, ())::=(), filter(A, (a!x))::=a!filter(A,x), filter(A,
(b!x))::=filter(A,x),
q(())::=(), q((t))::=(t), q(t!x)::=q(filter(/<(t), x) ) !
filter({t}, x) ! q(filter(/>(t), x) ).
The function q is from Wainright's Miranda examples[page 148 in Wainwright
92].
All modern formal notations from Ada to Z provide means for creating a new
type of object by (1) listing components and (2)specifying constraints and
operations on these components. In MATHS a collection of declarations defines
a set of labeled tupls. If D is a collection of declarations, definitions,
axioms, theorems and comments then Net{D} is called a set of documentation
and Net{D}=${D} is the set or class of objects that satisfy D.
The complete definition of MATHS structures will be delayed until chapter 5.
Here are some examples of tupls, sets of tupls and sets of sets of tuples. We
can think of these as records, files and data bases respectively. Or we can
think of them as objects, classes, and architectures respectively. In C we
might implement these as structs, arrays and files of code.
Tupls look like this: (a=>xa, b=>xb, ...). A tuple (a=>xa, b=>xb, ...) has
a type defined by one of the following forms:
- S::=${ a::A, b::B, c::C, ... },
or S::=Net{ a::A, b::B, c::C, ... },
or N::=Net{ a::A, b::B, c::C, ... }, S::=N.
or
==============================
______________________________
- S::=N.
- N::=
Net{
a::A,
- b::B,
- c::C,
- ...
}=::N
==============================
Example
- phones::=${name::Strings, number::Strings}.
- |-(name=>"John Smith", number=>"714-123-5678") in phones.
Each label in a MATHS structure can be used to extract components from tupls:
- (name=>"John Smith", number=>"714-123-5678").name="John Smith"
- (name=>"John Smith", number=>"714-123-5678").number="714-123-5678"
The dot fits with Ada, C, Pascal,... and mathematics since '.' projects a
vector into a scalar. MATHS formalizes projections as functions mapping
structured objects to their components. It uses both f(x) and x.f notations
[Knuth 74]. So if x:X and X=${a::A,b::B,c::C,... } then
a(x)=x.a= the a of x and a::X->A,
b(x)=x.b= the b of x and b::X->B,
c(x)=x.c= the c of x and c::X->C.
Because projections are maps, they are relations and have a converse so that
y./a= { x || x.a=y }. MATHS also defines projection operations by lists of
element names:
- x.(a,b) = (x.a, x.b) in A><B,
- x.(a,b,c) = (x.a, x.b, x.c) in A><B><C.
One way to imagine a MATHS structure is to think of it as a record or object
in a high level language or a labeled tupl in a RDBMS. The formal definition
implies a collection of maps. Thus
phones=${name,number::Strings}
implies name in phones->String and number in phones->String.
Formally, phones is the universal (free, initial, or universally repelling)
object in the category of types with names and numbers. In other words: all
types with two functions called name and number mapping objects into the
Strings, is modeled by phones. We can map phones into any other type without
loosing the information about names and numbers. Therefore phones encodes all
sets of objects with names and numbers(details in Chapter 5).
Given the idea of a tupl and a structured type it is easy to allow a notation
for collections of tupls that satisfy a given constraint - for example the set
of phones in area code "714" might be written as
${name,number::String, number in "714" String. }
In MATHS
Net{a::A, b::B, ... z::Z, W(a,b,...z),...}=
{(a=>u,b=>v,...):${a::A, b::B, ... z::Z) || for some u:A, v:B,... (W(u,
v,...) ) }
- The following are not the same:
- {x:A || W(x) } Set of objects in A satisfying W [Chapter 4].
- {x in A, W(x) } Set with Boolean elements, either {true},{false}, or
{true,false}.
Net{x::A, W(x) } Documentation with variable x in set A and
assertion W(x)[Chapter 5].
${x::A, W(x) } Set of objects (x=>a) where a is in A and W(a) is
true.
Given a set or class X:@T then we can talk about its subsets (@X). Finite
sets of similarly structured elements are familiar to computer people - files,
tables, and lists. For example:
Suppose phone_data::=@PHONE
where PHONE::=Net{name::Names, number::Numbers, place::{home,work} },
- Names::@Strings,
- Numbers::@Strings.
So if
P={ (name=>"John Doe", number=>"714-123-5678", place=>home),
- (name=>"John Doe", number=>"3678", place=>work),
- (name=>"Joan Roe", number=>"714-123-5878", place=>home),
- (name=>"Joan Roe", number=>"3878", place=>work)
}
then P in phone_data.
Further name(P) = P.name =
the names in P ={"John Doe","Joan Roe"},
and ("John Doe",work)./(name,work).P.number="3678".
Codd has shown that any set of data can be expressed as simple structured
tables like the above. Thus we can model all data as tuples. A collection of
sets of tuples is a relational data base and forms a category. The objects
in an object oriented design also forms a similar category [Freedman 83].
Structures form a conceptual model of the area being studied that is readily
implemented using modern techniques such as data bases, logic programming, and
object oriented programming[Chapter 5].
Notice that P above also satisfies the pragmatic definition of a telephone
book - we can look up numbers:- Given the items (name,place) they uniquely
determine the remaining items (number):
- P in @PHONE (name,place)->(number).
- the P(name="John Doe", place=work)=(name=>"John Doe", number=>"3678",
place=>work)
We can define the phone books as just those sets in @PHONE that relate a
name and place to an unique number:
- Phone_books::=@PHONE(name,place)->(number).
Identifiers or keys like (name, place) appear in all computer applications.
A special kind of relationship is one with two components - equivalent to a
set of 2-tuples. They are turn up in all formal and natural settings:
- x above y
- x > y
- x = y
- x owns y
- x is_the_mother_of y
The set of all relations between sets A and B is written @(A,B). A relation is
typically defined by a formula like the following:
- For x:A, y:B, x R y::= W(x,y),
where W(x,y) is a well formed formula that returns {true, false}. There is
also a formula for R it self:
- |-rel[x:A,y:B](x R y) = R.
We can also treat relations as sets of pairs or as mappings in context:
- {(x, y):A><B || x R y } , map[x:A,y:B](x R y).
Sections 6.3 and 6.4 below follow the thread of binary relations to derive a
model of program specification and design of surprising power and generality.
MATHS has ways of making relationships with more than two components readable.
First, we can declare n-ary relationships to be a binary relations between
structures.
- pays::@(person, ${it::money, to::person}).
- gives::@(person, ${it::object, to::person}).
- sells::@(person, ${it::object, to::person, at::money}).
- buys::@(person, ${it::object, from::person, at::money}).
- For a,b:person, p:object, m:money,
- a buys(p, from=>b, at=>m)::=b sells (p, to=>a, at=>m),
- b sells(p, to=>a, at=>m)::= a pays (m, to=>b) and b gives (p,
to=>a).
The expressions then simulate normal English: a buys (p, from=>b, at=>m).
There is a mild abuse of notation in that the label 'it=>' is omitted. The
notation is similar to Smalltalk key-word messages: a buys: p from:b at: c
or Grady Booch's Ada style. Further natural predicates make complex logical
functions readable. As an example:
- For a,b:person, m:money, p:object,
(a buys p from b price m)::=a buys(it=>p, from=>b, at=>m).
- |-buys =(1st) buys (2nd;it) from (2nd;from) price (2nd;at).
Further work on a "natural" or "distfix" syntax is speculative.
Standard technology and techniques can be used to implement objects, classes,
sets and mappings - using a DBMS[DATA] or data structures plus OOPs[DATA &
OOP]. An alternative is to treat each entity and relationship as a functor
or predicate in a logic programming system[Keller 87, PROLOG].
A.6 Subclasses and Subtypes
Commonly an entity type (= class = Ada type) has different
forms(=subclasses=Ada Subtypes). For example there are several species of
mammal: MAMMAL(cat), MAMMAL(dog), ... . We get this effect by defining
- MAMMAL::=Net{ ANIMATE, species::{feline, canine, equine, ...},
- cat::@=species=feline, dog::@=species=canine,
- horse::@=species=equine,...
}=::MAMMAL.
A more sophisticated syntax is :
- MAMMAL::=Net{ ANIMATE. Either cat, dog, horse,..., or ape. }.
The syntax "Either...,..,..,or..." is short for a net of Boolean declarations
and formulae. Sometimes there are normal and abnormal forms. In MATHS it is
possible to develop two pieces of documentation for each case and keep what is
common to both in the original [for example see the theory of ORDERS in
Appendix 1]. Another technique is to declare a Boolean variable and use it as
a condition. For example, normal dogs have four legs:
- MAMMAL::=Net{ ... legs::@LIMB, normal::@. if normal then |legs| =4. ...}.
Serial Data
Programs can access data from a source, one datum at a time. They need to put
items into a stream of data one at a time. Commonly the same syntax also
sends streams of messages to other programs and files. MATHS's string and
list operators perfectly model reading and writing serial streams of data.
If we are interested in strings of objects all in a set A. #A is the set of
finite sequences of things in A. MATHS defines the following (See Chapter
4(section 8), chapter 5, and sample at the end of book)
- s ! t::=the concatenation of strings s and t.
- s:!x = ( s'=s !x ) -- put a copy of x 's value at the end of s,
- x!:s = (x ! s = s') -- put a copy of x 's value in front of s [compare
Lam & Shankar 90].
The attach element to string operation is symbolized "?" and the following
are defined:
- e?:s = ( e ? s' = s ) -- e was the first item of S and is removed.
- x'?:s = ( x' ? s' = s )-- the first item is popped off s into x
- s:?x' = ( s' ? x' = s ) -- get the last x from s
There is a small technicality that arises{2} - there must be an explicit "end
of sequence" signal after the end of the data. MATHS uses the word 'end' as a
generic "There ain't gonna be no more" message. By convention 'end' is never
used as data in a string.
Here is an example Data Directed solution to the problem of replacing double
asterisks by a caret symbol[Hoare 78].
- star::="*", caret::="^".
The sets of possible inputs and outputs are I and O respectively:
- I::= #( star star | char) end,
- O::=#( caret | char) end.
Where star star corresponds to caret and I to O:
The variables i and 'o' are used for the standard input and output of a
program so:
- program::=(
- while( i.1st<>end )(
- if "**"!:i then o:!"^"
- else x'?:i; o:!x
- fi
- );
- o:!end; end?:i;
).
The above looks deceptively like pseudocode, but may demand unusual coding
techniques.
. . . . . . . . . ( end of section Appendices) <<Contents | End>>
When one process generates the input to another parallel process it is
not clear whether an empty input is merely as yet ungenerated data, or if the
generator has terminated.