Designing Functional Program in LISP
Functional programming works with a limitted set of constructs
so there is less guess work needed to design a program.
There is no room for improvisation when handling a given data
structure. Essentially the data (arguments) is repeatedly classified
and decomposed until it can be processed by functions that are (1)
predefined, (2) defined later, or (3) a reuse of the function
currently being defined.
The key point in LISP programming is to discard habits that waste your
time. Firstly: if you need a high speed program you wouldn't choose
LISP to implement it - so delay all worrying about efficiency until
there is a clear if slow solution. Secondly: Don't try looking for
assignments, loops and sequences. Third: Don't waste time
planning input and output - a LISP function is given some lists as
data by the interpretter and outputs a list as a result to the
standard output. As a result the functions can be used (and reused)
in any other function definiton.
In LISP, most functions can be designed by using the fact that the
arguments are always lists plus the fact that a list is in one of
three forms:
(nil, atom, pair)
Thus a typical structure that appears in most programs is
(COND
((NULL L) null_case)
((ATOM L) atomic_case(L))
( T pair_case( (CAR L), (CDR L),...) )
).
For example:
(DEFINE (length L)
(COND
((NULL L) 0)
((ATOM L) 1)
( T (1+ (LENGTH (CDR L)) ))
)
)
Thus we can prove that (length '(a b c)) = 3 as follows:
(length '(a b c)) = (LENGTH '(A B C))
= (1+ (LENGTH '(B C)))
= (1+ (1+ (LENGTH '(C))))
= (1+ (1+ (1+ (LENGTH '()))))
= (1+ (1+ (1+ 0 )))
= (1+ (1+ 1))
Note - you should always try a special case by hand like this before
running a LISP function.
Using the IF function:
(DEFINE ( length L)
(IF (NULL L) 0
( IF (ATOM L) 1
(+ 1 (LENGTH (CDR L)) )
)
)
)
In general the expression is
(COND
((NULL L) null_case)
((ATOM L) atomic_case)
( T pair_case )
).
Here null_case, atomic_case and pair_case are LISP expressions.
It has been shown that as long as only pair_case contains recursive
calls then the above will always halt and will define a unique result.
Similarly atoms are either numbers, strings, or words and so this
structure is common
(COND
((NUMBERP L) number_case(L) )
((STRINGP L) string_case(L) )
( T other_case(L) )
).
For example to add up all the numbers found in a structure:
(DEFINE (sum_numbers L)
(COND
((NULL L) 0 )
((NUMBERP L) L )
((ATOM L) 0 )
( T (+ (sum_numbers (CAR L)) (sum_numbers (CDR L)) ))
)
)
Again we can prove that (sum_numbers '( 1 ( A 2) 3) )= 6 by
tracing the evaluation of the function:
(sum_numbers '( 1 ( A 2) 3) )=(+ (sum_numbers 1) (sum_numbers '( A 2) 3) )
=(+ 1 (sum_numbers '(( A 2) 3) )
=(+ 1 (+ sum_numbers '( A 2))
sum_numbers '( 3) ) ))
=(+ 1 (+ sum_numbers '( A 2)) 3 ))
=(+ 1 (+ (+ (sum_numbers '(A))
(sum_numbers '(2)) 3 ))
=(+ 1 (+ (+ 0 2) 3 ))
=(+ 1 (+ 2 3 ))
=(+ 1 5)
=6.
A function to double all the numbers has a similar structure but
different operations:
(DEFINE (double_numbers L)
(COND
((NULL L) 0 )
((NUMBERP L) (+ L L) )
((ATOM L) L)
( T (cons (double_numbers (CAR L)) (double_numbers (CDR L)) )
)
)
)
Exercise. Prove that
(double_numbers '( 1 ( A 2) 3) ) = ( 2 ( A 4) 6 ).
Exercise. Prove that
(double_numbers ( sum_numbers L)) = (duble_numbers(double_numbers L)).
Sometimes the conditions depend on numeric parameters. Each pair in
the COND handles a special number. The method is to look for simple
cases plus invariants for the general cases. As a rule, when we have
integer arguments then we try to divide the problem until we either
reduce one towards a zero, or increase one up to a special value. One
template is:
(DEFINE (foo i)
(cond
( (<= i 0) zero-case )
( T (general_case ))
)
)
where the general_case is an expression that has i and (foo (1- i))
as elements. The theory of integer functions that can be defined by recursion
was one of the earliest pieces of computer science theory to be worked
out. It shows that they can do anything that can be done by any
other type of computer.
Sometimes we need to define operations on integers and lists together.
For example, to get the ith through jth items out of a list
(sublis i j x) we can handle out a lot of special cases like this
(DEFINE (sublist i j x)
(COND
((NOT (AND (NUMBERP i) (NUMBERP j)))
(print "Sublist: Nonnumeric subscripts!")
NIL
)
; else i and j must be numeric
((< i 1)
(print "Sublist: First subscript was < 1")
(sublist 1 j x)
)
; else 1 <= i
((> i j)
NIL
)
; else 1 <= i <= j
((> j (length x))
(print "Sublist: Second subscript was too big")
(sublist i (length x) x)
)
; else 1 <= i <= j <= length(x)
;(1)...
; else 1 < i <= j <=length(x)
;(2)...
)
)
We are left with two general cases and use two facts:
(1) When i=1,
sublist(1, j, (a b c ...)) = (a, sublist(1, j-1, (b,c,...))),
(2) When i>1,
sublist(i, j, (a b c ...)) = sublist( i-1, j-1, (b,c,...))
This gives:
(DEFINE (sublist i j x)
(COND
;...see above
; 1 <= i <= j <= length(x)
((= i 1) (cons (car x) (sublist 1 (1- j) (cdr x))) )
; 1 < i <= j <=length(x)
( T (sublis (1- i) (1- j) (cdr x)) )
)
)
Sometimes we introduce an auxilary function. For example, the sublist function above repeatedly calls itself and rechecks the data for being numeric each time. Compare it to:
(DEFINE (sublist i j x)
(COND
((NOT (AND (NUMBERP i) (NUMBERP j)))
(print "Sublist: Nonnumeric subscripts!")
NIL
)
; else i and j must be numeric
((< i 1))
(print "Sublist: First subscript was < 1")
(sublistok 1 j x)
)
; else 1 <= i
((> j (length x))
(print "Sublist: Second subscript was too big")
(sublistok i (length x) x)
)
( T (sublistok ( i j x))
)
)
(DEFINE (sublistok i j x)
; i, j numeric, 1 <= i, and j<=(length x)
(COND
((> i j)
NIL
)
; 1 <= i <= j <= length(x)
((= i 1) (cons (car x) (sublistok 1 (1- j) (cdr x))) )
; 1 < i <= j <=length(x)
( T (sublisok (1- i) (1- j) (cdr x)) )
)
)
Exercise. Prove that
(sublist 2 3 '(A B C)) = '(B C).
When each part of a structure has something different done to it then
an extra function be defined for each subproblem.
There is another case when we create an extra function. Because a
function has arguments that operate like local variables, we can often
avoid the need to store a value by calling an auxilary function with
the stored value as the actual parameter. This is often means that the
given problem is replaced by a more general one. Oddly, a general
program is often easier to solve than the original special case.
Beginners don't know this so the have to learn to look for it.
An example is what might happen if someone asked us to write a
function in LISP that used Eratosthene's sieve to generate all primes
less than or equal to a given number n. In Pseudo LISP we might
sketch this solution:
(define (primes n)
(cond
( (<= n 1) NIL )
( (= n 2) '(2) )
( (divisible_by_any n (primes (1- n)) )
(primes (1- n))
)
( T (cons n (primes (1- n)) ))
)
)
Now (primes (1- n)) appears in 3 places and would be re-evaluated each
time... Perhaps we should think in terms of a general function
(add_if_prime n ps)
that can be called with (primes (1- n)):
(add_if_prime n (primes (1- n))
and returns (primes n).
If we had `add_if_prime` then we could write
(define (primes n)
(cond
( (<= n 1) NIL )
( (= n 2) '(2) )
( T (add_if_prime n (primes (1- n)) ))
)
)
Now we can define `(add_if_prime n ps)`
(define (add_if_prime n ps)
; if no element of ps divides n then return list with n added
(cond
( (divisible_by_any n ps) ps )
( T (cons n ps) )
)
)
So we can show that (primes 4)=(primes 3)=(3 2) by these steps:
(primes 4) = (add_if_prime 4 (primes 3) )
Now follow (primes 3) for the moment:
(primes 3) = (add_if_prime 3 (primes 2) )
= (add_if_prime 3 (2) )
= (cons 3 (2) )
= (3 2).
So, returning to (primes 4):
(primes 4) = (add_if_prime 4 (primes 3) )
= (add_if_prime 4 (3 2) )
= (3 2)
The design of the predicate
(divisible_by_any n p)
is simple given (divides n m):
(define (divisible_by_any n p)
(cond
( (null p) NIL )
( (numberp p) (divides p n) )
( (divides (car p) n) T)
( T (divisible_by_any n (cdr p)) )
)
)
If (divides n m) doesn't exist already we can write:
(define (divides n m) (= 0 (rem m n)))
and even (if necesary)
(define (rem n m) (- (* (/ m n) n) m )
)
In summary, LISP programming is almost entirely top-down and data
directed - the program logic follows from the logical decomposition of
the data.