Introduction to Data Abstraction

We already know about procedural abstraction, which suppresses implementation details by treating procedures a black boxes.
Data abstractions allows us to isolate how a compound data object is used from the detail from the details of its actual representation.

That is our programs should use data in such a way as to make no assumptions about the data that are not strictly necessary for performing the task at hand.

There is a concrete data representation behind the abstraction.
The interface is made up of procedures called constructors and selectors.

Example: Arithmetic Operations for Rational Numbers

We want to add, subtract, multiply, divide and test equality with our rational numbes.
Assume we have (make-rat n d), (number x), and (denom x) available as the constructor and selectors. This is wishful thinking, and it is a good technique.
Here is the implementation for addition:

(define (add-rat x y)
  (make-rat (+ (* (numer x) (denom y))
               (* (numer y) (denom x)))
            (* (denom x) (denom y))))

Pairs

A pair is a concrete structure that we create with cons.
We extract parts of the pair with car and cdr.

(define x (cons 1 2))

(car x)
-> 1

(cdr x)
-> 2

This is all glue we need to implement all sorts of complex data structures.
Data objects constructed from pairs are called list-structured data.

Representing rational numbers

Now we can represent rational numbers:

(define (make-rat n d) (cons n d)) 
(define (numer x) (car x)) 
(define (denom x) (cdr x))

To ensure that our rational numbers are always in lowest terms, we need make-rat to divide the numerator and the denominator by their greatest common divisor (GCD).

(define (make-rat n d) 
  (let ((g (gcd n d))) 
    (cons (/ n g) (/ d g))))

Exercises: [[Exercise 2.01|2.1]]

Abstraction Barriers

In general, the underlying idea of data abstraction is to identify for each type of data object a basic set of operation terms of which all manipulations of data objects of that type will be expressed, and then to use only those operations in manipulating the data.

Details on the other side of an abstraction barrier are irrelevant to the code on this side.
This makes programs easier to maintain and modify.

Constraining the dependence on the representation to a few interface procedures helps us design programs as well as modify them, because it allows us to maintain the flexibility to consider alternate implementations.

Exercises: [[Exercise 2.02|2.2]], [[Exercise 2.03|2.3]]

What is Meant by Data?

Data is defined by some collection of selectors and constructors, together with specified conditions that these procedures must fulfill in order to be a valid representation.
For rationals, we have the following definition:
1. We can construct a rational x with (make-rat n d).
2. We can select the numerator with (numer x).
3. We can select the denominator with (denom x).
4. For all values of x, (/ (numer x) (denom x)) must equal \(n/d\).
For pairs, it is even simpler: we need three operations, which will call cons, car, and cdr, such that if z is (cons x y), then (car z) is x and (cdr z) is y.
Any triple of procedures satisfying this definition can be used to implement pairs. In fact, we can do it with procedures themselves:

(define (cons x y) 
  (lambda (m) 
    (if (= m 0) x y))) 
(define (car z) (z 0)) 
(define (cdr z) (z 1))

This doesn't look like data, but it works.
This is how you implement pairs in \(\lambda\)-calculus.
In real Lisp implementations, pairs are implemented directly, for reasons of efficiency. But they could be implemented this way and you wouldn't be able to tell the difference.

The ability to manipulate procedures as objects automatically provides the ability to represent compound data.

This style of programming is often called message passing.

Exercises: [[Exercise 2.04|2.4]], [[Exercise 2.05|2.5]], [[Exercise 2.06|2.6]]

Extended Exercise: Interval Arithmetic

We want to design a system that allows us to manipulate inexact quantities with known precision (uncertainty).
We need arithmetic operations for combining intervals (ranges of possible values).

(define (add-interval x y)
  (make-interval (+ (lower-bound x) (lower-bound y))
                 (+ (upper-bound x) (upper-bound y))))

(define (mul-interval x y)
  (let ((p1 (* (lower-bound x) (lower-bound y)))
        (p2 (* (lower-bound x) (upper-bound y)))
        (p3 (* (upper-bound x) (lower-bound y)))
        (p4 (* (upper-bound x) (upper-bound y))))
    (make-interval (min p1 p2 p3 p4)
                   (max p1 p2 p3 p4))))

(define (div-interval x y)
  (mul-interval
   x
   (make-interval (/ 1.0 (upper-bound y))
                  (/ 1.0 (lower-bound y)))))