Chapter 1. Introduction

Scheme supports many types of data values, or objects, including characters, strings, symbols, lists or vectors of objects, and a full set of numeric data types, including complex, real, and arbitrary-precision rational numbers.

The storage required to hold the contents of an object is dynamically allocated as necessary and retained until no longer needed, then automatically deallocated, typically by a garbage collector that periodically recovers the storage used by inaccessible objects. Simple atomic values, such as small integers, characters, booleans, and the empty list, are typically represented as immediate values and thus incur no allocation or deallocation overhead.

Regardless of representation, all objects are first-class data values; because they are retained indefinitely, they may be passed freely as arguments to procedures, returned as values from procedures, and combined to form new objects. This is in contrast with many other languages where composite data values such as arrays are either statically allocated and never deallocated, allocated on entry to a block of code and unconditionally deallocated on exit from the block, or explicitly allocated and deallocated by the programmer.

Scheme is a call-by-value language, but for at least mutable objects (objects that can be modified), the values are pointers to the actual storage. These pointers remain behind the scenes, however, and programmers need not be conscious of them except to understand that the storage for an object is not copied when an object is passed to or returned from a procedure.

Scheme programs share a common printed representation with Scheme data structures. As a result, any Scheme program has a natural and obvious internal representation as a Scheme object. For example, variables and syntactic keywords correspond to symbols, while structured syntactic forms correspond to lists. This representation is the basis for the syntactic extension facilities provided by Scheme for the definition of new syntactic forms in terms of existing syntactic forms and procedures.

Scheme variables and keywords are lexically scoped(see this link for more explanation:, and Scheme programs are block-structured. Identifiers may be imported into a program or library or bound locally within a given block of code such as a library, program, or procedure body. A local binding is visible only lexically, i.e., within the program text that makes up the particular block of code. An occurrence of an identifier of the same name outside this block refers to a different binding; if no binding for the identifier exists outside the block, then the reference is invalid. Blocks may be nested, and a binding in one block may shadow a binding for an identifier of the same name in a surrounding block. The scope of a binding is the block in which the bound identifier is visible minus any portions of the block in which the identifier is shadowed. Block structure and lexical scoping help create programs that are modular, easy to read, easy to maintain, and reliable.

In most languages, a procedure definition is simply the association of a name with a block of code. Certain variables local to the block are the parameters of the procedure. In some languages, a procedure definition may appear within another block or procedure so long as the procedure is invoked only during execution of the enclosing block. In others, procedures can be defined only at top level. In Scheme, a procedure definition may appear within another block or procedure, and the procedure may be invoked at any time thereafter, even if the enclosing block has completed its execution. To support lexical scoping, a procedure carries the lexical context (environment) along with its code.

As with procedures in most other languages, Scheme procedures may be recursive. Scheme implementations are required to implement tail calls as jumps (gotos), so the storage overhead normally associated with recursion is avoided. As a result, Scheme programmers need master only simple procedure calls and recursion and need not be burdened with the usual assortment of looping constructs.

Scheme supports the definition of arbitrary control structures with continuations. A continuation is a procedure that embodies the remainder of a program at a given point in the program. A continuation may be obtained at any time during the execution of a program. As with other procedures, a continuation is a first-class object and may be invoked at any time after its creation. Whenever it is invoked, the program immediately continues from the point where the continuation was obtained. Continuations allow the implementation of complex control mechanisms including explicit backtracking, multithreading, and coroutines.

Scheme also allows programmers to define new syntactic forms, or syntactic extensions, by writing transformation procedures that determine how each new syntactic form maps to existing syntactic forms. These transformation procedures are themselves expressed in Scheme with the help of a convenient high-level pattern language that automates syntax checking, input deconstruction, and output reconstruction.

Section 1.1. Scheme Syntax

Scheme programs are made up of keywords, variables, structured forms, constant data (numbers, characters, strings, quoted vectors, quoted lists, quoted symbols, etc.), whitespace, and comments.

Keywords, variables, and symbols are collectively called identifiers. Identifiers may be formed from letters, digits, and certain special characters, including ?, !, ., +, -, *, /, <, =, >, :, $, %, ^, &, _, ~, and @, as well as a set of additional Unicode characters. Identifiers cannot start with an at sign ( @ ) and normally cannot start with any character that can start a number, i.e., a digit, plus sign ( + ), minus sign ( - ), or decimal point ( . ). Exceptions are +, -, and ..., which are valid identifiers, and any identifier starting with ->.

There is no inherent limit on the length of a Scheme identifier; programmers may use as many characters as necessary. Long identifiers are no substitute for comments, however, and frequent use of long identifiers can make a program difficult to format and consequently difficult to read. A good rule is to use short identifiers when the scope of the identifier is small and longer identifiers when the scope is larger.

Identifiers may be written in any mix of upper- and lower-case letters, and case is significant, i.e., two identifiers are different even if they differ only in case.

Structured forms and list constants are enclosed within parentheses, e.g., (a b c) or (* (- x 2) y). The empty list is written (). Matched sets of brackets ( [ ] ) may be used in place of parentheses and are often used to set off the subexpressions of certain standard syntactic forms for readability, as shown in examples throughout this book. Vectors are written similarly to lists, except that they are preceded by #(and terminated by), e.g., #(this is a vector of symbols). Byte vectors are written as sequences of unsigned byte values (exact integers in the range 0 through 255) bracketed by #vu8( and ), e.g., #vu8(3 250 45 73).

Strings are enclosed in double quotation marks, e.g., "I am a string". Characters are preceded by #\, e.g., #\a. Case is important within character and string constants, as within identifiers. Numbers may be written as integers, e.g., -123, as ratios, e.g., 1/2, in floating-point or scientific notation, e.g., 1.3 or 1e23, or as complex numbers in rectangular or polar notation, e.g., 1.3-2.7i or -1.2@73. Case is not important in the syntax of a number. The boolean values representing true and false are written #t and #f. Scheme conditional expressions actually treat #f as false and all other objects as true, so 3, 0, (), "false", and nil all count as true.

Scheme expressions may span several lines, and no explicit terminator is required. Since the number of whitespace characters (spaces and newlines) between expressions is not significant, Scheme programs should be indented to show the structure of the code in a way that makes the code as readable as possible. Comments may appear on any line of a Scheme program, between a semicolon ( ; ) and the end of the line. Comments explaining a particular Scheme expression are normally placed at the same indentation level as the expression, on the line before the expression. Comments explaining a procedure or group of procedures are normally placed before the procedures, without indentation. Multiple comment characters are often used to set off the latter kind of comment, e.g., ;;; The following procedures ....

Two other forms of comments are supported: block comments and datum comments. Block comments are delimited by #| and |# pairs, and may be nested. A datum comment consists of a #; prefix and the datum (printed data value) that follows it. Datum comments are typically used to comment out individual definitions or expressions. For example, (three #;(not four) element list) is just what it says. Datum comments may also be nested, though #;#;(a)(b) has the somewhat non-obvious effect of commenting out both (a) and (b).

Section 1.2. Scheme Naming Conventions

Scheme's naming conventions are designed to provide a high degree of regularity. The following is a list of these naming conventions:

  • Predicate names end in a question mark ( ? ). Predicates are procedures that return a true or false answer, such as eq?, zero?, and string=?. The common numeric comparators =, <, >, <=, and >= are exceptions to this naming convention.
  • Type predicates, such as pair?, are created from the name of the type, in this case pair, and the question mark.
  • The names of most character, string, and vector procedures start with the prefix char-, string-, and vector-, e.g., string-append. (The names of some list procedures start with list-, but most do not.)
  • The names of procedures that convert an object of one type into an object of another type are written as type1->type2, e.g., vector->list.
  • The names of procedures and syntactic forms that cause side effects end with an exclamation point ( ! ). These include set! and vector-set!. Procedures that perform input or output technically cause side effects, but their names are exceptions to this rule.
Section 1.3. Typographical and Notational Conventions

A standard procedure or syntactic form whose sole purpose is to perform some side effect is said to return unspecified. This means that an implementation is free to return any number of values, each of which can be any Scheme object, as the value of the procedure or syntactic form.

The phrase "syntax violation" is used to describe a situation in which a program is malformed. Syntax violations are detected prior to program execution. When a syntax violation is detected, an exception of type &syntax is raised and the program is not executed.