Concepts and Semantics of Programming Languages 1. Therese Hardin
Preface
This two-volume work relates to the field of programming. First and foremost, it is intended to give readers a solid grounding in the bases of functional or imperative programming, along with a thorough knowledge of the module and class mechanisms involved. In our view, the semantics approach is most appropriate when studying programming, as the impact of interlanguage syntax differences is limited. Practical considerations, determined by the material characteristics of computers and/or “smart” devices, will also be addressed. The same approach will be taken in both volumes, using both mathematical formulas and memory state diagrams. With this book, we hope to help readers understand the meaning of the constructs described in the reference manuals of programming languages and to establish solid foundations for reasoning and assessing the correctness of their own programs through critical review. In short, our aim is to facilitate the development of safe and reliable programs.
Volume 1 begins with a presentation of the computer, in Chapter 1, first at the material level – as an assemblage of components – then as a tool for executing programs. Chapter 2 is an intuitive, step-by-step introduction to language semantics, intended to familiarize readers with this approach to programming. In Chapter 3, we provide a detailed discussion on the subject, with a formal presentation of the execution semantics of functional features. Chapter 4 continues with the same topic, looking at the execution semantics of imperative features. In these two chapters, a clear mathematical framework is used to support our presentation. Also, all of the notions which we introduce in these chapters are implemented in both Python and OCaml to assist readers learning about the semantic concepts in question for the first time. Multiple exercises, with detailed solutions, are provided in both cases. Chapter 5, on the subject of typing, begins by addressing typing rules, which are used to check programs; we then present the algorithm used to infer polymorphic types, along with the associated mathematical notions, all implemented in both languages. Finally, the extension of typing to imperative features is addressed. In Chapter 6, we present the main data types and methods of pattern matching, using a range of examples expressed in different programming languages. Chapter 7 focuses on low-level programming features: endianness, pointers and memory management; these notions are mostly presented using C and C++. Volume 1 ends with a discussion of error processing using exceptions, their semantics is presented in OCaml, and the exception management mechanisms used in Python, Java and C++ are also described (see Chapter 8).
Thus, Volume 1 is intended to give a broad overview of the functional and imperative features of programming, from notions that can be modeled mathematically to notions that are linked to the hardware configuration of computers themselves. Volume 2 focuses on modular and object programming, building on the foundations laid down in Volume 1 since modules, classes and objects are, in essence, the means of organizing functional or imperative constructs. Volume 2 first analyzes the needs of developers in terms of tools for software architecture. Based on this study, an original semantic model, called a kit, is drawn up, jointly presenting all the features of the modules and objects that can meet these needs. The semantics of these kits are defined in a rather informal way, as research in this field has not yet led to a mathematical model of this set of features, while remaining relatively simple. From this model, we consider a set of emerging questions, the objective of which is to guide the acquisition of a language. This approach is then exemplified by the study of the module systems of Ada, OCaml and C. Finally, the same approach will be used to deduce a semantic model of class and object features, which will serve to present classes in Java, C++, OCaml and Python from a unified perspective.
This work is aimed at a relatively wide audience, from experienced developers – who will find valuable additional information on language semantics – to beginners who have only written short programs. For beginners, we recommend working on the semantic concepts described in Volume 1 using the implementations in OCaml or Python to ease assimilation. All readers may benefit from studying the reference manual of a programming language, while comparing the presentations of constructs given in the manual with those given here, guided by the questions mentioned in Volume 2.
Note that we do not discuss the algorithmic aspect of data processing here. However, choosing the algorithm and the data representation that fit the requirements of the specification is an essential step in program development. Many excellent works have been published on this subject, and we encourage readers to explore the subject further. We also recommend using the standard libraries provided by the chosen programming language. These libraries include tried and tested implementations for many different algorithms, which may generally be assumed to be correct.
1
From Hardware to Software
This first chapter provides a brief overview of the components found in all computers, from mainframes to the processing chips in tablets, smartphones and smart objects via desktop or laptop computers. Building on this hardware-centric presentation, we shall then give a more abstract description of the actions carried out by computers, leading to a uniform definition of the terms “program” and “execution”, above and beyond the various characteristics of so-called electronic devices.
1.1. Computers: a low-level view
Computer science is the science of rational processing of information by computers. Computers have the capacity to carry out a variety of processes, depending on the instructions given to them. Each item of information is an element of knowledge that may be transmitted using a signal and encoded using a sequence of symbols in conjunction with a set of rules used to decode them, i.e. to reconstruct the signal from the sequence of symbols. Computers use binary encoding, involving two symbols; these may be referred to as “true”/”false”, “0”/”1” or “high”/”low”; these terms are interchangeable, and all represent the two stable states of the electrical potential of digital electronic circuits.
1.1.1. Information processing
Schematically, a computer is made up of three families of components as follows:
– memories: store data (information) and executable code (the so-called von Neumann architecture);
– one or more microprocessors, known as CPUs (central processing units), which process information by applying elementary operations;
– peripherals: these enable information to be exchanged between the CPU/memory couple and the outside.
Information processing by a computer – in other terms, the execution of a program – can be summarized as a sequence of three steps: fetching data, computing the results and returning them. Each elementary processing operation corresponds to a configuration of the logical circuits of the CPU, known as a logic function. If the result of this function is solely dependent on input, and if no notion of “time” is involved in the computations, then the function is said to be combinatorial; otherwise, it is said to be sequential.
For example, a binary half-adder, as shown in Figure 1.1, is a circuit that computes the sum of two binary digits (input), along with the possible carry value. It thus implements a combinatorial logic function.