Nonlinear Filters. Simon Haykin
Statistical Modeling
Statistical modeling aims at extracting information about the underlying data mechanism that allows for making predictions. Then, such predictions can be used to make decisions. There are two cultures in deploying statistical models for data analysis [5]:
Data modeling culture is based on the idea that a given stochastic model generates the data.
Algorithmic modeling culture uses algorithmic models to deal with an unknown data mechanism.
An algorithmic approach has the advantage of being able to handle large complex datasets. Moreover, it can avoid irrelevant theories or questionable conclusions.
Figure 1.1 The encoder of an asymmetric autoencoder plays the role of a nonlinear filter.
Taking an algorithmic approach, in machine learning, statistical models can be classified as [6]:
1 (i) Generative models predict visible effects from hidden causes, .
2 (ii) Discriminative models infer hidden causes from visible effects, .
While the former is associated with the measurement process in a state‐space model, the latter is associated with the state estimation or filtering problem. Deploying machine learning, a wide range of filtering algorithms can be developed that are able to learn the corresponding state‐space models. For instance, an asymmetric autoencoder can be designed by combining a generative model and a discriminative model as shown in Figure 1.1 [7]. Deep neural networks can be used to implement both the encoder and the decoder. Then, the resulting autoencoder can be trained in an unsupervised manner. After training, the encoder can be used as a filter, which estimates the latent state variables.
1.5 Vision for the Book
This book provides an algorithmic perspective on the nonlinear state/parameter estimation problem for discrete‐time systems, where measurements are available at discrete sampling times and estimators are implemented using digital processors. In Chapter 2, guidelines are provided for discretizing continuous‐time linear and nonlinear state‐space models. The rest of the book is organized as follows:
Chapter 2 presents the notion of observability for deterministic and stochastic systems.
Chapters 3–7 cover classic estimation algorithms:
Chapter 3 is dedicated to observers as state estimators for deterministic systems.
Chapter 4 presents the general formulation of the optimal Bayesian filtering for stochastic systems.
Chapter 5 covers the Kalman filter as the optimal Bayesian filter in the sense of minimizing the mean‐square estimation error for linear systems with Gaussian noise. Moreover, Kalman filter variants are presented that extend its applicability to nonlinear or non‐Gaussian cases.
Chapter 6 covers the particle filter, which handles severe nonlinearity and non‐Gaussianity by approximating the corresponding distributions using a set of particles (random samples).
Chapter 7 covers the smooth variable‐structure filter, which provides robustness against bounded uncertainties and noise. In addition to the innovation vector, this filter benefits from a secondary set of performance indicators.
Chapters 8–11 cover learning‐based estimation algorithms:
Chapter 8 covers the basics of deep learning.
Chapter 9 covers deep‐learning‐based filtering algorithms using supervised and unsupervised learning.
Chapter 10 presents the expectation maximization algorithm and its variants, which are used for joint state and parameter estimation.
Chapter 11 presents the reinforcement learning‐based filter, which is built on viewing variational inference and reinforcement learning as instances of a generic expectation maximization problem.
The last chapter is dedicated to nonparametric Bayesian models:
Chapter 12 covers measure‐theoretic probability concepts as well as the notions of exchangeability, posterior computability, and algorithmic sufficiency. Furthermore, it provides guidelines for constructing nonparametric Bayesian models from finite parametric Bayesian models.
In each chapter, selected applications of the presented filtering algorithms are reviewed, which cover a wide range of problems. Moreover, the last section of each chapter usually refers to a few topics for further study.
2 Observability
2.1 Introduction
In many branches of science and engineering, it is common to deal with sequential data, which is generated by dynamic systems. In different applications, it is often desirable to predict future observations based on the collected data up to a certain time instant. Since the future is always uncertain, it is preferred to have a measure that shows our confidence about the predictions. A probability distribution over possible future outcomes can provide this information [8]. A great deal of what we know about a system cannot be presented in terms of quantities that can be directly measured. In such cases, we try to build a model for the system that helps to explain the cause behind what we observe via the measurement process. This leads to the notions of state and state‐space model of a dynamic system. Chapters 3–7 and 9–11 are dedicated to different methods for reconstructing (estimating) the state of dynamic systems from inputs and measurements. Each estimation algorithm has its own advantages and limitations that should be taken into account, when we want to choose an estimator for a specific application. However, before trying to choose a proper estimation algorithm among different candidates, we need to know if for a given model of the dynamic system under study, it is possible to estimate the state of the system from inputs and measurements [9]. This critical question leads to the concept of observability, which is the focus of this chapter.
2.2 State‐Space Model
The behavioral