Nonlinear Filters. Simon Haykin
alt="left-parenthesis bold x Subscript i Baseline comma p left-parenthesis bold x Subscript i Baseline right-parenthesis right-parenthesis"/> may change the shape of the PDF curve significantly but it does not affect the value of the summation or integral in (2.95) or (2.96), because summation and integration can be calculated in any order. Since
Definition 2.4 Joint entropy is defined for a pair of random vectors based on their joint distribution as:
(2.98)
Definition 2.5 Conditional entropy is defined as the entropy of a random variable (state vector) conditional on the knowledge of another random variable (measurement vector):
(2.99)
It can also be expressed as:
(2.100)
Definition 2.6 Mutual information between two random variables is a measure of the amount of information that one contains about the other. It can also be interpreted as the reduction in the uncertainty about one random variable due to knowledge about the other one. Mathematically it is defined as:
Substituting for from (2.99) into the aforementioned equation, we will have:
(2.102)
Therefore, mutual information is symmetric with respect to
Definition 2.7 (Stochastic observability) The random vector (state) is unobservable from the random vector (measurement), if they are independent or equivalently . Otherwise, is observable from .
Since mutual information is nonnegative, (2.101) leads to the following conclusion: if either
2.8 Degree of Observability
Instead of considering the notion of observability as a yes/no question, it will be helpful in practice to pose the question of how observable a system may be [29]. Knowing the answer to this question, we can select the best set of variables, which can be directly measured, as outputs to improve observability [30]. With this in mind and building on Section 2.7, mutual information can be used as a measure for the degree of observability [31].
An alternative approach aiming at providing insight into the observability of the system of interest in filtering applications uses eigenvalues of the estimation error covariance matrix. The largest eigenvalue of the covariance matrix is the variance of the state or a function of states, which is poorly observable. Hence, its corresponding eigenvector provides the direction of poor observability. On the other hand, states or functions of states that are highly observable are associated with smaller eigenvalues, where their corresponding eigenvectors provide the directions of good observability [30].
A deterministic system is either observable or unobservable, but for stochastic systems, the degree of observability can be defined as [32]:
(2.103)
which is a time‐dependent non‐decreasing function that varies between 0 and 1. Before starting the measurement process,
2.9 Invertibility
Observability can be