Data Science in Theory and Practice. Maria Cristina Mariani

Data Science in Theory and Practice - Maria Cristina Mariani


Скачать книгу
Toss a coin 10 times bold upper X = sum of tails in 10 tosses

f Subscript bold upper X Baseline left-parenthesis bold x right-parenthesis equals bold upper P left-parenthesis bold upper X equals bold x right-parenthesis for all bold x period

      Definition 2.21 (Probability density function) The pdf, f Subscript bold upper X Baseline left-parenthesis bold x right-parenthesis of a continuous random variable bold upper X is the function that satisfies

upper F left-parenthesis bold x right-parenthesis equals upper F left-parenthesis x 1 comma ellipsis comma x Subscript n Baseline right-parenthesis equals integral Subscript negative infinity Superscript x 1 Baseline ellipsis integral Subscript negative infinity Superscript x Subscript n Baseline Baseline f Subscript bold upper X Baseline left-parenthesis t 1 comma ellipsis comma t Subscript n Baseline right-parenthesis d t Subscript n Baseline ellipsis d t 1 period

      We will discuss these notations in details in Chapter 20.

      Using these concepts, we can define the moments of the distribution. In fact, suppose that g colon double-struck upper R Superscript n Baseline right-arrow double-struck upper R is any function, then we can calculate the expected value of the random variable g left-parenthesis upper X 1 comma ellipsis comma upper X Subscript n Baseline right-parenthesis when the joint density exists as:

upper E left-bracket g left-parenthesis upper X 1 comma ellipsis comma upper X Subscript n Baseline right-parenthesis right-bracket equals integral Subscript negative infinity Superscript infinity Baseline ellipsis integral Subscript negative infinity Superscript infinity Baseline g left-parenthesis x 1 comma ellipsis comma x Subscript n Baseline right-parenthesis f left-parenthesis x 1 comma ellipsis comma x Subscript n Baseline right-parenthesis d x 1 ellipsis d x Subscript n Baseline period

      Now we can define the moments of the random vector. The first moment is a vector

upper E left-bracket bold upper X right-bracket equals mu Subscript bold upper X Baseline equals Start 3 By 1 Matrix 1st Row upper E left-bracket upper X 1 right-bracket 2nd Row vertical-ellipsis 3rd Row upper E left-bracket upper X Subscript n Baseline right-bracket EndMatrix period

      The expectation applies to each component in the random vector. Expectations of functions of random vectors are computed just as with univariate random variables. We recall that expectation of a random variable is its average value.

      The second moment requires calculating all the combination of the components. The result can be presented in a matrix form. The second central moment can be presented as the covariance matrix.

      (2.1)StartLayout 1st Row 1st Column Cov left-parenthesis bold upper X right-parenthesis 2nd Column equals upper E left-bracket left-parenthesis bold upper X minus mu Subscript bold upper X Baseline right-parenthesis left-parenthesis bold upper X minus mu Subscript bold upper X Baseline right-parenthesis Superscript t Baseline right-bracket 2nd Row 1st Column Blank 2nd Column equals Start 4 By 4 Matrix 1st Row 1st Column Var left-parenthesis upper X 1 right-parenthesis 2nd Column Cov left-parenthesis upper X 1 comma upper X 2 right-parenthesis 3rd Column ellipsis 4th Column Cov left-parenthesis upper X 1 comma upper X Subscript n Baseline right-parenthesis 2nd Row 1st Column Cov left-parenthesis upper X 2 comma upper X 1 right-parenthesis 2nd Column Var left-parenthesis upper X 2 right-parenthesis 3rd Column ellipsis 4th Column Cov left-parenthesis upper X 2 comma upper X Subscript n Baseline right-parenthesis 3rd Row 1st Column vertical-ellipsis 2nd Column vertical-ellipsis 3rd Column down-right-diagonal-ellipsis 4th Column vertical-ellipsis 4th Row 1st Column Cov left-parenthesis upper X Subscript n Baseline comma upper X 1 right-parenthesis 2nd Column Cov left-parenthesis upper X Subscript n Baseline comma upper X 2 right-parenthesis 3rd Column ellipsis 4th Column Var left-parenthesis upper X Subscript n Baseline right-parenthesis EndMatrix comma EndLayout

      where we used the transpose matrix notation and since the Cov left-parenthesis upper X Subscript i Baseline comma upper X Subscript j Baseline right-parenthesis equals Cov left-parenthesis upper X Subscript j Baseline comma upper X Subscript i Baseline right-parenthesis, the matrix is symmetric.

      We note that the covariance matrix is positive semidefinite (nonnegative definite), i.e. for any vector u element-of double-struck upper R Superscript n, we have u Superscript upper T Baseline bold upper X u less-than-or-equal-to 0.

      Now we explain why the covariance matrix has to be semidefinite. Take any vector u element-of double-struck upper R Superscript n. Then the product

StartLayout 1st Row 1st Column Var left-parenthesis u Superscript t Baseline bold upper X right-parenthesis 2nd Column equals upper E left-bracket left-parenthesis u Superscript upper T Baseline bold upper X minus u Superscript upper T Baseline mu Subscript bold upper X Baseline right-parenthesis squared right-bracket 2nd Row 1st Column Blank 2nd Column equals upper E left-bracket left-parenthesis u Superscript upper T Baseline bold upper X minus u Superscript upper T Baseline mu Subscript bold upper X Baseline right-parenthesis left-parenthesis u Superscript t Baseline bold upper X minus u Superscript upper T Baseline mu Subscript bold upper X Baseline right-parenthesis Superscript t Baseline right-bracket 3rd Row 1st Column Blank 2nd Column equals upper E left-bracket u Superscript upper T Baseline left-parenthesis bold upper X minus mu Subscript bold upper <hr><noindex><a href=Скачать книгу