# Computer Vision: Models, Learning, and Inference # Computer Vision: Models, Learning, and Inference

Language: English

Pages: 598

ISBN: 1107011795

Format: PDF / Kindle (mobi) / ePub

This modern treatment of computer vision focuses on learning and inference in probabilistic models as a unifying theme. It shows how to use training data to learn the relationships between the observed image data and the aspects of the world that we wish to estimate, such as the 3D structure or the object class, and how to exploit these relationships to make new inferences about the world from new image data. With minimal prerequisites, the book starts from the basics of probability and model fitting and works up to real examples that the reader can implement and modify to build useful vision systems. Primarily meant for advanced undergraduate and graduate students, the detailed methodological presentation will also be useful for practitioners of computer vision. - Covers cutting-edge techniques, including graph cuts, machine learning, and multiple view geometry. - A unified approach shows the common basis for solutions of important computer vision problems, such as camera calibration, face recognition, and object tracking. - More than 70 algorithms are described in sufficient detail to implement. - More than 350 full-color illustrations amplify the text. - The treatment is self-contained, including all of the background mathematics. - Additional resources at www.computervisionmodels.com.

state P r(w) = Bernw [λ] and apply Bayes’ rule P r(w = 1|x) = P r(x|w = 1)P r(w = 1) 1 k=0 P r(x|w = k)P r(w = k) . (7.4) All of these terms are simple to compute, and so inference is very easy and will not be discussed further in this chapter. 7.1 Normal classification model 73 Figure 7.2 Class conditional density functions for normal model with diagonal covariance. Maximum likelihood fits based on 1000 training examples per class. a) Mean for background data µ0 (reshaped from 10800 × 1

constituent Gaussian, we update the parameters {λk , µk , Σk }. The ith data point xi contributes to these updates according to the responsibility rik (indicated by size of point) assigned in the E-step; data points that are more associated with the kth component have more effect on the parameters. Dashed and solid lines represent fit before and after the update, respectively. 7.4 Mixture of Gaussians 81 Figure 7.10 Fitting a mixture of two Gaussians to 2D data. a) Initial model. b) E-step.

but this is unnecessarily large). The results are compared to classification based on a single normal distribution. The subsequent columns of the table show results for systems trained and tested with grayscale 24 × 24 pixel regions and grayscale 24 × 24 regions that have been histogram equalized (Section 13.1.2). There are two insights to be gleaned from these classification results. First, the choice of model does make a difference; the mixtures of Gaussians density always results in better

Directed graphical model with three nodes. There is only one conditional independence relation implied by this model: the node x3 is the Markov blanket of node x2 (shaded area) and so x2 ⊥⊥ x1 |x3 , where the notation ⊥⊥ can be read as “is independent of”. b) This undirected graphical model implies the same conditional independence relation. c) Second directed graphical model. The relation x2 ⊥⊥ x1 |x3 is no longer true, but x1 and x2 are independent if we don’t condition on x3 so we can write x2

δ modifies the average value of the mean vectors. where ΓD [•] is the multivariate gamma function and Tr[Ψ] returns the trace of the matrix Ψ (see Appendix C.2.4). For short we will write P r(µ, Σ) = NorIWisµ,Σ [α, Ψ, γ, δ] . (3.18) The mathematical form of the normal inverse Wishart distribution is rather opaque. However, it is just a function that produces a positive value for any valid mean vector µ and covariance matrix Σ, such that when we integrate over all possible values of µ and Σ,