UNIVERSITY COLLEGE LONDON EXAMINATION FOR INTERNAL STUDENTS MODULE CODE COMPM054 ASSESSMENT COMPM0548 PATTERN MODULE NAME Machine Vision (Masters Level) DATE 20-May-15 TIME 10:00 TIME ALLOWED 2 Hours...

attached.


UNIVERSITY COLLEGE LONDON EXAMINATION FOR INTERNAL STUDENTS MODULE CODE COMPM054 ASSESSMENT COMPM0548 PATTERN MODULE NAME Machine Vision (Masters Level) DATE 20-May-15 TIME 10:00 TIME ALLOWED 2 Hours 30 Minutes 2014/15-COMPM054B-001-EXAM-5 ©2014 University College London TURN OVER Machine Vision, COMP M054, 2015 Answer THREE of FOUR questions. Marks for each part of each question are indicated in square brackets Calculators are NOT permitted Machine Vision 1. We sometimes use graphical models to illustrate the relationships between variables for a given problem. a. Factorize the directed graphical model in Figure 1. [4 marks] b. What is the Markov blanket of X2 in Figure I? [3 marks] Figure 1: Directed graphical model for Questions 1.a and 1.b. Figure 2: Undirected graphical model for Questions I.c and I.d. c. Factorize the undirected graphical model in Figure 2. [4 marks] d. lilustrate a factor graph representing the undirected model in Figure 2. [3 marks] [ QUESTION CONTINUED ON NEXT PAGE. ] COMPM054 1 TURNOVER [ QUESTION CONTINUED FROM PREVIOUS PAGE. ] e. For the graphical model depicted in Figure 3, show algebraically that X4 is condi­ tionally independent of X2 given X3. [3 marks] Figure 3: Directed graphical model for Question I.e. f. If possible, draw the undirected graphical model that depicts the same independence between Xl and X2 as shown in the directed model in Figure 4. [3 marks] Figure 4: Directed graphical model for Question lJ. [ QUESTION CONTINUED ON NEXT PAGE. ] COMPM054 2 CONTINUED [ QUESTION CONTINUED FROM PREVIOUS PAGE. ] g. Imagine that your job is to create an algorithm that analyzes handwriting samples. Specifically, for each sample, your algorithm will get an ASCII string and an image of some person's handwritten version of that string. Assume the person made no spelling mistakes when writing out the string, and that the written letters do not overlap (for now). The goal is to associate each pixel in the image with one of the given characters from the string. An example image is shown in Figure 5. For building your system, you can use the output of another algorithm that computes a probabilistic distribution over the set of letters for each vertical slice in the image, P(letterlimx). Illustrate a graphical model and describe the setup and steps of an inference technique to perform this process efficiently. [8 marks] Name: Image of signature: John Canny IJ 0 ~ '" C'" (\'\ j l. o 140 Figure 5: Representative string and image of handwritten version of that string, for Question l.g. h. Next, you must deal with more complicated joined up handwriting, where the writ­ ten letters may overlap, see Figure 6. In this case, instead of being given a distri­ bution for each vertical slice, you are given a distribution per pixel, P(letterlimx,y). Explain why the per-slice model is not suitable, and describe an alternative model, along with an appropriate inference technique. Explicitly state any assumptions you are making. [5 marks] Name: Image of signature: 3Of~ kl,David Lowe o 90 Figure 6: Representative string and image of handwritten version of that string, for Question l.h. [Total 33 marks] COMPM054 3 TURNOVER 2. This question explores generative models of images, where the measured world data is pre-processed to become discrete. a. You have a collection of unlabeled test images, and an internet connection. Your algorithm will need to determine the continent on which each test image was taken. Describe the procedure to build a dictionary of visual words. Your answer does not need to be in pseudocode, but should contain enough instructions for a programmer to follow. Explicitly state any assumptions or parameter choices you are making. [10 marks] b. Describe the procedure for using the just-created dictionary to encode a test image as a bag of visual words x. Be specific about the steps. [9 marks] c. If the dictionary size is extremely large, how would you modify the procedure to encode test images in sub-linear time? [3 marks] d. Explain how it is possible for the dictionary to be too large, or how it can be too small. Your answers should describe the impact both problems can have on subse­ quent classification accuracy. [6 marks] e. For this task of classifying the continent for each test image, which would be more appropriate and why, given a choice between the latent Dirichlet allocation model vs. the single-author topic model? [5 marks] [Total 33 marks] COMPM054 4 CONTINUED 3. a. We wish to infer the world state w from some data vector x. Without delving into the specific distributions used, write the equation needed for inference to compute the posterior probability when using a generative model. Specify the name used for each of the terms. [5 marks] b. Though we are using a generative model here, explain why a discriminative model could be better in some situations. Also, state which approach is better if some elements of a data vector Xi are missing. [5 marks] c. In the generative model of Question 3.a, if we choose to use a Gaussian for each class conditional density function, we will have to learn some parameters. What are the parameters? Give the Maximum Likelihood objective function that the parameters are meant to optimize, assuming each Xi is drawn independently from the set Sn of data with label w = n, Le. where n is the class index, and w is the random variable for world state. [6 marks] d. What are the advantages and disadvantages of using a Gaussian or other parametric density function for modeling the relationship between data and the world state? [4 marks] e. How is a Mixture-of-Gaussians (MoG) distribution better than a Student t-distribution? [2 marks] f. How is a Student t-distribution better than a Mixture-of-Gaussians distribution? [2 marks] g. In the Student t-distribution, what role does the hidden variable play, and how is it modeled? [3 marks] h. Explain how the Expectation-Maximization (EM) algorithm is similar to the k­ means algorithm, spelling out which part is analogous to the Expectation-step (E­ step) and which is the Maximization-step (M-step). Also, when does the EM algo­ rithm stop, and how does one know if the answer is correct? [6 marks] [Total 33 marks] COMPM054 5 TURNOVER 4. Imagine you are building a vision system where one camera observes the car driver. This camera is mounted in the passenger's headrest, and so, it sees most of the driver's body, in­ cluding hands and tops of legs, as well as the steering wheel and other dashboard-controls. This camera has reasonably good spatial and temporal resolution, but not enough to read numbers on the digital speedometer. The car manufacturer installs a second camera, that is facing outward onto the road ahead of the car, but this outward-facing camera is only for your use on prototypes, so it will not be available on cars sold to the public. a. The two cameras are both running at 25 frames per second. Assume they are running without any hardware link for synchronization. 1. What is the maximum temporal difference between the two cameras? Explain your answer. [3 marks] 11. Describe an experimental method for determining the temporal difference be­ tween the two cameras. Note that the two cameras' views do not overlap at all. [4 marks] b. As mentioned, while all cars will have an inward-facing camera aimed at the driver, only the prototypes also have an outward-facing second camera that sees the road ahead. Design a regression system that will infer the car's speed when only the inward-facing camera is available. To train your system, you can use many hours of footage from the two-camera prototype cars. Your answer should include explana­ tions of how you would preprocess the driver's silhouette into a feature represen­ tation, and how you would obtain and organize your training labels. Your answer should also indicate any steps that are needed to learn absolute speed, instead of just relative speed. [16 marks] c. How is Bayesian non-linear regression better for this task than a random forest? [5 marks] d. How is a random forest better for this task than Bayesian non-linear regression? [5 marks] [Total 33 marks] END OF PAPER COMPM054 6 ..­ UNIVERSITY COLLEGE LONDON EXAMINATION FOR INTERNAL STUDENTS MODULE CODE COMPM054 ASSESSMENT COMPM054C PATIERN MODULE NAME Machine Vision (Masters Level) DATE 18 May 2016 TIME 10:00 am TIME ALLOWED 2 hours 30 mins This paper is suitable for candidates who attended classes for this module in the following academic year(s): 2015/16 2015/16-COMPM054C-001-EXAM-8 © 2015 University College London TURN OVER I,-. Machine Vision, COMP M054, 2016 Answer THREE of FOUR questions. Marks for each part of each question are indicated in square brackets Calculators are NOT pennitted Machine Vision 1. You are developing a system to help improve people's touch typing. Users will install a webcam above their computer keyboard, looking down on their fingers. We wish to track, at each frame in the video, the (x,y) location of each fingertip. This will allow the software to give advice to the user for improving their touch typing. Below, we show an example image from the webcam, with the (x,y) axis labeled and an identifier number applied to each of the ten fingertips in the image: y ~-------l~X Assume that we are only concerned with tracking the locations of the fingertips. We do not need to worry about finding the location of the keyboard or mouse. To perform tracking, we decide to use the Kalman Filter. We have 10 objects to track, so for now, we use 10 Kalman filters, one to track each fingertip. a. In our first version, we use a Browruan motion model. Each Kalman Filter therefore has a 2-dimensional state space w = [x, y]T, consisting of
Apr 18, 2021
SOLUTION.PDF

Get Answer To This Question

Submit New Assignment

Copy and Paste Your Assignment Here