CanadaThe author has ganted a nonexclusive licence allowing the National Library of Canada to reproduce, loan, distribute or selI copies of this thesis in microform, paper or electronic formats.The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts fkom it may be printed or othelurise reproduced without the author's permission.L'auteur a accordé une licence non exclusive permettant à la BibIiotheque nationale du Canada de reproduire, prêter' distribuer ou vendre des copies de cette thèse sous la forme de microfiche/nlm, de reproduction sur papier ou sur format électronique.L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés -ou autrement reproduits sans son autorisation.
AbstractThe Computational Perception of Scene Dynamics
Richard Mann
Doctor of Philosophy Graduate Depart ment of Computer ScienceUniversity of Toronto
1998Understanding observations of image sequences requires one to reason about qualitative scene dynamics. For example, on observing a hand lifting a cup, we may infer that an 'active' hand is applying an upwards force (by graspingj on a 'passive' cup. In order to perform such reasoning, we require an ontology that describes object properties and the generation and transfer of forces in the scene. Such an ontology should include, for example: the presence of gravity, the presence of a ground plane, whether objects are 'active' or 'passive', whether objects are contacting and/or attached to other objects, and so on. In this work we make these ideas precise by presenting an implemented computational system that derives symbolic force-dynamic descriptions fÎom video sequences.Our approach to scene dynamics i s based on an analysis of the Newtonian mechanics of a simpmed scene model. The critical requirement is that, @en image sequences, we can obtain estimates for the shape and motion of the objects in the scene. To do this, we assume that the objects can be approximated by a two-dimensional 'layered' scene model. The input to our system consists of a set of polygonal outlines dong with estimates for their velocities and acceIerations, obtained kom a view-based tracker. Given such input, we prment a system t hat ext rac ts force-dynamic descriptions for the image sequence. We provide computat ional examples to demonstrate that our ontology is sufiiciently rich to describe a wide variety of image sequences.
aThis work rnakes three central contributions. First, we provide an ontology suitable for describing object properties and the generation and transfer of forces in the scene. Second. we provide a computational procedure to test the feasibiLity of such interpretations by reducing the problem to a feasibility test in linear progrnmming. Finally, we provide a theory of preference orderhg between muitipIe interpretations dong with an efficient computational procedure to determine maximal element s in such orderings
AcknowledgementsStudying in the Compute...