The geometry of visual space-time


Yiannis Aloimonos

Computer Vision Lab

University of Maryland





The process of recovering 3D models of the world from images (views, cameras) is central to many of the problems relevant to this Workshop. The state-of-the-art theory for solving this problem proceeds by first finding features (points and lines) in the images and then matching them; using the matches, the camera geometry is obtained and after the cameras are "placed", a model of the scene emerges. The current theory appears to have reached its limitations. This workshop is indirect evidence for that. Direct evidence comes from the quality of the reconstructions (too many distortions, holes and the need for manual intervention). Although we have a well developed theory for matched points and lines in different views, we do not have the insight into how to generate the input to the process (i.e. correspondence).


Looking at the state of the art, something doesn't seem quite right. Geometry is totally separated from Statistics (Signal Processing). Using points and lines a 3d mesh is developed and then the image texture is mapped onto the mesh. This is pretty artificial. Texture (image signal) contains a lot of information and the key to the correspondence problem. I this talk I will explain in mathematical terms why points and lines cannot be found very accurately in images. In so doing, I will explain a large number of visual illusions. Then, I will introduce Harmonic Computational Geometry, a new framework for 3D recovery from multiple views using the outputs of filters applied to image patches. I will show new geometric constraints, such as the harmonic epipolar and harmonic trilinear. I will also briefly discuss the relevance of the new framework and results to the problem of building action descriptions.


This is joint work with Cornelia Fermuller and Patrick Baker



Brief Biography:


Yiannis Aloimonos studied Mathematics in Greece and Computer Science at the University of Rochester, NY (PhD. 1987). He is Professor of Computational Vision at the Computer Science Dept. of the University of Maryland, where he directs the Computer Vision Laboratory of the Institute for Advanced Computer Studies. He is credited for his contributions to Active Vision and the discovery of geometric constraints in multiple view vision, such as the trilinear constraints.  His work with Fermuller on the relationship of the field of view and the uncertainty in 3D reconstruction has given rise to new eyes (cameras) some of which are on their path to commercialization. His major interest is the understanding of action.