Geometry
The 2D image in a human eye or in a camera is a perspective projection of the 3D scene. This projection is often analyzed by using tools of projective geometry. Projective transformation treats the human eye as an uncalibrated camera, i.e. a camera whose intrinsic parameters, such as the focal length, are unknown. Existing empirical evidence suggests, however, that the human eye is a calibrated camera. It follows that an adequate model of the human eye must involve the rules of perspective, not projective transformation. This fact has implications for models of shape and space perception. In particular, human vision uses invariants of perspective, not projective transformation.
See publications >>
Problem Solving
Human problem solving is one of the most fundamental cognitive abilities, which
is at least as important as other aspects of cognition, like perception, memory, or
learning. In our work we adopt an information-processing methodology to study human
problem solving. In particular, we are interested in problems to find. We use computationally difficult (intractable) problems (such
as Travelling Salesman Problem), which can be visually presented to the subject. Such problems are easy
to define and they are considered to be natural and easy by subjects.
Our hierarchical (pyamid) models have shown good result when compared to human subjects. Studying computationally difficult problems that are solved well by humans is likely
to shed light on some fundamental aspects of human problem solving methodologies. Supported by Air Force Office of Scientific Research.
See publications >>
Shape Perception
Shape is arguably the most important characteristic of objects because it has sufficient complexity to allow unambiguous identification of objects. Whereas shape is characterized by very many parameters (theoretically, infinitely many), other visual properties such as color, lightness, size or speed are characterized by at most three. As a result of its complexity, shapes of objects can be recognized and reconstructed without using context. This makes shape unique. It follows that perceptual mechanisms underlying shape perception are completely different from mechanisms underlying perception of depth, color, size or lightness. In particular, shape constancy is achieved by using a priori simplicity constraints, such as symmetry and 3D compactness, rather than by using context. Shape constancy refers to the fact that the percept of the shape of an object “out there” is constant despite changes in the shape of the retinal image. The retinal image shape changes when the viewing orientation changes. Supported by National Science Foundation and US Department of Energy.
See publications >>
Symmetry and skewed symmetry
Many objects (natural and man-made) are symmetric. Symmetry is very common probably because in the presence of gravity, symmetric objects allow greater stability. Once symmetry is so universal, it is reasonable to expect that it will be used by the vision systems (human and computer) as a constraint (bias, assumption) in producing perceptual interpretations of 3D shapes. The use of symmetry is complicated by the fact that the 2D retinal image of a symmetric object is itself asymmetric. The symmetry of the object is distorted in the 2D image. However, it is not destroyed. The human visual system is able to detect the distorted (skewed) symmetry and then use it in recovering shape. Supported by National Science Foundation and US Department of Energy.
See publications >>
Figure-ground organization
The primary goal of figure-ground organization is to determine which regions and contours in the image represent a 3D object 'out there', and to describe the 2D shape of the object’s image. The 2D retinal shape is then used to recover the 3D shape of the object (achieve shape constancy). Figure-ground organization is a difficult computational problem, whose solution requires a priori simplicity constraints, such as proximity, good continuation and symmetry.
See publications >>
Speed-Accuracy Tradeoff in Visual Perception and Motor Control
Humans can naturally trade the speed of their response for accuracy. The speed-accuracy tradeoff is observed both in perception and behavior. It has been proposed that visual speed-accuracy tradeoff can be explained by multiresolution (pyramid) architecture of the visual system. It is less clear what the mechanism of the speed-accuracy tradeoff in motor control is. One possible explanation is that human movements, like visual perceptions have multiresolution representation.
See publications >>
Binocular Vision
Viewing the environment with two eyes provides the observer with more information than viewing it with only one eye. Binocular disparity is a powerful depth cue. This cue can lead to a 3D percept even in the absence of any other cue. However, binocular disparity tends to be unreliable. As a result, the visual system may choose not to use it, if other cues or a priori constraints are more reliable. This seems to be true especially in the case of viewing shapes of 3D objects.
See publications >>
Phi Phenomenon
Phi phenomenon, discovered by Max Wertheimer in 1912, is a pure, objectless movement. The observer perceives movement without seeing the shape of the moving figure. Phi is different from beta (optimal) movement. Beta is experienced in everyday life, whenever movement is simulated (rendered) on a TV, computer monitor or in a movie theater. Phi is different. Whereas beta is a moving figure, phi is a moving background. Recently, a very vivid phi was demonstrated. This new, more vivid phi is called magniphi.
See publications >>
Image Quality
Image processing refers to a process in which an image serves as an input and a processed image is an output. This is different from image understanding, where an image is an input and knowledge is an output. It follows that in the case of image processing there is an observer who will actually look at the images. Clearly, efficient image processing must incorporate a model of the human visual system. Only then will the quality of the processed image be high. Supported by Hewlett-Packard Company
See publications >>
Hierarchical Models: Pyramids
One of the most valuable and critical resources in human (visual) processing is time, therefore a highly parallel model is the biological answer dealing satisfactorily with this resource. Hierarchical representation and hierarchical processing in psychology, and computer vision systems are the credible approach to address space and performance constraints, observed in human and animal visual systems. The hierarchical representation, called the pyramid, employs both coarse to fine and fine to coarse processing strategies. The main advantage of the pyramids is rapid computation of global information in a recursive manner, due to the logarithmic height with respect to the size of the input.
See publications >>