"Active Shape Models - Their Training and Application" citations

List of Citations from Science Citation Index for

T. F. Cootes, C. J. Taylor, D. H. Cooper, et al., " Active Shape Models - Their Training and Application," Computer Vision and Image Understanding, 61(1): 38-59, January 1995.

1995: 1 1996: 10 1997: 32 1998: 19 1999: 23 2000: 34 2001: 21 2002: 1

Total citations: 141

As of 28 Jan 2002

By Year - By Citations - By Year with Abstract

1995

SOZOU, PD, COOTES, TF, TAYLOR, CJ, and DIMAURO, EC, "NONLINEAR GENERALIZATION OF POINT DISTRIBUTION MODELS USING POLYNOMIAL REGRESSION," IMAGE AND VISION COMPUTING, vol. 13, pp. 451-457, 1995.
Abstract: We have previously described how to model shape variability by means of point distribution models (PDM) in which there is a linear relationship between a set of shape parameters and the positions of points on the shape. This linear formulation can fail for shapes which articulate or bend. We show examples of such failure for both real and synthetic classes of shape. A new, more general formulation for PDMs, based on polynomial regression, is presented. The resulting polynomial regression PDMs (PRPDM) perform well on the data for which the linear method failed.

1996

Denzler, J, and Niemann, H, "3D data driven prediction for active contour models based on geometric bounding volumes," PATTERN RECOGNITION LETTERS, vol. 17, pp. 1171-1178, 1996.
Abstract: Active contour models have proven to be a promising approach for data driven object tracking without knowledge about the problem domain and the object. Problems arise in case of fast moving objects and in natural scenes with heterogeneous background. In these cases, a prediction step is an essential part of the tracking mechanism. In this paper we describe a new approach for modelling the contour of moving objects in the 3D world. The key point is the description of moving objects by a simplified geometric model, the sc-called bounding volume. The contour of moving objects is predicted by estimating the movement and the shape of the bounding volume in the 3D world and by projecting its contour to the image plane. Stochastic optimization algorithms are used to estimate shape parameters of the bounding volume. The 2D contour of the bounding volume is used to initialize the active contour, which then extracts the contour of the moving object. Thus, an update of the motion and model parameters of the bounding volume is possible. No task specific knowledge and no a priori knowledge about the object is necessary. We will show that in the case of convex polyhedral bounding volumes, this method can be applied to real-time closed-loop object tracking on standard Unix workstations. Furthermore, we present experiments which prove that the robustness for tracking moving objects in front of a heterogeneous background can be improved.
Girard, S, Dinten, JM, and Chalmond, B, "Building and training radiographic models for flexible object identification from incomplete data," IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, vol. 143, pp. 257-264, 1996.
Abstract: The authors address the problem of identifying the projection of an object from incomplete data extracted from its radiographic image. They assume that the unknown object is a particular sample of a flexible object. Their approach consists first in designing a deformation model able to represent and to simulate a great variety of samples of the flexible object radiographic projection. This modellisation is achieved using a training set of complete data. Then, given the incomplete data, the identification task consists in estimating the observed object using the deformation model. The proposed modelling extracts from the training set, not only the deformation modes, but also other relevant information (such as probability distributions on the deformations, relations between deformations) to use it to regularise the identification step.
Abrantes, AJ, and Marques, JS, "Class of constrained clustering algorithms for object boundary extraction," IEEE TRANSACTIONS ON IMAGE PROCESSING, vol. 5, pp. 1507-1521, 1996.
Abstract: Boundary extraction is a key task in many image analysis operations. This paper describes a class of constrained clustering algorithms for object boundary extraction that includes several well-known algorithms proposed in different fields (deformable models, constrained clustering, data ordering, and traveling salesman problems), The algorithms belonging to this class are obtained by the minimization of a cost function with two terms: a quadratic regularization term and an image-dependent term defined by a set of weighting functions, The minimization of the cost function is achieved by lowpass filtering the previous model shape and by attracting the model units toward the centroids of their attraction regions, To define a new algorithm belonging to this class, the user has to specify a regularization matrix and a set of weighting functions that control the attraction of the model units toward the data, The usefulness of this approach is twofold: It provides a unified framework for many existing algorithms in pattern recognition and deformable models, and allows the design of new recursive schemes.
Nastar, C, and Ayache, N, "Frequency-based nonrigid motion analysis: Application to four dimensional medical images," IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 18, pp. 1067-1079, 1996.
Abstract: We present a method for nonrigid motion analysis in time sequences of volume images (4D data). In this method, nonrigid motion of the deforming object contour is dynamically approximated by a physically-based deformable surface. In order to reduce the number of parameters describing the deformation, we make use of a modal analysis which provides a spatial smoothing of the surface. The deformation spectrum, which outlines the main excited modes, can be efficiently used for deformation comparison. Fourier analysis on time signals of the main deformation spectrum components provides a temporal smoothing of the data. Thus a complex nonrigid deformation is described by only a few parameters: the main excited modes and the main Fourier harmonics. Therefore, 4D data can be analyzed in a very concise manner. The power and robustness of the approach is illustrated by various results on medical data. We believe that our method has important applications in automatic diagnosis of heart diseases and in motion compression.
Chakraborty, A, Staib, LH, and Duncan, JS, "Deformable boundary finding in medical images by integrating gradient and region information," IEEE TRANSACTIONS ON MEDICAL IMAGING, vol. 15, pp. 859-870, 1996.
Abstract: Accurately segmenting and quantifying structures is a key issue in biomedical image analysis. The two conventional methods of image segmentation, region-based segmentation, and boundary finding, often suffer from a variety of limitations. Here we propose a method which endeavors to integrate the two approaches in an effort to form a unified approach that is robust to noise and poor initialization. Our approach uses Green's theorem to derive the boundary of a homogeneous region- classified area in the image and integrates this with a gray level gradient-based boundary finder. This combines the perceptual notions of edge/shape information with gray level homogeneity. A number of experiments were performed both on synthetic and real medical images of the brain and heart to evaluate the new approach, and it is shown that the integrated method typically performs better when compared to conventional gradient-based deformable boundary finding. Further, this method yields these improvements with little increase in computational overhead, an advantage derived from the application of the Green's theorem.
Mitiche, A, and Bouthemy, P, "Computation and analysis of image motion: A synopsis of current problems and methods," INTERNATIONAL JOURNAL OF COMPUTER VISION, vol. 19, pp. 29-55, 1996.
Abstract: The goal of this paper is to offer a structured synopsis of the problems in image motion computation and analysis, and of the methods proposed, exposing the underlying models and supporting assumptions. A sufficient number of pointers to the literature will be given, concentrating mostly on recent contributions. Emphasis will be on the detection, measurement and segmentation of image motion. Tracking, and deformable motion isssues will be also addressed. Finally, a number of related questions which could require more investigations will be presented.
Subsol, G, Thirion, JP, and Ayache, N, "Application of an automatically built 3D morphometric brain atlas: Study of cerebral ventricle shape," VISUALIZATION IN BIOMEDICAL COMPUTING, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1131, pp. 373-382, 1996.
Abstract: In this paper we present new results on the automatic building of a 3D morphometric brain atlas from volumetric MRI images and its application to the study of the shape of cerebral structures. In particular, we show how it is possible to define ''abnormal'' deformations of the cerebral ventricles with a small set of parameters.
Hill, A, Cootes, TF, and Taylor, CJ, "Active shape models and the shape approximation problem," IMAGE AND VISION COMPUTING, vol. 14, pp. 601-607, 1996.
Abstract: Active Shape Models (ASM) use an iterative algorithm to match statistically defined models of known but variable objects to instances in images. Each iteration of ASM search involves two steps: image data interrogation and shape approximation. Here we consider the shape approximation step in detail. We present a new method of shape approximation which uses directional constraints. We show how the error term for the shape approximation problem can be extended to cope with directional constraints, and present iterative solutions to the 2D and 3D problems. We also present an efficient algorithm for the 2D problem in which a modification of the error term permits a closed-form approximate solution which can be used to produce starting estimates for the iterative solution.
Cootes, TF, DiMauro, EC, Taylor, CJ, and Lanitis, A, "Flexible 3D models from uncalibrated cameras," IMAGE AND VISION COMPUTING, vol. 14, pp. 581-587, 1996.
Abstract: We describe how to build statistically-based flexible models of the 3D structure of variable objects, given a training set of uncalibrated images. We assume that for each example object there are two labelled images taken from different viewpoints. From each image pair a 3D structure can be reconstructed, up to either an affine or projective transformation, depending on which camera model is used. The reconstructions are aligned by choosing the transformations which minimise the distances between matched points across the training set. A statistical analysis results in an estimate of the mean structure of the training examples and a compact parameterised model of the variability in shape across the training set. Experiments have been performed using pinhole and affine camera models. Results are presented for both synthetic data and real images.
Christensen, GE, Rabbitt, RD, and Miller, MI, "Deformable templates using large deformation kinematics," IEEE TRANSACTIONS ON IMAGE PROCESSING, vol. 5, pp. 1435-1447, 1996.
Abstract: A general automatic approach is presented for accommodating local shape variation when mapping a two-dimensional (2-D) or three-dimensional (3-D) template image into alignment with a topologically similar target image, Local shape variability is accommodated by applying a vector-field transformation to the underlying material coordinate system of the template while constraining the transformation to be smooth (globally positive definite Jacobian), Smoothness is guaranteed without specifically penalizing large-magnitude deformations of small subvolumes by constraining the transformation on the basis of a Stokesian limit of the fluid-dynamical Navier-Stokes equations, This differs fundamentally from quadratic penalty methods, such as those based on linearized elasticity or thin-plate splines, in that stress restraining the motion relaxes over time allowing large-magnitude deformations, Kinematic nonlinearities are inherently necessary to maintain continuity of structures during large-magnitude deformations, and are included in all results, After initial global registration, final mappings are obtained by numerically solving a set of nonlinear partial differential equations associated with the constrained optimization problem, Automatic regridding is performed by propagating templates as the nonlinear transformations evaluated on a finite lattice become singular, Application of the method to intersubject registration of neuroanatomical structures illustrates the ability to account for local anatomical variability.

1997

Luettin, J, and Thacker, NA, "Speechreading using probabilistic models," COMPUTER VISION AND IMAGE UNDERSTANDING, vol. 65, pp. 163-178, 1997.
Abstract: We describe a robust method for locating and tracking lips in gray-level image sequences. Our approach learns patterns of shape variability from a training set which constrains the model during image search to only deform in ways similar to the training examples, Image search is guided by a learned gray- level model which is used to describe the large appearance variability of lips, Such variability might be due to different individuals, illumination, mouth opening, specularity, or visibility of teeth and tongue, Visual speech features are recovered from the tracking results and represent both shape and intensity information, We describe a speechreading (lip- reading) system, where the extracted features are modeled by Gaussian distributions and their temporal dependencies by hidden Markov models. Experimental results are presented for locating lips, tracking lips, and speechreading. The database used consists of a broad variety of speakers and was recorded in a natural environment with no special lighting or lip markers used, For a speaker independent digit recognition task using visual information only, the system achieved an accuracy about equivalent to that of untrained humans. (C) 1997 Academic Press.
Solloway, S, Hutchinson, CE, Waterton, JC, and Taylor, CJ, "The use of active shape models for making thickness measurements of articular cartilage from MR images," MAGNETIC RESONANCE IN MEDICINE, vol. 37, pp. 943-952, 1997.
Abstract: Previously reported studies to quantify articular cartilage have used labor-intensive manual or semi-automatic data-driven techniques, demonstrating high accuracy and precision. However, none has been able to automate the segmentation process. This paper describes a fast, automatic, model-based approach to segmentation and thickness measurement of the femoral cartilage in 3D T-1-weighted images using active shape models (ASMs). Systematic experiments were performed to assess the accuracy and precision of the technique with in vivo images of both normal and abnormal knees. Segmentation accuracy was determined by comparing the results of the segmentation with the boundaries delineated by a radiologist, The mean error in locating the boundary was 0.57 pixels. To assess the precision of the measurement technique, the mean thickness of the femoral cartilage was calculated for repeated scans of five healthy volunteers. A mean coefficient of variation (CV) of 2.8% was obtained for the thickness measurements.
Redhead, AL, Kotcheff, ACW, Taylor, CJ, Porter, ML, and Hukins, DWL, "An automated method for assessing routine radiographs of patients with total hip replacements," PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART H- JOURNAL OF ENGINEERING IN MEDICINE, vol. 211, pp. 145-154, 1997.
Abstract: This paper describes a new, fully automated method of locating objects on radiographs of patients with total joint replacements (TJRs). A statistical computer model, known as an active shape model, was trained to identify the position of the femur, pelvis, stem and cup marker wire on radiographs of patients with Charnley total hip prostheses. Once trained, the model was able to locate these objects through a process of automatic image searching, despite their appearance depending on the orientation and anatomy of the patient. Experiments were carried out to test the accuracy with which the model was able to fit to previously unseen data and with which reference points could be calculated from the model points. The model was able to locate the femur and stem with a mean error of approximately 0.8 mm and a 95 per cent confidence limit of 1.7 mn. Once the model had successfully located these objects, the midpoint of the stem head could be calculated with a mean error of approximately 0.2 mm. Although the model has been trained on Charnley total hip replacements, the method is generic and so can be applied to radiographs of patients with any TJR. This paper shows that computer models can form the basis of a quick, automatic method of taking measurements from standard clinical radiographs.
Delibasis, K, Undrill, PE, and Cameron, GG, "Designing Fourier descriptor-based geometric models for object interpretation in medical images using genetic algorithms," COMPUTER VISION AND IMAGE UNDERSTANDING, vol. 66, pp. 286-300, 1997.
Abstract: In previous work we have modeled simple 3D anatomical objects using deformed superquadrics and established their optimal position with the aid of genetic algorithms (GAs). Here we extend the complexity of the search object using 3D Fourier descriptor (FD) representations and allow GAs once again to optimize the object's shape and position. Using magnetic resonance image data, we perform an approximate segmentation on one lateral ventricle in the brain and use the FDs from this as seeding values for the GAs to search for the left and right lateral ventricles in seven 3D data sets. We show that the method is capable of coping with normal biological variation. Finally, we compare FD/GA-guided segmentation with a manually guided interactive region growing method and find an agreement of 78 +/- 10% in voxel classification with a corresponding average edge placement error of 2.2 +/- 0.4 mm. (C) 1997 Academic Press.
Smyth, PP, Taylor, CJ, and Adams, JE, "Automatic measurement of vertebral shape using Active Shape Models," INFORMATION PROCESSING IN MEDICAL IMAGING, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1230, pp. 441-446, 1997.
Abstract: In this paper, we describe how Active Shape Models (ASMs) have been used to accurately and robustly locate vertebrae in noisy lateral Dual Energy X-ray Absorptiometry (DXA) images of the spine. Vertebrae were located using either separate models for each vertebra, or a combined model of the whole spine. The combined model was found to be more robust. We show that ASMs allow normal vertebrae to be located as accurately as by human operators. We measure the performance of ASMs in locating fractured vertebrae of osteoporotic patients. We also describe how model parameters may be used to estimate how accurately a vertebra had been located, in order to detect vertebrae for which search had failed.
Joshi, SC, Banerjee, A, Christensen, GE, Csernansky, JG, Haller, JW, Miller, MI, and Wang, L, "Gaussian random fields on sub-manifolds for characterizing brain surfaces," INFORMATION PROCESSING IN MEDICAL IMAGING, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1230, pp. 381-386, 1997.
Abstract: This paper provides analytical methods for characterizing the variation of the shape of neuroanatomically significant substructures of the brain in an ensemble of brain images. The focus of this paper is on the neuro-anatomical variation of the "shape" of 2-dimensional surfaces in the brain. Brain surfaces are studied by building templates that are smooth sub-manifolds of the underlying coordinate system of the brain. Variation of the shape in populations is quantified via defining Gaussian random vector fields on these sub-manifolds Methods for the empirical construction of Gaussian random vector fields for representing the variations of the substructures are presented. As an example, using these methods we characterize the shape of the hippocampus in a population of normal controls and schizophrenic brains. Results from a recently completed study comparing shapes of the hippocampus in a group of matched schizophrenic and normal control subjects are presented. Bayesian hypothesis test is formulated to cluster the normal and schizophrenic hippocampi in the population of 20 individuals.
Duta, N, and Sonka, M, "Segmentation and interpretation of MR brain images using an improved knowledge-based active shape model," INFORMATION PROCESSING IN MEDICAL IMAGING, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1230, pp. 375-380, 1997.
Abstract: An improvement of the Active Shape procedure introduced by Cootes and Taylor is presented. The new automated brain segmentation and interpretation approach incorporates a priori knowledge about neuroanatomic structures and their specific structural relationships to provide robust segmentation and labeling. The method was trained in 8 MR brain images and tested in 19 brain images by comparison to observer-defined independent standards. Neuroanatomic structures in all images from the test set were successfully identified. The presented method is applicable to virtually any task involving deformable shape analysis.
Ahmad, T, Taylor, CJ, Lanitis, A, and Cootes, TF, "Tracking and recognising hand gestures, using statistical shape models," IMAGE AND VISION COMPUTING, vol. 15, pp. 345-352, 1997.
Abstract: Hand gesture recognition from video images is of considerable interest as a means of providing simple and intuitive man- machine interfaces. Possible applications range from replacing the mouse as a pointing device to virtual reality and communication with the deaf. We describe an approach to tracking a hand in an image sequence and recognising, in each video frame. which of five gestures it has adopted. A statistically based Point Distribution Model (PDM) is used to provide a compact parametrised description of the shape of the hand for any of the gestures or the transitions between them. The values of the resulting shape parameters are used in a statistical classifier to identify gestures. The model can be used as a deformable template to track a hand through a video sequence but this proves unreliable. We describe how a set of models, one for each of the five gestures, can be used for tracking with the appropriate model selected automatically. We show that this results in reliable tracking and gesture recognition for two 'unseen' video sequences in which all the gestures are used.
Sozou, PD, Cootes, TF, Taylor, CJ, DiMauro, EC, and Lanitis, A, "Non-linear point distribution modelling using a multi-layer perceptron," IMAGE AND VISION COMPUTING, vol. 15, pp. 457-463, 1997.
Abstract: Objects of the same class sometimes exhibit variation in shape. This shape variation has previously been modelled by means of point distribution models (PDMs) in which there is a linear relationship between a set of shape parameters and the positions of points on the shape. A polynomial regression generalization of PDMs, which succeeds in capturing certain forms of non-linear shape variability, has also been described. Here we present a new form of PDM, which uses a multi-layer perceptron to carry out non-linear principal component analysis. We compare the performance of the new model with that of the existing models on two classes of variable shape: one exhibits bending, and the other exhibits complete rotation. The linear PDM fails on both classes of shape; the polynomial regression model succeeds for the first class of shapes but fails for the second; the new multi-layer perceptron model performs well for both classes of shape. The new model is the most general formulation for PDMs which has been proposed to date. (C) 1997 Elsevier Science B.V.
Smyth, PP, Taylor, CJ, and Adams, JE, "Automatic measurement of vertebral shape using active shape models," IMAGE AND VISION COMPUTING, vol. 15, pp. 575-581, 1997.
Abstract: In this paper, we describe how Active Shape Models (ASMs) have been used to accurately and robustly locate vertebrae in lateral Dual Energy X-ray Absorptiometry (DXA) images of the spine. DXA images are of low spatial resolution, and contain significant random and structural noise, providing a difficult challenge for object location methods. All vertebrae in the image were searched for simultaneously, improving robustness in location of individual vertebrae by making use of constraints on shape provided by the position of other vertebrae. We show that the use of ASMs with minimal user interaction allows accuracy to be obtained which is as good as that achievable by human operators, as well as high precision. Having located each vertebra, it is desirable to evaluate whether it has been located sufficiently accurately for shape measurements to be useful. We determined this on the basis of grey-level model fit, which was shown to usefully detect poorly located vertebrae, which should enable accuracy to be improved by rejecting proposed search solutions whose grey-level fit was poorer than a threshold. (C) 1997 Elsevier Science B.V.
Kotcheff, ACW, and Taylor, CJ, "Automatic construction of eigenshape models by genetic algorithm," INFORMATION PROCESSING IN MEDICAL IMAGING, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1230, pp. 1-14, 1997.
Abstract: A new approach to the problem of automatic construction of eigenshape models is presented. Eigenshape models have proved to be successful in a variety of medical image analysis problems. However, automatic model construction is a difficult problem, and in many applications the models are built by hand - a painstaking process. Previous attempts to produce models automatically have been applicable only in specific cases or under certain assumptions. We show that the problem can be understood very simply in terms of shape symmetries. The pose and parameterisation of each shape must be chosen so as to produce a model that is compact and specific. Mie define an objective function that measures these properties. The problem of automatic model construction is thus reduced to an optimisation problem. We show that the objective function we define can be optimised by a Genetic Algorithm, and produces models that are better than hand built ones.
Rueckert, D, and Burger, P, "Geometrically deformable templates for shape-based segmentation and tracking in cardiac MR images," ENERGY MINIMIZATION METHODS IN COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1223, pp. 83-98, 1997.
Abstract: We present a new approach to shape-based segmentation and tracking of multiple, deformable anatomical structures in cardiac MR images. We propose to use an energy-minimizing geometrically deformable template (GDT) which can deform into similar shapes under the influence of image forces. The degree of deformation of the template from its equilibrium shape is measured by a penalty function associated with mapping between the two shapes. In 2D, this term corresponds to the bending energy of an idealized thin-plate of metal. By minimizing this term along with the image energy terms of the classic deformable model, the deformable template is attracted towards objects in the image whose shape is similar to its equilibrium shape. This framework allows for the simultaneous segmentation of multiple deformable objects using intra- as well as inter- shape information. The energy minimization problem of the deformable template is formulated in a Bayesian framework and solved using relaxation techniques: Simulated Annealing (SA), a stochastic relaxation technique is used for segmentation while Iterated Conditional Modes (ICM), a deterministic relaxation technique is used for tracking. We present results of the algorithm applied to the reconstruction of the left and right ventricle of the human heart in 4D MR images.
Liu, TL, and Geiger, D, "Visual deconstruction: Recognizing articulated objects," ENERGY MINIMIZATION METHODS IN COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1223, pp. 295-309, 1997.
Abstract: We propose a deconstruction framework to recognize and find articulated objects. In particular we axe interested in human arm and leg articulations. The deconstruction view of recognition naturally decomposes the problem of finding an object in an image, into the one of (i) extracting key features in an image, (ii) detecting key points in the models, (iii) segmenting an image, and (iv) comparing shapes. All of these subproblems can not be resolved independently. Together, they reconstruct the object in the image. We briefly address (i) and (ii) to focus on solving together shape similarity and segmentation, combining top-down & bottom-up algorithms. We show that the visual deconstruction approach is derived as an optimization for a Bayesian-Information theory, and that the whole process is naturally generated by the guaranteed Dijkstra optimization algorithm.
Lanitis, A, Taylor, CJ, and Cootes, TF, "Automatic interpretation and coding of face images using flexible models," IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 19, pp. 743-756, 1997.
Abstract: Face images are difficult to interpret because they are highly variable. Sources of variability include individual appearance, 3D pose, facial expression, and lighting. We describe a compact parametrized model of facial appearance which takes into account all these sources of variability. The model represents both shape and gray-level appearance, and is created by performing a statistical analysis over a training set of face images. A robust multiresolution search algorithm is used to fit the model to faces in new images. This allows the main facial features to be located, and a set of shape, and gray- level appearance parameters to be recovered. A good approximation to a given face can be reconstructed using less than 100 of these parameters. This representation can be used for tasks such as image coding, person identification, 3D pose recovery, gender recognition, and expression recognition. Experimental results are presented for a database of 690 face images obtained under widely varying conditions of 3D pose, lighting, and facial expression. The system performs well on ail the tasks listed above.
Vetter, T, and Poggio, T, "Linear object classes and image synthesis from a single example image," IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 19, pp. 733-742, 1997.
Abstract: The need to generate new views of a 3D object from a single real image arises in several fields, including graphics and object recognition. While the traditional approach relies on the use of 3D models, we have recently introduced [1], [2], [3] simpler techniques that are applicable under restricted conditions. The approach exploits image transformations that are specific to the relevant object class, and learnable from example Views of other ''prototypical'' objects df the same class. In this paper, we introduce such a technique by extending the notion of linear class proposed by Poggio and Vetter. For linear object classes, it is shown that linear transformations can be learned exactly from a basis set of 2D prototypical views. We demonstrate the approach on artificial objects and then show preliminary evidence that the technique can effectively ''rotate'' high-resolution face images from a single 2D view.
Pavlovic, VI, Sharma, R, and Huang, TS, "Visual interpretation of hand gestures for human-computer interaction: A review," IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 19, pp. 677-695, 1997.
Abstract: The use of hand gestures provides an attractive alternative to cumbersome interface devices for human-computer interaction (HCl). In particular, Visual interpretation of hand gestures can help in achieving the ease and naturalness desired for HCl. This has motivated a very active research area concerned with computer vision-based analysis and interpretation of hand gestures. We survey the literature on visual interpretation of hand gestures in the context of its role in HCl. This discussion is organized on the basis of the method used for modeling, analyzing, and recognizing gestures. Important differences in the gesture interpretation approaches arise depending on whether a 3D model of the human hand or an image appearance model of the human hand is used. 3D hand models offer a way of more elaborate modeling of hand gestures but lead to computational hurdles that have not been overcome given the real-time requirements of HCl. Appearance-based models lead to computationally efficient ''purposive'' approaches that work well under constrained situations but seem to lack the generality desirable for HCl. We also discuss implemented gestural systems as well as other potential applications of vision-based gesture recognition. Although the current progress is encouraging, further theoretical as well as computational advances are needed before gestures can be widely used for HCl. We discuss directions of future research in gesture recognition, including its integration with other natural modes of human-computer interaction.
Yow, KC, and Cipolla, R, "Feature-based human face detection," IMAGE AND VISION COMPUTING, vol. 15, pp. 713-735, 1997.
Abstract: Human face detection has always been an important problem for face, expression and gesture recognition. Though numerous attempts have been made to detect and localize faces, these approaches have made assumptions that restrict their extension to more general cases. We identify that the key factor in a generic and robust system is that of using a large amount of image evidence, related and reinforced by model knowledge through a probabilistic framework. In this paper, we propose a feature-based algorithm for detecting faces that is sufficiently generic and is also easily extensible to cope with more demanding variations of the imaging conditions. The algorithm detects feature points from the image using spatial filters and groups them into face candidates using geometric and gray level constraints. A probabilistic framework is then used to reinforce probabilities and to evaluate the likelihood of the candidate as a face. We provide results to support the validity of the approach and demonstrate its capability to detect faces under different scale, orientation and viewpoint. (C) 1997 Elsevier Science B.V.
Vetter, T, and Troje, NF, "Separation of texture and shape in images of faces for image coding and synthesis," JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, vol. 14, pp. 2152-2161, 1997.
Abstract: Human faces differ in shape and texture. Image representations based on this separation of shape and texture information have been reported by several authors [for a review, see Science 272, 1905 (1996)]. We investigate such a representation of human faces based on a separation of texture and two- dimensional shape information. Texture and shape were separated by use of pixel-by-pixel correspondence among the various images, which was established through algorithms known from optical flow computation. We demonstrate the improvement of the proposed representation over well-established pixel-based techniques in terms of coding efficiency and in terms of the ability to generalize to new images of faces. The evaluation is performed by calculating different distance measures between the original image and its reconstruction and by measuring the time that human subjects need to discriminate them. (C) 1997 Optical Society of America.
Taylor, CJ, Cootes, TF, Lanitis, A, Edwards, G, Smyth, P, and Kotcheff, ACW, "Model-based interpretation of complex and variable images," PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY OF LONDON SERIES B-BIOLOGICAL SCIENCES, vol. 352, pp. 1267-1274, 1997.
Abstract: The ultimate goal of machine vision is image understanding-the ability not only to recover image structure but also to know what it represents. By definition, this involves the use of models which describe and label the expected structure of the world. Over the past decade, model-based vision has been applied successfully to images of man-made objects. It has proved much more difficult to develop model-based approaches to the interpretation of images of complex and variable structures such as faces or the internal organs of the human body (as visualized in medical images). In such cases it has been problematic even to recover image structure reliably without a model to organize the often noisy and incomplete image evidence. The key problem is that of variability. To be useful, a model needs to be specific-that is, to be capable of representing only 'legal' examples of the modelled object(s). It has proved difficult to achieve this whilst allowing for natural variability. Recent developments have overcome this problem; it has been shown that specific patterns of variability in shape and grey-level appearance can be captured by statistical models that can be used directly in image interpretation. The details of the approach are outlined and practical examples from medical image interpretation and face recognition are used to illustrate how previously intractable problems can now be tackled successfully. It is also interesting to ask whether these results provide any possible insights into natural vision; for example, we show that the apparent changes in shape which result from viewing three- dimensional objects from different viewpoints can be modelled quite well in two dimensions; this may lend some support to the 'characteristic views' model of natural vision.
Subsol, G, Roberts, N, Doran, M, Thirion, JP, and Whitehouse, GH, "Automatic analysis of cerebral atrophy," MAGNETIC RESONANCE IMAGING, vol. 15, pp. 917-927, 1997.
Abstract: 3D MR data obtained for 10 healthy control subjects have been used to build a brain atlas, The atlas is built in four stages, First, a set of features that are unambiguously definable and anatomically relevant need to be computed for each item in the database, The chosen features are crest lines along which the maximal principal curvature of the surface of the brain is maximal in its associated principal direction, Second, a nonrigid registration algorithm is used to determine the common crest lines among the subjects in the database, These crest lines form the structure of the atlas. Third, a set of crest lines is taken as a reference set and a modal analysis is performed to determine the fundamental deformations that are necessary to bring the individual data in line with the reference set, The deformations are averaged and the set of mean crest lines becomes the atlas, Finally, the standard deviation of the deformations between the atlas and the items in the database defines the normal variation in the relative positions of the crest lines in a healthy population, In a fully automatic procedure, the crest lines on the surface of the brain adjacent to the cerebral ventricles in a patient with primary progressive aphasia were compared to the atlas; confirmation that the brain of this patient demonstrates atrophy was provided by stereological analysis that showed that the volume of the left cerebral hemisphere is 48.8 ml (CE 2.8%) less than the volume of the right cerebral hemisphere in the region of the temporal and frontal lobes, When the amplitude of the deformations necessary to register the crest lines obtained for the patient with the atlas were greater than three standard deviations beyond the variability inherent in the atlas, the deformation was considered significant, Four of the five main deformation modes of the longest crest line of the surface of the brain adjacent to the cerebral ventricles were significantly different in the patient with primary progressive aphasia compared to the atlas, The ventricles are preferentially enlarged in the left cerebral hemisphere, Furthermore, they are closer together posteriorly and further apart anteriorly than in the atlas, These observations may be indicative of the atrophy of the temporal and frontal lobes of the left cerebral hemisphere noted in the patient, Ultimately, the approach may provide a useful screening technique for identifying brain diseases involving cerebral atrophy, Serial studies of individual patients may provide insights into the processes controlling or affected by particular diseases. (C) 1997 Elsevier Science Inc.
Rueckert, D, and Burger, P, "Shape-based segmentation and tracking in 4D cardiac MR images," CVRMED-MRCAS'97, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1205, pp. 43-52, 1997.
Abstract: We present a new approach to shape-based segmentation and tracking of multiple, deformable anatomical structures in cardiac MR images. We propose to use an energy-minimizing geometrically deformable template (GDT) which can deform into similar shapes under the influence of image forces. The degree of deformation of the template from its equilibrium Shape is measured by a penalty function associated with mapping between the two shapes. In 2D, this term corresponds to the bending energy of an idealized thin-plate of metal. By minimizing this term along with the image energy terms of the classic deformable model, the deformable template is attracted towards objects in the image whose shape is similar to its equilibrium shape. This framework allows the simultaneous segmentation of multiple deformable objects using intra-as well as inter-shape information. The energy minimization problem of the deformable template is formulated in a Bayesian framework and solved using relaxation techniques: Simulated Annealing (SA), a stochastic relaxation technique is used for segmentation while Iterated Conditional Modes (ICM), a deterministic relaxation technique is used for tracking. We present results of the algorithm applied to the reconstruction of the left and right ventricle of the human heart in 4D MR images.
Montagnat, J, and Delingette, H, "Volumetric medical images segmentation using shape constrained deformable models," CVRMED-MRCAS'97, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1205, pp. 13-22, 1997.
Abstract: In this paper we address the problem of extracting geometric models from lour contrast volumetric images, given a template or reference shape of that model. We proceed by deforming a reference model in a volumetric image. This reference deformable model is represented as a simplex mesh submitted to regularizing shape constraint. Furthermore, we introduce an original approach that combines the deformable model framework with the elastic registration (based on iterative closest point algorithm) method. This new method increases the robustness of segmentation while allowing very complex deformation, of the original template. Examples of segmentation of the liver and brain ventricles are provided.
Grzeszczuk, RP, and Levin, DN, "''Brownian strings'': Segmenting images with stochastically deformable contours," IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 19, pp. 1100-1114, 1997.
Abstract: This paper describes an image segmentation technique in which an arbitrarily shaped contour was deformed stochastically until it fitted around an object of interest. The evolution of the contour was controlled by a simulated annealing process which caused the contour to settle into the global minimum of an image-derived ''energy'' function. The nonparametric energy function was derived from the statistical properties of previously segmented images, thereby incorporating prior experience. Since the method was based on a state space search for the contour with the best global properties, it was stable in the presence of image errors which confound segmentation techniques based on local criteria, such as connectivity. Unlike ''snakes'' and other active contour approaches, the new method could handle arbitrarily irregular contours in which each interpixel crack represented an independent degree of freedom. Furthermore, since the contour evolved toward the global minimum of the energy, the method was more suitable for fully automatic applications than the snake algorithm, which frequently has to be reinitialized when the contour becomes trapped in local energy minima. High computational complexity was avoided by efficiently introducing a random local perturbation in a time independent of contour length, providing control over the size of the perturbation, and assuring that resulting shape changes were unbiased. The method was illustrated by using it to find the brain surface in magnetic resonance head images and to track blood vessels in angiograms.
OToole, AJ, Vetter, T, Volz, H, and Salter, EM, "Three-dimensional caricatures of human heads: distinctiveness and the perception of facial age," PERCEPTION, vol. 26, pp. 719-732, 1997.
Abstract: A standard facial-caricaturing algorithm was applied to a three-dimensional representation of human heads. This algorithm sometimes produced heads that appeared 'caricatured'. More commonly, however, exaggerating the distinctive three- dimensional information in a face seemed to produce an increase in the apparent age of the face-both at a local level, by exaggerating small facial creases into wrinkles, and at a more global level via changes that seemed to make the underlying structure of the skull more evident. Concomitantly, de-emphasis of the distinctive three-dimensional information in a face made it appear relatively younger than the veridical and caricatured faces. More formally, face-age judgments made by human observers were ordered according to the level of caricature, with anticaricatures judged younger than veridical faces, and veridical faces judged younger than caricatured faces. These results are discussed in terms of the importance of the nature of the features made more distinct by a caricaturing algorithm and the nature of human representation(s) of faces.
Hozumi, T, Yoshida, K, Yoshioka, H, Yagi, T, Akasaka, T, Takagi, T, Nishiura, M, Watanabe, M, and Yoshikawa, J, "Echocardiographic estimation of left ventricular cavity area with a newly developed automated contour tracking method," JOURNAL OF THE AMERICAN SOCIETY OF ECHOCARDIOGRAPHY, vol. 10, pp. 822-829, 1997.
Abstract: Development of an automated contour tracking method provides detection and tracking of the endocardial boundary using the energy minimization method without tracing a region of interest. The purpose of this study was to compare the automated contour tracking method and manually drawn methods for the measurement of left ventricular cavity areas and fractional area change. Apical four-chamber view was visualized and recorded for off-line analysis in 11 patients by means of two-dimensional echocardiography. The automated contour tracking method automatically traces the endocardial border from the recorded images and calculates left ventricular cavity areas (end-diastole and end-systole) and fractional area change. In the same images selected as end-diastole and end- systole in the automated contour tracking method, left ventricular endocardial border was manually traced to calculate left ventricular cavity areas and fractional area change. Both methods were compared by Linear regression analysis for the measurement of cavity areas and fractional area change. Left ventricular areas measured by the automated contour tracking method showed an excellent correlation with those by the manual method (end-diastole: r = 0.99, y = 0.83x + 2.6, standard error of the estimate = 1.5 cm(2); end-systole: r = 0.99, y = 0.96x - 0.8, standard error of the estimate = 1.2 cm(2)). The mean differences between the automated contour tracking and manual methods were -3.1 +/- 5.1 cm(2) and -1.6 +/- 2.4 cm(2) at end- diastole and end-systole, respectively. Fractional area change determined by the automated contour tracking method correlated well with that by the manual method (r = 0.95, y = 1.17x - 6.5, standard error of the estimate = 3.4%). The mean difference between the automated contour tracking and manual methods was - 0.8% +/- 7.1%. In conclusion, a newly developed automated contour tracking method correlates highly with the manual method for the estimation of left ventricular cavity areas and fractional area change in high-quality images. This suggests that this new technique may be useful in the automated quantitation of left ventricular function in patients with high-quality images with no dropout and no intercavity artifact or structure.
Huang, CL, Chang, WT, Wu, LC, and Wang, JK, "Three-dimensional PET emission scan registration and transmission scan synthesis," IEEE TRANSACTIONS ON MEDICAL IMAGING, vol. 16, pp. 542-561, 1997.
Abstract: The duration of a positron emission tomography (PET) imaging scan can be reduced if the transmission scan of one patient which is used for emission correction can be synthesized by using the reference transmission scan of another patient, In this paper, we propose a new intersubjects PET emission scan registration method and PET transmission synthesis method by using the boundary information of the body or brain scan of the PET emission scans. The PET emission scans have poor image quality and different intensity statistics so that we preprocess the emission scans to have similar histogram and then apply the point distribution model (PDM) [15] to extract the contours of the emission scan, The extracted boundary contour of every slice is used to reconstruct the three- dimensional (3-D) surface of the reference set and the target set, Our registration is 3-D surface-based which uses the normal flow method [17] to find the correspondence vector field between two 3-D reconstructed surfaces, Since it is difficult to analyze internal organ using the PET emission scan imaging without correction, we assume that the deformation of internal organ is homogeneous, With the corresponding vector field between the two emission scans and the transmission scan of the reference set, we can synthesize the transmission scan of the target set.
Huang, CL, and Huang, YM, "Facial expression recognition using model-based feature extraction and action parameters classification," JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, vol. 8, pp. 278-290, 1997.
Abstract: This paper introduces an automatic facial expression recognition system which consists of two parts: facial feature extraction and facial expression recognition. The system applies the point distribution model and the gray-level model to find the facial features. Then the position variations of certain designated points on the facial feature are described by 10 action parameters (APs). There are two phases in the recognition process: the training phase and the recognition phase. In the training phase, given 90 different expressions, the system classifies the principal components of the APs of all training expressions into six different clusters. In the recognition phase, given a facial image sequence, it identifies the facial expressions by extracting the 10 APs, analyzes the principal components, and finally calculates the AP profile correlation for a higher recognition rate. In the experiments, our system has demonstrated that it can recognize the facial expression effectively. (C) 1997 Academic Press.
Choi, KN, Cross, ADJ, and Hancock, ER, "Localising facial features with matched filters," AUDIO- AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1206, pp. 11-20, 1997.
Abstract: This paper describes a study of facial feature recognition using matched filter techniques. The basic aim is to develop a set of filters that can be used to characterise each of eight different facial features. These are left and right eyes, left and right-eyebrows, hairline, nose, mouth and chin. The matched filters are extracted from training images using inverse Fourier analysis. We provide an experimental evaluation of the method on the University of Berne face data-base.. Here we explore the most effective choice of training data so that the filters can be effectively applied when the facial pose varies. We also evaluate the effectiveness of the method when facial occlusion due to spectacles is present.
Bowden, R, Mitchell, TA, and Sarhadi, M, "Cluster based nonlinear principle component analysis," ELECTRONICS LETTERS, vol. 33, pp. 1858-1859, 1997.
Abstract: In the field of computer vision, principle component analysis (PCA) is often used to provide statistical models of shape, deformation or appearance. This simple statistical model provides a constrained. compact approach to model based vision. However, as larger problems are considered. high dimensionality and nonlinearity make linear PCA an unsuitable and unreliable approach. A nonlinear PCA (NLPCA) technique is proposed which uses cluster analysis and dimensional reduction to provide a fast. robust solution. Simulation results on both 2D contour models and greyscale images are presented.
Joshi, SC, Miller, MI, and Grenander, U, "On the geometry and shape of brain sub-manifolds," INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, vol. 11, pp. 1317-1343, 1997.
Abstract: This paper develops mathematical representations for neuro- anatomically significant substructures of the brain and their variability in a population. The focus of the paper is an the neuro-anatomical variation of the geometry and the "shape" of two-dimensional surfaces in the brain. As examples, we focus on the cortical and hippocampal surfaces in an ensemble of Macaque monkeys and human MRI brains. The "shapes" of the substructures are quantified via the construction of templates; the variations are represented by defining probabilistic deformations of the template. Methods for empirically estimating probability measures on these deformations are developed by representing the deformations as Gaussian random vector fields on the embedded sub-manifolds. The Gaussian random vector fields are constructed as quadratic mean limits using complete orthonormal bases on the sub-manifolds. The complete orthonormal bases are generated using modes of vibrations of the geometries of the brain sub-manifolds. The covariances are empirically estimated from an ensemble of brain data. Principal component analysis is presented for characterizing the "eigen-shape" of the hippocampus in an ensemble of MRI-MPRAGE whole brain images. Clustering based on eigen-shape is presented for two sub-populations of normal and schizophrenic.
Christensen, GE, Joshi, SC, and Miller, MI, "Volumetric transformation of brain anatomy," IEEE TRANSACTIONS ON MEDICAL IMAGING, vol. 16, pp. 864-877, 1997.
Abstract: This paper presents diffeomorphic transformations of three- dimensional (3-D) anatomical image data of the macaque occipital lobe and whole brain cryosection imagery and of deep brain structures in human brains as imaged via magnetic resonance imagery, These transformations are generated in a hierarchical manner, accommodating both global and local anatomical detail, The initial low-dimensional registration is accomplished by constraining the transformation to be in a low- dimensional basis, The basis is defined by the Green's function of the elasticity operator placed at predefined locations in the anatomy and the eigenfunctions of the elasticity operator, The high-dimensional large deformations are vector fields generated via the mismatch between the template and target- image volumes constrained to be the solution of a Navier-Stokes fluid model. As part of this procedure, the Jacobian of the transformation is tracked, insuring the generation of diffeomorphisms. It is shown that transformations constrained by quadratic regularization methods such as the Laplacian, biharmonic, and linear elasticity models, do not ensure that the transformation maintains topology and, therefore, must only be used for coarse global registration.
Hill, A, Brett, AD, and Taylor, CJ, "Automatic landmark identification using a new method of non- rigid correspondence," INFORMATION PROCESSING IN MEDICAL IMAGING, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1230, pp. 483-488, 1997.
Abstract: A method for corresponding the boundaries of two shapes is presented. The algorithm locates a matching pair of sparse polygonal approximations, one for each of a pair of boundaries, by minimising a cost function using a greedy algorithm. The cost function expresses the dissimilarity in both the shape and representation error (with respect to the defining boundary) of the sparse polygons. Results are presented for three classes of shape which exhibit various types of non-rigid deformation. The algorithm is also applied to an automatic landmark identification task for the construction of statistical shape models.

1998

Schuhmann, D, Seemann, M, Schoepf, UJ, Haubner, M, Krapichler, C, Gebicke, K, Reiser, M, and Englmeier, KH, "Computerized diagnostic data analysis and 3-D visualization," RADIOLOGE, vol. 38, pp. 799-809, 1998.
Abstract: Purpose:To survey methods for 3D data visualization and image analysis which can be used for computer based diagnostics. Material and methods: The methods available are explained in short terms and links to the literature are presented. Methods which allow basic manipulation of 3D data are windowing, rotation and clipping. More complex methods for visualization of 3D data are multiplanar reformation, volume projections (MIP,semi-transparent projections) and surface projections. Methods for image analysis comprise local data transformation (e.g. filtering) and definition and application of complex models (e.g. deformable models). Results: Volume projections produce an impression of the 3D data set without reducing the data amount. This supports the interpretation of the 3D data set and saves time in comparison to any investigation which requires examination of all slice images. More advanced techniques for visualization, e.g. surface projections and hybrid rendering visualize anatomical information to a very detailed extent, but both techniques require the segmentation of the structures of interest. Image analysis methods can be used to extract these structures(e.g. an organ)from the image data. Discussion:At the present time volume projections are robust and fast enough to be used routinely. Surface projections can be used to visualize complex and presegmented anatomical features.
Hagenlocker, M, and Fujimura, K, "CFFD: a tool for designing flexible shapes," VISUAL COMPUTER, vol. 14, pp. 271-287, 1998.
Abstract: This paper describes a solid deformation method, composed free- form deformation (CFFD), which applies a sequence of uniform periodic B-spline FFDs over 3D Euclidean space. The construction of the individual FFDs, which are defined by unbounded control lattices, is described, concentrating on a method by which feature point trajectories are employed to control lattice point displacements. Problems due to mutual influence of feature point trajectories are discussed, and frozen points, which inhibit deformation in their proximity, are introduced. Also, methods for constructing the FFD lattices are proposed which control mutual influence of feature point trajectories. The paper addresses computational issues of constructing and applying CFFDs, and discusses the application of CFFD to 3D design and animation.
Montagnat, J, and Delingette, H, "Globally constrained deformable models for 3D object reconstruction," SIGNAL PROCESSING, vol. 71, pp. 173-186, 1998.
Abstract: To achieve geometric reconstruction from 3D datasets two complementary approaches have been widely used. On one hand, the deformable model framework locally applies forces to fit the data. On the other hand, the non-rigid registration framework computes a global transformation minimizing the distance between a template and the data. We first show that applying a global transformation on a surface template, is equivalent to applying certain global forces on a deformable model. Second, we propose a scheme which combines the registration and free-form deformation. This globally constrained deformation scheme allows us to control the amount of deformation from the reference shape with a single parameter. Finally, we propose a general algorithm for performing model-based reconstruction in a robust and accurate manner. Examples on both range data and medical images are used to illustrate and validate the globally constrained deformation framework. (C) 1998 Elsevier Science B.V. All rights reserved.
Jain, AK, Zhong, Y, and Dubuisson-Jolly, MP, "Deformable template models: A review," SIGNAL PROCESSING, vol. 71, pp. 109-129, 1998.
Abstract: In this paper, we review the recently published work on deformable models. We have chosen to concentrate on 2D deformable models and relate the energy minimization approaches to the Bayesian formulations. We categorize the various active contour systems according to the definition of the deformable model. We also present in detail one particular formulation for deformable templates which combines edge, texture, color and region information for the external energy and model deformations using wavelets, splines or Fourier descriptors. We explain how these models can be used for segmentation, image retrieval in a large database and object tracking in a video sequence. (C) 1998 Elsevier Science B.V. All rights reserved.
Fujimura, K, and Makarov, M, "Foldover-free image warping," GRAPHICAL MODELS AND IMAGE PROCESSING, vol. 60, pp. 100-111, 1998.
Abstract: An image warping method is presented that deforms an image continuously without foldover, while observing a given set of trajectories of feature elements. Any intermediate image during the morph is homeomorphic to the initial image and the morphing process is a homotopy. The method permits points, line- segments, and polygons to be included as features in the image. Our method is based on time-varying triangulation, that is, triangulation changes as features move. Accordingly, the deformation mapping is updated locally for the part for which the triangulation changes. Experimental results are included to demonstrate the feasibility of our approach and the complexity of the algorithm is analyzed. (C) 1998 Academic Press.
Ip, HHS, and Shen, DG, "An affine-invariant active contour model (AI-snake) for model- based segmentation," IMAGE AND VISION COMPUTING, vol. 16, pp. 135-146, 1998.
Abstract: In this paper, we show that existing shaped-based active contour models are not affine-invariant and we addressed the problem by presenting an affine-invariant snake model (AI- snake) such that its energy function are defined in terms local and global affine-invariant features. The main characteristic of the AI-snake is that, during the process of object extraction, the pose of the model contour is dynamically adjusted such that it is in alignment with the current snake contour by solving the snake-prototype correspondence problem and determining the required affine transformation. In addition, we formulate the correspondence matching between the snake and the object prototype as an error minimization process between two feature vectors which capture both local and global deformation information. We show that the technique is robust against object deformations and complex scenes. (C) 1998 Elsevier Science B.V.
Edwards, GJ, Lanitis, A, Taylor, CJ, and Cootes, TF, "Statistical models of face images - Improving specificity," IMAGE AND VISION COMPUTING, vol. 16, pp. 203-211, 1998.
Abstract: Model-based approaches to the interpretation of face images have proved very successful. We have previously described statistically based models of face shape and grey-level appearance and shown bow they can be used to perform various coding and interpretation tasks. In the paper we describe improved methods of modelling which couple shape and grey-level information more directly than our existing methods, isolate the changes in appearance due to different sources of variability (person, expression, pose, lighting) and deal with non-linear shape variation. We show that the new methods are better suited to interpretation and tracking tasks. (C) 1998 Elsevier Science B.V.
Glasbey, CA, and Mardia, KV, "A review of image-warping methods," JOURNAL OF APPLIED STATISTICS, vol. 25, pp. 155-171, 1998.
Abstract: Image warping is a transformation which maps all positions in one image plane to positions in a second plane. It arises in many image analysis problems, whether in order to remove optical distortions introduced by a camera or a particular viewing perspective, to register art image with a map or template, or to align two or more images. The choice of warp is a compromise between a smooth distortion and one which achieves a good match. Smoothness can be ensured by assuming a parametric form for the warp or by constraining it using differential equations. Matching can be specified by points to be brought into alignment, by local measures of correlation between images, or by the coincidence of edges. Parametric and non-parametric approaches to warping, and matching criteria, are reviewed.
Garrido, A, and De la Blanca, NP, "Physically-based active shape models: Initialization and optimization," PATTERN RECOGNITION, vol. 31, pp. 1003-1017, 1998.
Abstract: In this paper we describe a new approach for 2-D object segmentation using an automatic method applied on images with problems as partial information, overlapping objects, many objects in a single scene, severe noise conditions and locating objects with a very high degree of deformation. We use a physically-based shape model to obtain a deformable template, which is defined on a canonical orthogonal coordinate system. The proposed methodology works starting from the output of an edge detector, which is processed to automatically obtain an approximation of the shape. The final estimation of the shapes is obtained fitting a deformable template model, which is defined on a learned surface of deformation. Results from biological images are presented. (C) 1998 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.
Ivins, JP, and Porrill, J, "A deformable model of the human iris for measuring small three- dimensional eye movements," MACHINE VISION AND APPLICATIONS, vol. 11, pp. 42-51, 1998.
Abstract: This paper describes a deformable model of the human iris which forms part of a system for accurate offline measurement of binocular three-dimensional eye movements, particularly cyclotorsion (torsion), from video image sequences. At least two existing systems measure torsion from infrared video images by pupil tracking followed by cross correlation using arcs of bandpass-filtered iris texture. Unfortunately, pupil expansion and contraction reduces the accuracy of this method unless drugs are used to constrict the pupil, which causes temporary blurred vision. A five-parameter deformable model of the iris is therefore developed for analysing images obtained without the use of drugs. This model can translate (horizontal and vertical eye motion), rotate (torsion) and scale both uniformly and radially (pupil changes). Torsion measurements obtained with the model are repeatable and accurate to within 0.1 degrees; this performance in illustrated by analysing binocular torsion during fixation on a stationary target.
Vetter, T, "Synthesis of novel views from a single face image," INTERNATIONAL JOURNAL OF COMPUTER VISION, vol. 28, pp. 103-116, 1998.
Abstract: Images formed by a human face change with viewpoint. A new technique is described for synthesizing images of faces from new viewpoints, when only a single 2D image is available. A novel 2D image of a face can be computed without explicitly computing the 3D structure of the head. The technique draws on a single generic 3D model of a human head and on prior knowledge of faces based on example images of other faces seen in different poses. The example images are used to "learn" a pose-invariant shape and texture description of a new face. The 3D model is used to solve the correspondence problem between images showing faces in different poses. The proposed method is interesting for view independent face recognition tasks as well as for image synthesis problems in areas like teleconferencing and virtualized reality.
Gu, C, and Lee, MC, "Semiautomatic segmentation and tracking of semantic video objects," IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, vol. 8, pp. 572-584, 1998.
Abstract: This paper introduces a novel semantic video object extraction system using mathematical morphology and a perspective motion model. Inspired by the results from the study of the human visual system, we intend to solve the semantic video object extraction problem in two separate steps: supervised I-frame segmentation, and unsupervised P-frame tracking. First, the precise semantic video object boundary can be found using a combination of human assistance and a morphological segmentation tool. Second, the semantic video objects in the remaining frames are obtained using global perspective motion estimation and compensation of the previous semantic video object plus boundary refinement as used for I frames.
Lepsoy, S, and Curinga, S, "Conversion of articulatory parameters into active shape model coefficients for lip motion representation and synthesis," SIGNAL PROCESSING-IMAGE COMMUNICATION, vol. 13, pp. 209-225, 1998.
Abstract: Speech-driven facial animation combines techniques from different disciplines such as image analysis, computer graphics, and speech analysis. Active shape models (ASM) used in image analysis are excellent tools for characterizing lip contour shapes and approximating their motion in image sequences. By controlling the coefficients for an ASM, such a model can also be used for animation. We design a mapping of the articulatory parameters used in phonetics into ASM coefficients that control nonrigid lip motion. The mapping is designed to minimize the approximation error when articulatory parameters measured on training lip contours are taken as input to synthesize the training lip movements. Since articulatory parameters can also be estimated from speech, the proposed technique can form an important component of a speech-driven facial animation system. (C) 1998 Elsevier Science B.V. All rights reserved.
Ivins, J, and Porrill, J, "Constrained active region models for fast tracking in color image sequences," COMPUTER VISION AND IMAGE UNDERSTANDING, vol. 72, pp. 54-71, 1998.
Abstract: Image segmentation is a fundamental problem in computer vision, for which deformable models offer a partial solution. Most deformable models work by performing some kind of edge detection; complementary region growing methods have not often been used. As a result, deformable models that track regions rather than edges have yet to be developed to a great extent. Active region models are a relatively new type of deformable model driven by a region energy that is a function of the statistical characteristics of an image. This paper describes the use of constrained active region models for frame-rate tracking in color video images on widely available computer hardware. Two of the many color representations now in use are reviewed for this purpose: the intensity-based RGB space and the more intuitive HSV space. Normalized RGB, which is essentially a measure of hue and saturation, emerges as the preferred representation because it is invariant to illumination changes and can be obtained from many frame- grabbers via a simple fast software transformation. Three types of motion are examined for constraining deformable models: rigid models can only translate and rotate to fit image features; conformal models can also change size; affine models exhibit two kinds of shearing in addition to the other components. Two methods are described for producing affine motion, given the desired unconstrained motion calculated by searching for local energy minima lying perpendicular to the model boundary. An existing method, based on iterative gradient descent, computes translating, rotating, scaling, and shearing forces which can be combined to produce affine and other types of motion. A faster, more accurate method uses least-squares minimization to approximate the desired motion; with this method it is also possible to derive specific equations for rigid and conformal motion and to correct for the aperture problem associated with the perpendicular search method. The advantages of the new least-squares method are illustrated by using it to drive an active region model via an affine transformation which tracks the movements of a robot arm at frame rate in color video images, (C) 1998 Academic Press.
Cross, ADJ, and Hancock, ER, "Graph matching with a dual-step EM algorithm," IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 20, pp. 1236-1253, 1998.
Abstract: This paper describes a new approach to matching geometric structure in 2D point-sets. The novel feature is to unify the tasks of estimating transformation geometry and identifying point-correspondence matches. Unification is realized by constructing a mixture model over the bipartite graph representing the correspondence match and by affecting optimization using the EM algorithm. According to our EM framework, the probabilities of structural correspondence gate contributions to the expected likelihood function used to estimate maximum likelihood transformation parameters. These gating probabilities measure the consistency of the matched neighborhoods in the graphs. The recovery of transformational geometry and hard correspondence matches are interleaved and are realized by applying coupled update operations to the expected log-likelihood function. In this way, the two processes bootstrap one another. This provides a means of rejecting structural outliers. We evaluate the technique on two real-world problems. The first involves the matching of different perspective views of 3.5-inch floppy discs. The second example is furnished by the matching of a digital map against aerial images that are subject to severe barrel distortion due to a line-scan sampling process. We complement these experiments with a sensitivity study based on synthetic data.
Wang, YM, and Staib, LH, "Elastic model based non-rigid registration incorporating statistical shape information," MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION - MICCAI'98, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1496, pp. 1162-1173, 1998.
Abstract: This paper describes a new method of non-rigid registration using the combined power of elastic and statistical shape models. The transformations are constrained to be consistent with a physical model of elasticity to maintain smoothness and continuity. A Bayesian formulation, based on this model, on an intensity similarity measure, and on statistical shape information embedded in corresponding boundary points, is employed to find a more accurate and robust nan-rigid registration. A dense set of forces arises from the intensity similarity measure to accommodate complex anatomical details. A sparse set of forces constrains consistency with statistical shape models derived from a training set. A number of experiments were performed on both synthetic and real medical images of the brain and heart to evaluate the approach. It is shown that statistical boundary shape information significantly augments and improves elastic model based non-rigid registration.
Fleute, M, and Lavallee, S, "Building a complete surface model from sparse data using statistical shape models: Application to computer assisted knee surgery," MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION - MICCAI'98, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1496, pp. 879-887, 1998.
Abstract: This paper addresses the problem of extrapolating very few range data to obtain a complete surface representation of an antomical structure. A new method that uses statistical shape models is proposed and its application to modeling a few points manually digitized on the femoral surface is detailed, in order to improve visualization of a system developped by TIMC laboratory for computer assisted anterior cruciate ligament (ACL) reconstruction. The model is built from a population of 11 femur specimen digitized manually. Data sets are registered together using an elastic registration method of Szeliski and Lavallee based on octree-splines. Principal Components Analysis (PCA) is performed on a field of surface deformation vectors. Fitting this statistical model to a few points is performed by non-linear optimisation. Results are presented for both simulated and real data. The method is very flexible and can be applied to any structures for which the shape is stable.
Duta, N, and Sonka, M, "Segmentation and interpretation of MR brain images: An improved active shape model," IEEE TRANSACTIONS ON MEDICAL IMAGING, vol. 17, pp. 1049-1062, 1998.
Abstract: This paper reports a novel method for fully automated segmentation that is based on description of shape and its variation using point distribution models (PDM's). An improvement of the active shape procedure introduced by Cootes and Taylor to find new examples of previously learned shapes using PDM's is presented, The new method for segmentation and interpretation of deep neuroanatomic structures such as thalamus, putamen, ventricular system, etc. incorporates a priori knowledge about shapes of the neuroanatomic structures to provide their robust segmentation and labeling in magnetic resonance (MR) brain images. The method was trained in eight MR brain images and tested in 19 brain images by comparison to observer-defined independent standards. Neuroanatomic structures in all testing images were successfully identified. Computer-identified and observer-defined neuroanatomic structures agreed well, The average labeling error was 7% +/- 3%, Border positioning errors were quite small, with the average border positioning error of 0.8 +/- 0.1 pixels in 256 x 256 MR images, The presented method was specifically developed for segmentation of neuroanatomic structures in MR brain images. However, it is generally applicable to virtually any task involving deformable shape analysis.
Grenander, U, and Miller, MI, "Computational anatomy: An emerging discipline," QUARTERLY OF APPLIED MATHEMATICS, vol. 56, pp. 617-694, 1998.
Abstract: This paper studies mathematical methods in the emerging new discipline of Computational Anatomy. Herein we formalize the Brown/Washington University model of anatomy following the global pattern theory introduced in [1, 2], in which anatomies are represented as deformable templates, collections of 0, 1, 2, 3-dimensional manifolds. Typical structure is carried by the template with the variabilities accommodated via the application of random transformations to the background manifolds. The anatomical model is a quadruple (Omega, H, I, P), the background space Omega = boolean ORalpha M-alpha of 0, 1, 2, 3-dimensional manifolds, the set of diffeomorphic transformations on the background space H : Omega <-> Omega, the space of idealized medical imagery I, and P the family of probability measures on H. The group of diffeomorphic transformations H is chosen to be rich enough so that a large family of shapes may be generated with the topologies of the template maintained. For normal anatomy one deformable template is studied, with (Omega, H, I) corresponding to a homogeneous space [3], in that it can be completely generated from one of its elements, I = HItemp,I-temp is an element of I. For disease, a family of templates boolean ORalphaItempalpha are introduced of perhaps varying dimensional transformation classes. The complete anatomy is a collection of homogeneous spaces I-total = boolean ORalpha(I-alpha,H-alpha). There are three principal components to computational anatomy studied herein. (1) Computation of large deformation maps: Given any two elements I, I' is an element of I in the same homogeneous anatomy (Omega, H, I), compute diffeomorphisms h from one anatomy to the other I (h-1)reversible arrow(h) I'. This is the principal method by which anatomical structures are understood, transferring the emphasis from the images I is an element of I to the structural transformations h is an element of H that generate them. (2) Computation of empirical probability laws: Given populations of anatomical imagery and diffeomorphisms between them I h(n-1)reversible arrow(hn) I-n, n = 1, . . . , N, generate probability laws P is an element of P on H that represent the anatomical variation reflected by the observed population of diffeomorphisms h(n), n = 1,..., N. (3) Inference and disease testing: Within the anatomy (Omega, H, I, P), perform Bayesian classification and testing for disease and anomaly.

1999

Luo, B, and Hancock, ER, "Procrustes alignment with the EM algorithm," COMPUTER ANALYSIS OF IMAGES AND PATTERNS, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1689, pp. 623-631, 1999.
Abstract: This paper casts the problem of point-set alignment via Procrustes analysis into a maximum likelihood framework using the EM algorithm. The aim is to improve the robustness of the Procrustes alignment to noise and clutter. By constructing a Gaussian mixture model over the missing correspondences between individual points, we show how alignment can be realised by applying singular value decomposition to a weighted point correlation matrix. Moreover, by gauging the relational consistency of the assigned correspondence matches, we can edit the point sets to remove clutter. We illustrate the effectiveness of the method matching stereogram. We also provide a sensitivity analysis to demonstrate the operational advantages of the method.
Chui, H, Rambo, J, Duncan, J, Schultz, R, and Rangarajan, A, "Registration of cortical anatomical structures via robust 3D point matching," INFORMATION PROCESSING IN MEDICAL IMAGING, PROCEEDINGS, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1613, pp. 168-181, 1999.
Abstract: Inter-subject non-rigid registration of cortical anatomical structures as seen in MR is a challenging problem. The variability of the sulcal and gyral patterns across patients makes the task of registration especially difficult regardless of whether voxel- or feature-based techniques are used. In this paper, we present an approach to matching sulcal point features interactively extracted by neuroanatomical experts. The robust point matching (RPM) algorithm is used to find the optimal affine transformations for matching sulcal points. A 3D linearly interpolated non-rigid warping is then generated for the original image volume. We present quantitative and visual comparisons between Talairach, mutual information-based volumetric matching and RPM on five subjects' MR images.
Cootes, TF, Beeston, C, Edwards, GJ, and Taylor, CJ, "A unified framework for atlas matching using Active Appearance Models," INFORMATION PROCESSING IN MEDICAL IMAGING, PROCEEDINGS, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1613, pp. 322-333, 1999.
Abstract: We propose to use statistical models of shape and texture as deformable anatomical atlases. By training on sets of labelled examples these can represent both the mean structure and appearance of anatomy in medical images, and the allowable modes of deformation. Given enough training examples such a model should be able synthesise any image of normal anatomy. By finding the parameters which minimise the difference between the synthesised model image and the target image we can locate all the modelled structure. This potentially time consuming step can be solved rapidly using the Active Appearance Model (AAM). In this paper we describe the models and the AAM algorithm and demonstrate the approach on structures in MR brain cross-sections.
Brett, AD, and Taylor, CJ, "A framework for automated landmark generation for automated 3D statistical model construction," INFORMATION PROCESSING IN MEDICAL IMAGING, PROCEEDINGS, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1613, pp. 376-381, 1999.
Abstract: We describe a method of pairwise 3D surface correspondence for the automated generation of landmarks on a set of examples from a class of shape. We show how the pairwise corresponder can be used in an extension of an existing framework for establishing dense correspondences between a set of training examples to build a 3D statistical model. The framework relies upon additional algorithms for the production of surface paths between vertices on a polyhedral mesh, and these are described. An example statistical model is shown for the left lateral ventricle of the brain.
Velasco, HMG, Aligue, FJL, Orellana, CJG, Macias, MM, and Sotoca, MIA, "Application of ANN techniques to automated identification of bovine livestock," ENGINEERING APPLICATIONS OF BIO-INSPIRED ARTIFICIAL NEURAL NETWORKS, VOL II, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1607, pp. 422-431, 1999.
Abstract: In this work a classification system is presented that, taking lateral images of cattle as inputs, is able to identify the animals and classify them by breed into previously learnt classes. The system consists of two fundamental parts. In the first one, a deformable-model-based preprocessing of the image is made, in which the contour of the animal in the photograph is sought, extracted, and normalized. Next, a neural classifier is presented that, supplemented with a decision-maker at its output, makes the distribution into classes. In the last part, the results obtained in a real application of this methodology are presented.
Germond, L, Dojat, M, Taylor, C, and Garbay, C, "A multi-agent system for MRI brain segmentation," ARTIFICIAL INTELLIGENCE IN MEDICINE, LECTURE NOTES IN ARTIFICIAL INTELLIGENCE, vol. 1620, pp. 423-432, 1999.
Abstract: In this paper we present an original approach for the segmentation of MRI brain images which is based on a cooperation between low-level and high-level approches. MRI brain images are very difficult to segment mainly due to the presence of inhomogeneities within tissues and also due to the high anatomical variability of the brain topology between individuals. In order to tackle these difficulties, we have developped a method whose characteristics are : (i) the use of a priori knowledge essentially anatomical and model-based; (ii) a multi-agent system (MAS) for low-level region segmentation; (iii) a cooperation between a priori knowledge and low-level segmentation to guide and constrain the segmentation processes. These characteristics allow to produce an automatic detection of the main tissues of the brain. The method is validated with phantoms and real images through comparisons with another widely used approach (SPM).
Hutton, TJ, Hammond, P, and Davenport, JC, "Active shape models for customised prosthesis design," ARTIFICIAL INTELLIGENCE IN MEDICINE, LECTURE NOTES IN ARTIFICIAL INTELLIGENCE, vol. 1620, pp. 448-452, 1999.
Abstract: Images and computer graphics play an increasingly important role in the design and manufacture of medical prostheses and implants. Images provide guidance on optimal design in terms of location, preparation and the overall shape and configuration of subcomponents. Direct manipulation of a graphical representation provides a natural design environment. RaPiD is a CAD-like knowledge-based assistant for designing a dental prosthesis known as a removable partial denture (RPD). The expertise embedded in RaPiD encourages optimal subcomponent configuration, but currently supports only minor customisation. This paper describes how oral images and Active Shape Models (ASMs) are being used to address this limitation.
Kelemen, A, Szekely, G, and Gerig, G, "Elastic model-based segmentation of 3-D neuroradiological data sets," IEEE TRANSACTIONS ON MEDICAL IMAGING, vol. 18, pp. 828-839, 1999.
Abstract: This paper presents a new technique for the automatic model- based segmentation of three-dimensional (3-D) objects from volumetric image data. The development closely follows the seminal work of Taylor and Cootes on active shape models, but is based on a hierarchical parametric object description rather than a point distribution model, The segmentation system includes both the building of statistical models and the automatic segmentation of new image data sets via a restricted elastic deformation of shape models, Geometric models are derived from a sample set of image data which have been segmented by experts, The surfaces of these binary objects are converted into parametric surface representations, which are normalized to get an invariant object-centered coordinate system, Surface representations are expanded into series of spherical harmonics which provide parametric descriptions of object shapes. It is shown that invariant object surface parametrization provides a good approximation to automatically determine object homology in terms of sets of corresponding sets of surface points. Gray-level information near object boundaries is represented by 1-D intensity profiles normal to the surface. Considering automatic segmentation of brain structures as our driving application, our choice of coordinates for object alignment was the well-accepted stereotactic coordinate system. Major variation of object shapes around the mean shape, also referred to as shape eigenmodes, are calculated in shape parameter space rather than the feature space of point coordinates, Segmentation makes use of the object shape statistics by restricting possible elastic deformations into the range of the training shapes, The mean shapes are initialized in a new data set by specifying the landmarks of the stereotactic coordinate system, The model elastically deforms, driven by the displacement forces across the object's surface, which are generated by matching local intensity profiles. Elastical deformations are limited by setting bounds for the maximum variations in eigenmode space. The technique has been applied to automatically segment left and right hippocampus, thalamus, putamen, and globus pallidus from volumetric magnetic resonance scans taken from schizophrenia studies. The results have been validated by comparison of automatic segmentation with the results obtained by interactive expert segmentation.
Lotjonen, J, Magnin, IE, Nenonen, J, and Katila, T, "Reconstruction of 3-D geometry using 2-D profiles and a geometric prior model," IEEE TRANSACTIONS ON MEDICAL IMAGING, vol. 18, pp. 992-1002, 1999.
Abstract: A method has been developed to reconstruct three-dimensional (3-D) surfaces from two-dimensional (2-D) projection data. It is used to produce individualized boundary element models, consisting of thorax and lung surfaces, for electro- and magnetocardiographic inverse problems. Two orthogonal projections are utilized, A geometrical prior model, built: using segmented magnetic resonance images, is deformed according to profiles segmented from projection images. In our method, virtual X-ray images of the prior model are first constructed by simulating real X-ray imaging, The 2-D profiles of the model are segmented from the projections and elastically matched with the profiles segmented from patient data. The displacement vectors produced by the elastic 2-D matching are back projected onto the 3-D surface of the prior model. Finally, the model is deformed, using the back-projected vectors. Two different deformation methods are proposed, The accuracy of the method is validated by a simulation, The average reconstruction error of a thorax and lungs was 1.22 voxels, corresponding to about 5 mm.
Chesnaud, C, Refregier, P, and Boulet, V, "Statistical region snake-based segmentation adapted to different physical noise models," IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 21, pp. 1145-1157, 1999.
Abstract: Algorithms for object segmentation are crucial in many image processing applications. During past years, active contour models (snakes) have been widely used for finding the contours of objects. This segmentation strategy is classically edge- based in the sense that the snake is driven to fit the maximum of an edge map of the scene. In this paper, we propose a region snake approach and we determine fast algorithms for the segmentation of an object in an image. The algorithms developed in a Maximum Likelihood approach are based on the calculation of the statistics of the inner and the outer regions (defined by the snake). It has thus been possible to develop optimal algorithms adapted to the random fields which describe the gray levels in the input image if we assume that their probability density function family are known. We demonstrate that this approach is still efficient when no boundary's edge exists in the image. We also show that one can obtain fast algorithms by transforming the summations over a region, for the calculation of the statistics, into summations along the boundary of the region. Finally, we will provide numerical simulation results for different physical situations in order to illustrate the efficiency of this approach.
Stammberger, T, Eckstein, F, Michaelis, M, Englmeier, KH, and Reiser, M, "Interobserver reproducibility of quantitative cartilage measurements: Comparison of B-spline snakes and manual segmentation," MAGNETIC RESONANCE IMAGING, vol. 17, pp. 1033-1042, 1999.
Abstract: The objective of this work was to develop a segmentation technique for thickness measurements of the articular cartilage in MR images and to assess the interobserver reproducibility of the method in comparison with manual segmentation. The algorithm is based on a B-spline snakes approach and is able to delineate the cartilage boundaries in real time and with minimal user interaction. The interobserver reproducibility of the method, ranging from 3.3 to 13.6% for various section orientations and joint surfaces, proved to be significantly superior to manual segmentation. (C) 1999 Elsevier Science Inc.
Craw, I, Costen, N, Kato, T, and Akamatsu, S, "How should we represent faces for automatic recognition?," IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 21, pp. 725-736, 1999.
Abstract: We describe results obtained from a testbed used to investigate different codings for automatic face recognition. An eigenface coding of shape-free faces using manually located landmarks was more effective than the corresponding coding of correctly shaped faces. Configuration also proved an effective method of recognition, with rankings given to incorrect matches relatively uncorrelated with those from shape-free faces. Both sets of information combine to improve significantly the performance of either system. The addition of a system, which directly correlated the intensity values of shape-free images, also significantly increased recognition, suggesting extra information was still available. The recognition advantage for shape-free faces reflected and depended upon high-quality representation of the natural facial variation via a disjoint ensemble of shape-free faces; if the ensemble was comprised of nonfaces, a shape-free disadvantage was induced. Manipulation within the shape-free coding to emphasize distinctive features of the faces, by caricaturing, allowed further increases in performance; this effect was only noticeable when the independent shape-free and configuration coding was used. Taken together, these results strongly support the suggestion that faces should be considered as lying in a high-dimensional manifold, which is locally linearly approximated by these shapes and textures, possibly with a separate system for local features. Principal Components Analysis is then seen as a convenient tool in this local approximation.
Egmont-Petersen, M, and Arts, T, "Recognition of radiopaque markers in X-ray images using a neural network as nonlinear filter," PATTERN RECOGNITION LETTERS, vol. 20, pp. 521-533, 1999.
Abstract: Neural networks are developed to recognise radiopaque markers in biplane cineangiographic video-images with a background composed of different objects. Our connectionist approach is compared theoretically as well as experimentally with linear template matching. Theoretically, neural networks are likely to give better recognition results as they can implement nonlinear discriminants. Experiments confirm that the networks result in better marker recognitions than template matching. (C) 1999 Elsevier Science B.V. All rights reserved.
Cootes, TF, and Taylor, CJ, "A mixture model for representing shape variation," IMAGE AND VISION COMPUTING, vol. 17, pp. 567-573, 1999.
Abstract: The shape variation displayed by a class of objects can be represented as probability density function, allowing us to determine plausible and implausible examples of the class. Given a training set of example shapes we can align them into a common co-ordinate frame and use kernel-based density estimation techniques to represent this distribution. Such an estimate is complex and expensive, so we generate a simpler approximation using a mixture of gaussians. We show how to calculate the distribution, and how it can be used in image search to locate examples of the modelled object in new images. (C) 1999 Elsevier Science B.V. All rights reserved.
Brett, AD, Hill, A, and Taylor, CJ, "A method of 3D surface correspondence and interpolation for merging shape examples," IMAGE AND VISION COMPUTING, vol. 17, pp. 635-642, 1999.
Abstract: A method for corresponding the triangulated mesh surface representations of two shapes is presented. It comprises a method of polyhedral mesh decimation and a symmetric version of the iterative Closest Point (ICP) algorithm. The method produces a matching pair of sparse polyhedral approximations, one for each shape surface, using a global Euclidean measure of similarity. A method of surface patch parameterisation is presented which uses minimal paths constructed across the surface of a polyhedron. We describe the use of this patch parameterisation in the interpolation of surfaces for the construction of a merged mean shape with a densely triangulated surface. Results are presented for the production of a binary tree of merged biological shapes which may be used as a basis for the automated landmarking of a set of examples. (C) 1999 Elsevier Science B.V. All rights reserved.
Lelieveldt, BPF, van der Geest, RJ, Rezaee, MR, Bosch, JG, and Reiber, JHC, "Anatomical model matching with fuzzy implicit surfaces for segmentation of thoracic volume scans," IEEE TRANSACTIONS ON MEDICAL IMAGING, vol. 18, pp. 218-230, 1999.
Abstract: Many segmentation methods for thoracic volume data require manual input in the form of a seed point, initial contour, volume of interest etc. The aim of the work presented here is to further automate this segmentation initialization step. In this paper an anatomical modeling and matching method is proposed to coarsely segment thoracic volume data into anatomically labeled regions. An anatomical model of the thorax is constructed in two steps: 1) individual organs are modeled with blended fuzzy implicit surfaces and 2) the single organ models are grouped into a tree structure with a solid modeling technique named constructive solid geometry (CSG), The combination of CSG with fuzzy implicit surfaces allows a hierarchical scene description by means of a boundary model, which characterizes the scene volume as a boundary potential function. From this boundary potential, an energy function is defined which is minimal when the model is registered to the tissue-air transitions in thoracic magnetic resonance imaging (MRI) data. This allows automatic registration in three steps: feature detection, initial positioning and energy minimization, The model matching has been validated in phantom simulations and on 15 clinical thoracic volume scans from different subjects. In 13 of these sets the matching method accurately partitioned the image volumes into a set of volumes of interest for the heart, lungs, cardiac ventricles, and thorax outlines. The method is applicable to segmentation of various types of thoracic MR-images, provided that a large part of the thorax is contained in the image volume.
Chalmond, B, and Girard, SC, "Nonlinear modeling of scattered multivariate data and its application to shape change," IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 21, pp. 422-432, 1999.
Abstract: We are given a set of points in a space of high dimension. For instance, this set may represent many visual appearances of an object, a face, or a hand. We address the problem of approximating this set by a manifold in order to have a compact representation of the object appearance. When the scattering of this set is approximately an ellipsoid, then the problem has a well-known solution given by Principal Components Analysis (PCA). However, in some situations like object displacement learning or face learning, this linear technique may be ill- adapted and nonlinear approximation has to be introduced. The method we propose can be seen as a Non Linear PCA (NLPCA), the main difficulty being that the data are not ordered. We propose an index which favors the choice of axes preserving the closest point neighborhoods. These axes determine an order for visiting all the points when smoothing. Finally, a new criterion, called "generalization error," is introduced to determine the smoothing rate, that is, the knot number for the spline fitting. Experimental results conclude this paper: The method is tested on artificial data and on two data bases used in visual learning.
Cosio, FA, and Davies, BL, "Automated prostate recognition: a key process for clinically effective robotic prostatectomy," MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, vol. 37, pp. 236-243, 1999.
Abstract: Clinical trials of PROBOT, a robotic system for prostate surgery, have shown that robotic surgery of soft tissue can be successful. Monitoring of the progress of the resection has shown to be a necessary feature of an effective robotic system for prostate surgery. It should provide the surgeon with a reliable method of assessing the cavity during resection. An automatic system for intraoperative monitoring of the progress of the resection during robotic prostatectomy consists of two subsystems: real-time intraoperative imaging of the prostate and automatic identification of the contour of the gland on each image. The development of a fully automatic scheme for prostate recognition on transurethral ultrasound scans is reported. A genetic algorithm has been developed to automatically adjust a model of the prostate boundary until an optimum fit to the prostate in a given image is obtained. An analysis of ifs performance on 22 different ultrasound images showed an average error of 6.21 mm. Use of a genetic algorithm and a constrained prostate model have shown to be a robust way to automatically identify the prostate in ultrasound images. The scheme is able to produce approximate prostate boundaries, without any human intervention, on ultrasound scans of varying quality. In addition to soft tissue robotic surgery, the generic algorithm technique is also applicable to a wide range of computer assisted surgical techniques.
Gavrila, DM, "The visual analysis of human movement: A survey," COMPUTER VISION AND IMAGE UNDERSTANDING, vol. 73, pp. 82-98, 1999.
Abstract: The ability to recognize humans and their activities by vision is key for a machine to interact intelligently and effortlessly with a human-inhabited environment. Because of many potentially important applications, "looking at people" is currently one of the most active application domains in computer vision. This survey identifies a number of promising applications and provides an overview of recent developments in this domain. The scope of this survey is limited to work on whole-body or hand motion; it does not include work on human faces. The emphasis is on discussing the various methodologies; they are grouped in 2-D approaches with or without explicit shape models and 3-D approaches. Where appropriate, systems are reviewed. We conclude with some thoughts about future directions. (C) 1999 Academic Press.
Behiels, G, Vandermeulen, D, Maes, F, Suetens, P, and Dewaele, P, "Active Shape Model-based segmentation of digital X-ray images," MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION, MICCAI'99, PROCEEDINGS, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1679, pp. 128-137, 1999.
Abstract: We propose an improved search procedure for Active Shape Model (ASM) based delineation of anatomical structures in digital X- ray images. Whereas the original ASM search method (1) iteratively improves the current estimate of the location of boundary points by a limited least squares adjustment of the pose and shape parameters, our method additionally requires the subsequent changes in shape during the search to be smooth, which is achieved by using a minimum cost path search algorithm. We, compare the two methods oil a database of more than 400 manual segmentations of digital X-ray images of the femur, humerus and calcaneus. We evaluate the accuracy and robustness of both methods using a cross-validation procedure.
Fleute, M, and Lavallee, S, "Nonrigid 3-D/2-D registration of images using statistical models," MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION, MICCAI'99, PROCEEDINGS, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1679, pp. 138-147, 1999.
Abstract: This paper presents a new algorithm for reconstruction of 3D shapes using a few x-ray views and a statistical model. In many applications of surgery such as orthopedics, it is desirable to define a surgical planning oil 3-D images and then to execute the plan using standard registration techniques and image- guided surgery systems. But the cost, time and x-ray dose associated with standard pre-operative Computed Tomography makes it difficult to use this methodology for rather standard interventions. Instead, we propose to use. a few x-ray images generated from a C-Arm and to build the 3-D shape of the patient bones or organs intra-operatively, by deforming a statistical 3-D model to the contours segmented oil the x-ray views. In this paper, we concentrate on the application of our method to bone reconstruction. The algorithm starts from segmented contours of the bone oil the x-ray images and ail initial estimate of the pose of the 3-D model in the common coordinate system of the set of x-ray projections. The statistical model is made of a few principal modes that are sufficient to represent the normal anatomy. Those modes are built by using a generalization of the Cootes and Taylor method to 3-D surface models, previously published in MICCAI'98 by the authors. Fitting the model to the contours is achieved by using a generalization of the Iterative Closest Point Algorithm to nonrigid 3D/2D registration. For pathological shapes, the statistical model is not valid and subsequent local refinement is necessary. First results are presented for a 3-D statistical model of the distal part of the femur.
Lotjonen, J, Magnin, IE, Reinhardt, L, Nenonen, J, and Katila, T, "Automatic reconstruction of 3D geometry using projections and a geometric prior model," MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION, MICCAI'99, PROCEEDINGS, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1679, pp. 192-201, 1999.
Abstract: A method has been developed to reconstruct 3D surfaces from two orthogonal X-ray projections. A 3D geometrical prior model, composed of triangulated surfaces, is deformed according to contours segmented from projection images. The contours are segmented by a new method based on free-form deformation. First, virtual X-ray images of the prior model are constructed by simulating real X-ray imaging, Thereafter, the contours segmented from the virtual projections are elastically matched with patient data. Next, the produced 2D vectors are back- projected onto the surface of the prior model and the prior model is deformed using the back-projected vectors with shape- based interpolation. The accuracy of the method is validated by it data set, containing 20 cases. The method is applied to reconstruct thorax and lung surfaces. The average matching error is about 1.2 voxels, corresponding to 5 mm.
Suri, JS, Haralick, RM, and Sheehan, FH, "Linear vs. quadratic optimization algorithms for bias correction of left ventricle chamber boundaries in low contrast projection ventriculograms produced from xray cardiac catheterization procedure," COMPUTER ANALYSIS OF IMAGES AND PATTERNS, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1689, pp. 108-117, 1999.
Abstract: Cardiac catheterization procedure produces ventriculogram which have very low contrast in the apical, anterior and inferior zones of the left ventricle (LV). Pixel-based classifiers operating on these images produce boundaries which have systematic positional and orientation bias and have a mean error of about 10.5 mm. Using the IV convex information, comprising of the apex and the aortic valve plane, this pa, per presents a comparison of the linear and quadratic optimization algorithms to remove these biases. These algorithms axe named after the way the coefficients are computed: the identical coefficient and the independent coefficient. Using the polyline metric, we show that the quadratic optimization is better than the linear optimization. We also show that the independent coefficient method performs better than the identical coefficient when the training data is large. The overall mean system error was 2.49 mm while the goal set by the cardiologist was 2.5 mm.

2000

Hill, A, Taylor, CJ, and Brett, AD, "A framework for automatic landmark identification using a new method of nonrigid correspondence," IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 22, pp. 241-251, 2000.
Abstract: A framework for automatic landmark indentification is presented based on an algorithm for corresponding the boundaries of two shapes. The auto-landmarking framework employs a binary tree of corresponded pairs of shapes to generate landmarks automatically on each of a set of example shapes. The landmarks are used to train statistical shape models known as Point Distribution Models. The correspondence algorithm locates a matching pair of sparse polygonal approximations, one for each of a pair of boundaries by minimizing a cost function, using a greedy algorithm. The cost function expresses the dissimilarity in both the shape and representation error (with respect to the defining boundary) of the sparse polygons. Results are presented for three classes of shape which exhibit various types of nonrigid deformation.
Bronkorsta, PJH, Reinders, MJT, Hendriks, EA, Grimbergen, J, Heethaar, RM, and Brankenhoff, GJ, "On-line detection of red blood cell shape using deformable templates," PATTERN RECOGNITION LETTERS, vol. 21, pp. 413-424, 2000.
Abstract: For the purpose of automating a clinical diagnostic apparatus to quantify the deformability of human red blood cells, we present an automated image analysis procedure for on-line detection of the cell shape based upon the method of parametric deformable templates. (C) 2000 Elsevier Science B.V. All rights reserved.
Brett, AD, and Taylor, CJ, "A method of automated landmark generation for automated 3D PDM construction," IMAGE AND VISION COMPUTING, vol. 18, pp. 739-748, 2000.
Abstract: A previous publication has described a method of pairwise three-dimensional (3D) surface correspondence for the automated generation of landmarks on a set of examples from a class of shape (A.D. Brett, A. Hill, C.J. Taylor, A method of 3D surface correspondence for automated landmark generation, in: 8th British Machine Vision Conference, Essex, England, September 1997, pp 709-718). In this paper we describe a set of improved algorithms which give more accurate and more robust results. We show how the pairwise corresponder can be used in an extension of an existing framework for establishing dense correspondences between a set of training examples (A. Hill, A.D. Brett, C.J. Taylor, Automatic landmark identification using a new method of non-rigid correspondence, in: J. Duncan, G. Gindi, (Eds.), 15th Conference on Information Processing in Medical Imaging, Poulteney, VT, Springer, Berlin, 1997, pp. 483-488) to build a 3D Point Distribution Model. The framework relies upon additional algorithms for the production of surface paths between vertices on a polyhedral mesh, and these are described. Example statistical models are shown for both smooth synthetic data and the left lateral ventricle of the brain, a complex biological shape which demonstrates considerable variation between individuals. (C) 2000 Elsevier Science B.V. All rights reserved.
Bowden, R, Mitchell, TA, and Sarhadi, M, "Non-linear statistical models for the 3D reconstruction of human pose and motion from monocular image sequences," IMAGE AND VISION COMPUTING, vol. 18, pp. 729-737, 2000.
Abstract: This paper presents a model based approach to human body tracking in which the 2D silhouette of a moving human and the corresponding 3D skeletal structure are encapsulated within a non-linear point distribution model. This statistical model allows a direct mapping to be achieved between the external boundary of a human and the anatomical position. It is shown how this information, along with the position of landmark features such as the hands and head can be used to reconstruct information about the pose and structure of the human body from a monocular view of a scene. (C) 2000 Elsevier Science B.V. All rights reserved.
Egmont-Petersen, M, Schreiner, U, Tromp, SC, Lehmann, TM, Slaaf, DW, and Arts, T, "Detection of leukocytes in contact with the vessel wall from in vivo microscope recordings using a neural network," IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, vol. 47, pp. 941-951, 2000.
Abstract: Leukocytes play an important role in the host defense as they may travel from the blood stream into the tissue in reacting to inflammatory stimuli. The leukocyte-vessel wall interactions are studied in post capillary vessels by intraviral video microscopy during in vivo animal experiments. Sequences of video images are obtained and digitized with a frame grabber. A method for automatic detection and characterization of leukocytes in the video images is developed. Individual leukocytes are detected using a neural network that is trained with synthetic leukocyte images generated using a novel stochastic model. This model makes it feasible to generate images of leukocytes with different shapes and sizes under various lighting conditions. Experiments indicate that neural networks trained with the synthetic leukocyte images perform better than networks trained with images of manually detected leukocytes. The best performing neural network trained with synthetic leukocyte images resulted in an 18% larger area under the ROC curve than the best performing neural network trained with manually detected leukocytes.
Zhong, Y, Jain, AK, and Dubuisson-Jolly, MP, "Object tracking using deformable templates," IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 22, pp. 544-549, 2000.
Abstract: We propose a novel method for object tracking using prototype- based deformable template models. To track an object in an image sequence, we use a criterion which combines two terms: the frame-to-frame deviations of the object shape and the fidelity of the modeled shape to the Input image. The deformable template model utilizes the prior shape information which is extracted from the previous frames along with a systematic shape deformation scheme to model the object shape in a new frame. The following image information Is used in the tracking process: 1) edge and gradient information: the object boundary consists of pixels with large image gradient, 2) region consistency: the same object region possesses consistent color and texture throughout the sequence, and 3) interframe motion: the boundary of a moving object is characterized by large interframe motion. The tracking proceeds by optimizing an objective function which combines both the shape deformation and the fidelity of the modeled shape to the current image (in terms of gradient, texture, and interframe motion). The inherent structure in the deformable template. together with region, motion, and image gradient cues. makes the proposed algorithm relatively insensitive to the adverse effects of weak image features and moderate amounts of occlusion.
Huang, CL, Wu, MS, and Jeng, SH, "Gesture recognition using the multi-PDM method and Hidden Markov Model," IMAGE AND VISION COMPUTING, vol. 18, pp. 865-879, 2000.
Abstract: This paper introduces a multi-Principal-Distribution-Model (PDM) method and Hidden Markov Model (I-m IM) for gesture recognition. To track the hand-shape, it uses the PDM model which is built by learning patterns of variability from a training set of correctly annotated images. However, it can only fit the hand examples that are similar to shapes of the corresponding training set. For gesture recognition, we need to deal with a large variety of hand-shapes. Therefore, we divide all the training hand shapes into a number of similar groups, with each group trained for an individual PDM shape model. Finally, we use the HMM to determine model transition among these PDM shape models. From the model transition sequence, the system can identify the continuous gestures representing one- digit or two-digit numbers. (C) 2000 Elsevier Science B.V. All rights reserved.
Marques, JS, and Jorge, PM, "Visual inspection of a combustion process in a thermoelectric plant," SIGNAL PROCESSING, vol. 80, pp. 1577-1589, 2000.
Abstract: Infrared images provide useful information to inspect the status of combustion processes. The flame geometry and intensity depend on the combustion status and can be used for control and monitoring purposes. Flame segmentation is difficult since the background intensity is sometimes higher than the flame intensity, therefore requiring the use of sophisticated image analysis algorithms. This paper describes methods to analyze infrared images of industrial flames and to characterize the flame geometry. A segmentation algorithm is proposed to separate the flame region from the background using an image formation model, a background model and the available shape information. Segmentation algorithms (e.g., active contours) usually assume solid objects with sharp boundaries. This is not true in the case of flame images. The flame is nonhomogeneous and it has a fuzzy boundary. To circumvent this difficulty multiple contours are used to characterize the flame geometry. The flame shape is then obtained by robust estimation methods, using a model of the image formation process inside the combustion chamber. The proposed algorithm is evaluated and used to monitor the flame characteristics in a boiler of a thermoelectric plant. (C) 2000 Elsevier Science B.V. All rights reserved.
Wang, YM, and Staib, LH, "Boundary finding with prior shape and smoothness models," IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 22, pp. 738-743, 2000.
Abstract: We propose a unified framework for boundary finding, where a Bayesian formulation, based on prior knowledge and the edge information of the input image (likelihood), is employed. The prior knowledge in our framework is based on principal component analysis of four different covariance matrices corresponding to independence, smoothness, statistical shape. and combined models, respectively. indeed, snakes. modal analysis, Fourier descriptors, and point distribution models can be derived from or linked to our approaches of different prior models. When the true training set does not contain enough variability to express the full range of deformations, a mixed covariance matrix uses a combined prior of the smoothness and statistical variation modes. It adapts gradually to use more statistical modes of variation as larger data sets are available.
Rios, HV, Solis, AL, Aguirre, E, Guerrero, L, Pena, J, and Santamaria, A, "Facial expression recognition and modeling for virtual intelligent tutoring systems," MICAI 2000: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, LECTURE NOTES IN ARTIFICIAL INTELLIGENCE, vol. 1793, pp. 115-126, 2000.
Abstract: This paper describes ongoing work for developing a new interface for intelligent tutoring systems based on recognition and synthesis of facial expressions. This interface senses the emotional state of the user, or his/her degree of attention, and communicates more naturally through face animation.
Shen, DG, and Davatzikos, C, "An adaptive-focus deformable model using statistical and geometric information," IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 22, pp. 906-913, 2000.
Abstract: An active contour (snake) model is presented, with emphasis on medical imaging applications. There are three main novelties in the proposed model. First. an attribute vector is used to characterize the geometric structure around each point of the snake model: the deformable model then deforms in a way that seeks regions with similar attribute vectors. This is in contrast to most deformable models, which deform to nearby edges without considering geometric structure. and it was motivated by the need to establish point-correspondences that have anatomical meaning. Second, an adaptive-focus statistical model has been suggested which allows the deformation of the active contour in each stage to be influenced primarily by the most reliable matches. Third, a deformation mechanism that is robust to local minima is proposed by evaluating the snake energy function on segments of the snake at a time, instead of individual points. Various experimental results show the effectiveness of the proposed model.
Lelieveldt, BPF, Sonka, M, Bolinger, L, Scholz, TD, Kayser, H, van der Geest, R, and Reiber, JHC, "Anatomical modeling with fuzzy implicit surface templates: Application to automated localization of the heart and lungs in thoracic MR volumes," COMPUTER VISION AND IMAGE UNDERSTANDING, vol. 80, pp. 1-20, 2000.
Abstract: In this paper, a novel model-driven segmentation approach for thoracic MR-images is presented. The goal of this work is to coarsely, but fully automatically localize the boundary surfaces of the heart and lungs in thoracic MR sets. The major organs in the thorax are described in a three-dimensional analytical model template by combining a set of fuzzy implicit surfaces by means of constructive solid geometry and formulating model registration as an energy minimization. The method has been validated on 20 thoracic MR volumes from two centers (patients and normal subjects). On average 90% of the contour length of the heart and lung contours was localized with sufficient accuracy (average positional error 6 mm) to automatically provide the initial conditions for a subsequently applied locally accurate segmentation method. (C) 2000 Academic Press.
Howing, F, Dooley, LS, and Wermser, D, "Fuzzy active contour model," IEE PROCEEDINGS-VISION IMAGE AND SIGNAL PROCESSING, vol. 147, pp. 323-330, 2000.
Abstract: A new method for representing and tracking of object boundaries is presented, which allows for the integration of uncertain a priori knowledge into an active contour model. The novel concept of fuzzy snakes is developed to allow for an intuitive specification of the properties of an object's boundary. This is achieved by introducing fuzzy energy functions and establishing a linguistic rule base, which describes each of the fuzzy snake's segments. Furthermore the approximate length of each contour segment may be specified to both improve the segmentation process and to reduce computational complexity. Experimental results demonstrate the validity of the theoretical properties of the fuzzy snake approach, and examples have been included illustrating the application of the technique to complex scenes, such as medical imaging sequences.
Suri, JS, "Computer vision, pattern recognition and image processing in left ventricle segmentation: The last 50 years," PATTERN ANALYSIS AND APPLICATIONS, vol. 3, pp. 209-242, 2000.
Abstract: In the last decade, computer vision, pattern recognition, image processing and cardiac researchers have given immense attention to cardiac image analysis and modelling. This paper surveys state-of-the-are computer vision and pattern recognition techniques for Left Ventricle (LV) segmentation and modelling juring the second half of the twentieth century The paper presents the key characteristics of successful model-based segmentation techniques for LV modelling. This sun ey paper concludes the following: (1) any one pattern recognition or computer vision technique is nut sufficient for accurate 2D, 3D or 4D modelling of LV; (2) fitting mathematical models for LV modelling have dominated in the last 15 years; (3) knowledge extracted from the ground truth has lead to very successful attempts at LV modelling; (4) spatial and temporal behaviour of LV through different imaging modalities has yielded information which has led to accurate LV modelling; and (5) not much attention has bern paid to LV modelling validation.
Chella, A, Frixione, M, and Gaglio, S, "Understanding dynamic scenes," ARTIFICIAL INTELLIGENCE, vol. 123, pp. 89-132, 2000.
Abstract: We propose a framework for the representation of visual knowledge in a robotic agent, with special attention to the understanding of dynamic scenes. According to our approach, understanding involves the generation of a high level, declarative description of the perceived world. Developing such a description requires both bottom-up, data driven processes that associate symbolic knowledge representation structures with the data coming out of a vision system, and top-down processes in which high level, symbolic information is in its turn employed to drive and further refine the interpretation of a scene. On the one hand, the computer vision community approached this problem in terms of 2D/3D shape reconstruction and of estimation of motion parameters. On the Ether, the AI community developed rich and expressive systems for the description of processes, events, actions and, in general, of dynamic situations. Nevertheless, these two approaches evolved separately and concentrated on different kinds of problems. We propose an architecture that integrates these two traditions in a principled way. Our assumption is that a link is missing between the two classes of representations mentioned above. In order to fill this gap, we adopt the notion of conceptual space (CS-Gardenfors (2000)), a representation where information is characterized in terms of a metric space. A CS acts as an intermediate representation between subconceptual (i.e., not yet conceptually categorized) information, and symbolically organized knowledge. The concepts of process and action have immediate characterizations in terms of structures in the conceptual space. The architecture is illustrated with reference to an experimental setup based on a vision system operating in a scenario with moving and interacting people. (C) 2000 Elsevier Science B.V. All rights reserved.
Davies, ER, "Low-level vision requirements," ELECTRONICS & COMMUNICATION ENGINEERING JOURNAL, vol. 12, pp. 197-210, 2000.
Abstract: This paper aims to help those with some experience of vision to obtain a more in-depth understanding of the problems of low- level vision. As it is not possible to cover everything in a paper of this length, a carefully chosen series of cases and case studies is presented. Relevant principles are brought out and a set of important ground rules is presented by way of summary.
Hutton, TJ, Cunningham, S, and Hammond, P, "An evaluation of active shape models for the automatic identification of cephalometric landmarks," EUROPEAN JOURNAL OF ORTHODONTICS, vol. 22, pp. 499-508, 2000.
Abstract: This paper describes an evaluation of the application of active shape models to cephalometric landmarking. Permissible deformations of a template were established from a training set of hand-annotated images and the resulting model was used to fit to unseen images. An evaluation of this technique in comparison to the accuracy achieved by previous methods is presented. Sixty-three randomly selected cephalograms were tested using a drop-one-out method. On average, 13 per cent of 16 landmarks were within 1 mm, 35 per cent within 2 mm, and 74 per cent within 5 mm. It was concluded that the current implementation does not give sufficient accuracy for completely automated landmarking, but could be used as a timesaving tool to provide a first-estimate location of the landmarks. The method is also of interest because it provides a framework for a range of future improvements.
Brejl, M, and Sonka, M, "Object localization and border detection criteria design in edge-based image segmentation: Automated learning from examples," IEEE TRANSACTIONS ON MEDICAL IMAGING, vol. 19, pp. 973-985, 2000.
Abstract: This paper provides methodology for fully automated model-based image segmentation. All information necessary to perform image segmentation is automatically derived from a training set that is presented in a form of segmentation examples, The training set is used to construct two models representing the objects- shape model and border appearance model. A two-step approach to image segmentation is reported. In the first step, an approximate location of the object of interest is determined. In the second step, accurate border segmentation is performed. The shape-variant Hough transform method was developed that provides robust object localization automatically. It finds objects of arbitrary shape, rotation, or scaling and can handle object variability, The border appearance model was developed to automatically design cost functions that can be used in the segmentation criteria of edge based segmentation methods. Our method was tested in five different segmentation tasks that included 489 objects to be segmented. The final segmentation was compared to manually defined borders with good results [rms errors in pixels: 1.2 (cerebellum), 1.1 (corpus callosum), 1.5 (vertebrae), 1.4 (epicardial), and 1.6 (endocardial) borders], Two major problems of the state-of-the-art edge based image segmentation algorithms were addressed: strong dependency on a close-to-target initialization, and necessity for manual redesign of segmentation criteria whenever new segmentation problem is encountered.
Pantic, M, and Rothkrantz, LJM, "Automatic analysis of facial expressions: The state of the art," IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 22, pp. 1424-1445, 2000.
Abstract: Humans detect and interpret faces and facial expressions in a scene with little or no effort. Still, development of an automated system that accomplishes this task is rather difficult. There are several related problems: detection of an image segment as a face, extraction of the facial expression information. and classification of the expression (e.g., in emotion categories). A system that performs these operations accurately and in real time would form a big step in achieving a human-like interaction between man and machine. This paper surveys the past work in solving these problems. The capability of the human Visual system with respect to these problems is discussed. too, it is meant to serve as an ultimate goal and a guide for determining recommendations for development of an automatic facial expression analyzer.
Wang, YM, and Staib, LH, "Physical model-based non-rigid registration incorporating statistical shape information," MEDICAL IMAGE ANALYSIS, vol. 4, pp. 7-20, 2000.
Abstract: This paper describes two new atlas-based methods of 2D single modality non-rigid registration using the combined power of physical and statistical shape models. The transformations are constrained to be consistent with the physical properties of deformable elastic solids in the first method and those of viscous fluids in the second, to maintain smoothness and continuity. A Bayesian formulation, based on each physical model, an intensity similarity measure, and statistical shape information embedded in corresponding boundary points, is employed to derive more accurate and robust approaches to non- rigid registration. A dense set of forces arises from the intensity similarity measure to accommodate complex anatomical details. A sparse set of forces constrains consistency with statistical shape models derived from a training set. A number of experiments were performed on both synthetic and real medical images of the brain and heart to evaluate the approaches. It is shown that statistical boundary shape information significantly augments and improves physical model- based non-rigid registration and the two methods we present each have advantages under different conditions. (C) 2000 Elsevier Science B.V. All rights reserved.
Ullman, S, and Sali, E, "Object classification using a fragment-based representation," BIOLOGICALLY MOTIVATED COMPUTER VISION, PROCEEDING, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1811, pp. 73-87, 2000.
Abstract: The tasks of visual object recognition and classification are natural and effortless for biological visual systems, but exceedingly difficult to replicate in computer vision systems. This difficulty arises from the large variability in images of different objects within a class, and variability in viewing conditions. In this paper we describe a fragment-based method for object classification. In this approach objects within a class are represented in terms of common image fragments, that are used as building blocks for representing a large variety of different objects that belong to a common class, such as a face or a car. Optimal Fragments are selected from a training set of images based on a criterion of maximizing the mutual information of the fragments and the class they represent. For the purpose of classification the fragments are also organized into types, where each type is a collection of alternative fragments, such as different hairline or eye regions for Face classification. During classification, the algorithm detects fragments of the different types, and then combines the evidence for the detected fragments to reach a final decision. The algorithm verifies the proper arrangement of the fragments and the consistency of the viewing conditions primarily by the conjunction of overlapping fragments. The method is different from previous part-based methods in using class-specific overlapping object fragments of varying complexity. and in verifying the consistent arrangement of the fragments primarily by the conjunction of overlapping detected fragments. Experimental results on the detection of face and car views show that the fragment-based approach can generalize well to completely novel image views within a class while maintaining low mis-classification error rates. We briefly discuss relationships between the proposed method and properties of parts of the primate visual system involved in object perception.
Wu, RY, Ling, KV, and Ng, WS, "Automatic prostate boundary recognition in sonographic images using feature model and genetic algorithm," JOURNAL OF ULTRASOUND IN MEDICINE, vol. 19, pp. 771-782, 2000.
Abstract: This paper describes the development of a model based boundary recognition system for transrectal prostate ultrasonographic images. It consists of two techniques: boundary modeling and boundary searching with model constraints. To achieve higher specificity of the model, a method called feature modeling is derived from the existing point distribution modeling method. To improve the robustness of the searching technique, the genetic algorithm is used. Incremental genetic algorithm with crowding replacement and binary string chromosome type was found experimentally to give good search results. It was shown that the system could recognize the boundary with considerable accuracy and consistency within a few minutes in transrectal ultrasonographic images taken from approximate middle position of the prostate.
Marais, P, and Brady, JM, "Detecting the brain surface in sparse MRI using boundary models," MEDICAL IMAGE ANALYSIS, vol. 4, pp. 283-302, 2000.
Abstract: We introduce a framework for the detection of the brain boundary (arachnoid) within sparse MRI. We use the term sparse to describe volumetric images in which the sampling resolution within the imaging plane is far higher than that of the perpendicular direction. Generic boundary detection schemes do not provide good results for such data. In the scheme we propose, the boundary is extracted using a constrained mesh surface which iteratively approximates a 3D point set consisting of detected boundary points. Boundary detection is based on a database of piecewise constant models, which represent the idealised MR intensity profile of the underlying boundary anatomy. A non-linear matching scheme is introduced to estimate the location of the boundary points using only the intensity data within each image plane. Results are shown for a number of images and are discussed in detail. (C) 2000 Elsevier Science B.V. All rights reserved.
Lohmann, G, and von Cramon, DY, "Automatic labelling of the human cortical surface using sulcal basins," MEDICAL IMAGE ANALYSIS, vol. 4, pp. 179-188, 2000.
Abstract: Human brain mapping aims at establishing correspondences between brain function and brain anatomy. One of the most intriguing problems in this field is the high interpersonal variability of human neuroanatomy which makes studies across many subjects very difficult. The cortical folds ('sulci') often serve as landmarks that help to establish correspondences between subjects. in this paper, we will present a method that automatically detects and attributes neuroanatomical names to the cortical folds using image analysis methods applied to magnetic resonance data of human brains. We claim that the cortical folds can be subdivided into a number of substructures which we call sulcal basins. The concept of sulcal basins allows us to establish a complete parcellation of the cortical surface into separate regions. These regions are neuroanatomically meaningful and can be identified from MR data sets across many subjects. Sulcal basins are segmented using a region growing approach. The automatic labelling is achieved by a model matching technique. (C) 2000 Elsevier Science B.V. All rights reserved.
Kim, JS, Koh, KC, and Cho, HS, "An active contour model with shape regulation scheme," ADVANCED ROBOTICS, vol. 14, pp. 495-514, 2000.
Abstract: This paper presents an active method for locating target objects in images, which is aimed at improving the performance of detecting object boundaries by enhancing the behavioral characteristics of an active contour. The proposed active contour model simulates a mechanical system consisting of two main parts: the first is a rigid fixture, called the 'core' , specifying the expected shape of target boundaries, while the second is an elastic rod attached to the rigid fixture. The elastic rod deforms or moves relative to the rigid core according to the classical laws of the mechanical system, When the initial contour is applied to an image data, it is attracted near the dominant image features, but tries to keep its home shape and simultaneously make the deformation smooth if a deformation is more natural for force equilibrium. This mechanism significantly improves the performance of detecting object boundaries in the presence of some disturbing image features. The active contour is scale invariant, thereby significantly relieving the difficulty in selecting proper values for the model parameters. The values for the model parameters can be selected to make the contour have the desired behaviors around the equilibrium position through the analysis of the vibration mode of the mechanical system. The performance of the proposed method is validated through a series of experiments, which include detection of heavily degraded objects, tracking of objects under non-rigid motion and comparisons with the original snake models.
Luo, B, and Hancock, ER, "Alignment and correspondence using singular value decomposition," ADVANCES IN PATTERN RECOGNITION, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1876, pp. 226-235, 2000.
Abstract: This paper casts the problem of point-set alignment and correspondence into a unified framework. The utility measure underpinning the work is the cross-entropy between probability distributions for alignment and assignment errors. We show how Procrustes alignment parameters and correspondence probabilities can be located using dual singular value decompositions. Experimental results using both synthetic and real images are given.
Bishop, CM, and Winn, JM, "Non-linear Bayesian image modelling," COMPUTER VISION - ECCV 2000, PT I, PROCEEDINGS, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1842, pp. 3-17, 2000.
Abstract: In recent years several techniques have been proposed for modelling the low-dimensional manifolds, or 'subspaces', of natural images. Examples include principal component analysis (as used for instance in 'eigen-faces'), independent component analysis, and auto-encoder neural networks. Such methods suffer from a number of restrictions such as the limitation to linear manifolds or the absence of a probablistic representation. In this paper we exploit recent developments in the fields of variational inference and latent variable models to develop a novel and tractable probabilistic approach to modelling manifolds which can handle complex non-linearities. Our framework comprises a mixture of sub-space components in which both the number of components and the effective dimensionality of the sub-spaces are determined automatically as part of the Bayesian inference procedure. We illustrate our approach using two classical problems: modelling the manifold of face images and modelling the manifolds of hand-written digits.
Rogers, M, Graham, J, and Malik, RA, "Exploiting weak shape constraints to segment capillary images in microangiopathy," MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION - MICCAI 2000, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1935, pp. 717-726, 2000.
Abstract: Microangiopathy is one form of pathology associated with peripheral neuropathy in diabetes. Capillaries imaged by electron microscopy show a complex textured appearance, which makes segmentation difficult. Considerable variation occurs among boundaries manually positioned by human experts. Detection of region boundaries using Active Contour Models has proved impractical due to the existence of confusing image evidence in the vicinity of these boundaries. Despite the fact that the shapes have no identifying landmarks, the weak constraints imposed by statistical shape modelling combined with genetic search can provide accurate segmentations.
Shen, DG, and Davatzikos, C, "Adaptive-focus statistical shape model for segmentation of 3D MR structures," MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION - MICCAI 2000, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1935, pp. 206-215, 2000.
Abstract: This paper presents a deformable model for automatically segmenting objects from volumetric MR images and obtaining point correspondences, using geometric and statistical information in a hierarchical scheme. Geometric information is embedded into the model via an affine-invariant attribute vector, which characterizes the geometric structure around each model point from a local to a global level. Accordingly, the model deforms seeking boundary points with similar attribute vectors. This is in contrast to most deformable surface models, which adapt to nearby edges without considering the geometric structure. The proposed model is adaptive in that it initially focuses on the most reliable structures of interest, and subsequently switches focus to other structures as those become closer to their respective targets and therefore more reliable. The proposed techniques have been used to segment boundaries of the ventricles, the caudate nucleus, and the lenticular nucleus from volumetric MR images.
Montagnat, J, and Delingette, H, "Space and time shape constrained deformable surfaces for 4D medical image segmentation," MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION - MICCAI 2000, LECTURE NOTES IN COMPUTER SCIENCE, vol. 1935, pp. 196-205, 2000.
Abstract: The aim of this work is to automatically extract quantitative parameters from time sequences of 3D images (4D images) suited to heart pathology diagnosis. In this paper, we propose a framework for the reconstruction of the left ventricle motion from 4D images based on 4D deformable surface models. These 4D models are represented as a time sequence of 3D meshes whose deformation are correlated during the cardiac cycle. Both temporal and spatial constraints based on prior knowledge of heart shape and motion are combined to improve the segmentation accuracy. In contrast to many earlier approaches, our framework includes the notion of trajectory constraint. We have demonstrated the ability of this segmentation tool to deal with noisy or low contrast images on 4D MR, SPECT, and US images.
Duncan, JS, and Ayache, N, "Medical image analysis: Progress over two decades and the challenges ahead," IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 22, pp. 85-106, 2000.
Abstract: The analysis of medical images has been woven into the fabric of the Pattern Analysis and Machine Intelligence (PAMI) community since the earliest days of these Transactions. Initially, the efforts in this area were seen as applying pattern analysis and computer vision techniques to another interesting dataset. However, over the last two to three decades, the unique nature of the problems presented within this area of study have led to the development of a new discipline in its own right. Examples of these include: the types of image information that are acquired, the fully three- dimensional image data, the nonrigid nature of object motion and deformation, and the statistical variation of both the underlying normal and abnormal ground truth. In this paper, we look at progress in the field over the last 20 years and suggest some of the challenges that remain for the years to come.
Garrido, A, and de la Blanca, NP, "Applying deformable templates for cell image segmentation," PATTERN RECOGNITION, vol. 33, pp. 821-832, 2000.
Abstract: This paper presents an automatic method. based on the deformable template approach, for cell image segmentation under severe noise conditions. We define a new methodology, dividing the process into three parts: (1) obtain evidence from the image about the location of the cells, (2) use this evidence to calculate an elliptical approximation of these locations; (3) refine cell boundaries using locally deforming models. We have designed a new algorithm to locate cells and propose an energy function to be used together with 3 stochastic deformable template model. Experimental results show that this approach for segmenting cell images is both Fast and robust, and that this methodology may be used for automatic classification as part of a computer-aided medical decision making technique. (C) 2000 Pattern Recognition Society. Published by Elsevier Science Ltd, All rights reserved.
Suri, JS, Haralick, RM, and Sheehan, FH, "Greedy algorithm for error correction in automatically produced boundaries from low contrast ventriculograms," PATTERN ANALYSIS AND APPLICATIONS, vol. 3, pp. 39-60, 2000.
Abstract: Non-homogeneous mixing of the dye with the blood in the left ventricle chamber of the heart causes poor contrast in the ventriculograms. The pixel-based classifiers [1] operating on these ventriculograms yield boundaries which are not close to ground truth boundaries as delineated by the cardiologist. They have a mean boundary error of 6.4 mm and an error of 12.5 mm in the apex zone. These errors have a systematic positional and orientational bias, the boundary being under-estimated in the apex zone. This paper discusses two calibration methods: the identical coefficient and the independent coefficient to remove these systematic biases. From these methods, we constitute a fused algorithm which reduces the boundary error compared to either of the calibration methods. The algorithm, in a greedy way, computes which and how many vertices of the left ventricle boundary can be taken from the computed boundary of each method in order to best improve the performance. The corrected boundaries have a mean error of less than 3.5 mm with a standard deviation of 3.4 mm over the approximately 6 x 10(4) vertices in the data set of 291 studies. Our method reduces the mean boundary error by 2.9 mm over the boundary produced by the classifier. We also show that the calibration algorithm performs better in the apex zone where the dye is unable to propagate. For end diastole, the: algorithm reduces the error in the apex zone by 8.5 mm over the pixel-based classifier boundaries.
Pentland, A, "Looking at people: Sensing for ubiquitous and wearable computing," IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 22, pp. 107-119, 2000.
Abstract: The research topic of looking at people, that is, giving machines the ability to detect, track, and identify people and more generally, to interpret human behavior, has become a central topic in machine vision research. Initially thought to be the research problem that would be hardest to solve, it has proven remarkably tractable and has even spawned several thriving commercial enterprises. The principle driving application for this technology is "fourth generation" embedded computing: "smart"' environments and portable or wearable devices. The key technical goals are to determine the computer's context with respect to nearby humans (e.g., who, what, when, where, and why) so that the computer can act or respond appropriately without detailed instructions. This paper will examine the mathematical tools that have proven successful, provide a taxonomy of the problem domain, and then examine the state-of-the-art. Four areas will receive particular attention: person identification, surveillance/monitoring, 3D methods, and smart rooms/perceptual user interfaces. Finally, the paper will discuss some of the research challenges and opportunities.

2001

Sui, L, Haralick, RM, and Sheehan, FH, "A knowledge-based boundary delineation system for contrast ventriculograms," IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, vol. 5, pp. 116-132, 2001.
Abstract: Automated left-ventricle (LV) boundary delineation from contrast ventriculograms has been studied for decades. Unfortunately, no accurate methods have ever been reported. A new knowledge based multistage method to automatically delineate the LV boundary at end diastole (ED) and end systole (ES) is discussed in this paper. It has a mean absolute boundary error or about 2 mm and an associated ejection fraction error of about 6%. The method makes extensive use of knowledge about LV shape and movement, The processing includes a multiimage pixel region classification, shape regression, and rejection classification. The method was trained and cross- validated tested on a database of 375 studies whose ED and ES boundary had been manually traced as the ground truth. The cross-validated results presented in this paper show that the accuracy is close to and slightly above the interobserver variability.
Hermann, NV, Jensen, BL, Dahl, E, Darvann, TA, and Kreiborg, S, "A method for three-projection infant cephalometry," CLEFT PALATE-CRANIOFACIAL JOURNAL, vol. 38, pp. 299-316, 2001.
Abstract: Objective: To assess morphology and growth in infants and children with craniofacial anomalies based on comprehensive digitization of radiographic films in three, mutually orthogonal projections. Method: The method consists of (1) acquisition of radiographic films in a highly standardized three-projection (lateral, frontal, and axial) cephalometer, (2) marking and digitization of a total of 279 anatomical landmarks in the three projections, and (3) computation and presentation (tabular and graphical) of 356 linear and angular variables describing the craniofacial morphology, including soft tissue. Computation of statistical entities describing a patient, a group of patients, the differences between patients or groups of patients was carried out, Error assessment of the method involved investigation of error distribution among a number of error sources. Duplicate digitization of radiographic films from 30 randomly selected patients, and from in dry skulls, was carried out to determine the errors contributed by the procedure of landmark digitization and the distribution of error among landmarks and variables, as well as between projections. Results: The average error due to landmark digitization, s(i), determined by duplicate digitization and calculated by use of Dahlberg's formula was 0.8 mm for linear variables and 1.6 degrees for angular variables. Conclusion: This method of infant cephalometry has been shown to be highly accurate and reproducible, and it adds significant new potential for, e.g., asymmetry detection, population comparison, and growth measurements compared to other cephalometric techniques due to its standardized acquisition and digitization protocol, inclusion of an axial projection, and the large number of well-defined landmarks and variables involved.
Varekamp, C, and Hoekman, DH, "Segmentation of high-resolution InSAR data of a tropical forest using Fourier parameterized deformable models," INTERNATIONAL JOURNAL OF REMOTE SENSING, vol. 22, pp. 2339-2350, 2001.
Abstract: Currently, tree maps are produced from field measurements that are time consuming and expensive. Application of existing techniques based on aerial photography is often hindered by cloud cover. This has initiated research into the segmentation of high resolution airborne interferometric Synthetic Aperture Radar (SAR) data for deriving tree maps. A robust algorithm is constructed to optimally position closed boundaries. The boundary of a tree crown will be best approximated when at all points on the boundary, the z-coordinate image gradient is maximum, and directed inwards orthogonal to the boundary. This property can be expressed as the result of a line integral along the boundary. Boundaries with a large value for the line integral are likely to be tree crowns. This paper focuses on the search procedure and on illustrating how smoothing can be used to prevent the search from becoming trapped in a local optimum. The final crown detection stage is not described in this paper but could be based on the gradient and implemented using the above described value for the line integral. Results of this paper indicate that a Fourier parametrization with only three harmonics (nine parameters) can describe the shape variation in the 2D crown projection in sufficient detail. Current ground datasets are not suitable for obtaining detection statistics such as the percentage of tree crowns detected and the number of false alarms. Better ground datasets will be needed to evaluate algorithm performance for real tree mapping situations.
Chang, IC, and Huang, CL, "Skeleton-based walking motion analysis using hidden Markov model and active shape models," JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, vol. 17, pp. 371-403, 2001.
Abstract: This paper proposes a skeleton-based human walking motion analysis system which consists of three major phases. In the first phase, it extracts the human body skeleton from the background and then obtains the body signatures. In the second phase, it analyzes the training sequences to generate statistical models. In the third phase, it uses the trained models to recognize the input human motion sequence and calculate the motion parameters. The experimental results demonstrate how our system can recognize the motion type and describe the motion characteristics of the image sequence. Finally, the synthesized motion sequences are illustrated. The major contributions of this paper are: (1) development of a skeleton-based method and use of Hidden Markov Models (HMM) to recognize the motion type; (2) incorporation of the Active Shape Models (ASMs) and the body structure characteristics to generate the motion parameter curves of the human motion.
Wohler, C, and Anlauf, JK, "Real-time object recognition on image sequences with the adaptable time delay neural network algorithm - applications for autonomous vehicles," IMAGE AND VISION COMPUTING, vol. 19, pp. 593-618, 2001.
Abstract: Within the framework of the vision-based "Intelligent Stop&Go" driver assistance system for both the motorway and the inner city environment, we present a system for segmentation-free detection of overtaking vehicles and estimation of ego-position on motorways as well as a system for the recognition of pedestrians in the inner city traffic scenario. Both systems are running in real-time in the test vehicle UTA of the DaimlerChrysler computer vision lab, relying on the adaptable time delay neural network (ATDNN) algorithm. For object recognition, this neural network processes complete image sequences at a time instead of single images, as it is the case in most conventional neural algorithms. The results are promising in that using the ATDNN algorithm, we are able to perform the described recognition tasks in a large variety of real-world scenarios in a computationally highly efficient and rather robust and reliable manner. (C) 2001 Elsevier Science Ltd All rights reserved.
Pycock, D, Pammu, S, Goode, AJ, and Harman, SA, "Robust model-based signal analysis and identification," PATTERN RECOGNITION, vol. 34, pp. 2181-2199, 2001.
Abstract: We describe and evaluate a model-based scheme for feature extraction and model-based signal identification which uses likelihood criteria for "edge" detection. Likelihood measures from the feature identification process are shown to provide a well behaved measure of signal interpretation confidence. We demonstrate that complex, transient signals, from one of 6 classes, can reliably be identified at signal to noise ratios of 2 and that identification does not fail until the signal to noise ratio has reached 1. Results show that the loss in identification performance resulting from the use of a heuristic, rather than an exhaustive, search strategy is minimal. Crown Copyright (C) 2001 Published by Elsevier Science Ltd on behalf of the Pattern Recognition Society. All rights reserved.
Dubuisson-Jolly, MP, and Gupta, A, "Tracking deformable templates using a shortest path algorithm," COMPUTER VISION AND IMAGE UNDERSTANDING, vol. 81, pp. 26-45, 2001.
Abstract: This paper proposes a new technique to track deformable templates. We extend the typical graph algorithms that have been used for active contour recovery to incorporate shape information. The advantage of graph algorithms is that they are guaranteed to find the global minimum of the energy function. The difficulty with their traditional use for active contours is that they consider only two pixels at a time when recovering the contour, making it impossible to enforce shape constraints. We define the deformable template as a polygonal contour, demonstrate the proper mapping between the image, the contour, and a graph, and show how to apply Dijkstra's algorithm to track contours in image sequences. Examples are shown for deforming contours, articulated objects, and smooth contours being tracked in simple and complicated backgrounds. We also provide an analysis of the computational requirements. (C) 2001 Academic Press.
Huang, CL, and Jeng, SH, "A model-based hand gesture recognition system," MACHINE VISION AND APPLICATIONS, vol. 12, pp. 243-258, 2001.
Abstract: This paper introduces a model-based hand gesture recognition system, which consists of three phases: feature extraction, training, and recognition. In the feature extraction phase, a hybrid technique combines the spatial (edge) and the temporal (motion) information of each frame to extract the feature images. Then, in the training phase, we use the principal component analysis (PCA) to characterize spatial shape variations and the hidden Markov models (HMM) to describe the temporal shape variations. A modified Hausdorff distance measurement is also applied to measure the similarity between the feature images and the pre-stored PCA models. The similarity measures are referred to as the possible observations for each frame. Finally, in recognition phase, with the pre-trained PCA models and HMM, we can generate the observation patterns from the input sequences and then apply the Viterbi algorithm to identify the gesture. In the experiments, we prove that our method can recognize 18 different continuous gestures effectively.
Shearer, K, Wong, KD, and Venkatesh, S, "Combining multiple tracking algorithms for improved general performance," PATTERN RECOGNITION, vol. 34, pp. 1257-1269, 2001.
Abstract: Automated tracking of objects through a sequence of images has remained one of the difficult problems in computer vision. Numerous algorithms and techniques have been proposed for this task. Some algorithms perform well in restricted environments, such as tracking using stationary camel as, but a general solution is not currently available. A frequent problem is that when an algorithm is refined for one application, it becomes unsuitable for other applications, This paper proposes a general tracking system based on a different approach. Rather than refine one algorithm for a specific tracking task, two tracking algorithms are employed, and used to correct each other during the tracking task. By choosing the two algorithms such that they have complementary failure modes, a robust algorithm is created without increased specialisation. (C) 2001 Pattern Recognition Society. Published by Elsevier Science Ltd. All rights reserved.
Shen, DG, Herskovits, EH, and Davatzikos, C, "An adaptive-focus statistical shape model for segmentation and shape modeling of 3-D brain structures," IEEE TRANSACTIONS ON MEDICAL IMAGING, vol. 20, pp. 257-270, 2001.
Abstract: This paper presents a deformable model for automatically segmenting brain structures from volumetric magnetic resonance (MR) images and obtaining point correspondences, using geometric and statistical information in a hierarchical scheme. Geometric information is embedded into the model via a set of affine-invariant attribute vectors, each of which characterizes the geometric structure around a point of the model from a local to a global scale, The attribute vectors, in conjunction with the deformation mechanism of the model, warranty that the model not only deforms to nearby edges, as is customary in most deformable surface models, but also that it determines point correspondences based on geometric similarity at different scales. The proposed model is adaptive in that it initially focuses on the most reliable structures of interest, and gradually shifts focus to other structures as those become closer to their respective targets and, therefore, more reliable. The proposed techniques have been used to segment boundaries of the ventricles, the caudate nucleus, and the lenticular nucleus from volumetric MR images.
Chang, CC, and Tsai, WH, "Vision-based tracking and interpretation of human leg movement for virtual reality applications," IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, vol. 11, pp. 9-24, 2001.
Abstract: A vision-based system for tracking and interpreting leg motion in image sequences using a single camera is developed for a user to control his movement in the virtual world by his legs. Twelve control commands are defined. The trajectories of the color marks placed on the shoes of the user are used to determine the types of leg movement by a first-order Markov process. Then, the types of leg movement are encoded symbolically as input to Mealy machines to recognize the control command associated with a sequence of leg movements. The proposed system is implemented on a commercial PC without any special hardware. Because the transition functions of Mealy machines are deterministic, the implementation of the proposed system is simple and the response time of the system is short, Experimental results with a 14-Hz frame rate on 320 x 240 image resolution are included to prove the feasibility of the proposed approach.
Sullivan, J, Blake, A, Isard, M, and MacCormick, J, "Bayesian object localisation in images," INTERNATIONAL JOURNAL OF COMPUTER VISION, vol. 44, pp. 111-135, 2001.
Abstract: A Bayesian approach to intensity-based object localisation is presented that employs a teamed probabilistic model of image filter-bank output, applied via Monte Carlo methods, to escape the inefficiency of exhaustive search. An adequate probabilistic account of image data requires intensities both in the foreground (i.e, over the object), and in the background, to be modelled. Some previous approaches to object localisation by Monte Carlo methods have used models which, we claim, do not fully address the issue of the statistical independence of image intensities. It is addressed here by applying to each image a bank of filters whose outputs are approximately statistically independent. Distributions of the responses of individual filters, over foreground and background, are learned from training data. These distributions are then used to define a joint distribution for the output of the filter bank, conditioned on object configuration, and this serves as an observation likelihood for use in probabilistic inference about localisation. The effectiveness of probabilistic object localisation in image clutter, using Bayesian Localisation, is illustrated. Because it is a Monte Carlo method, it produces not simply a single estimate of object configuration, but an entire sample from the posterior distribution for the configuration. This makes sequential inference of configuration possible. Two examples are illustrated here: coarse to fine scale inference, and propagation of configuration estimates over time, in image sequences.
Rolfe, BF, Cardew-Hall, M, Abdallah, SM, and West, GAW, "Geometric shape errors in forging: developing a metric and an inverse model," PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART B- JOURNAL OF ENGINEERING MANUFACTURE, vol. 215, pp. 1229-1240, 2001.
Abstract: The complexity of the forging process ensures that there is inherent variability in the geometric shape of a forged part. While knowledge of shape error, comparing the desired versus the measured shape, is significant in measuring partiquality the question of more interest is what can this error suggest about the forging process set-up? The first contribution of this paper is to develop a shape error metric which identifies geometric shape differences that occur from a desired forged part. This metric is based on the point distribution deformable model developed in pattern recognition research. The second contribution of this paper is to propose an inverse model that identifies changes in process set-up parameter values: by analysing the proposed shape error metric. The metric and inverse models are developed using two sets of simulated hot- forged parts created using two different die pairs (simple and 'M'-shaped die pairs). A neural network is used to classify the shape data into three arbitrarily chosen levels for each parameter and it is accurate to at least 77 per cent in the worst case for the simple die pair data and has an average accuracy of approximately 80 per cent when classifying the more complex 'M'-shaped die pair data.
Gavrila, DM, "Sensor-based pedestrian protection," IEEE INTELLIGENT SYSTEMS, vol. 16, pp. 77-81, 2001.
Abstract: Objectives: To evaluate the potential for machine, learning techniques to identify objective criteria for classifying vertical facial deformity. Methods: 19 parameters were determined from 131 lateral skull radiographs. Classifications were induced from raw data with simple visualisation, C5.0 and Kohonen feature maps; and using a Point Distribution Model (PDM) of shape templates comprising points taken from digitised radiographs. Results: The induced decision trees enable a direct of clinicians' idiosyncrosies in classification. Unsupervised algorithms induce models that are potentially more objective, but their blackbox nature makes them unsuitable for clinical application. The PDM methodology gives dramatic visualisations of two modes separating horizontal and vertical facial Kohonen feature maps favour one clinician and PDM the other. Clinical response suggests that while Clinician 1 places greater weight on 5 of 6 parameters, Clinician 2 relies on more parameters;that capture facial shape, Conclusions: While machine learning and statistical analyses classify subjects for vertical facial height they have limited application in their present form. The supervised learning algorithm C5.0 is effective for generating rules for individual clinicians but its inherent bias invalidates its use for objective classification of facial form for research purposes. On the otherhand, promising results from unsupervised strategies (especially the POW suggest a potential use for objective classification and further identification and analysis of ambiguous cases. At present, such methodologies may be unsuitable tor clinical application because of the invisibility of their underlying processes. Further study is required with additional patient data and a wider group of clinicians.
Hammond, P, Hutton, TJ, Nelson-Moon, ZL, Hunt, NP, and Madgwick, AJA, "Classifying vertical facial deformity using supervised and unsupervised learning," METHODS OF INFORMATION IN MEDICINE, vol. 40, pp. 365-372, 2001.
Abstract: Objectives: To evaluate the potential for machine, learning techniques to identify objective criteria for classifying vertical facial deformity. Methods: 19 parameters were determined from 131 lateral skull radiographs. Classifications were induced from raw data with simple visualisation, C5.0 and Kohonen feature maps; and using a Point Distribution Model (PDM) of shape templates comprising points taken from digitised radiographs. Results: The induced decision trees enable a direct of clinicians' idiosyncrosies in classification. Unsupervised algorithms induce models that are potentially more objective, but their blackbox nature makes them unsuitable for clinical application. The PDM methodology gives dramatic visualisations of two modes separating horizontal and vertical facial Kohonen feature maps favour one clinician and PDM the other. Clinical response suggests that while Clinician 1 places greater weight on 5 of 6 parameters, Clinician 2 relies on more parameters;that capture facial shape, Conclusions: While machine learning and statistical analyses classify subjects for vertical facial height they have limited application in their present form. The supervised learning algorithm C5.0 is effective for generating rules for individual clinicians but its inherent bias invalidates its use for objective classification of facial form for research purposes. On the otherhand, promising results from unsupervised strategies (especially the POW suggest a potential use for objective classification and further identification and analysis of ambiguous cases. At present, such methodologies may be unsuitable tor clinical application because of the invisibility of their underlying processes. Further study is required with additional patient data and a wider group of clinicians.
Montagnat, J, Delingette, H, and Ayache, N, "A review of deformable surfaces: topology, geometry and deformation," IMAGE AND VISION COMPUTING, vol. 19, pp. 1023-1040, 2001.
Abstract: Deformable models have raised much interest and found various applications in the fields of computer vision and medical imaging. They provide an extensible framework to reconstruct shapes. Deformable surfaces, in particular, are used to represent 3D objects. They have been used for pattern recognition [Computer Vision and Image Understanding 69(2) (1998) 201; IEEE Transactions on Pattern Analysis and Machine Intelligence 19(10) (1997) 1115], computer animation [ACM Computer Graphics (SIGGRAPH'87) 21(4) (1987) 205], geometric modelling [Computer Aided Design (CAD) 24(4) (1992) 178], simulation [Visual Computer 16(8) (2000) 437], boundary tracking [ACM Computer Graphics (SIGGRAPH'94) (1994) 185], image segmentation [Computer Integrated Surgery, Technology and Clinical Applications (1996) 59; IEEE Transactions on Medical Imaging 14 (1995) 442; Joint Conference on Computer Vision, Virtual Reality and Robotics in Medicine (CVRMed-MRCAS'97) 1205 (1997) 13; Medical Image Computing and Computer-Assisted Intervention (MICCAI'99) 1679 (1999) 176; Medical Image Analysis 1(1) (1996) 19], etc. In this paper we propose a survey on deformable surfaces. Many surface representations have been proposed to meet different 3D reconstruction problem requirements. We classify the main representations proposed in the literature and we study the influence of the representation on the model evolution behavior, revealing some similarities between different approaches. (C) 2001 Elsevier Science B.V. All rights reserved.
Inoue, A, Drummond, T, and Cipolla, R, "Real time feature-based facial tracking using Lie algebras," IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, vol. E84D, pp. 1733-1738, 2001.
Abstract: We have developed a novel human facial tracking system that operates in real time at a video frame rate without needing any special hardware. The approach is based on the use of Lie algebra, and uses three-dimensional feature points on the targeted human face. It is assumed that the roughly estimated facial model (relative coordinates of the three-dimensional feature points) is known. First, the initial feature positions of the face are determined using a model fitting technique. Then, the tracking is operated by the following sequence: (1) capture the new video frame and render feature points to the image plane; (2) search for new positions of the feature points on the image plane; (3) get the Euclidean matrix from the moving vector and the three-dimensional information for the points; and (4) rotate and translate the feature points by using the Euclidean matrix, and render the now points oil the image plane, The key algorithm of this tracker is to estimate the Euclidean matrix by using a least square technique based oil Lie algebra. Thc resulting tracker performed very well on the task of tracking a human face.
Chung, FL, and Ip, WWS, "Complex character decomposition using deformable model," IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C- APPLICATIONS AND REVIEWS, vol. 31, pp. 126-132, 2001.
Abstract: Despite the facts that Chinese characters are composed of radicals and Chinese usually formulate their knowledge of Chinese characters as a combination of radicals, very few studies have focused on a character decomposition approach to recognition, i.e., recognizing a character by first extracting and recognizing its radicals. In this paper, such an approach is adopted and the problem of how to extract radical subimages from character images is addressed by proposing an algorithm based on a deformable model (DM). The application of a DM to complex character decomposition (and recognition) is a novel one and concepts like goodness of character decomposition have been exploited to formulate appropriate energy terms and to devise cost effective minimization schemes for the problem. The advantage of the character decomposition approach is demonstrated by feeding the extracted radical images to an existing structure-based Chinese character recognizer, the outputs of which are then combined to classify the input. Simulation results show that the performance of the existing system can be improved significantly when character decomposition is used.
Mitchell, SC, Lelieveldt, BPF, van der Geest, RJ, Bosch, HG, Reiber, JHC, and Sonka, M, "Multistage hybrid active appearance model matching: Segmentation of left and right ventricles in cardiac MR images," IEEE TRANSACTIONS ON MEDICAL IMAGING, vol. 20, pp. 415-423, 2001.
Abstract: A fully automated approach to segmentation of the left and right cardiac ventricles from magnetic resonance (MR) images is reported. A novel multistage hybrid appearance model methodology is presented in which a hybrid active shape model/active appearance model (AAM) stage helps avoid local minima of the matching function. This yields an overall more favorable matching result, An automated initialization method is introduced making the approach fully automated. Our method was trained in a set of 102 MR images and tested in a separate set of 60 images. In all testing cases, the matching resulted in a visually plausible and accurate mapping of the model to the image data, Average signed border positioning errors did not exceed 0.3 mm in any of the three determined contours-left- ventricular (LV) epicardium, LV and right-ventricular (RV) endocardium. The area measurements derived from the three contours correlated well with the independent standard (r = 0.96, 0.96, 0.90), with slopes and intercepts of the regression lines close to one and zero, respectively. Testing the reproducibility of the method demonstrated an unbiased performance with small range of error as assessed via Bland- Altman statistic. In direct border positioning error comparison, the multistage method significantly outperformed the conventional AAM (p < 0.001), The developed method promises to facilitate fully automated quantitative analysis of LV and RV morphology and function in clinical setting.
Chang, JY, and Chen, JL, "Automated facial expression recognition system using neural networks," JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, vol. 24, pp. 345-356, 2001.
Abstract: This paper proposes an automated facial expression recognition system using neural network classifiers. First, we use the rough contour. estimation routine, mathematical morphology, and point contour detection method to extract the precise contours of the eyebrows, eyes, and mouth of a face image. Then we define 30 facial characteristic points to describe the position and shape of these three facial features. Facial expressions can be described by combining different action units, which are specified by the basic muscle movements of a human face. We choose six main action units, composed of facial characteristic point movements, as the input vectors of two different neural network-based expression classifiers including a radial basis function network and a multilayer perceptron network. Using these two networks, we have obtained recognition rates as high as 92.1 % in categorizing the facial expressions neutral, anger, or happiness. Simulation results by the computer demonstrate that computers are capable of extracting high-level or abstract information like humans.
Cootes, TF, Edwards, GJ, and Taylor, CJ, "Active appearance models," IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 23, pp. 681-685, 2001.
Abstract: We describe a new method of matching statistical models of appearance to images. A set of model parameters control modes of shape and gray-level variation learned from a training set. We construct an efficient iterative matching algorithm by learning the relationship between perturbations in the model parameters and the induced image errors.

2002

Sanchez, G, Llados, J, and Tombre, K, "A mean string algorithm to compute a set of 2D shapes the average among," PATTERN RECOGNITION LETTERS, vol. 23, pp. 203-213, 2002.
Abstract: An algorithm to compute the mean shape, when the shape is represented by a string, is presented as a modification of the well-known string edit algorithm. Given N strings of symbols, a string edit sequence defines a mapping between their corresponding symbols. We transform these sets of mapped symbols (edges) into piecewise linear functions and we compute their mean. To transform them into functions, we use the equation of the line defining their edges, and the percentage of their length, in order to have a common parameterization. The algorithm has been experimentally tested in the computation of a representative among a class of shapes in a clustering procedure in the domain of a graphics recognition application. (C) 2002 Elsevier Science B.V. All rights reserved.