[MSc] [Resources] [Terminology] [MCQs]


Terminology in Computer Vision

Adaptive Thresholding
The process of thresholding with the automatic local selection of a theshold value. Generally this value is estimated from local image content.

Additive and Multiplicative Noise
The name given to the dominant dependancy of the random noise process on the observered signal.

Apeture Problem
The problem encountered in "motion" analysis that extended linear "features" cannot provide local infromation regarding motion in any direction other than perpendicular to the line.

Total number of pixel locations in a segmented region.

Artificial Neural Networks
A name given to the computational models implemented on computers (generally in software) which mimic certain aspects of biological brain function. They are generally distinguished according to their "training" paradigm as either "supervised" or "unsupervised" which are used to adjust free parameters within the system in order to achieve a particular task. Under certain circumstances ANNs can be shown to be "Bayes Optimal".

Bayes Decision Rule
A classification rule which assigns a classification category on the basis of a set of evidence according to the class model whose conditional probability (given the evidence) is highest. The resulting "Bayes Error" rate is the theoretical optimal performance for a forced classification task. See "Bayes Theory".

Bayes Theory
An equation derived from "probability theory" which allows the probability to be computed that a particular process generated a particular set of data. It involves the use of both "conditional" and "prior" probabilities. The former can generally be computed from a model or sample data. The latter is the expected relative frequency that a particular process would have occurred.

Biological Vision
The process of modelling the underlying computational aspects behind human (or other) visual perception. The existence proof that "image understanding" must be possible and one approach to guiding the design of "computer vision" systems.

The process in image formation which removes predominantly high frequency components from the image. The physical analog of "Image Smoothing".

Calibrated Stereo
The name given to stereo algorithms which return accurate metric information rergarding the real 3D structure of a scene. This should be contrasted with uncalibrated stereo algorithms which solve the correspondence problem but return geometrical measurements only up to an unknown transformation.

The process of assigning one of a limited set of alternative interpretations to (the generator of) a set of data. Often requires the steps of
the computation of relative probabilities (or a quantity related to them) followed by the application of a decision rule. All classification processes can be evaluated in terms of "detection" and "mis-classification" rates. See "receiver operator characterstics".

Colour Images
Colour in images is generally represented as a triplet of values at each image location ("pixel") in one of either "chromaticity", "hue saturation" or "YIQ" representations. These representations can be inter-converted but "hue saturation" is the one which is more directly useful in image interpretation as it separates intensity (which relates closely to surface shape) from chromatic properties (which relates to scene content).

Complex Images
Complex images are represented as a pair of values at each image location ("pixel") corresponding to the real and imaginary parts of a complex number. Such images are generated by "image processing" algorithms such as the 2D "Fourier Transform" though may also provide a useful representation for image data in a wide variety of algorithms.

Computer Vision
Compter Vision is the subject area which deals with the automatic analysis of images for the purposes of quantification or system control (often mimicking tasks which humans find trivial). It is to be distinguised from "Image Processing" which deals only with the computational processes applied to images, including enhancement and compression, but does not deal with abstract representation for the purposes of reasoning and interpretation. Compter Vision can be seen as the inverse of Computer Graphics, though generally the representations and methods of this area are not of use in Computer Vision due to the incomplete and therefore ambiguous nature of images. This requires prior knowledge to be used in order to obtain robust scene interpretation.

Connected Region
A set of "features" within a scene where the property of "connectivity" is defined between all features.

This is defined on a binary image of labelled (or classified) pixels as the property of being able to find a path beteen two similar labels by moving only along adjacent similarly labelled pixels. Adjacency can be defined in several ways according to the image lattice, in particular rectangular lattices can have four or eight-way adjacency depending on how diagonal pixels are treated. Hexagonal lattices have six-way adjacency.

A generic term relating to the degree of visual difference between a particular "object" in an image and the rest of the scene or background. In order to make this concept quantitative, using for example grey levels, account must be taken of the variation in the "features" or measurement accuracy. The contrast within an image can often be changed for the purposes of visualisation, though this will have no effect on the true information content of the image from the point of view of processing. Contrast is often used as a property of a feature for the purposes of matching or correspondence.

Contrast Manipulation
The adjustment of image grey levels to enhance the visual appearance of certain "objects" by increasing the difference between its grey level values and its surroundings. Achieved by "mapping" the initial grey scale onto a different set of values. The most common form of this is "histogram equalisation".

Convolution is a linear operator. It involves the construction of an output image by computing each pixel as a weighted sum of a local region of the input image with a fixed fixed array or "convolution mask". The process is mathematically equivalent to many physical processes which occur during image formation, particularly optical blurring, though it is also useful for a variety of tasks including noise filtering. The computational requirements for large masks has led to the development of several approaches for efficient implementation. In particular, Fourier techniques, based on the "Convolution Theorem", which also makes possible the process of "deconvolution" which can be used for image enhancement (particularly de-blurring)".

Edge features can only be localised in one dimension on the image plane due to the "aperture problem". Corners can effectively be defined as any image feature which can be located reliably in 2D. This includes catergories of feature detector algorithms reffered to as "interest operators". Corner detectors are often based on discrete approximations to the product of image curvature and gradient. As a consequence "corners" can also be defined along edge strings as points of high curvature.

The process of associating "features" from two images of the same scene as belonging to the same scene feature. This is a fundamental process in both feature based stereo and motion analysis. The problem is often complicated by incomplete feature detection (unreliability or spatial occlusion) and potential matching ambiguities.

A representation of multi-parameter measurement accuracy in the form of a matrix which defines the size and orientation of a quadratic log likelihood function.

Curve Fitting
The process of determining a set of parameters forming the representation of a continuous curve which "best" describes a discrete set of data, generally in a "least squares" sense. The most common form of curves are lines, circles, ellipses, general conic sections and splines. The process is useful for data reduction, interpolation and estimation of continuous properties such as derivatives and has also been used as a starting point for "object" location.

Decision Theory
See "Bayes Theory".

See "convolution".

Difference of Gaussian
The image formed by subtracting two "convolutions" of the same image with 2D "Gaussian" kernals at different scales. The process when considered in the "Fourier Domain" can be seen to be equivalent to selecting a particular band of frequency components from the input image. In the spatial domain the observed result can exhibit properties such as edge enhancement.

This is the name given to the physical separation along an "epi-polar" line between equivalent features in a pair of stereo images. The value is simply related to distance from the camera via a reciprocal function.

Digitized Images
A digitized "image" is an array of "grey level","complex" or "colour" values which has been formed by sampling an image at predefined regular (often uniformly spaced) locations.

Dilating and Eroding
These non-linear operators are the basic processes of all morphological image processing operations. If the 2D image is considered as a 3D surface, the process of erosion and dilation can be visualised as etching or coating this surface (respectively) so that the surface expands or contracts. Mathematically the process is represented using a structuring element (analogous to a kernal in convolution) which determines how much to erode or dilate. A shaped structuring element will have an effect which varies depending upon the local surface orientation and can be used to extract or remove particular scales of shaped "features".

One of the major tasks of "computer vision" is the extraction of scene structure from the 2D image. The only "features" which are quantitatively preserved from the scene in the projected scene are differential discontinuities, ie: places at which there are large isolated gradient in the image data. These are edges and this property is referred to as diffeomorphic equivalence. There are several ways of defining and extracting differential discontinuities and these result in four characteristic approaches to extracting edges; gradient, Laplacian, zero crossings and morphological operations. The detection process also implicitly defines a "scale" for the detection process, as some edges at one scale of the image may not be detected or may bifurcate at another.
Edge Convolution Enhancement
The first stage in many "edge" detectors is a process of enhancement which generates an image in which ridges correspond to statistical evidence for an edge. Such a process is often reffered to as edge enhancement and is achieved using linear convolution operators (often named after their inventors) including, "Roberts", "Prewitt", and the early stages of "Canny".

Epi-Polar Line
This is the line in one of a pair of stereo images along which a physical 3D feature selected from the other image must lie. It is defined by the plane which passes through the physical 3D point and the optical centres of the cameras at the line of intersection with the image planes. Computing this line thus requires knowledge of the relative transformation in world coordinates which relates the left and right image planes.

Any automatically "localisable" component of an image. The process of localisation implicitly requires a definition of a set of measurable characteristics.

Feature Matching
The process of assigning correspondence between two sets of "features" extracted from two images of the same (or an equivalent) scene.

Frequency Domain
The "digital image" can be converted from the "spatial domain" into the Fourier Domain by applying a 2D Discrete Fourier Transform. In this form the new frequency domain image is represented with complex (two values) pixels, each of which corresponds to the amplitude or strength of a particular scale and phase of sine curve. Images of repeating patterns or textures will naturally produce compact peaks in this image and because of this the Frequency Domain is often a convenient way of representing scene content.

Fourier Transformation
The Fourier Transform is a linear operation which computes the complex coefficients of the "Frequency Domain" representation. Each term is computed as the correlation of the original signal with a sine or cosine curve of particular phase and frequency. The inverse Fourier Transform allows the process to be reversed, and for discrete data this process is exact. The most commonly used algorithm for computing the Fourier Transform is the FFT (Fast Fourier Transform) which breaks the calculation down into a set of serial stages which minimises the computational effort. This transform is useful for a wide variety of image processing applications, including compression, interpolation and "convolution".

Gaussian Filtering
Gaussian Filtering is the process of image "convolution" with a 2D Gaussian "kernal", generally in order to remove noise or fine scale (high frequency) structure from an image.

Geometric Distortion
Any process (though generally optical) which results in the geometric formation of the image differing from a pin-hole model. The most common form is known as "radial distortion" as the deviation from the standard model has a radial dependancy often reffered to as "pin-cushion" or "barrel" distortion.

Graph Searching
Though this has a generic meaning outside the field of "Computer Vision" it is generally encountered here as a method for incorporating additional knowledge (start and end points and also smoothness and curvature) into the process of edge location. Algorithms generally operate on a discrete lattice (eg pixels) and have a set of rules which define how optimality criteria should be computed according to local constraints. The graph search then identifies a solution that in some sence maximises these criteria.

Grey and Binary Scale Levels
A grey level is a single scalar value associated with a particular location in an image. For optical or photographic sensors this value is proportional (or at least monotonically related to) the measured signal. Some sensors or image processing algorithms require multi-valued data such as complex (two valued) or "colour" (three valued) images. Grey level images generally have integer values in the range 0-64, 0-256 or 0-1024, corresponding to 6-bit, 8-bit, or 10-bit digitisation. The binary image requires less storage, one bit per pixel, and is generally used to represent the presence or absence of a particular "feature" at each point in the image (eg: "edges");.

Grey Level Processing
Grey level processing is the general term given to image processing of "grey level" image data. As distict from "binary processing" which refers to processing of binary (single bit per pixel) images.

High-pass Filtering
The process of selectively attenuating low frequency components of a signal.

A histogram is an array of non negative integer counts from a set of data, which represents the frequency of occurance of values within a set of non-overlapping regions. For example, the image histogram is an array of the frequency of occurrence of grey levels within a particular set of grey level ranges.

Histogram Modification
The creation of a new image by the systematic replacement of grey levels with an alternative set of values. The process of histogram equalisation is a particular case in which the resulting image has a flat image histogram.

Hough Transformation
The Hough Transform is a particular method for selecting a set of parametric values from a specific functional model, which "best" describe a set of data. It works by effectively taking the peak in a "histogram" defined across the parameter space, where the entries are made based on which ranges of parameters can be supported by sub-sets of the data. The technique can be related to probability theory and robust statistics (particularly robust fitting) and works well for low parameter (2-3) models though has problems of computational complexity with higher orders.

An image is two dimensional spatial representation of a group of "objects" (or "scene") which exists in two or more dimensions. It is an intuitive way of presenting data for computer interfaces in the area of graphics, but in machine vision it may be defined as a continuous function of two variables defined within a bounded (generally rectangular) region.

Image Arithmetic
Simple grey level processing which computes an output image from a set of input images from basic arithmetic manipulations of the grey levels at equivalent pixel locations. The process is therefore a "point operator".

Image Processing
The computation of new images from a set of input images for the purpose of making some visual aspects of the image more explicit. It should be distinguised from "Computer Vision" which is a broader area and generally requires working with a far broader range of concepts, data structures and algorithms.

Image Pyramid
A set of copies of an image in which both sample density and resolution are decreased in regular steps. The bottom level is the original image and each successive level is obtained by a filtering operation followed by a sampling operator. The most common form of pyramid are Gaussian, morphological and Laplacian.

Image Reconstruction
The process of reconstructing a noise free uncorrupted image from a corrupted one. Generally achieved using an iterative (often "relaxation") algorithm that makes use of knowledge about local image behaviour in order to obtain the best approximation to the data, which would have been most likely to give rise to the observed data set while simulatneously having the expected local characteristics.

Image Segmentation
The process of assigning classification groups to an image (often on a pixel by pixel basis) in order to isolate (segment) particular regions of scene structure. Often done as a precursor to higher level interpretation such as recognition or measurement (achieved via the processes of "representation" and "classification").

Image Smoothing
The process of removing some quantity of the high frequency components from the "frequency domain" representation of an image. Could equally well be refereed to as "low-pass filtering".

Image Transformation
The process of generating a new image via the process of transforming the pixel coordinate system. There are several simple forms of this including "bi-linear" and "affine" warping, and "image rotation". All such methods require a process of "interpolation" in order to estimate the grey level values at "non-lattice" sites from the values in the input image.

Image Understanding
The process of extracting a real understanding of a scene from the contents of an image. Perhaps in terms of scene content ("objects") but also relational aspects (on top of , in front, etc.) and in terms of characteristic behaviours (doing, and consequences).

Image Interpolation
The process of estimating an intermediate value from a discrete set of data points. For images this is generally done by approximating a local region of the image with some low parameter function and using this function to estimate a value at the intermediate location. The simplest form is known as "bi-linear" interpolation and the commonly accepted standard is known as "sinc" interpolation. This latter technique effectively uses a "Fourier domain" model, though it is implemented as a "spatial domain" linear filter.

Kalman Filter
A statistical process which estimates an optimal set of parameters based on the log-likelihood function. It is effectively an approach to least-squares which allows the data to be combined into the solution estimate in a sequential fashion. At each stage the optimal solution is estimated from the data accumulated up to that point. Update equations are used to include succesive points based on the "covariance" of the parameters.

Knowledge Based Computer Vision
A particular approach to solving the "computer vision" problem by representing prior information in the form of conceptual knowledge (in the sense of Artificial Intelligence) rather than low level correlations between observed variables.

Laplacian Operator
A simple linear convolution operator which uses as its kernal values which compute a discrete approximation to the second derivative (summed in two orthogonal directions) at each point in the image.

Least Squares
The method of least squares is a "statistical method" based on "maximum likelihood" for estimating the parameters of a model from a set of data. The method is computationally simple and often leads to closed form (ie: non-iterative) solutions. Estimation of the optimal parameters is achieved by minimising a least squares sum of residuals between the observed data and the model prediction. This functional form is justified by "probability" theory on the basis that each piece of observed data has independant errors drawn from a "Gaussian" distribution. The method is generally not suitable for data which may be contaminated by other sources of error, such as "outliers". Solutions to this type of problem are generally reffered to as "robust" and includes the "hough transform". These problems can never be solved in closed form.

Length and Orientation
For extended "features" or "objects" the properties of length and orientation can be defined. Orientation is measured directly as an angle in the image co-ordinate system (generally with vertical defined as zero). Length is defined as the maximum extension of the feature (eg: connected region) in any direction. Like "contrast" these features are often useful for the processes of matching and "correspondence".

Linear Discriminant Analysis
A method for assigning one of two classifications to a particular vector of measurements by projecting that vector onto the line of best separation between the two classes. In doing so it effectively defines a high dimensional plane as the decision boundary, this plane is determined from sample data as that which will gives the best classification performance for that data.

Any "feature" within an image has a location. Localisation is the process of defining for a particular (perhaps even extended) feature a single co-ordinate and perhaps also spatial extent for use in latter processing. Small features such as "corners" and "edges" can generally be located to fractions of a pixel. Larger features, such as entire complex objects, may be more difficult to locate accurately or even find using software.

Low-pass Filtering
The process of selectively attenuating the high frequency components of a signel.

Linear Structures
Though this term obviously applies to lines it can also be used to describe any group of "features" in an images which show a distinct linear (straight line) trend associated with their "localisation".

Machine Vision
Like "computer vision" but generally more closely associated with its use in robotics.

Mathematical Method
A group of mathematical steps which describe the solution for a particular group of desired quantities. To be distinguised from "numerical methods" which also have to take into account the practical problems of computational stability and "statistical methods" which take into account the accuracy of the source data. Generally, good algorithms are based on statistical methods backed up by sound numerical implementations not mathematical solutions. This is one of the reasons why machine vision cannot be reliably solved as a problem of inverse optics, even when there is enough information in the images to solve all of the relevant equations.

Maximum Likelihood
A likelihood is a particularly useful result from "probability theory" which allows the relative merits of particular model parameter hypotheses to be assessed. For a set of independent measurements and a model hypothesis, it is the log probability that the data could have been generated by the model with a particular set of parameters. This likelihood is often optimised to find the maximum likelihood combination of parameters. This can be done either by "fitting" or related techniques such as the "hough transform". The simplest form of maximum likelihood estimation is known as "least squares" fitting, which forms the basis of many statistical methods and algorithms.

Median Filtering
Median filtering is a non-linear image processing technique which is often used to reduce image noise (particularly "drop-out" in a way which preserves edge discontinuities. It involves replacing each pixel in an image with the median (middle value) grey level value from a local neighbourhood of pixels.

Model Based Computer Vision
Model based computer vision is the approach to computer vision which represents knowledge about the world in the form of parametric models such as wireframes or surfaces. It is particularly appropriate for man-made objects which can be defined with relatively simple (CAD like) geometric "features". It is obviously less suitable for unstructured natural environments.

Morphological Sizing
See "seiving".

Motion is the area of "computer vision" associated with extracting quantitative measurements from sequences of images, relating to physical movement in the underlying scene. This can take the form of a complete (full image) "flow field" of velocity vectors at each pixel, or a more sparse set of movements between corresponding "features" or "objects". Obtaining an accurate and robust "flow field" from images with a single generic algorithm is an unsolved problem, but much can be achieved with sparse data. From an algorithmic point of view, motion can be considered as the temporal analog of stereo "correspondence".

Non-maximal Suppression
After the process of "enhancement" edge features are generally "localised" using this technique. The simplest method compares local evidence supporting an edge with its surrounding pixels and labels it as an edge if it is above a "threshold" and if it is larger than at least n neigbours. More complicated versions also use "hysteresis" thresholding to extract extended "connnected" features.

Numerical Methods
The area of computing associated with mathematical programming which deals with problems such as numerical stability accuracy and computational efficiency.

Neighbourhood Operators
A neighbourhood operation is one in which the output pixel values depend only upon input image values from a neighbourhood surrounding the output pixel position.

A conceptual entity in the scene which may be "recognised" and "localised" in an image as a set of spatially organised "features".

Opening and Closing
Opening and closing are morphological operations composed of successive "erosion" and "dilation" operations. They can be defined either for grey level or binary images. Opening an image with a disk shaped structuring element smooths contours, breaks narrow isthmuses and eliminates "features" smaller than the structuring element. Closing with the same element, smooths contours, fuses narrow breaks, and fills holes smaller than the structuring element. These operators can thus be used to construct a "pyramid" based representation of image content, with "features" of particular ranges of scale represented at each level.

Optical Flow
Optical flow is the name give to the body of research and algorithms aimed at extracting full flow fields describing the "motion" in an image. Unambiguous estimation of motion can only be done at limited locations (generally "corners") in an image. Even "edges" can only give one component of the flow vector due to the "aperture problem". Extracting a full flow field therefore requires some degree of high level knowledge which is application dependent.

Pattern Recognition
Pattern recognition is the process of assigning a pattern classification to a particular set of measurements, normally represented as a high dimensional vector. This is normally done within the context of "probability theory", whereby a particular set of assumptions regarding the expected statistical distribution of measurements is used to compute classification probabilities which can be used as the basis for a decision such as the "Bayes decision rule". There are several popular forms of classifier including "k-nearest neighbour", "parzen windows", "mixture methods" and more recently "artificial neural networks".

Pin Hole Camera Model
The pinhole camera is the name given to the mathematical model which describes the geometric aspects of image formation in terms of a perspective projection.

Pixels and Voxels
A pixel is a small region of an image defined by a unique row column location within a 2D image. A voxel is the analog for 3D volume data sets.

Point Operators
A point operator is an image operator in which the output image value at each location depends only upon the input image value at the same location ("pixel"). See for example "image arithmetic" and "neighbourhood operator".

Principal Component Analysis
Principle component analysis is the technique used to represent the expected dimensions of variation of a high dimensional vector data set. This is done by finding a limited set of orthothogonal axes which describe the major directions of variation (statistical variance). The method is mathematically equivalent to an eigen vector decomposition and is often performed using the technique known as Singular Value Decomposition (SVD). As well as having applications in "object" location and for direct solutions to "least squares" problems, the technique can be used as the basis of noise filtering algorithms.

Probability Theory
The area of mathematics which deals with the definition and manipulation of probabilities for the purposes of statistical reasoning. Probabilities have been formally defined in mathematics by a series of axioms (eg: a number between 0 and 1) but in any practical application they must be a direct (empirical) measure of how often a particular event will be observed under a specified set of conditions. Probability theory is the only self consistent theory for data analysis, and therefore forms the basis of all statistical methods.

The process during image formation involving losing information due to converting a continuous value to a discrete digital one. See "Digitisation".

Rank Order Filtering
A non-linear image filtering operation. Creating an output image from another by replacing grey level values with rank values from a defined region aroung the input pixel.

Receiver Operator Characteristic
All classification algorithms (including featutre detectors which effectively classify regions of the image as "feature" or non-feature) require a threshold which controls the task of statistical decision making based on the evidence for the interpretation. Different thresholds yield different results. The ROC curve provides a way of comparing the performance of classification across the whole range of detection thesholds. It is a function of detection rate against false-alarm rate for a detection task. An ROC curve which is entirely above the curve from another detection algorithm is worse under all circumstances and therefore evidence of an inferior technique for that set of data.

The process of recognising the presence of "objects" in an image. See "representation". Recognition should not be confused with "localisation", which generally starts by assuming that a particular "object" is present in an image.

The process of applying a projective warp to a pair of stereo images in order to align their epi-polar lines so that they are parallel to the x axis.

Region Growing
An intermediate step in the process of many "image segmentation" algorithms which aims to merge adjacent regions with similar characteristics in order to obtain a simpler and hopefully more correct interpretation of the data.

Region of Interest
A restricted area within an image to which processing effort is directed in order to speed up analysis.

Region Representation
Techniqes for representing "segmented" regions of images. For example in terms of area, boudary or chord representations. This includes simple measures such as "area", "length" and orientation as well as the more complete methods such as "moments" and "fourier descritors".

A computational mechanism which involves the repeated application of a "neighbourhood operator" to a set of image values in order to obtain a mutually consistent (convergent) interpretation.

Replacement Noise
The process which results in image "grey values" being replaced with values unrelated to scene content. The simplest example is where the data value is replaced by zero and is known as "drop-out".

The process of describing a set of "feature" data as a set of parametric measurements for the purposes of making explicit those characteristics relavant to a recognition task and those that are not. Generally it is the first step in the process of "recognition" and it is followed by "classification". The representation must therefore be suited to the statistical assumtions used in the classification step. Unwanted characteristics are often reffered to as invariances as the classification process is required to be invariant to these categories of data variation (eg: rotation, translation, scale and illumination). A representation is "complete" if the original data can be reconstructed from the representation patrameters up to the ambiguity of the invariances. The recognition process will subsequently be "optimal" if the statistical assumtions used in the classification step are correct and use is made of all of the available data in the representation. Much of the process of "Image Understanding" is involved with defining adequate representations for image characteristics such as shape and texture. The "scope" of a representation refers to the range of objects that can be represented.

This is a generic term that describes how well a system can measure an isolated "object" consisting of separate, closely spaced, "features" or lines. Resolution may be a function of contrast, spatial position as well as object shape.

RGB and HS Colour Space
see "Colour Images"

Run Length Coding
The process of representing a sparse binary image as a sequence of distances between successive 0's or 1's for the purposes of compression.

Sampling Theorem"
The theorem which relates the size of a quantized sampling mesh in the "spatial domain" to the smallest frequency component of the data that can be reliably computed in the "fourier domain". Attempting to sample data from a signal at a frequency above the limit defined by the sampling theorem will result in a process known as "aliasing".

Scale Space
An abstract concept of image content which makes explicit the occurance of "features" at particular ranges of scales.

Shape Modelling
The process of producing a limited parameter description of the allowable ranges of shapes for a particular "object". This group may include deformation or shape change due to projection transformation.

Application of a combination of morphological filters to remove structures of less than a given size. See "opening and closing" and "image pyramids".

Spatial Domain
The representation of the image in which distances between pixels are measures of linear length. As distict from the "frequency domain" representation in which distances between pixels are indicators of frequency and phase differences.

Split and Merge
A particualr pixel based approach to "image segmentation" whereby the best image segmentation is arrived at by iteratively combining the processes of "region growing" and sub-division.

Statistical Methods
The body of computational techniques which take into account expected or observed distributions in data sets, in order to determine parametric values or solutions. In doing so these techniques are always more stable than direct "mathematical methods".

Stereo Matching
The process of corresponding either pixel locations or extracted image "features" between a pair of stereo images. Algorithms can be broadly classified as area based or feature based algorithms. The first often attempts to extract dense stereo data (which is often inaccurate anywhere but at "edges") and the latter extract sparse but more reliable data.

Structuring Element
A structuring element has a role in morphology which is exactly analogous to the kernal in a convolution operation. It is a function defined as a spatial pattern used by the operator at each location in the image. For grey level dilation the operation is the maximum of all possible sums of pixel values with the structuring element. For grey level erosion it is the minimum of all possible differences. For binary images the sum is replaced by the product.

Texture Analyis
Texture is a term used with reference to the spatial distribution of image intensities. It can be described in terms of dimensions of uniformity (symmetry), density, coarseness, roughness, regularity, intensity and directionality. This number of possible degrees of variation, and the difficulty in defining a quantitative method for extracting them, make texture representation and classification problematic, particularly when compounded with scale, illumination and surface shape.

Thinning and Thickening
Thinning is a symbolic image neighbourhood operation that deletes, in some symmetric way, all interior border pixels of a region that do not disconnect the region. Successive applications of a thinning process reduce a region to a set of arcs that constitute a skeleton. Thickening aggregates all background pixels near enough to a region into the region.

Thresholding converts a grey level image into a binary image by setting all pixel values above a threshold to 1 and all those below to zero.

Zero Crossing
A zero point in the second derivative of a function. Generally used to define edge features as this corresponds to locations of maximum gradient.

(c) Imaging Science and Biomedical Engineering 2000 [paul.bromiley@man.ac.uk]

Valid HTML 4.01!