A


Adaptive Thresholding


The process of thresholding with the automatic local selection of
a theshold value. Generally this value is estimated from local
image content.


Additive and Multiplicative Noise 

The name given to the dominant dependancy of the random noise
process on the observered signal.


Apeture Problem 

The problem encountered in "motion" analysis that extended
linear "features" cannot provide local infromation regarding motion
in any direction other than perpendicular to the line.


Area 

Total number of pixel locations in a segmented region.


Artificial Neural Networks 

A name given to the computational models implemented on
computers (generally in software) which mimic certain
aspects of biological brain function. They are generally
distinguished according to their "training" paradigm as either
"supervised" or "unsupervised" which are used to adjust free
parameters within the system in order to achieve a particular task.
Under certain circumstances ANNs can be shown to be "Bayes Optimal".

B


Bayes Decision Rule 

A classification rule which assigns a classification category on the
basis of a set of evidence according to the class model whose
conditional probability (given the evidence) is highest.
The resulting "Bayes Error" rate is the theoretical optimal performance
for a forced classification task. See "Bayes Theory".


Bayes Theory 

An equation derived from "probability theory" which allows the
probability to be computed that a particular process
generated a particular set of data. It involves the use of both
"conditional" and "prior" probabilities. The former can generally
be computed from a model or sample data. The latter is the
expected relative frequency that a particular process would have
occurred.


Biological Vision 

The process of modelling the underlying computational
aspects behind human (or other) visual
perception. The existence proof that "image understanding"
must be possible and one approach to guiding the design
of "computer vision" systems.


Blurring 

The process in image formation which removes predominantly
high frequency components from the image. The physical analog
of "Image Smoothing".

C


Calibrated Stereo 

The name given to stereo algorithms which return accurate metric
information rergarding the real 3D structure of a scene. This
should be contrasted with uncalibrated stereo algorithms which solve
the correspondence problem but return geometrical measurements
only up to an unknown transformation.


Classification 

The process of assigning one of a limited set of alternative
interpretations to (the generator of) a set of data.
Often requires the steps of 
 the computation
of relative probabilities (or a quantity related to them)
followed by the application of a
decision rule. All classification processes can be evaluated
in terms of "detection" and "misclassification" rates. See
"receiver operator characterstics".


Colour Images 

Colour in images is generally represented as a triplet of values
at each image location ("pixel") in one of either "chromaticity", "hue saturation"
or "YIQ" representations. These representations can be interconverted
but "hue saturation" is the one which is more directly useful in image
interpretation as it separates intensity (which relates closely to
surface shape) from chromatic properties (which relates to scene
content).


Complex Images 

Complex images are represented as a pair of values at each image
location ("pixel") corresponding to the real and imaginary parts of
a complex number. Such images are generated by
"image processing" algorithms such as the 2D "Fourier Transform"
though may also provide a useful representation for image data
in a wide variety of algorithms.


Computer Vision 

Compter Vision is the subject area which deals with the automatic
analysis of images for the purposes of quantification or system
control (often mimicking tasks which humans find trivial).
It is to be distinguised from "Image Processing" which
deals only with the computational processes applied to images,
including enhancement and compression, but does not deal with
abstract representation for the purposes of reasoning and interpretation.
Compter Vision can be seen as the inverse of Computer Graphics,
though generally the representations and methods of this area
are not of use in Computer Vision due to the incomplete and
therefore ambiguous nature of images. This requires
prior knowledge to be used in order to obtain robust
scene interpretation.


Connected Region 

A set of "features" within a scene where the property of
"connectivity" is defined between all features.


Connectivity 

This is defined on a binary image of labelled (or classified) pixels
as the property of being able to find a path beteen two similar
labels by moving only along adjacent similarly labelled pixels.
Adjacency can be defined in several ways according to the
image lattice, in particular rectangular lattices can have
four or eightway adjacency depending on how diagonal pixels are
treated. Hexagonal lattices have sixway adjacency.


Contrast 

A generic term relating to the degree of visual difference
between
a particular "object" in an image and the rest of the scene
or background. In order to make this concept quantitative,
using for example grey levels, account must be taken of
the variation in the "features" or measurement accuracy.
The contrast within an image can often be changed for the
purposes of visualisation, though this will have no effect
on the true information content of the image from the
point of view of processing. Contrast is often used as
a property of a feature for the purposes of matching or
correspondence.


Contrast Manipulation 

The adjustment of image grey levels to enhance the visual
appearance of certain "objects" by increasing the difference
between its grey level values and its surroundings.
Achieved by "mapping" the initial grey scale onto a different
set of values. The most common form of this is "histogram equalisation".


Convolution 

Convolution is a linear operator. It involves the construction of
an output image by computing each pixel as a weighted sum
of a local region of the input image with a fixed fixed array or
"convolution mask". The process is mathematically equivalent
to many physical processes which occur during image formation,
particularly optical blurring, though it is also useful for a
variety of tasks including noise filtering.
The computational requirements for large
masks has led to the development of several approaches for
efficient implementation. In particular, Fourier techniques,
based on the "Convolution Theorem", which also makes possible the process
of "deconvolution" which can be used for image enhancement (particularly
deblurring)".


Corners 

Edge features can only be localised in one dimension on the image plane
due to the "aperture problem". Corners can effectively be defined
as any image feature which can be located reliably in 2D. This includes
catergories of feature detector algorithms reffered to as
"interest operators". Corner detectors are often based on discrete
approximations to the product of image curvature and gradient. As
a consequence "corners" can also be defined along edge strings
as points of high curvature.


Correspondence 

The process of associating "features" from two
images of the same scene as belonging to the
same scene feature. This is a fundamental process in both
feature based stereo and motion analysis. The problem is
often complicated by incomplete feature detection (unreliability
or spatial occlusion)
and potential matching ambiguities.


Covariance 

A representation of multiparameter measurement accuracy in the form
of a matrix which defines the size and orientation of a quadratic
log likelihood function.


Curve Fitting 

The process of determining a set of parameters forming the
representation of a continuous curve which "best" describes a discrete
set of data, generally in a "least squares" sense.
The most common form of curves are lines, circles, ellipses,
general conic sections and splines. The process is useful for
data reduction, interpolation and estimation of continuous properties
such as derivatives and has also been used as a starting point for
"object" location.

D


Decision Theory 

See "Bayes Theory".


Deconvolution 

See "convolution".


Difference of Gaussian 

The image formed by subtracting two "convolutions" of the same
image with 2D "Gaussian" kernals at different scales. The process
when considered in the "Fourier
Domain" can be seen to be equivalent to selecting a particular band of
frequency components from the input image. In the spatial domain
the observed result can exhibit properties such as edge enhancement.


Disparity 

This is the name given to the physical separation along an "epipolar"
line between equivalent features in a pair of stereo images. The
value is simply related to distance from the camera via a reciprocal function.


Digitized Images 

A digitized "image" is an array of "grey level","complex" or "colour"
values
which has been formed by
sampling an image at predefined regular (often uniformly spaced)
locations.


Dilating and Eroding 

These nonlinear operators are the basic processes of all morphological
image processing operations.
If the 2D image is considered as a 3D surface, the process of erosion
and dilation can be visualised as etching or coating this surface
(respectively) so that the surface expands or contracts. Mathematically
the process is represented using a structuring element (analogous to
a kernal in convolution) which determines how much to erode or dilate.
A shaped structuring element will have an effect which varies depending
upon the local surface orientation and can be used to extract or remove
particular scales of shaped "features".

E


Edges 

One of the major tasks of "computer vision" is the extraction of
scene structure from the 2D image. The only "features" which are
quantitatively preserved from the scene in the projected scene
are differential discontinuities, ie: places at which there are
large isolated gradient in the image data. These are edges and
this property is referred to as diffeomorphic equivalence.
There are several ways of defining and extracting differential
discontinuities and these result in four characteristic approaches
to extracting edges; gradient, Laplacian, zero crossings and
morphological operations. The detection process also implicitly
defines a "scale" for the detection process, as some edges at one scale
of the image may not be detected or may bifurcate at another.


Edge Convolution Enhancement 

The first stage in many "edge" detectors is a process of enhancement
which generates an image in which ridges correspond to statistical
evidence for an edge. Such a process is often reffered to as
edge enhancement and is achieved using linear convolution
operators (often named after their inventors)
including, "Roberts", "Prewitt", and the early stages of "Canny".


EpiPolar Line 

This is the line in one of a pair of stereo images along which
a physical 3D feature selected from the other image must lie.
It is defined by the plane which passes through the physical 3D
point and the optical centres of the cameras at the line of intersection
with the image planes.
Computing this line thus requires knowledge of the relative transformation
in world coordinates which relates the left and right image planes.

F


Feature 

Any automatically "localisable" component of an image.
The process of localisation implicitly requires a definition
of a set of measurable characteristics.


Feature Matching 

The process of assigning correspondence between two sets of
"features" extracted from two images of the same (or an equivalent)
scene.


Frequency Domain 

The "digital image" can be converted from the "spatial domain" into the Fourier
Domain by applying a 2D Discrete Fourier Transform. In this form
the new frequency domain image is represented with complex (two values)
pixels, each of which corresponds to the amplitude or strength of
a particular scale and phase of sine curve. Images of repeating patterns
or textures will naturally produce compact peaks in this image and
because of this the Frequency Domain is often a convenient way of
representing scene content.


Fourier Transformation 

The Fourier Transform is a linear operation which computes the
complex coefficients of the "Frequency Domain" representation.
Each term is computed as the correlation of the original signal with
a sine or cosine curve of particular phase and frequency.
The inverse Fourier Transform allows the process to be reversed, and
for discrete data this process is exact. The most commonly used
algorithm for computing the Fourier Transform is the FFT (Fast
Fourier Transform) which breaks the calculation down into a set
of serial stages which minimises the computational effort. This
transform is useful for a wide variety of image processing applications,
including compression, interpolation and "convolution".

G


Gaussian Filtering 

Gaussian Filtering is the process of image "convolution" with a
2D Gaussian "kernal", generally in order to remove noise
or fine scale (high frequency) structure from an image.


Geometric Distortion 

Any process (though generally optical) which results in the
geometric formation of the image differing from a pinhole
model. The most common form is known as "radial distortion"
as the deviation from the standard model has a radial dependancy
often reffered to as "pincushion" or "barrel" distortion.


Graph Searching 

Though this has a generic meaning outside the field of "Computer Vision"
it is generally encountered here as
a method for incorporating additional knowledge (start and end points
and also smoothness and curvature)
into the process of edge location. Algorithms generally operate
on a discrete lattice (eg pixels) and have a set of rules which define how
optimality criteria should be computed according to local constraints.
The graph search then identifies a solution that in some sence
maximises these criteria.


Grey and Binary Scale Levels 

A grey level is a single scalar value associated with a particular
location in an image. For optical or photographic sensors this value
is proportional (or at least monotonically related to) the measured signal.
Some sensors or image processing algorithms require multivalued
data such as complex (two valued) or "colour" (three valued) images.
Grey level images generally have integer values in the range 064, 0256
or 01024, corresponding to 6bit, 8bit, or 10bit digitisation.
The binary image requires less storage, one bit per pixel, and is generally
used to represent the presence or absence of a particular "feature" at
each point in the image (eg: "edges");.


Grey Level Processing 

Grey level processing is the general term given to image processing of
"grey level" image data. As distict from "binary processing" which
refers to processing of binary (single bit per pixel) images.

H


Highpass Filtering 

The process of selectively attenuating low frequency components
of a signal.


Histograms 

A histogram is an array of non negative integer
counts from a set of data, which represents the frequency of occurance
of values within a set of nonoverlapping regions. For example,
the image histogram is an array of the frequency of occurrence
of grey levels within a particular set of grey level ranges.


Histogram Modification 

The creation of a new image by the systematic replacement of grey levels with
an alternative set of values.
The process of histogram equalisation is a particular case in
which the resulting image has a flat image histogram.


Hough Transformation 

The Hough Transform is a particular method for selecting a set of
parametric values from a specific functional model, which
"best" describe a set of data. It works by effectively taking
the peak in a "histogram" defined across the parameter space, where
the entries are made based on which ranges of parameters can be supported
by subsets of the data. The technique can be related to probability
theory and robust statistics (particularly robust fitting)
and works well for low parameter (23) models
though has problems of computational complexity with higher orders.

I


Images 

An image is two dimensional spatial representation of a
group of "objects" (or "scene") which
exists in two or more dimensions. It is an intuitive way of presenting
data for computer interfaces in the area of graphics, but
in machine vision it may be defined as a continuous function of two
variables defined within a bounded (generally rectangular) region.


Image Arithmetic 

Simple grey level processing which computes an output
image from a set of input images from basic
arithmetic manipulations of the grey levels at equivalent pixel locations.
The process is therefore a "point operator".


Image Processing 

The computation of new images from a set of input images for the
purpose of making some visual aspects of the image more explicit.
It should be distinguised from "Computer Vision" which is a
broader area and generally requires working with a far broader range
of concepts, data structures and algorithms.


Image Pyramid 

A set of copies of an image in which both sample density and resolution
are decreased in regular steps. The bottom level is the original image
and each successive level is obtained by a filtering operation followed by
a sampling operator. The most common form of pyramid are Gaussian,
morphological and Laplacian.


Image Reconstruction 

The process of reconstructing a noise free uncorrupted image
from a corrupted one. Generally achieved using an iterative (often
"relaxation") algorithm that makes
use of knowledge about local image behaviour in order to obtain
the best approximation to the data, which would have been most likely
to give rise to the observed data set while simulatneously having the expected
local characteristics.


Image Segmentation 

The process of assigning classification groups to an image (often on
a pixel by pixel basis) in order
to isolate (segment) particular regions of scene structure.
Often done as a precursor to higher level interpretation such as
recognition or measurement (achieved via the processes of
"representation" and "classification").


Image Smoothing 

The process of removing some quantity of the
high frequency components from the "frequency domain" representation
of an image. Could equally well be refereed to as "lowpass
filtering".


Image Transformation 

The process of generating a new image via the process of transforming
the pixel coordinate system. There are several simple forms
of this including "bilinear" and "affine" warping, and "image rotation".
All such methods require a process of "interpolation" in order
to estimate the grey level values at "nonlattice" sites from the
values in the input image.


Image Understanding 

The process of extracting a real understanding of a scene from the
contents of an image. Perhaps in terms of scene content ("objects") but
also relational aspects (on top of , in front, etc.) and in terms of
characteristic behaviours (doing, and consequences).


Image Interpolation 

The process of estimating an intermediate value from a
discrete set of data points. For images this is generally
done by approximating a local region of the image with some
low parameter function and using this function to estimate a value
at the intermediate location. The simplest form is known as
"bilinear" interpolation and the commonly accepted standard is
known as "sinc" interpolation. This latter technique effectively uses
a "Fourier domain" model, though it is implemented as a "spatial domain"
linear filter.

J

K


Kalman Filter 

A statistical process which estimates an optimal set of parameters
based on the loglikelihood function. It is effectively an
approach to leastsquares which allows the data to be combined
into the solution estimate in a sequential fashion. At each
stage the optimal solution is estimated from the data accumulated
up to that point. Update equations are used to include succesive points
based on the "covariance" of the parameters.


Knowledge Based Computer Vision 

A particular approach to solving the "computer vision" problem
by representing prior information in the form of conceptual knowledge
(in the sense of Artificial Intelligence) rather than low level
correlations between observed variables.

L


Laplacian Operator 

A simple linear convolution operator which uses as its kernal
values which compute a discrete approximation to the second derivative
(summed in two orthogonal directions)
at each point in the image.


Least Squares 

The method of least squares is a "statistical method" based
on "maximum likelihood" for estimating
the parameters of a model from a set of data. The method is computationally
simple and often leads to closed form (ie: noniterative)
solutions. Estimation of the
optimal parameters is achieved by minimising a least squares sum of residuals
between the observed data and the model prediction. This functional
form is justified by "probability" theory on the basis that each piece of
observed data has independant errors drawn from a "Gaussian" distribution.
The method is generally not suitable for data which may be contaminated
by other sources of error, such as "outliers". Solutions to this
type of problem are generally reffered to as "robust" and includes
the "hough transform". These problems can never be solved in closed form.


Length and Orientation 

For extended "features" or "objects" the properties of length
and orientation can be defined. Orientation is measured directly
as an angle in the image coordinate system
(generally with vertical defined as zero).
Length is defined as the maximum extension of the feature
(eg: connected region) in any direction. Like "contrast" these
features are often useful for the processes of matching and
"correspondence".


Linear Discriminant Analysis 

A method for assigning one of two classifications to a particular
vector of measurements by projecting that vector onto the line
of best separation between the two classes. In doing so it effectively
defines a high dimensional plane as the decision boundary, this plane
is determined from sample data as that which will gives the best
classification performance for that data.


Localisation 

Any "feature" within an image has a location. Localisation is the process
of defining for a particular (perhaps even extended) feature a single
coordinate and perhaps also spatial extent for use in latter processing.
Small features such as "corners" and "edges" can generally be located to fractions
of a pixel. Larger features, such as entire complex objects, may be more difficult
to locate accurately or even find using software.


Lowpass Filtering 

The process of selectively attenuating the high frequency components
of a signel.


Linear Structures 

Though this term obviously applies to lines it can also be used
to describe any
group of "features" in an images which show a distinct linear (straight line)
trend associated with their "localisation".

M


Machine Vision 

Like "computer vision" but generally more closely associated with
its use in robotics.


Mathematical Method 

A group of mathematical steps which describe the solution
for a particular group of desired quantities.
To be distinguised from "numerical methods" which also
have to take into account the practical problems of
computational stability and "statistical methods" which
take into account the accuracy of the source data.
Generally, good algorithms are based on statistical
methods backed up by sound numerical implementations
not mathematical solutions. This is one of the reasons why
machine vision cannot be reliably solved as a problem of inverse optics,
even when there is enough information in the images to solve
all of the relevant equations.


Maximum Likelihood 

A likelihood is a particularly useful result from "probability
theory" which allows the
relative merits of particular model parameter hypotheses to be
assessed. For a set of independent measurements and
a model hypothesis, it is the log probability that the data could have
been generated by the model with a particular set of parameters.
This likelihood is often optimised to find the maximum likelihood
combination of parameters. This can be done either by
"fitting" or related techniques such as the "hough transform".
The simplest form of maximum likelihood estimation is known as
"least squares" fitting, which forms the basis of many statistical
methods and algorithms.


Median Filtering 

Median filtering is a nonlinear image processing technique
which is often used to reduce image noise (particularly "dropout"
in a way which preserves edge discontinuities.
It involves replacing each pixel in an
image with the median (middle value) grey level value from
a local neighbourhood of pixels.


Model Based Computer Vision 

Model based computer vision is the approach to computer vision which
represents knowledge about the world in the form of parametric
models such as wireframes or surfaces. It is particularly appropriate
for manmade objects which can be defined with relatively simple
(CAD like) geometric "features". It is obviously less suitable
for unstructured natural environments.


Morphological Sizing 

See "seiving".


Motion 

Motion is the area of "computer vision" associated with extracting
quantitative measurements from sequences of images, relating to
physical movement in the underlying scene. This can take the form
of a complete (full image) "flow field" of velocity vectors at each pixel,
or a more sparse set of movements between corresponding "features"
or "objects".
Obtaining an accurate and robust "flow field" from images with a
single generic algorithm is an
unsolved problem, but much can be achieved with sparse data.
From an algorithmic point of view, motion can be considered as the temporal
analog of stereo "correspondence".

N


Nonmaximal Suppression 

After the process of "enhancement" edge features are generally
"localised" using this technique. The simplest method compares local
evidence supporting an edge with its surrounding pixels
and labels it as an edge if it is above a "threshold" and if
it is larger than at least n neigbours. More complicated versions
also use "hysteresis" thresholding to extract extended "connnected"
features.


Numerical Methods 

The area of computing associated with mathematical programming which
deals with problems such as numerical stability accuracy and
computational efficiency.


Neighbourhood Operators 

A neighbourhood operation is one in which the output pixel values
depend only upon input image values from a neighbourhood surrounding
the output pixel position.

O


Object 

A conceptual entity in the scene which may be "recognised"
and "localised" in an image as a set of spatially
organised "features".


Opening and Closing 

Opening and closing are morphological operations composed of
successive "erosion" and "dilation" operations. They can
be defined either for grey level or binary images. Opening an image
with a disk shaped structuring element smooths contours, breaks narrow
isthmuses and eliminates "features" smaller than the structuring element.
Closing with the same element, smooths contours, fuses narrow breaks,
and fills holes smaller than the structuring element. These operators can
thus be used to construct a "pyramid" based representation of image
content, with "features" of particular ranges of scale represented at each
level.


Optical Flow 

Optical flow is the name give to the body of research and algorithms
aimed at extracting full flow fields describing the "motion" in an image.
Unambiguous estimation of motion can only be done at limited locations
(generally "corners") in an image. Even "edges" can only give one component
of the flow vector due to the "aperture problem". Extracting a full flow
field therefore requires some degree of high level knowledge which is
application dependent.

P


Pattern Recognition 

Pattern recognition is the process of assigning a pattern classification
to a particular set of measurements, normally represented as
a high dimensional vector. This is normally done within the context of
"probability theory", whereby a particular set of assumptions regarding the
expected statistical distribution of measurements is used to compute
classification probabilities which can be used as the basis for a decision
such as the "Bayes decision rule". There are several popular forms
of classifier including "knearest neighbour", "parzen windows",
"mixture methods" and more recently "artificial neural networks".


Pin Hole Camera Model 

The pinhole camera is the name given to the mathematical model
which describes the geometric aspects of image formation in terms of
a perspective projection.


Pixels and Voxels 

A pixel is a small region of an image defined by a
unique row column location
within a 2D image. A voxel is the analog for 3D volume data sets.


Point Operators 

A point operator is an image operator in which the output image value
at each location depends only upon the input image value at the same
location ("pixel"). See for example "image arithmetic" and
"neighbourhood operator".


Principal Component Analysis 

Principle component analysis is the technique used to represent the
expected dimensions of variation of a high dimensional vector data set.
This is done
by finding a limited set of orthothogonal axes which describe the major
directions of variation (statistical variance). The method is
mathematically equivalent to an
eigen vector decomposition and is often performed using the technique
known as Singular Value Decomposition (SVD).
As well as having applications in "object" location and for direct solutions
to "least squares" problems, the technique can be
used as the basis of noise filtering algorithms.


Probability Theory 

The area of mathematics which deals with the definition and manipulation
of probabilities for the purposes of statistical reasoning. Probabilities
have been formally defined in mathematics by a series of axioms (eg:
a number between 0 and 1) but in
any practical application they must be a direct (empirical) measure
of how often a particular event will be observed under a specified set of
conditions. Probability theory is the only self consistent theory
for data analysis, and therefore forms the basis of all statistical
methods.

Q


Quantisation 

The process during image formation involving losing information
due to converting a continuous value to a discrete digital one.
See "Digitisation".

R


Rank Order Filtering 

A nonlinear image filtering operation.
Creating an output image from another by replacing grey level values
with rank values from a defined region aroung the input pixel.


Receiver Operator Characteristic 

All classification algorithms (including featutre detectors
which effectively classify regions of the image as "feature"
or nonfeature) require a threshold which controls the
task of statistical decision making based on the evidence for the
interpretation. Different thresholds yield different results.
The ROC curve provides a way of comparing the performance
of classification across the whole range of detection thesholds.
It is a function of detection rate against falsealarm
rate for a detection task. An ROC curve which is entirely
above the curve from another detection algorithm is worse under
all circumstances and therefore evidence of an inferior technique
for that set of data.


Recognition 

The process of recognising the presence of "objects" in an image.
See "representation".
Recognition should not be confused with "localisation", which generally
starts by assuming that a particular "object" is present in an image.


Rectification 

The process of applying a projective warp to a pair of stereo images in
order to align their epipolar lines so that they are parallel to the
x axis.


Region Growing 

An intermediate step in the process of many "image segmentation" algorithms
which aims to merge adjacent regions with similar characteristics
in order to obtain a simpler and hopefully more correct interpretation
of the data.


Region of Interest 

A restricted area within an image to which processing effort is directed
in order to speed up analysis.


Region Representation 

Techniqes for representing "segmented" regions of images. For example
in terms of area, boudary or chord representations. This includes
simple measures such as "area", "length" and orientation as well
as the more complete methods such as "moments" and "fourier descritors".


Relaxation 

A computational mechanism which involves the repeated application
of a "neighbourhood operator" to a set of image values in order to
obtain a mutually consistent (convergent) interpretation.


Replacement Noise 

The process which results in image "grey values" being replaced with
values unrelated to scene content. The simplest example is where
the data value is replaced by zero and is known as "dropout".


Representation 

The process of describing a set of "feature" data as a set of
parametric measurements for the purposes of making
explicit those characteristics relavant to a recognition task
and those that are not. Generally it is the first step
in the process of "recognition" and it is followed by "classification".
The representation must therefore be suited to the statistical assumtions
used in the classification step.
Unwanted characteristics are often reffered to as invariances as
the classification process is required to be invariant to these
categories of data variation (eg: rotation, translation, scale and illumination).
A representation is "complete" if
the original data can be reconstructed from the representation
patrameters up to the ambiguity of the invariances.
The recognition process will subsequently be "optimal" if the
statistical assumtions used in the classification step are
correct and use is made of all of the available data in the
representation. Much of the process of "Image Understanding"
is involved with defining adequate representations for image
characteristics such as shape and texture.
The "scope" of a representation refers to the range of objects that can
be represented.


Resolution 

This is a generic term that describes how well a system can measure
an isolated "object" consisting of separate, closely spaced, "features" or
lines. Resolution may be a function of contrast, spatial position
as well as object shape.


RGB and HS Colour Space 

see "Colour Images"


Run Length Coding 

The process of representing a sparse binary image as a sequence
of distances
between successive 0's or 1's for the purposes of compression.

S


Sampling Theorem" 

The theorem which relates the size of a quantized sampling mesh
in the "spatial domain"
to the smallest frequency component of the data that can be
reliably computed in the "fourier domain".
Attempting to sample data from a signal at a frequency above
the limit defined by the sampling theorem will result in
a process known as "aliasing".


Scale Space 

An abstract concept of image content which
makes explicit the occurance
of "features" at particular ranges of scales.


Shape Modelling 

The process of producing a limited parameter description
of the allowable ranges of shapes for a particular
"object". This group may include deformation
or shape change due to projection transformation.


Sieving 

Application of a combination of morphological filters
to remove structures of less than a given size. See "opening and
closing" and "image pyramids".


Spatial Domain 

The representation of the image
in which distances between pixels are measures of linear length.
As distict from the
"frequency domain" representation in which distances between
pixels are indicators of frequency and phase differences.


Split and Merge 

A particualr pixel based approach to "image segmentation" whereby
the best image segmentation is arrived at by iteratively combining
the processes of "region growing" and subdivision.


Statistical Methods 

The body of computational techniques which take into account
expected or observed distributions in data sets, in order
to determine parametric values or solutions. In doing so
these techniques are always more stable than direct "mathematical
methods".


Stereo Matching 

The process of corresponding either pixel locations or extracted
image "features" between a pair of stereo images. Algorithms can be
broadly classified as area based or feature based algorithms.
The first often attempts to extract dense stereo data (which
is often inaccurate anywhere but at "edges") and the latter extract
sparse but more reliable data.


Structuring Element 

A structuring element has a role in morphology which is exactly
analogous to the kernal in a convolution operation.
It is a function defined as a spatial pattern
used by the operator at each location in the image.
For grey level dilation the operation is the maximum of all possible
sums of pixel
values with the structuring element. For grey level erosion it is the minimum
of all possible differences.
For binary images the sum is replaced by the product.

T


Texture Analyis 

Texture is a term used with reference to the spatial distribution
of image intensities.
It can be described in terms of dimensions of uniformity (symmetry),
density, coarseness, roughness, regularity, intensity
and directionality. This number of possible degrees of variation,
and the difficulty in defining a quantitative method for extracting them,
make texture representation and
classification problematic, particularly when compounded with scale,
illumination and surface shape.


Thinning and Thickening 

Thinning is a symbolic image neighbourhood operation that deletes, in some
symmetric way, all interior border pixels of a region that do not disconnect
the region. Successive applications of a thinning process reduce a region
to a set of arcs that constitute a skeleton. Thickening
aggregates all background pixels near enough to a region into the region.


Thresholding 

Thresholding converts a grey level image into a binary image
by setting all pixel values above a threshold to 1 and all those below to
zero.

U

V

W

X

Y

Z


Zero Crossing 

A zero point in the second derivative of a function. Generally
used to define edge features as this corresponds to
locations of maximum gradient.
