Segmentation

Segmentation refers to the process of extracting the desired object (or objects) of interest from the background in an image or data volume. There are a variety of techniques that are used to do this, ranging from the simple (such as thresholding and masking) to the complex (such as edge/boundary detection, region growing and clustering algorithms.) Segmentation can be aided through manual intervention or handled automatically through software algorithms. It can be performed before building the 3-D reconstruction by processing of images in the image stack, or after the 3-D model has been formed.
Examples of simple forms of segmentation that can be used with confocal data include thresholding and masking.
Thresholding involves limiting the intensity values within an individual image or the entire image stack to a certain bounded range (or ranges). For example, since each pixel in an 8-bit greyscale confocal image (with values 0 [black] to 255 [white]) corresponds to fluorescence intensity at a point within the specimen, the pixels with lower values represent areas with lower fluorescence while the pixels with higher values represent brighter regions. It may be decided that all pixels below a certain value do not contribute significantly to the object(s) of interest and hence can be eliminated. This can be done by scanning the image(s) one pixel at a time, and keeping that pixel if it is above the selected intensity value, or setting it to 0 (black) if it is below that value. In a similar manner, thresholding can also be used to eliminate non-consecutive ranges of intensities while preserving the regions containing the intensities of interest.
Masking is a procedure whereby an enclosed region(s) of an image (or of the image stack) are defined for processing. This can be done either by manually tracing around the regions of interest (e.g. with a mouse in a graphics application) or by an automated routine. An easy (and useful) application of this is to use a 2-D stacked projection of an image to define the image mask. The stacked projection of the image stack is a single image that represents the sum of all of the images in the image stack (these images can usually be provided automatically from software supplied with the LSCM.) If the object of interest has a closed, continuous surface (such as that of a neuron) the stacked projection defines the absolute boundaries of the object in 2-D. A mask can be formed by either manually tracing around the boundaries of the object(s) of interest in the stacked projection or by absolute thresholding (making all intensities above a certain value white and all below this value black.) The mask can now be applied to the entire image stack, such that regions falling within the mask selection area are preserved, whereas areas outside this region are eliminated (e.g. set to 0 [black].) After the mask has been applied, thresholding and image filtering methods can be used to aid in removing the remaining undesired regions.

Introduction to Image Segmentation

What is Image Segmentation?

Image segmentation is a partitioning of an image into related sections or regions. These regions may be later associated with informational labels, but the segmentation process simply gives each region a generic label (region 1, region 2, etc.). In the context of Earth remote sensing, informational labels would generally be a ground cover type or land use category. The regions consist of groupings of multispectral or hyperspectral image pixels that have similar data feature values. These data feature values may be the multispectral or hyperspectral data values themselves and/or they may be derived features such as band ratios or textural features.
Figure 1 shows an RGB representation of a Landsat Thematic Mapper (TM) scene from over St. Charles, Maryland. This scene was an early evaluation scene taken within a couple weeks of the launch of Landsat-4 (launched July 16, 1982). Figure 2 shows a Java Script animation of a five level hierarchical segmentation of the Landsat TM displayed in Figure 1. The finest level of detail has 126 regions and the ensuing coarser segmentation have 70 regions, 42 regions, 12 regions and 4 regions, respectively. (You can't necessarily discern all of the regions in the finer segmentations, as some of the regions are only a few pixels in size.) The regions colored with shades of green roughly correspond to wooded areas. The regions colored with shades of turquoise roughly correspond to grassy areas (mixed with residential at the coarser levels). The regions colored with shades of yellow correspond to roads, residential areas, shopping centers, etc. The regions colored with shades of blue correspond to water, and the red and pink areas correspond roughly to agricultural fields (with some mixing with grassy areas). Finally, the white to gray areas correspond to bare soil (gravel pits, land fill, plowed fields, construction areas, etc.). An approach to obtain a better labeling of the various regions is discussed below in the section on the "Region Labeling Tool".

Figure 1. RGB Representation of a Landsat TM scene.

Simple Classifiers

There are at least two ways to approach the design of a classifier:

Hypothesize a plausible solution and adjust it to fit the problem
Create a mathematical model of the problem and derive an optimal classifier

The first method is more intuitive, is frequently used in practice, and is the approach that we shall take. We start with a very simple solution, analyze its characteristics, identify its weaknesses, and complicate it only as necessary.

Clustering

It frequently happens that the a given class is not homogeneous, but is composed of a number of distinct subclasses. In the example shown above, there are obviously three different kinds of letters in the "A" class, and the average or mean feature vector may not represent any one subclass, let alone all of them. In designing the classifier, it would make sense to have three categories A₁, A₂ and A₃, and say that the input is an "A" if it matches either A₁ or A₂ or A₃. In general, if we know that a class contains k subclasses, we could design a two-stage classifier, in which we first assign a feature vector x to a subclass, and then OR the results to identify the class.

The problem of finding subclasses in a set of examples from a given class is called unsupervised learning. The problem is easiest when the feature vectors for examples in a subclass are close together and form a cluster . We will consider four popular methods for finding clusters:

Feature Extraction

A Pattern Recognition system is composed of

Pre-processing
Feature Extraction
Classification

Feature Extraction is a crucial step in Pattern Recognition. It is responsible for measuring features of objects in an image.
In this experiment we have a binary image with different objects. The feature used in this illustrative example is the first invariant moment. It measures the spread of pixels from the centroid of the object.

Original binary image

Labeling is an intermediate step in feature extraction. It allows individual measurements of the objects. The maximum pixel value of the labeled image shown below gives us the number of objects, 28 objects. Note that there are three very small objects that cannot be seen at first sight.

Labeled image

Based on the first invariant moment attribute of each object, it is possible to plot and visualize the graph below. We can count 13 objects with small values, which correspond to the ring screws. There are 9 large values corresponding to the nails and tee-pins. There are also 3 objects with measurements with this attribute closed to zero which correspond to the three small noise dots in the image.

Region number by first invariant moment

Feature Vectors

It frequently happens that we can measure a fixed set of d features for any object or event that we want to classify. For example, we might always be able to measure

x₁ = area
x₂ = perimeter
...
x_d = arc_length / straight_line_distance

In this case, we can think of our feature set as a feature vector x, where x is the d-dimensional column vector

Equivalently, we can think of x as being a point in a d-dimensional feature space. By this process of feature measurement, we can represent an object or event abstractly as a point in feature space.

Robust Analysis of Feature Spaces: Color Image Segmentation by Dorin Comaniciu and Peter Meer

Department of Electrical and Computer Engineering
Rutgers University, Piscataway, NJ 08855, USA

A general technique for the recovery of significant image features is presented. The technique is based on the mean shift algorithm, a simple nonparametric procedure for estimating density gradients. Drawbacks of the current methods (including robust clustering) are avoided. Feature space of any nature can be processed, and as an example, color image segmentation is discussed. The segmentation is completely autonomous, only its class is chosen by the user. Thus, the same program can produce a high quality edge image, or provide, by extracting all the significant colors, a preprocessor for content-based query systems. A 512x512 color image is analyzed in less than 10 seconds on a standard workstation. Gray level images are handled as color images having only the lightness coordinate.

Position Estimation of Micro-Rovers using a Spherical Coordinate Transform Color Segmenter

This work addresses position estimation of a micro-rover mobile robot as a larger robot tracks it through large spaces with unstructured lighting. We use the Spherical Coordinate Transform color segmenter commonly used in medical applications. Data was collected from 50 images taken in five types of lighting: fluorescent, tungsten, daylight lamp, natural daylight indoors and outdoors. The results show that average pixel error was 1.5, with an average error in distance estimation of 6.3 cm. The size of the error did not vary greatly with the type of lighting. In addition to giving segmentation results comparable to stereo triangulation, our approach has other advantages including low computational complexity O(n^2) and lightweight, inexpensive hardware.

Examples of color segmentation in different lighting conditions

Fluorescent lighting

Original	SCT Segmentation	HSI Segmentation	RGB Threshold Segmentation

Tungsten Lighting

Original	SCT Segmentation	HSI Segmentation	RGB Threshold Segmentation

Daylight lamp (halogen with blue filter)

Original	SCT Segmentation	HSI Segmentation	RGB Threshold Segmentation

Indoor sunlight

Original	SCT Segmentation	HSI Segmentation	RGB Threshold Segmentation

Outdoor sunlight

Original	SCT Segmentation	HSI Segmentation	RGB Threshold Segmentation

Segmentation

Introduction
Equipment
Software Tools
Results
Introduction

This part of the Automated Inspection project is concerned with object detection and measurement techniques, particularly in images in which greyscale intensities are insufficient in themselves to do the task.
For instance, an intensity image may be processed morphologically to detect feature boundaries. However, these boundaries may be generated by shadows, or edges in colour which are not associated with features of interest. Assuming one has control of lighting, the problem of shadowing may be minimised by careful attention to illumination. However, the features of interest may still be confused with features of non-interest. This is particularly true when inspecting natural products such as fruit, vegetables, sea food products, timber, etc.
The project is therefore concerned with discriminating features of interest drawing upon information such as colour and texture.
Equipment

The technical development work is carried out on the Team's network of Sun SparcStations and Linux machines.
We have a Data Cell Ltd model S2200 framegrabber/display installed in one of the workstations. Its VisionTool control program enables 24-bit RGB images to be grabbed in either 512x512 or 768x576 pixel sizes. Client projects frequently involve PC-based framegrabbers and associated hardware.
The Team has numerous B&W cameras of varying grades of quality, with much of the work utilising standard security-type cameras. For accurate colour image acquisition we use a compact 3-chip RGB camera (Hitatchi model HV-C20).
Software Tools

Most algorithm development is done using the functional building-blocks of Khoros/Cantata v1.5 and v2.1 on the Sun workstations. Khoros-compatible functions are written (under Craftsman control) only if absolutely necessary.
The C4.5 machine learning classifier developed by Ross Quinlan is used frequently in colour and texture segmentation work.
We have recently started using the WEKA package from the University of Waikato. This is the Waikato Environment for Knowledge Analysis, a workbench which allows users to explore the analysis of data with a range of standard machine learning techniques using a common data file formats and an X-windows interface.
We are developing our own MBIS package (see next section) which specifically targets the use of classifiers with image data.
Results

HSV Colour Space

Conversion to HSI (hue, saturation, value) is one approach to decoupling the intensity component from the colour information, for instance, to isolate features by their colour content in the presence of shadows or other intensity changes. These features make HSI space attractive for colour segmentation algorithms if colours saturations are high. It is a non-linear transformation and has the disadvantage that the hue is unstable for low saturations.
The following images show the RGB and R,G,B components of a fruit image:

The following images (monochrome) show the corresponding HSV components in the sequence Hue, Saturation, Value. Note the noise scatter in the Hue image in the regions of low saturation (the dark areas in the Saturation image).

Any of the above images may be viewed at twice the size:
- RGB input image (34596 bytes)
- Red channel (22773 bytes)
- Green channel (23333 bytes)
- Blue channel (25032 bytes)
- Hue image (41048 bytes)
- Saturation image (53266 bytes)
- Value image (46067 bytes)
While the hue image may be suited to global measurements such as colour grading, we have not found it to be particularly useful in practical applications to date.
Colour/Texture Segmentation

We have produced our own Multi Band Image Segmentation (MBIS) package. This package regards colour and other registered images as multi-banded ones. To date MBIS has 18 options for feature generation. These draw from a variety of 1x1, 3x3, 4x4 and 5x5 windows on or between each band, together with Markov modelling options, and various combinations of Markov, averages and (absolute) differences. New options may readily be added. To date, MBIS can invoke two classifiers, namely the Maximum Likelihood and C4.5 Machine Learning, or a combination of the two.
Fruit Classification

The above images show the classifications for three Fuji apples. The fruit surface is red and green; areas of low intensity are coded brown or black. The background is classified as blue or black. The full-size images (768 x 256) may be viewed as follows:
- Original input image (122618 bytes)
- Classified image (13980 bytes)
Mussel Classification

The above images show how green and blue mussels can be classified. These well-lit images show the two types against an arbitrary pink background. Note that although shadows are present, MBIS clearly distinguishes between the two types. The full-size images (768 x 576) may be viewed as follows:
- Original input image (229846 bytes)
- Classified image (7342 bytes)
Publications

P.W. Power and R.S. Clist, Comparison of supervised learning techniques applied to color segmentation of natural product images. Submitted for presentation at the SPIE Photonics East Symposium, in the Conference on Intelligent Robots and Computer Vision XV: Algorithms, Techniques, Active Vision, and Materials Handling, 18-22 November 1996.

Segmentation

Introduction to Image Segmentation

What is Image Segmentation?

Simple Classifiers

Clustering

Feature Extraction

Feature Vectors

Robust Analysis of Feature Spaces: Color Image Segmentation by Dorin Comaniciu and Peter Meer

Position Estimation of Micro-Rovers using a Spherical Coordinate Transform Color Segmenter

Examples of color segmentation in different lighting conditions

Fluorescent lighting

Segmentation

Contents