Vision Project - CS223B - Winter
97
Eric Frew, Andreas Huster, Edward LeMaster
Image Thresholding for Object Detection
Introduction
The HUMMINGBIRD project
at the Aerospace Robotics Lab is a
robot helicopter with a computer vision system used to survey a grass field and
locate objects of interest. These objects currently consist of orange disks and
black barrels and are designed to be easily identified. To detect objects, we currently
apply thresholds to an LUV color segmented image. Objects are defined as regions
that are greater than the threshold in one of the color segments. Because this is
a real-time control system, algorithm speed, reliability and robustness are all
important factors. While the thresholding process is fast enough, it is too sensitive
to the thresholding levels and not robust to changes in lighting. For scenes with
more than one object, we currently have difficulty determining object locations.
In an effort to improve the segmentation by thresholding process and add robustness
to lighting changes, we have implemented some of the improvements discussed in section
3.3.1 of Nalwa's book [Nalwa 1993]. Specifically, we have examined three different
approaches to automatic segmentation. The first algorithm computes the histogram
of the image, and then uses local minima in the histogram to compute potential threshold
locations. The second algorithm computes the histogram based upon regions of the
image with high gradients in an attempt to find thresholds based to a greater extent
upon object edges. The final algorithm acts as a control, and simply sets the thresholds
based upon the percentage of pixels in the image above the threshold value. In each
case, we took advantage of the available color information in order to tailor thresholds
to each type of object. An additional post-processing algorithm we implemented simply
takes a binary image (black with white patches) and segments out the individual
objects. Overall success of each algorithm was calculated by examining each of a
series of test images, and then recording the number of times the objects were identified
correctly.
More information on the project goals and schedule can be found in the
Project Proposal.
Test Images
Any attempt to find an algorithm to solve a problem requires a rigidly-defined series
of tests in order to evaluate performance. To collect our
test images, we simulated a helicopter flight by holding the actual helicopter
cameras about 2m above a grass field, looking downward. This is the approximate
configuration encountered during flight conditions. We then took a series of images,
with several varying parameters. Objects of interest in any given image include
green grass, 4 inch orange metal disks, or 1 m long black barrels. Representative
images can have multiple objects and were taken in direct sunlight, partial shadow,
and complete shadow.
Preliminary image processing is provided by the Teleos (AutoDesk) Advanced Vision
Processing System (AVP). This includes a color digitizer which automatically segments
the pictures into three color components:
- L - The greyscale intensity
- U - Primarily blue (with some green) information
- V - Primarily red (with some yellow) information
We then converted these three components to PGM images for further processing. To
display the test images we also did an LUV to RGB conversion, outputting the files
in PPM format. Note that since the AVP digitizer automatically segments into LUV
components, we do not do any algorithm processing on the RGB components.
Algorithm #1
The first segmentation algorithm we implemented involved computing histograms for
each color component and then applying thresholds at the local minima, or 'valleys',
as per [Prewitt and Mendelsohn 1966]. Barrels are located in the segmented U-component
image, and disks are located in the V-component image. A quick outline of the algorithm
follows...
- Load raw image components (LUV)
- Smooth each component image
- Generate histograms
- Smooth histograms and take derivatives
- Apply thresholds at local minima (zero-crossings)
- Use logic to improve success rate
- Thresholds must be certain distance apart
- U-Thresholds must be above a certain value
- Objects in U-component must be larger than certain size
More
detailed information can be found in in the attached page. In addition,
a summary
of results for each test image is also available. The table below summarizes
the success rate of the algorithm.
Images Examined
|
15
|
Barrels Found
|
8/8
|
Overall Barrel Success Rate
|
100%
|
Disks Found
|
11/13
|
Overall Disk Success Rate
|
93.3%
|
Algorithm #2
The second algorithm selects the segmentation threshold by averaging the intensity
of those pixels with high gradient values - edge pixels. Given the primary assumption
that the images contain only disks and barrels, edge pixels will occur mainly in
the neighborhood of the boundaries between disks/barrels and the background. If
the image is properly smoothed, approximately half of the selected pixels are background
pixels and half belong to the object. Thus, the algorithm calculates a segmentation
threshold as the average intensity of the selected pixel values [Katz 1965]. The
algorithm is surprisingly stable as indicated by the results below.
Following is a coarse outline of the algorithm:
- Smooth U-Image
- Calculate the gradients
- Calculate gradient histogram
- Calculate gradient threshold as a percentile
- Separate pixels with large gradients
- Calculate image threshold as average intensity of high-gradient pixels
- Ensure validity of threshold by comparing it to average image intensity
- Apply threshold to U-Image to find barrels
- Repeat on V-Image to find disks
A more
detailed discussion and a
summary of results for each test image can be found in the attached
pages. Quantitative results are tabulated below, and show that this algorithm is
extremely successful.
Images Examined
|
15
|
Barrels Found
|
8/8
|
Overall Barrel Success Rate
|
100%
|
Disks Found
|
13/13
|
Overall Disk Success Rate
|
100%
|
Algorithm #3
The third algorithm determines a threshold such that a given percentage of pixels
remain after thresholding [Doyle 1962]. Therefore the algorithm will always provide
false positive information when an object is not present. The routine found all
objects when present, but since it cannot distinguish the presence of an object
or not, it is not successful for our purposes. The complete algorithm is outlined
as follows:
- Load raw image components (LUV)
- Smooth each component image
- Generate histograms
- Calculate threshold to keep given percentage of pixels
More detailed
information with examples.
Segmentation Algorithm
The three algorithms discussed above perform pixel classification. That is, their
output is an image in which pixels representing objects are white and the background
is black. To make the results useful for a robot, these outputs have to be segmented
so that pixels are grouped into meaningful regions of connected pixels. For the
purpose of finding barrels, all regions that are greater than a certain minimum
size are detected as barrels. For disks, any connected region of pixels is identified.
Our segmentation algorithm is a simple search through the images classified by the
thresholding algorithms. Once an object pixel is found, the entire connected object
region is enumerated. For barrels, the region is kept only if it is larger than
100 pixels. Finally, the centroid of the region is calculated as a simple measure
of object location. A separate page shows cross hairs to identify object locations
calculated by this algorithm for all the test images. These are the
final results of this project.
Created Code
In order to implement the algorithms above, we were required to generate a significant
amount of code. Unless otherwise noted, all of the code presented here is our own
implementation, although we used function packages where available. This code can
be grouped into 2 main areas:
- AVP Code
- This is code we wrote to interface with the AVP processor, and to convert LUV
segmented images into more standard formats for display. All AVP code is in C/C++.
- Matlab
Code - Where possible, we tested and implemented our algorithms in Matlab to
reduce coding and debugging time.
Comparisons and Conclusions
On the whole, we were pleasantly surprised with the success of our algorithms. The
most important feature in this success is color segmentation of the images. This
allowed us to isolate each type of object within its own 'image type' (U for barrels
and V for disks), and thus ease the burden on the algorithms. Images with more types
of objects (especially those of different colors) would conceivably be segmentable
using these algorithms, but it would be more difficult and require more logic since
more than one type of object would be found in each component image.
Concerning the algorithms themselves, each has its own associated advantages and
disadvantages. Algorithm #2 clearly was the most accurate, with a 100% success rate.
Even though it is a very robust algorithm, it is inherently limited in that it can
only find one threshold. When we can assume that images contain only one type of
object, this algorithm works well. However, it is not easily extensible to situations
where multiple types of objects clutter the scene. This limitation becomes apparent
in the test images where the barrels cast shadows, which are effectively another
type of object.
Algorithm #1 is more flexible because it can detect several thresholds and can make
better use of assumptions about the objects expected in the scene. For example,
shadows appear in the same color component as barrels. When this happens, this algorithm
can identify more than one local minimum in the histogram and thus pick thresholds
that explicitly separate barrels and shadows. On the other hand the logic required
to select among the possible thresholds is cumbersome, relying upon magic numbers
and a previous knowledge of the types of objects expected. Another shortcoming of
this algorithm is that it is difficult to find thresholds for small objects like
the disks because detection relies upon finding an extremely small dip in the image
histogram. That we could find them in most images is amazing in itself, but in the
one case where it failed, we do not have an adequate explanation, since the disks
were quite distinct. Thus, this algorithm should be used very carefully when trying
to extract small objects from the image field.
Algorithm #3 was not well suited to the problem at hand. Because it uses a fixed
percentage value to determine the threshold, it implicitly assumes that some objects
of interest of known size will be present in the image. In our current case, however,
this is not guaranteed, and for actual flights the number of images without objects
generally exceeds the number with objects. Note that when the expected objects were
present, however, the algorithm worked quite well. Applying some extra logic to
this algorithm could significantly improve its suitability.
The motivation for this project is to identify robust object detection techniques
that are suitable for real-time control of an autonomous helicopter. At this time,
the algorithms are coded in Matlab and are not optimized for real-time performance.
Currently, we feel that the performance improvement of these algorithms is not large
enough compared to fixed thresholds to warrant the extra computational burden. However,
as we encounter situations where more robustness is required, these algorithms,
in an optimized form, would be useful. It would then make sense to apply algorithm
#1 to the analysis of the U-Images to find barrels (these contain shadows which
need to be separated from barrels) and algorithm #2 to the analysis of V-Images
to find disks (which are small and make up only a small area in the image).
References
Doyle, W. 1962. "Operations useful for similarity-invariant pattern recognition"
in J. Assoc. Comput. Mach., pp.259-267.
Katz, Y.H. 1965. "Pattern recognition of meteorological satellite cloud photography",
in Proc. Third Symposium on Remote Sensing of Environment, Institute of
Science and Technology, Univ. of Michigan, pp.173-214.
Nalwa, V. S. 1993. A Guided Tour of Computer Vision, Addison-Wesley Publishing
Company, Menlo Park, CA.
Prewitt, J.M.S. 1970. "Object Enhancement and Extraction," in Picture Processing
and Psychopictorics, B.S. Lipkin and A. Rosenfeld, Eds., Academic Press,
New York, pp.75-149.
Prewitt, J.M.S. and Mendelsohn, M.L. 1966. "The analysis of cell images," in Ann.
N.Y. Acad. Sci, pp.1035-1053.
Weska, J.S. 1978. "A Survey of Threshold Selection Techniques," Computer Graphics
and Image Processing, Vol. 7, pp.259-265.
Copyright © 1997 by Eric Frew, Andreas Huster, & Edward LeMaster.
|