Image Thresholding Demos

This demo shows how a image looks like after thresholding. The percentage of the thresholding means the threshold level between the maximum and minimum indesity of the initial image. Thresholding is a way to get rid of the effect of noise and to improve the signal-noise ratio. That is, it is a way to keep the significant imformation of the image while get rid of the unimportant part (under the condition that you choose a plausible thresholding level). In the Canny edge detector part, you will see that, before thinning, we first do some thresholding. You are encouraged to do thinning without thresholding and to see what is the advantage of thresholding.

lena.gif	10% threshold

20% threshold	30% threshold

Vision Project - CS223B - Winter 97
Eric Frew, Andreas Huster, Edward LeMaster

Algorithm #1 Detail

This section examines the effects of applying Algorithm #1 (finding histogram minima) to three specific images. In this manner, the reader can examine the method of operation and the reasoning behind these methods. Inputs to the algorithm are raw L, U, and V components of a color image. Output includes a list of suggested threshold locations for each component, as well as the images resulting from the application of these thresholds. This algorithm takes advantage of the fact that barrels appear very bright in the U-component, and disks in the V-component.
The algorithm begins by smoothing each of the components with a narrow pseudo-gaussian filter, as shown below. These filters are intended to repress the very high frequency noise associated with specular reflections off of grass. Although these filters have the same shape characteristics as a gaussian, they are constructed by dividing by integer values. This was done to speed up the image processing on the real-time helicopter vision system, since floating-point operations are considerably more 'expensive' than character operations. Note that the V-component filter is even narrower than the others. This is because the disks are very small in size (only a few pixels in some cases), and too much filtering would eliminate all information of value.

L/U Filter
1/16	1/8	1/16
1/8	1/4	1/8
1/16	1/8	1/16

V Filter
0	1/8	0
1/8	1/2	1/8
0	1/8	0

Following smoothing, a histogram is made of each image component. As you can see from the plots below, the preliminary smoothing has pushed the histograms up into a better-defined series of peaks and valleys, justifying our observation that much of the noise in the images is very high frequency. The resulting histogram is still too jagged, however, to effectively find local minima. We thus apply smoothing directly to the histogram. This flattens out small peaks, yet leaves the larger-scale trends. The easiest way to find the local minima is to look at the zero-crossings of the derivative. To save computational effort, we combine the two steps and convolve each histogram with the derivative of a gaussian filter with Sigma=5. This value was chosen experimentally for optimal results. The plots below come from Image 14, which will be used in the example until further notice.

Using zero crossings, we can easily calculate the locations of local minima. Examination of the output showed, however, that the algorithm occasionally gave threshold locations at very low intensity values, or at locations in very close proximity to each other, due to the fact that we could not perfectly smooth the histogram. To correct this problem, we applied some logic and only took threshold locations which separate a certain number of pixels. We did this by advancing up the (unsmoothed) histogram, and calculating the number of pixels below each suggested threshold location. If less than 20% of the total number of pixels were contained in the interval since the last applied threshold, then the suggested threshold was not applied.
We would like to note that the above logic is somewhat arbitrary, and in fact would give different results if you moved down from the top of the histogram. However, we knew that the disks are very small and thus would not meet the 20% requirement, while the grass and large shadows cover enough area for these areas to register. The proof that this is somewhat reasonable comes from the success of the algorithm on the test images.
The results of this process for Image 14 are the images shown below. We note that the algorithm only applied one threshold in each of the U- and V-component (denoted by magenta X's in the plots above), and that it did not segment the intensity image. Further note that the algorithm successfully segmented out the barrel and both disks, despite the fact that specular reflection washed out the disk on the barrel to the point where only three pixels remained!

Image 14
Original Image	Thresholded U-Component	Thresholded V-Component
	Threshold: 165 Barrels: 1/1	Threshold: 187 Disks: 2/2

Image 11

We now move our attention to Image 11, which is interesting for different reasons. As the histogram plots show, the algorithm applies 2 thresholds in both the L- and U-components, and none in the V-component. The last omission is desired, since no disk is present.

The output images show that the intensity image (L-component), although thresholded, really does not hold any meaningful information about the objects present in the images, although it does identify the shadows quite well by omission. On the flip side, the U-component successfully segmented the barrels from the shadows from the grassy background.

Image 11
Original Image	Thresholded L-Component (1)	Thresholded L-Component(2)
	Threshold: 85	Threshold: 154
Thresholded U-Component (1)	Thresholded U-Component (2)	Thresholded V-Component
Threshold: 130	Threshold: 163 Barrels: 2/2	Threshold: NA Disks: 0/0

Image 24

The final image we will look at in detail is Image 24. This image was the sole image which failed to segment fully using this algorithm. The barrel was found beautifully, but for some unexplained reason the disks were lost. A possible could be the fact the the two disks were in different lighting, which would 'flatten out' their histogram signatures and make them tougher to find. It is interesting to note that a simple threshold of 160 applied on the V-component does successfully find both disks.

Image 24
Original Image	Thresholded U-Component	Thresholded V-Component
	Threshold: 191 Barrels: 1/1	Threshold: NA Disks: 0/2

V-Component Thresholded at 160

Final Notes

After analysis, we did find that a minimal amount of added logic was necessary to reduce the number of false positives to zero. These checks were applied only to the U-component...

Ignore Thresholds below 150 - This is necessary if we have a picture with bright sunlight and shadow but no barrels (see Image 25). Otherwise, the algorithm will generally place an undesired threshold between the sunlit and shadowy grass.
Ignore blobs smaller than N pixels - Even with the smoothing, sometimes small blobs slip through. Although we cannot correct for this logically with the disks (they are too small already), we can easily do this when extracting multiple barrels from the U-component.

Vision Project - CS223B - Winter 97
Eric Frew, Andreas Huster, Edward LeMaster

Algorithm #2 Details

This page explains the details of how algorithm #2 works. The algorithm is essentially the same for both detecting barrels from the U-component and disks from the V-component. The differences are slight and will be explained as they are encountered.
Following is a sample result for image 14 of the test images. It is an interesting image because it contains both barrels and disks and exhibits other interesting features. The four quadrants of the composite results image show:

Barrels: The result of thresholding the U color component with the threshold calculated by the algorithm	Disks: The result of thresholding the V color component with the threshold calculated by the algorithm
Barrel Gradient: Pixels with high gradient values in the U component (used to compute barrel threshold)	Disk Gradient: Pixels with high gradient values in the V component (used to compute disk threshold)

Smoothing
The algorithm relies on gradients which are very sensitive to image noise. Therefore, the first step in the process is smoothing the image. This is achieved by a convolution with the following kernel:

0	1/32	1/16	1/32	0
1/32	1/32	1/16	1/32	1/32
1/16	1/16	1/8	1/16	1/16
1/32	1/32	1/16	1/32	1/32
0	1/32	1/16	1/32	0

Because this smoothing step is the most time consuming step in the algorithm, we chose the smallest kernel for which the algorithm works (5x5) and used powers of two for the weights. In this way, the convolution can be implemented using shift operations instead of floating point multiplications.

Gradients
Next we calculate the gradient with a second convolution using the following kernel. The kernel uses imaginary weights for calculating the two orthogonal gradient components using a single convolution.

0	i/2	0
1/2	0	-1/2
0	-i/2	0

Cropping
The smoothing and gradient operations introduce edge artifacts along the perimeter of the image. These are cropped before further processing.
Gradient Histogram
The next step in the algorithm is choosing pixels with high edge (gradient) values. We therefore calculate a gradient histogram and find the (100-n) th percentile. n is a small number, typically less than 5%. Subsequently, we find a set E of pixels with high gradient values by thresholding the gradients at the percentile above. For images with objects, the pixels in E lie close to object boundaries. In fact, in an ideal image, about half of the pixels in E are part of the object and half are part of the background. However, pixel sets drawn from real images will generally contain a number of other pixels as well. The bottom row in the composite above shows pixels in E in white. (Note that for illustration only, the edges were not cropped.)
For barrels, we chose n = 3% and for disks n = 1%. We found that the algorithm is not very sensitive to the actual value of n.
Threshold Calculation
The threshold used to classify the image is now simply the intensity average over the pixels in the set E. Because approximately half of the chosen pixels belong to the object and half to the background, the average of these pixels typically represents a robust thresholding value somewhere between the average object intensity and the average background intensity.
Threshold Logic
A feature of this algorithm is that it always selects exactly one threshold. If the image actually contains an object, this threshold is adequate, at the least, for detecting the object. However, if the image contains no objects, the threshold is clearly not meaningful. We can detect this situation by comparing the threshold with the intensity average of the entire image. In this algorithm, if the threshold is not at least 10% greater than the average over the entire image, it is assumed to be false and the algorithm concludes that the image contains no objects.
Discussion of Sample Results
This section discusses some of the results shown at the top of this page. Results to all of the test images are provided in the summary page.
The bottom row of the composite shows which pixels are used in the computation of the threshold. For the disks, the algorithm finds only the desired edge pixels and computes a threshold that leads to very successful detection, as seen in the top row. This process is easy for disks because of the wide separation between typical disk intensities and the background.
For barrels, the algorithm finds a large number of barrel-to-background edge pixels, but also a scattering of other pixels. These also enter into the averaging calculation, but because of the relative numbers of pixels, they do not contribute much to the final threshold. The result of thresholding shows some small blobs in addition to the barrel. The blobs occur in the shadow cast by the barrel which shows a weakness in the algorithm: the threshold is calculated between the object and the background and does not account for the shadows which have typical intensity values that are between the object and background intensities. The output is still a success because the blobs are small enough to be discarded by the segmentation algorithm.
The purpose of this project is to find thresholding algorithms that are robust to lighting changes where fixed thresholds become ineffective. Hence, the introduction of new "magic" numbers is troublesome. The magic numbers for this algorithm are the number of pixels used to compute the threshold (3% for barrels, 1% for disks), and the minimum threshold which is set to 10% greater than the average image intensity. Note, however, that these quantities are relative quantities and therefore they vary with the given image. Moreover, experience has showed us that the performance of the algorithm is not very sensitive to these parameters. Therefore, setting the magic numbers to values that work well for the test images is a valid procedure, because we can expect that the performance will not be affected greatly by conditions that may arise in images not previously tested.

Return to the main page.

Vision Project - CS223B - Winter 97
Eric Frew, Andreas Huster, Edward LeMaster

Image Thresholding for Object Detection

Introduction

The HUMMINGBIRD project at the Aerospace Robotics Lab is a robot helicopter with a computer vision system used to survey a grass field and locate objects of interest. These objects currently consist of orange disks and black barrels and are designed to be easily identified. To detect objects, we currently apply thresholds to an LUV color segmented image. Objects are defined as regions that are greater than the threshold in one of the color segments. Because this is a real-time control system, algorithm speed, reliability and robustness are all important factors. While the thresholding process is fast enough, it is too sensitive to the thresholding levels and not robust to changes in lighting. For scenes with more than one object, we currently have difficulty determining object locations.
In an effort to improve the segmentation by thresholding process and add robustness to lighting changes, we have implemented some of the improvements discussed in section 3.3.1 of Nalwa's book [Nalwa 1993]. Specifically, we have examined three different approaches to automatic segmentation. The first algorithm computes the histogram of the image, and then uses local minima in the histogram to compute potential threshold locations. The second algorithm computes the histogram based upon regions of the image with high gradients in an attempt to find thresholds based to a greater extent upon object edges. The final algorithm acts as a control, and simply sets the thresholds based upon the percentage of pixels in the image above the threshold value. In each case, we took advantage of the available color information in order to tailor thresholds to each type of object. An additional post-processing algorithm we implemented simply takes a binary image (black with white patches) and segments out the individual objects. Overall success of each algorithm was calculated by examining each of a series of test images, and then recording the number of times the objects were identified correctly.
More information on the project goals and schedule can be found in the Project Proposal.

Test Images

Any attempt to find an algorithm to solve a problem requires a rigidly-defined series of tests in order to evaluate performance. To collect our test images, we simulated a helicopter flight by holding the actual helicopter cameras about 2m above a grass field, looking downward. This is the approximate configuration encountered during flight conditions. We then took a series of images, with several varying parameters. Objects of interest in any given image include green grass, 4 inch orange metal disks, or 1 m long black barrels. Representative images can have multiple objects and were taken in direct sunlight, partial shadow, and complete shadow.
Preliminary image processing is provided by the Teleos (AutoDesk) Advanced Vision Processing System (AVP). This includes a color digitizer which automatically segments the pictures into three color components:

L - The greyscale intensity
U - Primarily blue (with some green) information
V - Primarily red (with some yellow) information

We then converted these three components to PGM images for further processing. To display the test images we also did an LUV to RGB conversion, outputting the files in PPM format. Note that since the AVP digitizer automatically segments into LUV components, we do not do any algorithm processing on the RGB components.

Algorithm #1

The first segmentation algorithm we implemented involved computing histograms for each color component and then applying thresholds at the local minima, or 'valleys', as per [Prewitt and Mendelsohn 1966]. Barrels are located in the segmented U-component image, and disks are located in the V-component image. A quick outline of the algorithm follows...

Load raw image components (LUV)
Smooth each component image
Generate histograms
Smooth histograms and take derivatives
Apply thresholds at local minima (zero-crossings)
Use logic to improve success rate
- Thresholds must be certain distance apart
- U-Thresholds must be above a certain value
- Objects in U-component must be larger than certain size

More detailed information can be found in in the attached page. In addition, a summary of results for each test image is also available. The table below summarizes the success rate of the algorithm.

Images Examined	15
Barrels Found	8/8
Overall Barrel Success Rate	100%
Disks Found	11/13
Overall Disk Success Rate	93.3%

Algorithm #2

The second algorithm selects the segmentation threshold by averaging the intensity of those pixels with high gradient values - edge pixels. Given the primary assumption that the images contain only disks and barrels, edge pixels will occur mainly in the neighborhood of the boundaries between disks/barrels and the background. If the image is properly smoothed, approximately half of the selected pixels are background pixels and half belong to the object. Thus, the algorithm calculates a segmentation threshold as the average intensity of the selected pixel values [Katz 1965]. The algorithm is surprisingly stable as indicated by the results below.
Following is a coarse outline of the algorithm:

Smooth U-Image
Calculate the gradients
Calculate gradient histogram
Calculate gradient threshold as a percentile
Separate pixels with large gradients
Calculate image threshold as average intensity of high-gradient pixels
Ensure validity of threshold by comparing it to average image intensity
Apply threshold to U-Image to find barrels
Repeat on V-Image to find disks

A more detailed discussion and a summary of results for each test image can be found in the attached pages. Quantitative results are tabulated below, and show that this algorithm is extremely successful.

Images Examined	15
Barrels Found	8/8
Overall Barrel Success Rate	100%
Disks Found	13/13
Overall Disk Success Rate	100%

Algorithm #3

The third algorithm determines a threshold such that a given percentage of pixels remain after thresholding [Doyle 1962]. Therefore the algorithm will always provide false positive information when an object is not present. The routine found all objects when present, but since it cannot distinguish the presence of an object or not, it is not successful for our purposes. The complete algorithm is outlined as follows:

Load raw image components (LUV)
Smooth each component image
Generate histograms
Calculate threshold to keep given percentage of pixels

More detailed information with examples.

Segmentation Algorithm

The three algorithms discussed above perform pixel classification. That is, their output is an image in which pixels representing objects are white and the background is black. To make the results useful for a robot, these outputs have to be segmented so that pixels are grouped into meaningful regions of connected pixels. For the purpose of finding barrels, all regions that are greater than a certain minimum size are detected as barrels. For disks, any connected region of pixels is identified.
Our segmentation algorithm is a simple search through the images classified by the thresholding algorithms. Once an object pixel is found, the entire connected object region is enumerated. For barrels, the region is kept only if it is larger than 100 pixels. Finally, the centroid of the region is calculated as a simple measure of object location. A separate page shows cross hairs to identify object locations calculated by this algorithm for all the test images. These are the final results of this project.

Created Code

In order to implement the algorithms above, we were required to generate a significant amount of code. Unless otherwise noted, all of the code presented here is our own implementation, although we used function packages where available. This code can be grouped into 2 main areas:

AVP Code - This is code we wrote to interface with the AVP processor, and to convert LUV segmented images into more standard formats for display. All AVP code is in C/C++.
Matlab Code - Where possible, we tested and implemented our algorithms in Matlab to reduce coding and debugging time.

Comparisons and Conclusions

On the whole, we were pleasantly surprised with the success of our algorithms. The most important feature in this success is color segmentation of the images. This allowed us to isolate each type of object within its own 'image type' (U for barrels and V for disks), and thus ease the burden on the algorithms. Images with more types of objects (especially those of different colors) would conceivably be segmentable using these algorithms, but it would be more difficult and require more logic since more than one type of object would be found in each component image.
Concerning the algorithms themselves, each has its own associated advantages and disadvantages. Algorithm #2 clearly was the most accurate, with a 100% success rate. Even though it is a very robust algorithm, it is inherently limited in that it can only find one threshold. When we can assume that images contain only one type of object, this algorithm works well. However, it is not easily extensible to situations where multiple types of objects clutter the scene. This limitation becomes apparent in the test images where the barrels cast shadows, which are effectively another type of object.
Algorithm #1 is more flexible because it can detect several thresholds and can make better use of assumptions about the objects expected in the scene. For example, shadows appear in the same color component as barrels. When this happens, this algorithm can identify more than one local minimum in the histogram and thus pick thresholds that explicitly separate barrels and shadows. On the other hand the logic required to select among the possible thresholds is cumbersome, relying upon magic numbers and a previous knowledge of the types of objects expected. Another shortcoming of this algorithm is that it is difficult to find thresholds for small objects like the disks because detection relies upon finding an extremely small dip in the image histogram. That we could find them in most images is amazing in itself, but in the one case where it failed, we do not have an adequate explanation, since the disks were quite distinct. Thus, this algorithm should be used very carefully when trying to extract small objects from the image field.
Algorithm #3 was not well suited to the problem at hand. Because it uses a fixed percentage value to determine the threshold, it implicitly assumes that some objects of interest of known size will be present in the image. In our current case, however, this is not guaranteed, and for actual flights the number of images without objects generally exceeds the number with objects. Note that when the expected objects were present, however, the algorithm worked quite well. Applying some extra logic to this algorithm could significantly improve its suitability.
The motivation for this project is to identify robust object detection techniques that are suitable for real-time control of an autonomous helicopter. At this time, the algorithms are coded in Matlab and are not optimized for real-time performance. Currently, we feel that the performance improvement of these algorithms is not large enough compared to fixed thresholds to warrant the extra computational burden. However, as we encounter situations where more robustness is required, these algorithms, in an optimized form, would be useful. It would then make sense to apply algorithm #1 to the analysis of the U-Images to find barrels (these contain shadows which need to be separated from barrels) and algorithm #2 to the analysis of V-Images to find disks (which are small and make up only a small area in the image).

References

Doyle, W. 1962. "Operations useful for similarity-invariant pattern recognition" in J. Assoc. Comput. Mach., pp.259-267.
Katz, Y.H. 1965. "Pattern recognition of meteorological satellite cloud photography", in Proc. Third Symposium on Remote Sensing of Environment, Institute of Science and Technology, Univ. of Michigan, pp.173-214.
Nalwa, V. S. 1993. A Guided Tour of Computer Vision, Addison-Wesley Publishing Company, Menlo Park, CA.
Prewitt, J.M.S. 1970. "Object Enhancement and Extraction," in Picture Processing and Psychopictorics, B.S. Lipkin and A. Rosenfeld, Eds., Academic Press, New York, pp.75-149.
Prewitt, J.M.S. and Mendelsohn, M.L. 1966. "The analysis of cell images," in Ann. N.Y. Acad. Sci, pp.1035-1053.
Weska, J.S. 1978. "A Survey of Threshold Selection Techniques," Computer Graphics and Image Processing, Vol. 7, pp.259-265.