Morphological Operations

(Based on material from Digital Imaging: Theory and Applications, H. E. Burdick, McGraw-Hill, 1997)
While point and neighborhood operations are generally designed to alter the look or appearance of an image for visual considerations, morphological operations are used to understand the structure or form of an image. This usually means identifying objects or boundaries within an image. Morphological operations play a key role in applications such as machine vision and automatic object detection.
There are three primary morphological functions: erosion, dilation, and hit-or-miss. Others are special cases of these primary operations or are cascaded applications of them. Morphological operations are usually performed on binary images where the pixel values are either 0 or 1. For simplicity, we will refer to pixels as 0 or 1, and will show a value of zero as black and a value of 1 as white. While most morphological operations focus on binary images, some also can be applied to grayscale images.
It is important to introduce the concepts of segmentation and connectivity. Consider a binary image where the predominant field of white pixels is divided (or segmented) into two parts by a black line. In this image there are three segments: the top group of white pixels, the bottom group of white pixels, and the group of black pixels that form the dividing line. Another three segment image would be one an outer border of white pixels, the black pixels that form square, and a group of white pixels within the square. We see that all pixels of a segment are directly adjacent to at least one other pixel of the same classification, they are all connected.
Most morphological functions operate on 3 x 3 pixel neighborhoods. The pixels in a neighborhood are identified in one of two ways - sometimes interchangeably. The pixel of interest lies at the center of the neighborhood and is labeled X. The surrounding pixels are referred to as either X0 through X7, or by their compass coordinates E, NE, N, NW, W, SW, S, and SE. A pixel is four-connected if at least one of its neighbors in positions X0, X2, X4, or X6 (E, N, W, or S) is the same value. The pixel is eight-connected if all neighbors are the same value. Under eight-connectivity, a set of pixels is said to be minimally connected if the loss of a single pixel causes the remaining pixels to lose connectivity.
Binary Erosion and Dilation
Erosion and dilation are related to convolution but are more for logical decision-making than numeric calculation. Like convolution, binary morphological operators such as erosion and dilation combine a local neighborhood of pixels with a pixel mask to achieve the result. Figure 6.3 shows this relationship. The output pixel, 0, is set to either a hit (1) or a miss (0) based on the logical AND relationship.
Binary erosion uses the following for its mask:
 1 1 1
 1 1 1
 1 1 1
This means that every pixel in the neighborhood must be 1 for the output pixel to be 1. Otherwise, the pixel will become 0. No matter what value the neighboring pixels have, if the central pixel is 0 the output pixel is 0. Just a single 0 pixel anywhere within the neighborhood will cause the output pixel to become 0. Erosion can be used to eliminate unwanted white noise pixels from an otherwise black area. The only condition in which a white pixel will remain white in the output image is if all of its neighbors are white. The effect on a binary image is to diminish, or erode, the edges of a white area of pixels.
Dilation is the opposite of erosion. Its mask is:
0 0 0
0 0 0
0 0 0
This mask will make white areas grow, or dilate. The same rules that applied to erosion conditions apply to dilation, but the logic is inverted - use the NAND rather than the AND logical operation. Being the opposite of erosion, dilation will allow a black pixel to remain black only if all of its neighbors are black. This operator is useful for removing isolated black pixels from an image.
Other functions can be performed using erosion and dilation as their basic operation. One of these is outlining. It is possible to perform a single erosion operation and then subtract the resultant image from the original. The result will be an image that shows a one-pixel outline of all objects. If two erode operators are performed before the subtraction, a two-pixel outline would be created. If desired, a dilation operation can be performed before the erosion as a way to clear up any unwanted 'holes" in the white areas and may produce a cleaner outline image. This is optional because, while making the image cleaner, it might also affect the border of the original image.
Binary Hit-or-Miss Operators
Two operator masks have been discussed so far, one filled with l's to perform erosion and another filled with 0's to perform dilation. There are other masks that could be useful for other types of conditional processing. For example, the following masks can be used to check to see if a pixel is four-connected to its neighbors:
0 0 0    0 1 0    0 0 0    0 0 0
0 1 1    0 1 0    1 1 0    0 1 0
0 0 0    0 0 0    0 0 0    0 1 0
 A similar set of masks can be used to check for eight-connectivity. Bridges, which are defined to be single-pixel connections between groups of similar pixels, can be identified by the following masks:
1 0 1    1 1 1
1 1 1    0 1 0
1 0 1    1 1 1
There also are masks that check for corners or interior pixels or other conditions.
Performing multiple passes on the same image to check for every possible condition of interest can become time consuming. To solve this problem, a concept can be borrowed from the image point operators - look-up tables. Because each pixel in a binary image is either one or zero, it can become a bit that is grouped with other pixels in the neighborhood to form a numerical value. The neighborhood of 9 binary pixels becomes a 9-bit number that can be used as an index into a look-up table to determine if the output pixel should be a hit or a miss. This table is known as a 9-to-1 LUT since the 9-bit input value results in a 1-bit output value. The table has 512 entries, the number of possible conditions of the 3 x 3 binary pixel neighborhood.
Obviously, the challenge of using this technique is generating the proper look-up table, because all possible conditions of pixel neighborhoods must be considered. Once this task is completed, however, the resultant processing is much faster.
Pipelined Processing
A number of morphological operators have been performed by applying a single 3 x 3 pixel mask. There are others, such as shrinking, thinning, and skeletonization, for which 3 x 3 will not suffice. A 5 x 5 mask is needed to perform these functions. But that mask size creates over 33 million conditional patterns that must be checked for each pixel! A very efficient method is to use a two-stage pipeline processing technique, with both stages using 3 x 3 masks. The first stage of the procedure is to process an image, checking for pixels that might be operated upon. This first stage of the pipeline generates a new binary image that marks the likely candidates. The second stage of the pipeline then uses the original binary image and the marked image to determine whether each pixel is a hit or a miss for the desired function. The look-up table method of processing is used, so these checks become very fast. The result is performing the equivalent of 33 million checks per pixel in two passes of a look-up table.
Shrinking will reduce objects in a binary image to a single point located at the geometric center of the object. This can be thought of as finding the center of mass of an object. For objects that do not have holes in them, a single point is generated. If there is a hole, the process will produce a ring of pixels that surrounds the hole and is equidistant from the nearest boundary.
The thinning function is similar to shrinking, except that thinning generates a minimally connected line that is equidistant from the boundaries. Some of the structure of the object is maintained. Thinning also is useful when the binary sense of the image is reversed, creating black objects on a white background. If the thinning function is used on this revered image, the results, are minimally connected lines that form equidistant boundaries between the objects.
Skeletonization also is similar to thinning, except that it maintains more information about the internal structure of objects. The classic way to think about skeletonization is to set fire (mentally, of course) to pixels around the outer edge of an object simultaneously. As the fire burns inward toward the center of the object, eventually it will meet burning pixels from the opposite direction. When two opposing fires meet they extinguish one another, leaving behind a single (or double) pixel boundary, or skeleton, of the object.
Grayscale Morphological Operations
While morphological operations usually are performed on binary images, some processing techniques also apply to grayscale images. These operations are for the most part limited to erosion and dilation. Grayscale erosions and dilations produce results identical to the nonlinear minimum and maximum filters.
The minimum operator will interrogate a 3 x 3 (or any other size) neighborhood and select the smallest pixel value to become the output value. This has the effect of causing the bright areas of an image to shrink, or erode. Similarly, grayscale dilation is performed by using the maximum operator to select the greatest value in a neighborhood.
Morphological functions that are based on hit-or-miss processing, such as thinning and skeletonization, do not translate well to grayscale images.