Bio-inspired Image Processing.

3. Object Recognition in Natural Environments.

     The recognition of objects in natural environments is a very difficult task, mainly because of the wider variety of possible objects and the impossibility to have control over external conditions. One example of such kind of environments is the inner space of sewage pipes. In the FhG-IPK a system for the automatic visual inspection of sewage pipes was developed [Ruiz-del-Solar and Köppen, 1996]. Two important tasks to be performed in this system are the automatic detection of pipeís sockets and the segmentation of internal wall images. In subsection 3.1 a neural architecture for the automatic socket detection in video images, the SBCS-Architecture, is presented. The detection of edges is a fundamental part of the process of image segmentation. A fuzzy based architecture, the FHAAR-Architecture, was developed to perform this task in highly variant environments. In section 3.3 the FHAAR-Architecture and its application to the segmentation of internal wall images of pipes scenes is described. Finally, the FUZZ-GIDO, an operator for the detection of symmetrical objects, which is based in the same biological principles of the FHAAR-Architecture is described in section 3.2. It should be stressed that the main feature of these architectures and of the FUZZ-IDO is their robustness.

3.1. Pipeís Socket Detection using the SBCS-Architecture.

     Sewage pipes have to be periodically inspected because of economical, environmental and juridical reasons. The small diameter of pipes does not allow direct human inspection of them. Visual inspection through the processing of inner images is an established possibility to perform this task. The inspection is performed through the processing of a video signal. This signal is taken by a CCD-camera mounted on a remote controlled canal-robot, which moves through the inner parts of the pipes (see figure 3.1). Automating the visual inspection process saves human time and effort and can provide accurate, objective, and reproducible results. Additionally, automation can eliminate human errors resulting from fatigue or lack of concentration. A system for the automation of sewage pipe inspection was developed (see description in [Ruiz-del-Solar and Köppen, 1996]).

Fig. 3.1 Sewage pipesí inspection system.

     The automated inspection system works as follows: the camera-car moves through the pipes and looks for the location of the pipes' sockets, because of most of the pipes's faults are placed in the socket's surrounding area. Each time a socket is detected the camera-car films its surrounding area. Later this information is off-line analyzed. From the above description it is clear that the automatic socket detection must be performed in real time. The detection system must be also very robust, because of the variable environmental conditions inside the pipes (variable illumination, lack of equidistance between the sockets, presence of physical obstacles such as solid substances and water, etc). To implement the detection of sockets a very robust, neural-based architecture, the SBCS-Architecture, was designed. The mechanisms used in the architecture are motivated and justified by evidence from psychophysics and neurophysiology. They were adapted and simplified taking into account the main system characteristics, which are real time processing, variable environmental conditions inside the pipes, and some a priori knowledge of the physical system properties (geometry of the pipes, CCD camera, and camera-car). The block diagram of the SBCS-Architecture is shown in figure 3.2. This architecture is composed of three subsystems: PSS (Preattentive Segmentation Subsystem), ORS (Object Recognition Subsystem), and FS (Foveation Subsystem). The PSS segments the input image, or more exactly, a reduced image obtained from the original one. It has two inputs, the VIS (Video Input signal), and the PFS (Parameter Feedback Signal), which is a signal coming from the ORS that allows adjustment of local parameters. The ORS detects the pipes' sockets taking as input the output of the segmentation process (SOS - Segmentation Output Signal). Finally, the FS keeps the camera focus centered in relation to the main axis of the pipes. It receives an input signal (SMS -Spatial Mapping Signal) from the PSS and sends the FFS (Foveation Feedback Signal) to the camera-car.

Fig. 3.2. Block diagram of the SBCS-Architecture.

3.1.1. Preattentive Segmentation Subsystem (PSS).

     The Preattentive Segmentation Subsystem (PSS) is formed by three modules (see the block diagram shown in figure 3.3), called: SCLM (Spatial Complex Logarithmic Mapping), DOI (Discount of Illuminant), and SBCS (Simplified Boundary Contour System).

3.1.2. SCLM (Spatial Complex Logarithmic Mapping).

     The SCLM module performs a complex logarithmic mapping of the input signal. This mapping is implemented using the Polar-Logarithmic Transformation (explained in section 7.1). The SCLM module takes advantage of the circular system symmetry by focalizing the analysis into circular image segments, which allows a significant diminution of the data to be processed. This data diminution is produced by the logarithmic sampling of the input signal in the radial direction and by the constant sampling (the same number of points is taken) in each angular sector to be transformed. Additionally, the complex logarithmic mapping provides an invariant representation of the objects, because rotations and scalings on the input signal are transformed into translations [Schwartz, 1980], which can be easily compensated.

Fig. 3.3. Preattentive segmentation subsystem (PSS).

3.1.3. DOI - Discount of Illuminant.

     In this stage variable illumination conditions are discounted by a shunting on-center off-surround network (defined in [Grossberg, 1983]), which models the receptive fields response of the ganglions cells of the retina. Image regions of high relative contrast are amplified and regions of low relative contrast are attenuated as a consequence of the discounting process.

3.1.4. SBCS - Simplified Boundary Contour System.

     The SBCS module corresponds to a simplified and modified version of the Boundary Contour System (BCS) developed at the Boston University [Grossberg and Mingolla, 1985]. The BCS model is based primarily on psychophysical data related to perceptual illusions. Its processing stages are linked to stages in the visual pathway: LGN Parvo->Interblob->Interstripe->V4 (see description in [Nicholls et al., 1992]). The BCS model generates emergent boundary segmentations that combine edge, texture, and shading information. The BCS operations occur automatically and without learning or explicit knowledge of the environment. The system performs orientational decomposition of the input data, followed by short-range competitive and long-range cooperative interactions among neurons. The competitive interactions combine information of different positions and different orientations, and the cooperative interactions allow edge completion. The BCS has been shown to be robust and has been successfully used in different real applications like processing of synthetic aperture radar images [Cruthirds et al., 1992], segmentation of magnetic resonance brain images [Lehar et al., 1990; and Worth, 1993], and segmentation of images of pieces of meat in an industrial environment [Díaz Pernas, 1993].

     In general the standard BCS algorithm requires a significant amount of execution time that does not allow its utilization in our real time application. For that reason the SBCS was developed. It uses monocular processing, only the "ON" processing channel, a single spatial scale, and three orientations (a description of these characteristics can be found in [Grossberg, 1994]). Each processing stage of the SBCS model is explained in the following paragraphs.

Oriented Filtering Stage (S1).

     Two-Dimensional Gabor Filters are used as oriented filters. These filters model the receptive fields of simple and complex cells in the visual cortex. Only odd-symmetric filters, which respond optimally to differences of average contrast across its axis of symmetry, are used. By taking into account the image circular symmetry and their posterior logarithmic mapping, only three oriented filters are used (see figure 3.4).

Fig. 3.4. Oriented filter masks.

First Competitive Stage (S2).

     Cells in this stage compete across spatial position within their own orientation plane. This is done in the form of a standard shunting equation with two additional terms, a tonic input (T) and a feedback signal (V) that comes from a later stage (Feedback Stage). Our modified dynamic shunting equation is given by:


At equilibrium, this equation is determined by:


where W is the output of this stage; J is the output of the Oriented Filtering Stage; G is a gaussian mask; k is the orientation index; p,q,i,j are position indexes; and A, B, C are constants.

Second Competitive Stage (S3).

     At this stage competition takes place only across the orientation dimension, i.e. cells compete with other cells that have the same position but different orientation. As in [Lehar et al. 1990], the equilibrium condition for this dynamic competition is simulated by finding the maximal response across the orientation planes for each image position, and by multiplying all non-maximal values by a suppression factor.

Oriented Cooperation Stage (S4).

     The oriented cooperation is performed in each orientation channel by bipole cells that act as long-range statistical AND-gates. Unlike the standard BCS model we use bipole cells whose receptive fields have collinear and non-collinear branches (see figure 3.5). These receptive fields have properties consistent with the spatial relatability property (deduced from studies performed by [Kellman and Shipley, 1991], which indicates that two boundaries can support an interpolation between themselves when their extensions intersect in an obtuse or right angle. The cooperation in the left-half receptive field (Lijk) is performed among neighboring cells with the same orientation, and the cooperation in the right-half receptive fields (Rijk) among neighboring cells across all the orientations. At equilibrium, the output from this stage is defined by:






with Fpqijk the receptive field kernel; Bpqijk and Cpqijk the rotated coordinates; Qpqij the direction of position (p,q) with respect to the position (i,j); Y the output of the Second Competitive Stage; r0 k and rmax k the ranges of relatable orientations; D, E, P, and R constants; and Dq the angle between orientations.

Fig. 3.5. Bipole cells with collinear and non-collinear branches. The bipole cells (*) are nor used.

Feedback Stage (S5).

     Before cooperative signals are sent to the first competitive stage, a competition across the orientation dimension, and a competition across spatial position take place in order to pool and sharpen the signals that are fed back. Both competitions are implemented in one processing step.

3.1.5. Results.

     As an example of the system processing capabilities, figure 3.6a shows a sewage pipe image. In this image one can see a socket (in white). Figure 3.6b shows the SCLM-Module output. It can be seen that the spatial mapping allows a great data reduction (more than 10 times) that produces an equivalent reduction in the processing time. Figure 3.6c shows the Segmentation Output Signal. It can be seen that the input image (more exactly the transformed image) is segmented in two areas: a white area, which corresponds to the socket area, and a black one, which corresponds to the rest of the image. In the upper right quadrant one can see noise that corresponds to some geometrical distortion produced by the mapping, but does not disturb the sockets' detection. The segmented image is the input of the ORS where the pipes' sockets are finally detected. That is performed by a SOM (Self-Organizing Map) network.




Fig. 3.6. (a) Input Image (376x288 pixels); (b) Spatial mapping signal (128x64 pixels);
(c) Segmentation output signal (128x64 pixels).

3.2. Fuzzy-based Detection of Symmetrical Object in real-word Scenes.

     Since the development of pyramidal algorithms by Burt in the early eighties [Burt, 1981] many researchers have been working on applying the Multiresolution-Principle on the development of Image Processing Systems. As an example, Daugman developed a system for iris identification that uses a circular edge detector to locate the inner and outer boundaries of the iris [Daugman, 1993]. This detector is based on the use of integrodifferential operators to search under different resolutions circular structures over the image domain. The circular detector from Daugman was generalized by Ruiz-del-Solar to detect symmetrical real-world objects such as pipes, valves and danger signals [Ruiz-del-Solar, Nowack and Nickolay, 1996; Ruiz-del-Solar, 1997]. An additional organizational principle present in the visual system is the variable sensibility of the photoreceptors, which depends on the luminance of the background, and allows the invariant perception of objects under different lighting conditions [Spillman and Werner, 1990]. Pal and Mukhopadhyay developed a fuzzy-based edge detector which makes uses of this organizational principle [Pal and Mukhopadhyay, 1996]. This section describes the FUZZ-GIDO operator, an operation for detection of symmetrical objects, which is based in the generalized Daugmanís detector developed by Ruiz-del-Solar, and that uses the fuzzy-based concepts introduced by Pal and Mukhopadhyay. The main characteristics of this new detector are its robustness and its processing speed.

3.2.1 The FUZZ-GIDO Operator.

     The circular edge detector proposed by Daugman is defined by [Daugman, 1993]:


     This integrodifferential operator is based on the application of three operations: (a) a normalized contour integral along a circular arc over the pixels of the input image I(x,y) with (x0,y0) as center coordinates and r as radius, (b) a partial derivation of the contour integral, with respect to the radius, and (c) a convolution (blurring) of the differentiation result with a smoothing Gaussian of scale s (). All these operations are discretely implemented. To speed up the calculations, the order of convolution and differentiation are interchanged.

     By considering not only circular contours in the calculation of the line integral, the IDO can be generalized to detect symmetrical objects. In this case the contour integral is calculated over a domain D, defined by a function f() (analytical or a non-analytical) that specifies the form of the symmetrical object to be detected (see figure 3.7).

Fig. 3.7. Parameters of the GIDO (after [Ruiz-del-Solar, 1997]).

     This generalized IDO (GIDO) uses a scale factor s (instead of the radius) and the center coordinates (x0,y0) as parameters. It is (discretely) defined by [Ruiz-del-Solar, 1997]:


     The GIDO was improved to allow the invariant perception of objects under different lighting conditions. As it was mentioned, the perceived contrast in an image depends on its local background intensity (luminance of the background). For this reason a new fuzzy-based operator for symmetrical object detection (FUZZ-GIDO) that takes into account the local background intensity, and is based on the GIDO, was proposed in [Ruiz-del-Solar, 1998b].

     A fuzzy reasoning system with the generalized edge detector (GIDO) and the background intensity (B) as antecedent variables is implemented. The consequent variable (output) of this system is the proposed FUZZ-GIDO. The linguistic variables used are: PB (positive big), PS (positive small), MED (medium) and ZE (zero). Triangular fuzzy sets are used for all the variables. The IF-THEN rules employed are coded in Table 1.


Table 1. Codification of the 12 IF-THEN rules used (after [Pal and Mukhopadhyay, 1996]).

The background intensity (B) is calculated over the same domain D used to calculate the GIDO (see figure 3.7), and given by:


The defuzzification is performed by:


where ci represents the center of mass of the normalized triangular fuzzy sets of the FUZZ-GIDO linguistic variable, and li corresponds to the firing strength of each rule, calculated as the minimum between the membership functions of the GIDO and the B linguistic variables.

3.2.2. The Multiresolution Algorithm for the FUZZ-GIDO.

     The algorithm looks for the objects in the image by applying the FUZZ-GIDO over all possible combinations of center coordinates and scale factors. This search is performed under different resolutions by using the concept of a virtual pixel (a group of real pixels). In each virtual pixel position the FUZZ-GIDO is computed. All the operations involved in this computation are performed over real pixels. The simultaneous use of virtual pixels (search) and real pixels (calculations) makes possible to have a high processing speed without a lost of precision. The algorithm iterates from coarse to fine resolutions by decreasing the size of the virtual pixels. At each iteration step two Region of Interest (ROI) are defined, one square for the center coordinates and one lineal for the scale factor [Ruiz-del-Solar, Nowack and Nickolay, 1996]. Inside these ROIs all possible combinations of center coordinates and scale factors are used to find the maximum value of the FUZZ-GIDO. This maximum value is used to decide the possible presence of an object in the ROI. After the maximum value of the FUZZ-GIDO is found, the size of the virtual pixels is halved, and new ROIs are defined around the virtual pixel position (in (x,y,s) space) of the maximum. The algorithm stops when the virtual pixels have the same size as the real ones. Typical values for the initial size of the virtual pixels are 4 or 8, that means that only three or four iterations are necessaries to perform the search of an object. The pseudocode of the algorithm is given by:

  • Initialization:
    • Construction of the virtual pixels (size is application dependent). Typical values are 4 or 8.
    • Definition of the starting ROIs (size is application dependent), one square for the center coordinates and one lineal for the scale factor.
  • While (size of the virtual pixels 3 1) {
    • Application of the FUZZ-GIDO in each virtual pixel position of the ROIs. The FUZZ-GIDO is computed over real pixels.
    • Calculation of the virtual pixel position (center coordinates and scale factor combination) corresponding to the maximum value of the FUZZ-GIDO.
    • Definition of new ROIs around the virtual pixel position of the maximum (extension of the lineal ROI: 3 virtual pixels; extension of the square ROI: 3x3 virtual pixels).
    • Reduction by a factor of 2 of the size of the virtual pixels.
 3.2.3 Results.

Some examples of detection of real-world objects are presented. Figure 3.8a exemplifies the detection of a pipe union in an internal image of a pipe. Figure 3.8b shows the detection of the inner boundary of the iris. Figure 3.8c shows the detection of circular structures in a valve image. Finally, figure 3.8d exemplifies the detection of a real, danger signal. The presented objects are very difficult to detect, because of the poor lighting conditions, as for example in figure 3.8a, and because of they contain hidden areas (also see figure 3.8a), and distorted or not exact defined edges, as in figures 3.8b and 3.8c. The extra complexity of the detection performed in figure 3.8d, is given by the fact that the danger signal detected, doesnít match exactly the searched one, which corresponds to a perfect rhombus.





Fig.3.8. (a) Internal image of a pipe. Detection of a pipe union; (b) Eye image. Detection of the inner boundary of the iris;
(c) Valve image. Detection of the most brilliant circular structure; (d) Detection of a real, danger signal.

3.3. FHAAR: An Architecture for the Automatic Edge Detection in real-world Scenes.

     Taking the work developed in the previous section as a starting point and using an edge detector, which follows Cannyís paradigm and is derived from the use of Haar wavelets, a robust architecture for the automatic edge detection in real-world scenes, was developed [Lohmann, Nowack, and Ruiz-del-Solar, 1999]. The motivation to develop this architecture was the necessity to have a robust and fast edge detection module in the System for the Automatic Segmentation of real Pipe Scenes described in section 3.1, and in a System for the Recognition of Number Plates (also developed at the FhG-IPK Berlin).

3.3.1 The FHAAR - Architecture.

     In this architecture the data and control flow are separated (see the block diagram of ºure 3.9). The data flow contains four stages: Noise Filtering, Averaging, Edge Detection and Fuzzy Processing. The control flow contains the Histogram Processing and the Fuzzy Parameter Determination stages. The noise contained in the input images is filtered in the Noise Filtering stage. Depending on Gaussian or Impulsive noise is present in the images, a Gaussian-Filter or a Median-Filter is applied. In the Averaging stage, the background information of the original image is determined. A simple and hence fast averaging algorithm, which uses two 5-pixels-size line-masks, one vertical and one horizontal, determines homogenous areas that are larger than the relevant edge structure information. The relevant edge information is calculated in the Edge Detection stage by using an edge detector that is based on the use of the Haar-Wavelet. This edge detector is very robust against the variances found in natural patterns, and is also implemented by using two 5-pixels-size line-masks.

Fig. 3.9. Block diagram of the proposed FHAAR-Architecture.

     Based on the previously described data processing stages, the Fuzzy Processing stage calculates the final fuzzy edge image by using an Edge Image and a Background Image. This stage has as inputs data-dependent fuzzy sets, whose membership functions are dynamically determined by using information obtained from the cumulative histograms of the Edge and the Background Images. It must be pointed out, that in the proposed architecture the fuzzy set parameters are automatically determined and that the end-user must set only very few application-dependent parameters (the size of some filter masks and some threshold values). These parameters depend on the kind of images (i.e. on the application) being processed. More specifically, on the kind of information contained in these images.

3.3.2 Edge Detection.

     In general, Wavelets are mathematical functions that cut up data into different frequency components, and then allow one to study each component with a resolution matched to its scale. There is an important connection between wavelets and edge detection, since wavelets are well adapted to react locally to rapid changes in images. Moreover, the wavelet theory allows one to understand how to combine multiscale edge information to characterize different types of edges. Cannyí edge detector is one of the most often used detectors in vision algorithms. Cannyís detection is equivalent to the detection of wavelet transform local maximum [Mallat, 1996].

     Canny provides valuable insights into the desired properties of an edge detector [Canny, 1986]. The basic requirement is that an edge detector should exhibit good performance under noisy conditions. The detector should fulfil the following goals: Good Detection, Good Localization and Clear Response (see explanation in [Canny, 1986]). The Canny model assumes an ideal step edge corrupted by Gaussian noise. In practice this is not an exact model, but it represents an approximation of the effects of sensor noise, sampling and quantization. Following Cannyís paradigm, an image is first subject to low-pass smoothing and then a derivative operation is applied (for edge detection by local maximum).

     The proposed edge detector follows Cannyís paradigm, but is implemented using a wavelet filter (after [Mallat, 1996], both approaches are equivalent). The main reasons to choose a wavelet implementation are the fast processing capabilities and the possibility to develop a multiscale version of our detector in a near future. Due to its simplicity and fast implementation possibilities, the Haar wavelet was chosen as our wavelet basis. The Haar wavelet is the simplest and the oldest of all wavelets. It is a step function that is defined as [Strang and Nguyen, 1996]:


     By taking into account the equivalence between wavelets and filter banks [Strang and Nguyen, 1996], a separable filter, which is based on a redundant Haar decomposition, was derived. This decomposition performs a multiscale smoothing of the local image information, and then a derivative operation. The proposed filter, called Haar-Filter (HFilter), has a vertical and a horizontal five-pixel component defined by the mask [1, 2, 0 ,-2 ,-1]. The modulus of both perpendicular components is used to calculate the local maximum of the transform, or equivalent, to find the edgesí location (see [Mallat, 1996]). Some optimizations (approximations!) were performed to speed up the filter computation. First, a new five-pixel mask, given by [1, 1, 0 ,-1 ,-1], was defined. The edge detection capability of this second mask is almost the same as the first one, but its processing is much faster. Secondly, to find the edgesí location, the operation of calculating the modulus of both perpendicular components was replaced by the more simple one of calculating the maximum between the absolute values of each perpendicular component.

3.3.3 Fuzzy Processing.

     The described HFilter can be improved to allow invariant edge detection under different lighting conditions. As it was pointed out, the perceived contrast in an image depends on its local background intensity (luminance of the background). The fuzzy-based edge detector here described, the so called Fuzzy-Haar-Filter (FHFilter), is inspired by the work of Pal and Mukhopadhyay (see section 3.2), but uses our HFilter as the basic edge detector and makes some other modifications to the original architecture. A fuzzy reasoning system with the Haar Edges (HEdges) and the background intensity (B) as antecedent variables is implemented. The consequent variable (output) of this system is called Fuzzy-Haar Edges (FHEdges). The linguistic variables used are ([Pal and Mukhopadhyay, 1996]): PB (positive big), PS (positive small), MED (medium) and ZE (zero). Triangular membership functions are used for all the variables. The IF-THEN rules employed are the same presented in table 3.1. The defuzzification is performed by:


where ci represents the center of mass of the normalized triangular membership functions of the FHEdges linguistic variable, and li corresponds to the firing strength of each rule, calculated as the minimum between the membership functions of the HEdges and the B linguistic variables. To speed up the computations, all fuzzy sets were implemented by using look-up tables. These look-up tables are initialized at the beginning of the fuzzy processing.

3.3.4 Histogram processing and dynamic determination of fuzzy sets parameters.

     The proposed fuzzy reasoning system has data-dependent fuzzy sets, whose membership functions are dynamically determined by using information obtained from the cumulative histograms of the Edge and Background Images. By means of this automatic parameter adjustment our architecture works without parameters adjustment for each kind of application and only some few parameters (the size of some filter masks and some threshold values) must be adjusted between different kinds of applications. In the Histogram Processing stage, the cumulative histograms of the background and edge image are determined. By using these histograms and some application-dependent threshold values (percents of the cumulative histograms) the locations of the membership functions of the background and edge fuzzy sets are automatically determined in the Fuzzy Parameter Determination stage. In figure 3.10, the process of determination of the background fuzzy sets is exemplified. In the case of non-existing high peak histogram values (only a peak histogram near the gray level zero), a minimal fuzzy set is initialized. As it can be seen in figure 3.9, the process of determination of the edge fuzzy sets and the background fuzzy sets is performed separately. The membership functions of the consequent variable (FHEdges) are the same as the ones of the HEdges variable.

Fig. 3.10. Automatic determination of background fuzzy sets using the cumulative histogram. A background image (behind), its normal histogram (top-left), its cumulative histogram and the resulting fuzzy sets (bottom-right) are shown. X, Y and Z correspond to application-dependent threshold values.

3.3.5 Comparison with other edge detection algorithms.

     Applications operating on images of real-world scenes, especially outdoor scenes, have high demands on the reliability of the edge detection process. These demands are various, sometimes not unequivocal, not well-defined or even contradictory. Hence, sometimes it is not possible to associate these different conditions with a formalism. Accordingly, the estimation of the suitability of an edge detector is given by the result achieved in a real-world application. The operator has to prove its ability for the processing of images with various kinds of data. It must be also application independent. Finally, it has to be robust. In this section the application of our architecture to the detection of edges in two real-world images is exemplified (figures 3.11 and 3.12). Difficulties are as follows: Inhomogeneous lighting conditions especially affect the task of edge detection. For instance, edges starting in a bright area of the image and ending in a dark one will be more difficult to detect, because edges are perceived stronger in bright areas. But the structures in every part of the image have the same importance and have to be detected.

     Two images are processed by using three different edge detectors. The first image corresponds to an airplane image (see figure 3.11) and the second one to a cast image (see figure 3.12). The used edge-detectors are the Sobel-Operator, the HFilter and the here described fuzzy edge detector. Concerning the first two operators we can say that the Sobel-Operator is prone to noise and that the HFilter is insensitive to noise, but it does not adapt to the different local intensities. The examples illustrate the suitability of the proposed fuzzy edge detector for applications, that needs similar results in imageís areas with various lighting conditions.

Fig. 3.11. Original airplane image and its edge images: Sobel Edge Image, Haar Edge Image, Fuzzy Edge Image
(from left to right).

Fig. 3.12. Original casting image with an ore-defect (acquired by an endoscope camera) and its edge images: Sobel Edge Image, Haar Edge Image, Fuzzy Edge Image (from left to right).

3.3.6 Integration of the proposed architecture in Vision Systems.

Automatic Segmentation of internal Wall Images in real Pipe Scenes.

     The first step to evaluate the pipesí internal surface is the segmentation of wall images. This segmentation is difficult, because the pipes are subject of natural influences. Particularly, sedimentation, encrustation, horizontal or vertical offset of two pipes, shards, roots and remaining water lead to very different appearances. Furthermore, the lighting is irregular, that means that for example gaps and sockets are more prominent in bright than in dark areas of the image. For the subsequent steps of the pipes' processing an invariant segmentation result is highly necessary, independent of these given lighting conditions. The FHAAR-Architecture is used to detect the edges with a high reliability. Examples of edge detection by our architecture are shown in figure 3.13. The final segmentation of the images is performed using the Watershed Transformation [Beucher, 1982], which need an edge and a label image. The edge image is obtained using the described architecture. An example of the complete segmentation process of a real socket scene is shown in figure 3.14.


Fig. 3.13. Original (upper row) and edge (lower row) images of sewage pipe scenes: Socket Image,
Shard Image and Fissure Image (from left to right).



Fig. 3.14. Segmentation of a socket scene. Original socket scene with (from left to right)
the wall area, the so-called blending area, the socket area, the blending area and the wall area (a);
Edge Image (obtained with the proposed architecture) (b); Label Image (c); and Segmentation Image (d).

Recognition of Number Plates.

     An interesting application field for the proposed edge detection architecture is the vehicle identification. In the FhG-IPK (Berlin) was developed a system consisting in two modules [Lohmann et al., 1997]: Recognition of Number Plates and Identification of Shape Typical Vehicle Features. The first module performs the following tasks: (1) Location of the number plates structures and the charactersí area; (2) Charactersí Cleaning; (3) Charactersí Separation; and (4) Charactersí Recognition.

     One fundamental step of the recognition procedure is the determination of the charactersí area (task 1), especially in images of real-world scenes, which are influenced by different environment conditions. Charactersí areas are detected from their structural properties (nearly parallel borders in a connected image region). Filtering (low-pass and high-pass) and morphological operations are used to find the sequence of characters. The verification of the extracted regions is performed by evaluating regional features, e.g. features describing the holes inside the connected components. For these reasons it is clear that a robust edge detector is fundamental in the car platesí location process. The FHAAR-Architecture was used to perform this task. Some results of its application are shown in figure 3.15. This figure show the real gray-value images and the results of the application of the edge detection architecture, as an example for the search of number plate structures and form-typical contour features from a car (figure 3.15a) and a lorry (figure 3.15b). As it can be seen from these two examples, the proposed architecture detects all relevant structure information, especially the number plate's information, although the original images are very inhomogeneously illuminated. It can be concluded that the proposed architecture is able to be used in the number plate location stage of the described number plates recognition system. In figure 3.16, the successful number plate recognition of the car shown in figure 3.15a is exemplified.


Fig. 3.15. Original car image and its fuzzy edge image (a); and original lorry image and its fuzzy edge image (b).

Fig. 3.16. Number plate recognition procedure.

3.3.7 Future Work.

     The FHAAR-Architecture was modified to process vertical and horizontal edges separately (see figure 3.17). A vertical and a horizontal edge image are used. These images are obtained by using a V-Edge Operator and a H-Edge Operator defined by the masks [1, 1, 0 ,-1 ,-1]T and [1, 1, 0 ,-1 ,-1], respectively. After the convolution with this mask, the absolute value operator is applied. Two fuzzy controllers, working in parallel, are used to process each directional edge image. The final edges are found by taking the maximum response of the fuzzy controllers, in each image position. The detection capabilities of this second architecture are being investigated. A comparison with the FHAAR-Architecture is also going to be performed.

Fig. 3.17. Block diagram of the modified FHAAR-Architecture.



página anterior

página siguiente

Ciencia Abierta