Digital Image Processing Fundamentals-free book
Digital image processing is electronic data processing on a 2-D array of numbers. The array is a numeric representation of an image. A real image is formed on a sensor when an energy emission strikes the sensor with sufficient intensity to create a sensor output. The energy emission can have numerous possible sources (e.g., acoustic, optic, etc.). When the energy emission is in the form of electromagnetic radiation within the band limits of the human eye, it is called visible light [Banerjee]. Some objects will reflect only electromagnetic radiation. Others produce their own, using a phenomenon called radiancy. Radiancy occurs in an object that has been heated sufficiently to cause it to glow visibly [Resnick]. Visible light images are a special case, yet they appear with great frequency in the image processing literature.
Another source of images includes the synthetic images of computer graphics. These images can provide controls on the illumination and material properties that are generally unavailable in the real image domain.
This chapter reviews some of the basic ideas in digital signal processing. The review includes a summary of some mathematical results that will be of use in Chapter 15. The math review is included here in order to strengthen the discourse on sampling.
5.1. The Human Visual System
A typical human visual system consists of stereo electromagnetic transducers (two eyes) connected to a large number of neurons (the brain). The neurons process the input, using poorly understood emergent properties (the mind). Our discussion will follow the eye, brain and mind ordering, taking views with a selective focus.
The ability of the human eye to perceive the spectral content of light is called color vision. A typical human eye has a spectral response that varies as a function of age and the individual. Using clinical research, the CIE (Commission Internationale de L’Eclairage) created a statistical profile of human vision called the standard observer. The response curves of the standard observer indicate that humans can see light whose wavelengths have the color names red, green and blue. When discussing wavelengths for visible light, we typically give the measurements in nanometers. A nanometer is meters and is abbreviated nm. The wavelength for the red, green and blue peaks are about 570-645 nm, 526-535 nm, and 444-445 nm. The visible wavelength range (called the mesopic range) is 380 to about 700-770 nm [Netravali] [Cohen].
Fig. 5-1. Sketch of a Human Eye
Fig. 5-1 shows a sketch of a human eye. When dimensions are given, they refer to the typical adult human eye unless otherwise stated. Light passes through the cornea and is focused on the retina by the lens. Physiological theories use biological components to explain behaviour. The optical elements in the eye (cornea, lens and retina) form the primary biological components of a photo sensor. Muscles are used to alter the thickness of the lens and the diameter of the hole covering the lens, called the iris. The iris diameter typically varies from 2 to 8 mm. Light passing through the lens is focused upon the retina. The retina contains two types of photo sensor cells: rods and cones.
There are 75 to 150 million rod cells in the retina. The rods contain a blue-green absorbing pigment called rhodopsin. Rods are used primarily for night vision (also called the scotopic range) and typically have no role in color vision [Gonzalez and Woods].
Cones are used for daylight vision (called the photopic range). The tristimulus theory of color perception is based upon the existence of three types of cones: red, green and blue. The pigment in the cones is unknown [Hunt]. We do know that the phenomenon called adaptation (a process that permits eyes to alter their sensitivity) occurs because of a change in the pigments in the cones [Netravali]. The retina cells may also inhibit each another from creating a high-pass filter for image sharpening. This phenomenon is known as lateral inhibition [Mylers].
The current model for the retinal cells shows a cone cell density that ranges from 900 to 160,000 [Gibson]. There are 6 to 7 million cone cells, with the density increasing near the fovea. Further biological examination indicates that the cells are imposed upon a noisy hexagonal array [Wehmeier].
Lest one be tempted to count the number of cells in the eye and draw a direct comparison to modern camera equipment, keep in mind that even the fixated eye is constantly moving. One study showed that the eyes perform over 3 fixations per second during a search of a complex scene [Williams]. Further more, there is nearly a 180-degree field of view (given two eyes). Finally, the eye-brain interface enables an integration between the sensors’ polar coordinate scans, focus, iris adjustments and the interpretation engine. These interactions are not typical of most artificial image processing systems [Gonzalez and Woods]. Only recently have modern camcorders taken on the role of integrating the focus and exposure adjustment with the sensor.
The optic nerve has approximately 250,000 neurons connecting to the brain. The brain has two components associated with low-level vision operations: the lateral geniculate nucleus and the visual cortex. The cells are modeled using a circuit that has an inhibit input, capacitive-type electrical storage and voltage leaks, all driving a comparitor with a variable voltage output.
The capacitive storage elements are held accountable for the critical fusion frequency response of the eye. The critical fusion frequency is the rate of display whereby individual updates appear as if they are continuous. This frequency ranges from 10-70 Hz depending on the color [Teevan] [Netravali]. At 70 Hz, the 250,000-element optic nerve should carry 17.5 million neural impulses per second. Given the signal-to-noise ratio of a human auditory response system (80 dB), we can estimate that there are 12.8 bits per nerve leading to the brain [Shamma]. This gives a bit rate of about 224 Mbps. The data has been pre-processed by the eye before it reaches the optic nerve. This preprocessing includes lateral inhibition between the retinal neurons. Also, we have assumed that there is additive white Gaussian noise on the channel, but this assumption may be justified.