In Section 1 of this chapter we review the existing models of peripheral vision, most of which are based on the appealing conception that peripheral vision is just a spatially scaled version of central vision. We argue against scaling models, however, on the grounds that they cannot account for the primary factor which limits resolving power in the periphery: neural undersampling of the optical image formed on the retina. In order to account for sampling effects, we adopt an engineering perspective to develop in Section 2 a simple model of optical and neural processing of the retinal image. In Section 3, we apply our neuro-optical model to human eyes in order to discover the relative importance of the optical and neural limits to pattern detection and resolution. The results show that, although the optical system of the eye is the dominant factor limiting central vision, spatial undersampling by the optic nerve cells of the retina limits resolution in the periphery. Neural undersampling of the retinal image leads to perceptual aliasing and spurious detection of patterns up to an order of magnitude finer than the Nyquist limit. Aliasing is curtailed at very high frequencies by a combination of optical filtering and spatial summation over the finite aperture of cone photoreceptors. We conclude with some comments on practical applications of the model.
Most of what is known about human vision is the result of experimental investigations of the central few degrees of the visual field. This emphasis is understandable since central vision is critical for everyday living and the loss of central vision to disease or aging is a severe physical handicap. Nevertheless, it is ironic to note that probably 99% of all research on the human visual system has been devoted to the central 1% of our visual field. As a result, the performance characteristics and mechanisms of peripheral vision are poorly understood by comparison to those of central vision.
Spatial resolving power is the most important difference between central and peripheral vision. In a classic study published in 1857 by Aubert and Förster 4 and summarized in Helmholtz's authoritative Treatise on Physiological Optics 83 (see p. 39), two fundamental facts were established. First, the minimum dimensions of letters and numerals that can be resolved in the peripheral field varies in direct proportion to the eccentricity of the letters from the fixation point. Second, the iso-acuity contours in peripheral vision for two-point resolution are slightly elliptical with the major axis oriented horizontally. Numerous subsequent studies have confirmed these findings (see Genter et al. 26 for a review), the most thorough of which was by Wertheim. 102 Using gratings as a test target, Wertheim performed an exhaustive series of measurements which documented the acuity of his own left eye throughout the visual field, an heroic feat which has only recently been repeated. 105 His results showed that the ability of the human eye to resolve the individual bars of a grating pattern varies by nearly two orders of magnitude over the visual field, falling from about 50 cyc/deg in central vision to less than 1 cyc/deg in the far periphery. Although Wertheim and many subsequent authors describe acuity as falling inversely with target eccentricity, this description simply reflects the convention of using spatial frequency, rather than its inverse, as a measure of acuity. 26
Accounting for the huge loss of spatial resolving power in peripheral vision is a major challenge to visual scientists and to the models of spatial vision which they invent. Such models are important not only for summarizing our understanding of the physical factors which limit visual resolution, but also for solving applied problems and for yielding sensible answers to practical questions such as: "What are the functional consequences of a 6 deg foveal scotoma?", "Is a central scotoma equivalent to low-pass filtering the image?", "Is scotopic vision just photopic vision with more low-pass filtering?", "Does contrast threshold for patterned targets scale with eccentricity?", "Over what range of stimulus parameters is detection equivalent to recognition, and does the answer vary with retinal eccentricity?"
Variation of resolving power across the visual field is, in a broader context, an example of the common biological strategy of distributing individual sensory transducers non-uniformly in order to selectively emphasize some regions of the environment at the expense of others. In the sense of touch, for example, there are many more tactile sensors in the finger tips and lips than along the arm or leg. As a result, spatial resolution of tactile stimuli is much higher for our fingers and lips and there is a far greater portion of our brains devoted to the representation and analysis of these areas by comparison to other parts of the body. 55 Similarly, many animals (including mammals) which live in an open habitat dominated visually by the horizon have retinas which contain a specialized region called the visual streak, 33, 85 a pronounced band of tightly packed retinal receptors that supports especially high visual acuity over a horizontal region of visual space.
The evolution of a visual streak in insects and other arthropods is just one of many examples of biological adaptation that Wehner calls "matched filtering". 97 According to this general principle, the fundamental spatial aspects of crucial sensory problems are incorporated into the spatial design of the sensory surface itself. By this simple trick, sensory systems avoid the need for complex neural circuitry to solve difficult problems such as navigation by honey bees or desert ants using the celestial pattern of polarization to steer the course, or long-distance navigation by migratory birds and homing pigeons. A similar problem described by Wehner is illustrated in Fig. 1. During forward locomotion (or during the approach of a predator) the retinal image of an off-axis object grows larger as it gets closer. In order to avoid misinterpreting this enlargement of the retinal image as physical growth of the object, the visual system must take into account the fact that the image is also moving into the periphery as it grows. It is conceivable that a neural circuit could be built to achieve the desired "neural size constancy" by extracting the correlation between image size and retinal eccentricity and using the result to calculate object size from image size. However, there is a much simpler solution which takes advantage of the geometry of Fig. 1. If the spatial grain of the retina increases in direct proportion to eccentricity, as is true in man (and sand crabs!), then the retinal image of an approaching object will stimulate the same number of neural elements regardless of viewing distance. One consequence of this arrangement is that if a visual object of size x is just resolvable in peripheral vision at one viewing distance, then it will be just resolvable at all viewing distances. From Wehner's perspective, then, the dramatic variation of resolving power of the human eye across the visual field is just one more example of biological evolution of matched filtering to simplify sensory coding of the visual environment through the technique of spatial distortion of the retinal array.

Figure 1. Non-uniform resolving power of the eye as a mechanism for achieving neural size-constancy. An object of constant absolute size located a fixed distance from the visual axis doubles in angular size and its eccentricity doubles when the viewing distance is halved. For the neural image of the object to be the same size in either case, the minimum resolvable angle of the visual system should vary in direct proportion to eccentricity. The weight of experimental evidence in humans is consistent with this model. 26
The aim and scope of this chapter is to review the salient features of spatial vision in the peripheral field for the purpose of developing a useful model. In our view, the variation of spatial resolution across the visual field so prominent in man is due entirely to just two ocular factors: variation of optical image quality and changes in the architecture of the neural retina. The latter of these two factors is the topic of this chapter, whereas the former is dealt with in detail in the chapter by Bradley and Thibos (1995) in this book.
Perhaps the simplest approach to modeling spatial vision in the periphery is to adapt existing models of central vision by adjusting the parameters of the model. Current models of central vision emphasize two fundamental, physical factors which limit performance: filtering and noise. For example, in the "Static Performance Model" of the USAF Night Vision Laboratory 59 the human visual system is represented by a concatenation of two linear filters (one optical and the other a spatio-temporal filter of a neural contrast detector) and performance is quantified statistically to account for the effects of noise. 40 Such filtering-limited and noise-limited models have a firm scientific basis which is well documented in the literature on human foveal vision 20 and many applied models follow a similar approach, 50 including ACQUIRE, 40 VIDEM, 1 ORACLE, 51 and PHIND. 84
To adapt foveal models for peripheral vision, the usual approach is to broaden the filter's impulse response function (or, equivalently, reduce the low-pass cutoff frequency) sufficiently to account for the reduced resolving power in the peripheral field. A conceptually simple way to broaden a spatial filter is to rescale the spatial dimension and so it has become popular to conceive of peripheral vision as just a spatially-scaled version of central vision. This rescaling approach gained considerable momentum from early experiments by Rovamo and Virsu 62 and others 35, 36 which demonstrated that if visual targets were magnified in the periphery in order to compensate for the reduced representation of the peripheral field in the visual cortex, and if the size of targets were expressed in cortical dimensions rather than visual space dimensions, then visual performance is invariant across the visual field. This idea fit in neatly with the well-known anatomical fact that the cortical representation of the visual world is highly distorted in favor of the central field of view. 19 It is also consistent with emerging evidence that the "cortical magnification factor", defined as the number of mm on the surface of the cortex representing 1 deg of visual field, is correlated with peripheral visual acuity. 14, 92
The basic concept of the Rovamo-Virsu rescaling model in its current form 108 is illustrated schematically in Fig. 2. The foveal contrast sensitivity function (CSF) in Fig. 2A indicates the minimum contrast required by human observers to detect a small patch of sinusoidal grating as a function of the grating's spatial frequency. Since all points in the graph below the CSF represent grating contrasts which exceed psychophysical threshold, the CSF may be regarded as the upper boundary of a region of visibility. The lower boundary of this region of visibility is set by the horizontal line representing 100% contrast, which is the maximum physical contrast possible for sinusoidal gratings. The spatial frequency for which contrast sensitivity falls to unity thus represents the highest detectable spatial frequency and this cutoff spatial frequency is a universally accepted definition of visual acuity, i.e. the resolving power of central vision. By comparison, the CSF for the same small patch of grating displayed in peripheral vision was displaced increasingly downwards and to the left as the visual stimulus moved progressively further into the periphery (Fig. 2B). Rovamo and Virsu argued that contrast sensitivity and resolving power falls dramatically when the target is imaged on peripheral retina because the amount of visual cortex stimulated by the visual target is much smaller. In support of this interpretation, they found that if the peripheral target was enlarged to compensate for the difference in cortical representation, then contrast sensitivity increased to the levels measured for central vision. By scaling the targets in this way, the CSFs for central and peripheral retina were found to be indistinguishable when spatial frequency was expressed as a fraction of the cutoff frequency, as shown in Fig. 2C. Thus, on the basis of their experimental data, Rovamo and Virsu proposed the unifying concept of a universal CSF which, when properly scaled by a "cortical magnification factor", applies anywhere in the visual field.

Figure 2. The Rovamo-Virsu filtering model of contrast detection across the visual field. Ordinate may be specified either as the contrast of a just detectable grating (right) or as contrast sensitivity (i.e. the inverse of threshold contrast, left). Abscissa is object spatial frequency specified either in absolute angular dimensions (A, B) or normalized by the cutoff spatial frequency for each curve (C). A: foveal vision. B: peripheral vision for targets of fixed angular size. C: a universal CSF which applies for all parts of the visual field, provided that the stimulus size is scaled according to cutoff spatial frequency.
Our objection to the rescaling model as described above is that it contains a hidden assumption which is not valid for peripheral vision. To expose this hidden assumption we return to our earlier interpretation of the cutoff spatial frequency, which was offered as a definition of the resolving power of the eye. In so doing, no distinction was made between the ability of observers to detect the presence of a luminance pattern and the ability to resolve the spatial structure of the pattern. Although this may sound like a minor distinction, it is in fact fundamental to the problem of modeling peripheral vision. In psychophysical experiments designed to measure contrast detection, observers are asked to distinguish the grating from a uniform patch of the same mean luminance. The rationale of this paradigm is that, in a properly controlled experiment, the presence of spatial contrast is the only feature of the grating required by subjects to successfully perform the task. To measure resolution on the other hand, the experimenter must force the observer to demonstrate some evidence that the features of the detected pattern are correctly perceived. The most common way to gather such evidence experimentally is to ask the subject to identify the orientation of the grating (e.g. is it vertical or horizontal?).
When these two different psychophysical tasks of detection and resolution are performed using central vision, they give essentially identical results. Consequently, there is normally no need to distinguish between visual acuity for detection and visual acuity for resolution when viewing targets foveally. However, in peripheral vision there can be as much as an order of magnitude difference in these two measures of visual acuity when high-contrast interference-fringes are used to stimulate the retina, 82 thereby avoiding many of the optical limitations of the eye. 78 The difference between the cutoff frequency for resolution and detection tasks is not as great when the retinal image is formed by the eye's optics in natural view, but substantial differences remain provided the retinal image is clearly focused. For example, at 20 deg in the nasal field detection acuity for gratings displayed on an oscilloscope is four times greater than resolution acuity as demonstrated by the psychometric performance curves of Fig. 3A. The open symbols in this graph show that performance for the resolution task was flawless below 5 cyc/deg but then falls to 75% correct at about 6 cyc/deg and is no better than chance beyond about 7 cyc/deg. Nevertheless, just beyond this resolution limit the pattern is still detected without error (solid symbols) and detection performance does not fall to the 75% correct level until the spatial frequency goes beyond about 24 cyc/deg. Thus there is a large range of spatial frequencies, shown by the shaded area, for which gratings can be detected but not resolved. For reasons that we discuss next, this area is labeled the aliasing zone.
Figure 3. A: Comparison of performance for resolution and detection tasks. B: Drawings of subjective appearance of gratings. Eccentricity = 20 deg, horizontal nasal visual-field.
To gain some insight into the reason for the large difference in performance for the detection and resolution tasks in peripheral vision, we asked the observer in this experiment to sketch the subjective appearance of the stimulus. One observer's drawings of a 1.5 deg circular patch of vertical grating set to several different values of spatial frequency are shown in Fig. 3B. These sketches reveal that when the grating frequency was less than the resolution cutoff (i.e. < 6 cyc/deg), the subjective appearance was veridical. That is, the pattern appeared to contain the correct number of cycles and its orientation was correctly perceived. However, for frequencies beyond the resolution cutoff the subjective appearance was erroneous in several different ways. The pattern often did not look like a grating at all, but was fragmented and splotchy and it always appeared much coarser than when the stimulus was viewed foveally. On those occasions when the pattern did look like a grating, it frequently had the wrong orientation and always had too few cycles (for example, a 1.5 deg patch of 8 cyc/deg grating contains 12 cycles, but the drawing has only 5 cycles). Such percepts were highly unstable and changed significantly from moment to moment, as may be seen in the pairs of sketches drawn for the same test frequency. Despite this instability, the presence of spatial contrast was reliably observed across the full range of frequencies from the resolution limit to the detection limit.
Our interpretation of the non-veridical drawings in Fig. 3B is that subjects were experiencing the effects of neural undersampling of the retinal image. In other words, we believe the observer perceived an alias of the stimulus rather than the stimulus itself. Our reasoning is as follows. Because the continuous retinal image is sampled by the mosaic of retinal neurons, the fidelity of the discrete neural image will be constrained by Shannon's sampling theorem of communication theory. 67 Consequently, if the spatial frequency of the retinal image exceeds the Nyquist limit set by the density of retinal neurons, then the pattern will be misrepresented by the neural array. This erroneous pattern of neural activity is indistinguishable from the pattern of neural activity that would have been produced by a different, and much coarser, visual stimulus which is below the Nyquist limit. Thus, undersampling produces an essential ambiguity which cannot be reconciled by appeal to the information contained in the neural signals leaving the eye. Consequently, the neural apparatus of the visual cortex (which may interpolate the neural image onto a finer scale 6 ), has no basis for attempting a reconstruction of the original stimulus rather than the sub-Nyquist alias. In other words, the observer's brain has no option but to misinterpret retinal signals generated by a relatively fine, undersampled pattern as being generated by a relatively coarse, oversampled stimulus.
The sketches of Fig. 3 reveal that although the aliased percept is always on a coarse spatial scale, it is not always grating-like. This is probably because the array of retinal neurons is not completely regular. 52, 109 For this reason, in biology it is more useful to use the term "aliasing" in a more general sense than the usual moiré effect familiar in engineering. Accordingy, we define aliasing as the misperception of spatial features of objects caused by neural undersampling of the retinal image.
We suspect that the temporal instability of aliasing is due to random eye movements which cause the stimulus to land on a slightly different part of the retina at different instants. Such eye movements have little consequence for gratings below the Nyquist limit of the neural array, but for gratings beyond the Nyquist limit small displacements of the target can induce large changes in orientation, spatial frequency, phase, and spatial structure of the neural image. It is perhaps surprising that the brain does not take advantage of eye movements to increase the effective sampling density of the retinal mosaic through temporal integration of a sequence of neural "snap-shots", each with a slightly different position of the retina relative to the image. This seems like a reasonable strategy to avoid the misperception of aliasing, but experimental evidence of such interpolation is lacking.
A growing body of evidence supports the interpretation that neural
undersampling, rather than spatial filtering, is the primary factor
which limits pattern resolution in the periphery. Although the
original observations of aliasing in peripheral vision were for
interference fringes, 82 Smith &
Cass 68 and Still 73
have independently reported observations of aliasing for
naturally-viewed gratings in the peripheral field. In addition to
errors in the perceived structure of visual patterns, the
undersampling hypothesis also predicts that drifting gratings should
appear to move in the wrong direction. Experimental observations
confirming this prediction of motion reversal have been reported
independently by several groups. 3,
13, 88
Finally, if filtering were the limiting factor for resolution in the
periphery, then the contrast of the visual target would need to be
100% in order to achieve the resolution limit. However, Still
73 has shown that stimulus contrast as
low as 10% is sufficient to achieve maximum resolution acuity at an
eccentricity of 30 deg.
If the resolving power of peripheral vision is sampling-limited, how are we to interpret the earlier evidence that resolution is filtering limited? One possible explanation is that the contrast sensitivity functions measured by Rovamo & Virsu were attenuated by optical filtering caused by uncorrected refractive errors, uncompensated changes in viewing distance of the stimulus, and off-axis optical aberrations of the eye (see the chapter by Bradley & Thibos in this book). To investigate this possibility, Still avoided the filtering effects of the eye's optical system by measuring subjects' ability to detect a small patch of high-contrast interference-fringes formed directly on the peripheral retina. 73 His measurements of contrast sensitivity were much higher than those reported by Rovamo & Virsu, 62 which suggests that optical factors may have been responsible for the very low values of contrast sensitivity which motivated the original rescaling model. This seems a plausible explanation since optical filtering by the eye is also responsible for the lack of sensitivity in central vision for high-frequency gratings beyond the neural resolution limit. 106 The implication of this line of reasoning (which we will revisit in Section 3.2) is that the optical system of the eye provides an effective anti-aliasing filter for central vision but not necessarily for peripheral vision. If the peripheral optics are well focused, then patterns beyond the neural Nyquist limit are imaged on the retina and perceptual aliasing becomes possible. However, if the retinal image is sufficiently blurred by defocus or aberrations, even peripheral vision may be protected by the anti-aliasing effects of low-pass, optical filtering in the eye.
In summary, there is now good evidence that a fundamental, qualitative difference exists between central and peripheral vision which cannot be accounted for by spatial rescaling of filters. When modeling foveal vision it is often permissible to ignore the sampling effects of converting a continuous retinal image into a discrete neural image because the eye's optical system is an effective anti-aliasing filter which prevents spatial frequency components beyond the Nyquist limit from being imaged on the retina. However, the Nyquist sampling limit drops rapidly in the peripheral retina and therefore optical anti-aliasing filtering does not necessarily occur for peripheral targets. Consequently, the ultimate factor limiting target resolution in the periphery is not filtering but retinal undersampling. This explains why filter models adapted from the fovea are inadequate for the periphery and suggests that to improve applied models of spatial vision one must take sampling effects into account. This is the goal of the next section.
In this section we develop a mathematical model for the initial stages of image processing by the eye. Linear filter theory is used to describe the cascading effects of spatial filtering by the optical system of the eye and by neural receptive fields in the retina. The conditions necessary to prevent aliasing in the sampled neural image are then formulated in terms of optical filtering and the amount of spatial overlap of retinal samplers.

Figure 4. A: Coordinate reference frame for vision. The angular light distribution o(x) of the object is imaged on the retina as i(x). B: Point spread function p(x) for a diffraction-limited optical system (solid curve) and it's equivalent width Deq.
Vision begins with the formation of a light image upon the retina by the optical system of the eye, as illustrated in Fig. 4A. In order to develop optical models of the eye which are independent of viewing distance, it is common practice to specify object and image dimensions in angular units of visual direction (x). Optical imperfections of the eye and diffraction at the pupil inevitably reduce image contrast in a way that may be described as low-pass spatial-filtering. The highest quality retinal image (i.e. least amount of filtering) occurs for a well-focused eye with a pupil diameter of about 2.5 mm. 9 For smaller pupils, diffraction at the margin of the iris is the major limiting factor whereas for larger pupils, ocular aberrations dominate. 11 Given an optimum pupil diameter, the modulation transfer function of the human eye is slightly less than that of an ideal, diffraction-limited optical system. 9, 10 The corresponding statement of image quality expressed in the spatial domain is that the smallest point-spread function of the eye is somewhat larger than that of a diffraction-limited system with 2.5 mm pupil. To quantify this lower bound on the width of the point-spread function, it is convenient to apply Bracewell's 8 concept of the equivalent width of a function, which is defined as the width of the rectangular function with the same height and area as the given function. As illustrated in Fig. 4B, the equivalent width of the ideal system (2.5 mm pupil, 550 nm light; see Goodman, 27 Eq. (6-31)) is 0.87 arcmin and so an approximate value of about 1 arcmin would be a reasonable figure-of-merit for the equivalent width of the eye's point-spread function under optimal conditions.
Psychophysical experiments 73, 74 and physical measurements 49 have shown that optical image quality at 30 degrees of eccentricity can be nearly as good as in central vision, provided refractive errors and off-axis astigmatism are corrected with appropriate spectacle lenses. Nevertheless, systematic changes in the refractive power, magnification, and off-axis aberrations are well documented, 11 which means that a single optical transfer function is usually inadequate for describing image quality over the entire visual field. Despite this global non-uniformity, it is not unreasonable to assume that the optical system of the human eye is characterized on a local scale by a linear, shift-invariant system. Under this assumption, we may calculate the retinal image i(x) by convolution (k) of the optical point spread function p(x) of the eye with the intensity distribution of the object o(x). Thus the first stage of the visual system will be characterized by the equation
|
|
|
where the visual direction x applies interchangeably to angular distances in object space or image space.
Neural processing of the retinal image begins with the transduction of light energy into corresponding changes of membrane potential of individual light-sensitive neurons called photoreceptor cells. Photoreceptors are laid out as a thin sheet which varies systematically in composition across the retina. At the very center of the foveal region, which corresponds to our central field of view, the photoreceptors are exclusively cones but just outside the fovea rods appear and in the peripheral retina rods are far more numerous than cones. 58 Each photoreceptor is thought to integrate the light flux entering the cell through its own tiny aperture which, for foveal cones, is about 2.5 µ in diameter or 0.5 arcmin of visual angle. 17, 48 Since this entrance aperture is wholly within the body of each photoreceptor, apertures from neighboring receptors will not physically overlap on the retinal surface. Given this arrangement of the photoreceptor mosaic, we may characterize the first neural stage of the visual system as a sampling process by which a continuous optical image on the retina is transduced by an array of non-overlapping samplers into a discrete array of neural signals which we call a neural image.

Figure 5. Receptive fields of cone photoreceptors in the fovea. A: Cone apertures on retina are blurred by eye's optical system when projected into object space. B: Spatial sensitivity profile of foveal cone in object space (solid curve) is broader than in image space (dashed curve).
Often it is useful to think of the cone aperture as being projected back into object space where it can be compared with the dimensions of visual targets, as illustrated schematically in Fig. 5A. This back-projection can be accomplished mathematically by convolving p(x), the optical point-spread function of the eye, with the uniformly-weighted aperture function of the cone. (In taking this approach we are ignoring the effects of diffraction at the cone aperture, which would increase the cone aperture still further.) The result is a spatial weighting function called the receptive field of the cone. Before examining this receptive field in detail, we may draw two important qualitative conclusions. First, since foveal cones are tightly packed on the retinal surface, and since the effect of the eye's optical system is to broaden and blur the acceptance aperture of cones, the receptive fields of cones in object space must overlap to some degree. Second, the convolution result will be dominated by p(x) since the equivalent width of the optical point spread function is about double that of the cone aperture. To substantiate these inferences quantitatively, we convolved the optical point-spread function of Fig. 4B with the uniformly weighted aperture function of a cone (0.5 arcmin diameter, circular shape). The 1-dimensional profile of the result is shown by the solid curve in Fig. 5B. As expected, this profile is very similar to the point-spread function of the optics alone and it has a calculated equivalent width of 0.93 arcmin. Since this figure is about twice that of the cone aperture on the retina, we conclude that receptive fields of neighboring cones will overlap significantly. The functional implications of this overlap are discussed below in Section 2.6.
The neural image encoded by the cone mosaic is transmitted from eye to brain over an optic nerve which, in man, contains roughly one million individual fibers. Each fiber is an outgrowth of a third order retinal neuron called a ganglion cell. It is a general feature of the vertebrate retina that ganglion cells are functionally connected to many rods and cones by means of intermediate, second order neurons called bipolar cells as illustrated schematically in Fig. 6A. As a result, a given ganglion cell typically responds to light falling over a relatively large region of the retina with the middle of the receptive field weighted most heavily. Neighboring ganglion cells may receive input from the same receptor, which implies that ganglion cell receptive fields may overlap. Thus in general the mapping from photoreceptors to optic nerve fibers is both many-to-one and one-to-many. The net result, however, is a significant degree of image compression since the human eye contains about 5 times more cones than optic nerve fibers and about 100 times more rods. 16, 17

Figure 6. A: Formation of a neural response to the retinal image. Signals transduced by rods and cones are relayed by bipolar inter-neurons to ganglion cell output neurons for transmission along the optic nerve. B: Neural image for a point source of light on the retina is found by sampling the reflected receptive field at each visual direction represented by the array of output neurons.
A full account of spatial image processing by the retinal network would involve a description of the propagation of the neural image through each stage, taking account of neural pooling and sub-sampling of a fine mosaic by a coarse mosaic at multiple levels. Unfortunately, such a detailed model is beyond our present grasp despite the availability of a conceptual framework and mathematical tools. 25, 44 The reason is that although we understand retinal connectivity well enough to draw a rough schematic diagram for the mammalian eye, 43 detailed understanding of the complex retinal circuitry which connects input photoreceptors to output ganglion cells is lacking. Fortunately, such detailed knowledge is not necessary to achieve the more modest goal of describing the end result of retinal processing as revealed by the output neural image.
Ganglion cells come in many varieties, but one particular class called P-cells dominates throughout the primate retina. 60 Physiological experiments in monkey and cat indicate that a P-ganglion cell (or analogous X-cell in cat) responds to a linear combination of light falling on its receptive field. 34 Accordingly, the response r of an individual P-cell to the intensity distribution i(x) would be found by integrating the product of the input i(x) with the weighting function w(x) over the receptive field (rf) of the neuron. That is,
|
|
|
Notice that the weighting function w(x) for the output neurons of the retina subsume the spatial weighting effects of the two previous stages of neural processing, namely, integration over the photoreceptor aperture and filtering of the neural image by second-order inter-neurons. Also note that for this analysis of spatial vision it is not necessary to consider the final stage of retinal processing in which the time-continuous response r is encoded (perhaps non-linearly) for asynchronous digital transmission along the optic nerve.
In this section we build upon the foundation laid above in order to give a mathematical description of what the neural image looks like as it leaves the eye via the optic nerve. A global analysis encompassing the whole of the visual field is too difficult to attempt here because of the complications introduced by retinal inhomogeneity. Instead, attention will be focused on a local region where the neural architecture of the retina is relatively uniform. Accordingly, consider a homogeneous population of P-ganglion cells which are responsible for representing the retinal image in a small patch of retina as illustrated in Fig. 6B. Although the visual field is two-dimensional, a simpler one-dimensional analysis will be sufficient for developing the main results which follow. By the assumption of homogeneity, the weighting function of each receptive field has the same form but is centered on different x values for different output neurons. The cells need not be equally spaced for the following general results to hold.
If we let xj be the location of the receptive field center of the jth neuron in the array, then the weighting function for that particular cell will be
|
|
|
and the corresponding response rj is found by combining Eq. (2) and (3) to give
|
|
|
It is important to emphasize at this point that although the neural image and light image are distinctly different entities, they share a common domain and similar language can be used to describe optical and neural images. For example, both kinds of image are functions of x, the visual direction, and a natural correspondence exists such that when the jth output neuron responds at level rj it is sending a message to the brain that a certain amount of light (weighted by the receptive field) has been received from visual direction xj. The variation of these neural responses across the array indicates the presence of contrast in the neural image just as the variation of light intensities across the retina signifies the presence of spatial contrast in the optical image. On the other hand, because of the regional specialization of the retina, the difference between visual directions of neighboring cells is small in the fovea and large in the periphery. Consequently, on a global level the neural image is spatially distorted causing the fovea to be highly magnified in comparison with the periphery. 20 This complication will be avoided in the present analysis by assuming local uniformity of scale over small regions.
The result embodied in Eq. (4) can be placed on more familiar ground by temporarily ignoring the fact that the neural image is discrete. That is, consider substituting for xj the continuous spatial variable u. Then Eq. (4) may be re-written as
|
|
|
which is recognized as a cross correlation integral. 8 In other words, the discrete neural image is interpolated by the cross-correlation of the input with the receptive weighting function of the neuron. We may therefore retrieve the neural image by evaluating the cross-correlation result at those specific locations xj which are represented by the neural array. Using standard pentagram (H) notation for cross correlation, the result is
|
|
|
Replacing the awkward cross correlation operation with convolution yields
|
|
|
In other words, the discrete neural image is interpolated by the result of convolving the retinal image with the reflected weighting function of the neural receptive field.
By analogy with the optical point-spread function, it is useful to
define the neural point-spread function n(xj) as
the neural image for a point source of stimulation on the retina. An
expression for the neural point-spread function is obtained by
letting i(x) be the impulse function
(x)
in Eq. (7) and then applying the sifting
property 8 of the impulse function
|
|
|
where n(xj) denotes the response of the neuron at location xj to the point source. This fundamental relationship between the neural point-spread function and the receptive field is summarized by the following theorem, which is illustrated graphically in Fig. 6B: The neural image formed by a homogeneous array of linear neurons in response to a point of light on the retina is equal to their common receptive field weighting function, reflected about the origin, and evaluated for those visual directions represented by the array. This neural image is called the neural point-spread function. In what follows, the discrete neural point-spread function will be designated n(xj) while its continuous interpolation will be designated n(x).
The concept of a neural point-spread function is useful for specifying the output neural image for an arbitrary visual stimulus. Combining Eqs. (1), (7) and (8) we obtain
|
|
|
For the same reasons mentioned above in relation to Fig. 5, it is sometimes useful to interpret the term p(x)*n(x) as the neural image projected optically back into object space where it is convolved with the stimulus. To put these results in a form which emphasizes the role of neural sampling, we multiply the right-hand side of (9) by a continuous "array function" a(x), which is defined to be unity at each visual direction xj represented by the neural array but is zero for all the in-between directions. By this maneuver the result of Eq. (9) takes on its final form
![]() |
|
It may be helpful to think of this result as the neurophysiological counterpart to the well-known "array of arrays rule" in the engineering study of antennas. 8 According to this rule, the field pattern of a set of identical antennas is the product of the pattern of a single antenna and an "array factor", which is the pattern that would be obtained if the set of antennas were replaced by a set of point sources.
Equation (10), which is illustrated graphically in Fig. 7, exposes the conceptual simplicity of signal processing by the common P-class of optic nerve fibers of the human eye. The final neural image is seen to be the result of three sequential stages of processing. First the object is optically filtered (by convolution with the optical point-spread function) to produce a retinal image. Next the retinal image is neurally filtered (by convolution with the neural point-spread function) to form a hypothetical, continuous neural image. Finally, the continuous neural image is point-sampled by the array of ganglion cells to produce a discrete neural image ready for transmission along the optic nerve to the brain. Notice the change of viewpoint embodied in Eq. (10). Initially the output stage of the retina was portrayed as an array of finite, overlapping receptive fields which simultaneously sample and filter the retinal image. Now this dual function is split into two distinct stages: neuro-optical filtering by the receptive field followed by sampling with an array of point samplers. Thus we see that neural filtering by non-point samplers is functionally equivalent to more traditional forms of filtering, such as that provided by optical blurring, which occur prior to the sampling operation.

Figure 7. Sequential model of neural image formation by the eye. Objects are first filtered by the optical system of the eye and then sampled by the retinal array of neurons. Conceptually a retinal array of non-point samplers is equivalent to a filter followed by point sampling.
A spectral description of the neural image may be obtained by either of two methods. One approach is to compute the Fourier transform of r(x), the hypothetical, continuous interpolation of the neural image specified in Eq. (10). One disadvantage of this method is that it invites the potential conceptual error of supposing that the spatial frequency spectrum of the neural image may contain frequency components beyond those which can actually be supported by the discrete neural array. An alternative approach is to determine the discrete Fourier transform of Eq. (9). This is straightforward when the sample points of the neural array are evenly spaced but is significantly more complex for an irregular, two-dimensional sampling mosaic. 110 In either case, aliasing in the neural image will occur unless the combined filtering action of the optics and receptive fields eliminates all frequency content beyond the Nyquist limit. This issue is the topic of the following section.
An important design criterion for man-made digital communications systems is to avoid the prospect of aliasing caused by undersampling of analog signals. Often it is preferable to discard high-frequency information by pre-filtering the signal rather than to allow corruption of the remaining low-frequencies by aliasing. The design of the human eye seems to follow this good engineering practice to protect foveal vision from aliasing since the optical cutoff frequency is about equal to the Nyquist limit of the mosaic of cone photoreceptors. 9 On the other hand, the penalty of optical anti-alias filtering is an unavoidable loss of image contrast for all spatial frequencies. It is not obvious that the benefits gained by avoiding aliasing are worth the cost of contrast attenuation in biological vision systems. To the contrary, the fact that the optical cutoff frequency exceeds the neural sampling limit over most of the visual field in humans and other animals suggests that the cost of an optical solution is prohibitively high. 69, 76 From a different viewpoint, it may actually be advantageous to allow undersampling since this will reduce the statistical correlation between the outputs of neighboring ganglion cells, reduce redundancy in the neural image, and hence make more efficient use of the available channel capacity of the optic nerve. 42
A strategy which the visual system might adopt in order to avoid conspicuous moiré effects when undersampling periodic patterns is to introduce irregularity into the sampling mosaic. 109 This view is supported by anatomical evidence of a degree of irregularity in the mosaic of retinal ganglion cells, 96 although the array is surprisingly regular in humans even in the peripheral retina. 18 Perceptually the subjective appearance of aliasing often lacks distinct periodicity (see Fig. 3B), an observation which lends further support to the irregularity hypothesis. However, it is not uncommon for the alias to look very much like a grating which suggests that the degree of irregularity is not so great as to completely eliminate moiré effects altogether. Perhaps regularity persists even in peripheral retina because the penalties of irregular sampling outweigh the benefits. Irregular sampling causes a deterioration of the quality of the neural image because of image demodulation and distortion 23, 29 which has serious consequences for spatial resolution 30 and for feature localization. 7
The processing cascade depicted in Fig. 7 suggests that the eye's optical system in combination with neural receptive fields could act as a low-pass, anti-aliasing filter. If low-pass filtering by visual neurons is to be an effective anti-aliasing filter, then neural receptive fields must be relatively large compared to the spacing of the array. We can develop this idea quantitatively without detailed knowledge of the shape of the receptive field weighting function by employing Bracewell's equivalent bandwidth theorem. 8 This theorem states that the product of equivalent width and equivalent bandwidth of a filter is unity. In the present context, the equivalent width of the neuro-optical filter equals the equivalent diameter (dE ) of the receptive fields of retinal ganglion cells as measured in object space. The equivalent bandwidth of this filter is the bandwidth of the ideal, low-pass filter which has the same height and area as the Fourier transform of the receptive field. If we adopt the equivalent bandwidth as a measure of the highest frequency passed to any significant extent by the filter, then by applying Bracewell's theorem we find that the cutoff frequency is 1/dE. (This is a conservative criterion. In Bracewell's terminology, the bandwidth of an ideal low-pass filter is twice the cutoff frequency. Here we are effectively assuming that the tail of the neural low-pass filter is insignificant beyond twice the cutoff frequency of the equivalent, ideal filter.) To avoid aliasing, the cutoff frequency of the filter must be less than the Nyquist frequency (0.5/S) as set by the characteristic spacing S of the array. Thus aliasing will be avoided when dE > 2S, that is, when the equivalent radius of the receptive field exceeds the spacing between fields.
BFigure 8. Coverage of visual field by square array of circular receptive fields. Left (A): visual field is subdivided into nearest-neighbor regions. S = spacing between fields, R = radius of field. Right (B): critical case where cutoff spatial frequency for individual receptive fields just matches the Nyquist frequency of the array.= period of grating at the Nyquist frequency for the array.
A similar line of reasoning can be developed for two-dimensional
receptive fields. In Fig. 8 the visual field
is tessellated by an array of square tiles, with each tile containing
the circular receptive field of a visual neuron. Assuming radial
symmetry of the fields, the generalization of Bracewell's theorem to
two dimensions states that the product of equivalent width and
equivalent bandwidth is 4/
and so (by the above conservative criterion) the cutoff frequency for
individual neurons will be 4/(
dE
). The Nyquist frequency of the array will vary slightly with grating
orientation, 57 but 0.5/S
remains a useful lower bound. Thus the anti-aliasing requirement is
that dE > 8S/
.
In other words, aliasing will be avoided if the equivalent radius of
the receptive field exceeds 4/
times the spacing between fields. To within the level of
approximation assumed by this analysis, 4/
is the same as unity and so the one-dimensional and the
two-dimensional requirements for avoiding aliasing are essentially
the same. Thus, we conclude from these arguments that effective
anti-alias filtering requires that the radius of receptive fields be
greater than the spacing between fields (i.e., R > S). The
critical case (R = S) is depicted in Fig.
8B, along with that grating stimulus which is simultaneously
at the Nyquist frequency for the array and at the cutoff frequency of
the neuro-optical filter.
Neuroscientists are well aware of the importance of aliasing as a limit to the fidelity of the visual system and so have devised a simple measure called the coverage factor to assess whether a given retinal architecture will permit aliasing. 21, 33, 52 Conceptually, the coverage factor of a neural array measures how much the receptive fields overlap. To calculate this overlap we tessellate the visual field into nearest-neighbor regions (also called Voronoi or Dirichlet regions) as illustrated for a square array in Fig. 8A and then define
|
|
|
(For a hexagonal array the area of a tile is
0.5S2/
and thus coverage is 2
(R/S)2/
.)
The utility of this measure of overlap is that it encapsulates into a
single parameter the importance of the ratio of receptive field size
to receptive field spacing as a determinant of aliasing. For the
critical case shown in Fig. 8B, R=S
and therefore the coverage factor equals
(for square array) or 2
/
(for hexagonal array). In other words, if the coverage is less than
about 3 we can expect aliasing to result. Physiological evidence
suggests that coverage may have to be as high as 4.5 to 6 in order to
avoid aliasing completely since retinal ganglion cells in cat
12, 52
and monkey 15 continue to respond above
noise levels to gratings with spatial frequency 1.5 to 2 times
greater than that estimated from their equivalent diameter. Such
responses to very high frequencies may represent a kind of "spurious
resolution" in which the phase of the response reverses, as is to be
expected when the receptive field profile has sharp corners or
multiple sensitivity peaks. 72,
81
Analysis of the X-cells in cat 32, 33 and P-cells in monkey retina 15, 56, 90 suggests that the total population of these classes of ganglion cells has sufficient coverage to avoid aliasing, although this is not the case for other, more sparsely populated classes of ganglion cells. These conclusions were based upon physiological assessments of receptive field sizes in object space, which are much larger than corresponding dendritic fields in monkey peripheral retina. 15, 56 Consequently, if calculations are based on anatomical measurements of dendritic field size, then coverage is much less. 15, 56, 90 Furthermore, it has been argued that coverage should be assessed separately for the various sub-types of ganglion cells. 94 For example, X-cells in cat are split equally into two independent populations of opposite polarity ("ON-center" and "OFF-center") which, when considered separately, have insufficient coverage to avoid aliasing. 33, 52 This debate over whether it is more appropriate to consider the combined population of heterogeneous neurons, or the separate sub-populations of homogenous neurons, is compounded in monkey where P-cells are further subdivided into color-coded sub-populations (e.g., receptive field centers connected exclusively to either red-sensitive or green-sensitive cone receptors, with either ON- or OFF- polarity). The available evidence indicates that if each of these sub-populations is separately considered, then none would have sufficient coverage to avoid aliasing. 90
Recent studies on human retina 16-18, 61 and other primates 93 have provided valuable new data on the cell-to-cell spacing of cones and of retinal ganglion cells at various locations across the visual field. Human data for the horizontal meridian of the visual field are shown by the solid curves in Fig. 9A,B. A prominent feature of the human retina revealed by these data is that cells are more widely spaced in the nasal field (i.e. temporal retina) than they are at equivalent eccentricities in the temporal visual field. To be useful here in estimating coverage of the visual field, these anatomical dimensions must be projected into object space as described above in Section 2.2. To do this for cones, we first estimated the mean anatomical diameter of cone inner segments (believed to be the entrance apertures of cones 48 ) from published photomicrographs at various retinal locations (Figs. 2,3 of Curcio et al. 17 ). Assuming a circular cone aperture, we next convolved a two-dimensional cylinder function with the optical point-spread function for a diffraction-limited optical system (shown above in Section 2.1 to be the lower-bound estimate of the optimal point-spread function of the human eye for central and mid-peripheral vision). The result is the cone receptive field in object space, from which we calculated the equivalent radius (an example of this calculation for foveal cones was described above in Section 2.2). In Fig. 9A we compare the results (indicated by symbols) with cone spacing (solid curves, taken from Fig. 7 of Curcio et al. 17 ).

Figure 9. Coverage of visual field by cone receptors (A) and by retinal P-ganglion cells (B) in human retina. Solid curves show average linear spacing between cells and symbols show equivalent receptive field radius in object space. Data are from Curcio et al. and from Kolb et al. (see text).
The comparisons drawn in Fig. 9A reveal that, over most of the visual field, the receptive-field radius of cones is much smaller than their spacing. This implies that aliasing of peripheral targets by the cone array is inevitable when the eye's optics are well focused. In the fovea, however, the situation is quite different. As mentioned earlier (see Section 2.2), minimum cone radius in object space and cone spacing are both about 0.5 arcmin even though the physical radius of the cone aperture on the retinal surface is about half this value. This is precisely the critical situation depicted earlier in Fig. 8B (except that cones are typically packed in a more efficient hexagonal array). In effect, the eye's optics broaden the functional receptive fields of the foveal cones, thereby increasing their coverage factor to achieve the critical value needed to avoid aliasing. Thus we conclude that the neuro-optical filtering of the cone receptors is sufficient to provide effective anti-alias filtering for foveal vision but not for peripheral vision. It is interesting to note that cone undersampling appears to be widespread in the animal kingdom and so the foveal case, which has spawned a well-entrenched view that the photoreceptor mosaic should "match" the optical image quality of the eye, is more the exception than it is the rule. 69
A similar analysis of coverage of the visual field by the retinal ganglion cell array in humans is depicted in Fig. 9B. The solid curves show the anatomical spacing between human P-ganglion cells as derived from the anatomical study of Curcio & Allen. 16 assuming that 80% of all ganglion cells are P-cells. 56 and that 1 mm on the retina corresponds to 3.3 deg of visual angle. Notice that the curves do not extend to zero eccentricity since retinal ganglion cells which are functionally connected to foveal cones are anatomically displaced into the parafoveal region. Physiological receptive field sizes of retinal ganglion cells in humans are unknown and so we must rely upon indirect estimates based on anatomical size of dendritic fields. It is generally accepted that the dendritic field of a ganglion cell provides the anatomical support for the central component of the cell's functional receptive field and so the size of dendritic fields is a good, lower-bound estimate of the size of physiological receptive fields. 53 Anatomical data are available for human retina from several studies, 18, 39, 61 and in Fig. 9B we show the radius of P-cell dendritic fields recently reported by Kolb et al. 39 (their Fig. 19). Unlike other contemporary neuroanatomists, Kolb et al. distinguish between two sub-populations of P-cells which have significantly different dimensions of dendritic fields. In Fig. 9B these two populations are separately represented by closed symbols (P1-cells) and by open symbols (P2-cells). (Note that, unlike the previous analysis with cones, it wasn't necessary to convolve these relatively large, anatomical receptive fields with the point-spread function of the eye in order to achieve an acceptable level of accuracy).
The relative proportion of P1 and P2 cell types in human retina is presently unknown. Nevertheless, the comparison drawn in Fig. 9B suggests that if all P-cells were of the larger (P2) type then receptive field radius matches cell-to-cell spacing and so the coverage factor for this population of cells (about 3) would be sufficient to avoid most, but probably not all, of the aliasing effects of undersampling. It follows that any sub-population of P-cells will render an even greater degree of aliasing since coverage is necessarily lower in a sparser array. For example, if a significant number of P-cells are of the smaller (P1) type described by Kolb et al., then the coverage factor for the remaining P2 cells will be that much less than the critical value needed to suppress aliasing. Similarly, P1-cells must generate a significant degree of aliasing since they have such small receptive fields. As may be deduced from Fig. 9B, even if every P-cell were a P1-type, their coverage factor would still be less than 1, which is far below the critical value needed to avoid aliasing.
The most direct examination of coverage factor in human retina is from an elegant series of recent experiments by Dacey. 18 By visualizing and staining individual neurons in fresh human tissue, Dacey was able to identify and study all of the P-ganglion cells in a small patch of retina. His observations revealed a large variation in the size of dendritic fields which did not fall neatly into P1 and P2 subclasses. However, he did confirm that P-cells form two independent mosaics (the "ON-" and "OFF-" systems described in greater detail in Section 3.1 below) just as they do in other species. 94 When these two mosaics were examined separately, Dacey found that adjacent dendritic trees apposed one another but did not overlap. Instead they fit together like pieces of a jigsaw puzzle, tiling the retinal surface with interlocking, irregularly shaped areas. This intriguing arrangement was observed first in cat 94 and it is supposed that a highly specific mechanism must be at work during development of the fetal retina that allows the growing dendritic trees of cells in the same mosaic to expand to fill their space, yet not overlap, just as prescribed by a Dirichlet tessellation scheme. 95 The relevant point here is that, since the ON and OFF sub-mosaics each have coverage no greater than 1, the total population of P-cells will have coverage no greater than 2, which is too small to avoid aliasing.
In summary, if all P-cells participate in a unified sampling mosaic then their coverage of the peripheral visual field would be sufficient to avoid most aliasing effects even when the retinal image is well focused. However, if P-cells are subdivided into known, functional sub-populations, all of the anatomical studies to date indicate that coverage would be too low to prevent aliasing in the peripheral field for a well-focused human eye. In the next Section we explore the quantitative implications of this conclusion.
In the preceding sections we have argued that neural sampling of the retinal image presents a fundamental limitation to visual resolution. We have also shown how each stage of neural processing in the retina can be conceptually and computationally projected back into object space and therefore be conceived as a sampling mosaic impressed upon the visual world. Given that the visual system consists of a series of anatomically distinct stages (e.g., photoreceptors, bipolars, ganglion cells), we may ask which of these arrays of neurons sets the ultimate limit for peripheral acuity? Since the beginning of this century the textbook answer to this question has been stated in the negative: the cone array cannot be responsible since peripheral acuity declines much more rapidly with increasing eccentricity than does cone spacing. 111 Each subsequent generation of vision scientists, with its refined measurements of anatomical dimensions and visual performance, has reaffirmed that cones are too tightly packed to limit resolution in peripheral retina and pointed instead to post-receptoral neural circuitry as the determining factor. 3, 28, 75, 80, 104
It stands to reason that the sampling limit of the visual system will be set by the coarsest array of the visual pathway. A comparison of Figs. 9A,B reveals that over most of the retina (excluding the very central area) photoreceptors greatly outnumber ganglion cells, which implies that peripheral ganglion cells sub-sample the photoreceptor array. Wässle et al. have shown in monkey retina that out to 30 deg of eccentricity the number of cones matches the number of interneurons (midget bipolar cells) connecting cones to P-ganglion cells. 91 This implies that the cone array and the array of bipolar interneurons have the same Nyquist limit, which means the neural image carried by the cone mosaic is effectively transferred to the input stage of ganglion cells without loss of spatial resolution. Consequently, if retinal undersampling is the limiting factor for pattern resolution, then human resolution acuity should be well correlated with the Nyquist frequency of the ganglion cell array rather than the cone or bipolar arrays. This basic premise that ganglion cell spacing is the limiting factor for visual resolution has a long history in vision science, having been reaffirmed many times 33, 75, 80, 93, 104
The comparison between anatomical predictions and human visual
performance is drawn in Fig. 10. The
symbols in this figure show how resolution acuity for interference
fringes varies across the retina. 80,
82 In these experiments
resolution acuity was defined as the highest spatial frequency which
did not elicit perceptual aliasing (see Section
1.3). Given this criterion, we interpret these resolution data as
a direct, non-invasive measurement of the sampling density of the
underlying neural array. These estimates are compared with recent
anatomical estimates of the Nyquist sampling rate of the population
of P-type (i.e. midget) ganglion cells in humans, 16
and cone photoreceptors in human. 17
For a given density D of neurons, the highest Nyquist
frequency occurs when the sampling points are arranged in an
hexagonal array. 57 In this case the
center-to-center spacing S of the sample points is given by
the formula S2 = 2/(D
)
and the Nyquist frequency which applies to gratings of all
orientations is 1/(S
).
71 Using these two formulas, we
computed the Nyquist frequency of neural arrays from anatomical
estimates of cell density measured along the horizontal meridian of
the human retina. 16,
17 The results for the mosaic of
cone photoreceptors are shown by the dashed curve in Fig.
10. We estimated the Nyquist frequency of the sub-population
of P-type ganglion cells from measurements of total ganglion cell
density by assuming that 80% of all human ganglion cells are P-cells,
just as in the monkey. 56 To plot these
anatomical data in visual coordinates, the nonlinear projection of
the retinal image (which has the effect of reducing the Nyquist rate
at larger eccentricities) was computed according to the wide-angle
model of Drasdo and Fowler. 70

Figure 10. Comparison of human resolution acuity (symbols, from Thibos et al. 1987) with anatomical predictions based on the sampling density of retinal neurons (dashed = human cones (Curcio et al., 1990), solid = human ganglion cells (Curcio & Allen, 1990)). NF=nasal field, TF=temporal field.
Inspection of Fig. 10 reveals that the cone Nyquist frequency falls rapidly over the first few degrees of eccentricity before leveling off to a much shallower slope in the peripheral retina. It is difficult to measure the density of ganglion cells in central retina because, although they are functionally connected to central cones, their cell bodies are physically displaced from the fovea into surrounding retina. (This accounts for the break in the solid curve near 0 deg. The other break, between 10-15 deg in nasal retina, corresponds to the optic nerve head.) Despite this uncertainty, it is clear that beyond about 10 deg of eccentricity the sampling density of P-ganglion cells is significantly lower than that of cones. Comparing these two curves with the psychophysical measurements of resolution acuity for interference fringes shows that, beyond about 10-15 deg of eccentricity, human resolution acuity is much lower than that predicted by the sampling density of cones but closely match the predictions based on ganglion cell density. These predictions are different for nasal and temporal visual fields (because of the well-known nasal/temporal asymmetry of the human and primate retinas 60 ) and the psychophysical data reflect these differences. In fact, the onset of aliasing is so sensitive to small changes in neural sampling density that this psychophysical technique can reveal subtle variations across the visual field associated with the human visual streak (see Section 1.1). 2
Similar experiments have been conducted on macaque monkeys trained to fixate a small spot of light while making resolution judgments about visual targets placed in the peripheral visual field. 46 While it is far easier to train humans to do the same task, the reward for pursuing this arduous experiment in monkey is that it provides invaluable behavioral data to compare with the substantial literature on the physiology and anatomy of the macaque visual system. The results indicated a decline in resolution acuity with eccentricity that followed the density of P-cells in a manner that is very similar to that shown for humans in Fig. 10.
BFigure 11. Illustration of the spatial scale of neural arrays in human nasal retina, 30 deg from fovea. A: cone photoreceptors (rods fill the space between cones but are too small to show here). B: P-ganglion cells (receptive fields are shown at half their true diameter to avoid the confusion of overlapping fields). Anatomical dimensions are taken from Fig. 9. Grating patch is 1 deg square and has spatial frequency (5.3 cyc/deg) equal to the psychophysical resolution limit.
To help gain an intuitive appreciation of these results, it may be helpful for the reader to inspect a scale drawing of the various neural arrays in the eye overlaid upon the visual stimulus. For example, in Fig. 11A we show a 1deg 6 1deg patch of grating which is just at the resolution limit (5.3 cyc/deg, from Fig. 10) when placed 30 deg from the fovea in the temporal visual field. This target is imaged on nasal retina where cone diameter is about 1.5 arcmin and cone spacing is about 3.0 arcmin (from Fig. 9). Cones are depicted by circles arranged in an hexagonal array, but of course in reality the array is not perfectly regular. The space between the cones is filled by numerous rod photoreceptors which are about 0.5 arcmin in diameter . Rods are too small to illustrate here but notice that the space available between neighboring cones is large enough to accommodate about 3 rods side-by-side. For comparison with the cone array we show in Fig. 11B the array of P-ganglion cells in the same retinal location. Spacing and receptive field radius are both about 5.2 arcmin for the P2-ganglion cells (Fig. 9) chosen for illustration. From this comparison it is clear that the cone array significantly oversamples the grating target but the ganglion cell array is just adequate to meet the Nyquist requirement of two samples per period of the grating. One misleading feature of Fig. 11 is the apparent lack of overlap. The receptive fields of ganglion cells in Fig. 11 are drawn at half their true diameter because if drawn at full size their overlap would make it difficult to appreciate their spacing. When drawn to scale, the P2-ganglion cell picture looks very much like Fig. 8B.
On the basis of the close correlations evident between anatomical predictions and human performance, we conclude that undersampling by retinal ganglion cells is the primary limiting factor for pattern resolution in human peripheral vision beyond about 10 deg of eccentricity (provided the retinal image is well focused). However, in the parafoveal region (less than 10 deg of eccentricity) the cone mosaic determines the neural sampling limit 107 since ganglion cells outnumber cones over this region of retina. 16, 93 The same conclusion was drawn for the macaque monkey by Merigan and Katz. 46
An important issue still to be resolved is the role of various sub-populations of P-cells in determining visual resolution performance. There is a large body of anatomical and physiological evidence from numerous vertebrate species demonstrating a fundamental dichotomy in the polarity of neural responses to light. Some cells are visually excited when the receptive field center receives more light than the surrounding areas, and conversely, are inhibited when the central region is darker than the surround. These are called "ON-cells" and are to be contrasted with "OFF-cells" which behave in just the opposite manner (i.e. are excited when the center is dark and inhibited when the center is light). This separation of visual signals into parallel neural channels signaling "brighter than ambient" and "darker than ambient" begins at the very first synapse in the visual system (between photoreceptors and bipolar cells) and is maintained by separate anatomical structures throughout the initial stages of the visual pathway. 90
Since nature seems to have gone to great lengths to separate the
neural image into complementary ON- and OFF- neural channels, it
might be thought that the two arrays should be considered separately
when estimating the sampling density of retinal ganglion cells. On
the other hand, it is also conceivable that at some stage the ON- and
OFF- channels are recombined with the overall sampling density
determined by the combined array. The rationale for this view is that
the ON array provides good dynamic range for signaling the bright
bars of a grating and OFF cells provide good, but complementary,
dynamic range for the dark bars. Thus ON/OFF parallel pathways in
vision bear a striking resemblance to the positive and negative
circuits in the classic, push-pull design of electronic amplifiers.
Given the very close correlation evident in Fig.
10 between psychophysical resolution and anatomical
predictions based on the combined (ON + OFF) populations of ganglion
cells, it is tempting to reject the hypothesis that psychophysical
performance is limited by sampling density of either sub-array alone.
However, this would be pushing the data further than is warranted.
Curcio and Allen 16 reported a large
degree of inter-subject variability in the density of human ganglion
cells, especially in the periphery. For the six retinas in their
study, up to threefold differences in ganglion cell density were
found at corresponding points in peripheral retina, which implies a
-fold
difference in Nyquist frequencies. Given such large variability
between individuals, the psychophysical experiments of Fig.
10 cannot reveal the relatively small,
-fold
difference in Nyquist frequencies that distinguishes the combined (ON
+ OFF) hypothesis from the separate (ON or OFF) hypothesis. For the
same reason it is not yet possible to decide whether there is any
substantive disagreement between the approach of Thibos et al.
80 who compared resolution to the total
P-cell population and that of Merigan & Katz 46
who compared monkey resolution to the sub-population of ON (or OFF)
P-cells.
Results of our own systematic exploration of the limits to aliasing in human vision are summarized in Fig. 12. A series of experiments were conducted in which cutoff spatial frequency was measured for two different tasks (contrast detection, pattern resolution), for two different types of visual targets (interference fringes, sinusoidal gratings displayed on a computer monitor with the eye's refractive error corrected by spectacle lenses), at various locations along the horizontal nasal meridian of the visual field. 74, 80, 82
Inspection of Fig. 12 reveals that, for the resolution task, cutoff spatial frequency was the same regardless of whether the visual stimulus was imaged on the retina by the eye's optical system (natural view; open triangles) or produced directly on the retina as high-contrast, interference fringes (closed triangles). This is consistent with our earlier conclusion that, for a well-focused eye, pattern resolution is limited by the ambiguity of aliasing caused by undersampling, rather than by contrast attenuation due to optical or neural filtering. Aliasing first occurs for frequencies just above the resolution limit, so the triangles in Fig. 12 also mark the lower limit to the aliasing portion of the spatial frequency spectrum. Recall that this lower boundary of the aliasing zone is accurately predicted by the Nyquist limit calculated for human P- ganglion cells (data from Fig. 10 replotted in Fig. 12 as a dotted curve).
The upper limit to the aliasing zone is determined by performance on the detection task. Detection acuity is significantly lower for natural viewing (open squares) than for interferometric viewing (filled squares) at all eccentricities. Consequently, the spectrum of frequencies for which aliasing occurs is narrower for natural viewing (shaded region) than for interference fringes (shaded + cross-hatched regions). This difference is directly attributable to imperfections of the eye's optical system 9 since all else is equal (in both cases the neural apparatus is faced with identical tasks (contrast detection) of the same stimulus (sinusoidal gratings)). Notice that for natural viewing the aliasing zone narrows with decreasing eccentricity and vanishes completely at the fovea, where contrast sensitivity for detection and for resolution of gratings is nearly identical. Thus under normal viewing conditions, the fovea is protected from aliasing because of optical low-pass filtering whereas in the periphery the optical quality of the human eye remains quite good (assuming refractive errors are corrected) and so the eye's optics are ineffective as an anti-aliasing filter for the coarse sampling of the peripheral retina.

Figure 12. Summary of optical and neural limits to pattern detection and pattern resolution across the visual field in humans. Symbols show psychophysical performance (mean of 3 subjects from Thibos et al., 1987) for grating detection (squares) and resolution (triangles) tasks under normal viewing conditions (open symbols) or when viewing interference fringes (closed symbols). The aliasing zone extends from the resolution to the detection limits. Solid curve drawn through open squares indicates the optical cutoff of the eye and marks the upper limit to the aliasing zone for natural viewing (shaded). The expanded aliasing zone observed with interference fringes (cross-hatched area) extends beyond the optical cutoff to a higher value set by neural factors. Dashed curve shows computed detection limit of individual cones (from Curcio et al., 1990) and dotted curve shows computed Nyquist limit of retinal ganglion cells (RGC; from Curcio & Allen, 1990).
The neuro-optical model discussed in Section 2 above suggests that the optical limits to pattern detection revealed in Fig. 12 may be described in two ways, corresponding to image-space and object-space viewpoints. For an image space analysis, we conceive of the eye as a concatenated series of image processing stages as in Fig. 7. From this viewpoint the eye's optical system plays the role of a low-pass filter which attenuates the contrast of the retinal image, thus lowering the maximum detectable spatial frequency. Alternatively, if we imagine the projection of neural receptive fields into object space as in Fig. 5, then the effect of the eye's optical imperfections is to blur the neural receptive fields, creating a greater degree of overlap and thus lowering the cutoff frequency for contrast-detection by the neural array. From both of these viewpoints emerges the same question: which neural receptive fields limit pattern detection? Evidently the fields in question are extremely small since the cutoff spatial frequency for detecting interference fringes is over 30 cyc/deg even in the far periphery beyond 30 deg of eccentricity. This implies (if we assume, to first approximation, that cutoff spatial frequency is equal to the inverse of receptive field diameter) that the limiting field diameter must be less than 2 minutes of arc in visual angle, which corresponds to less than 10 microns on the retinal surface. The only visual neurons with such small receptive fields are the cone photoreceptors.
To pursue the idea that human detection of interference fringes throughout the visual field is limited by the size of cone apertures, we estimated the cutoff spatial frequency of cones (assuming cutoff = inverse of aperture diameter) from the data of Curcio et al. 17 and compare this result with the upper boundary of the aliasing zone determined psychophysically. 80 As may be seen from Fig. 12, the calculated cutoff for cones (dashed curve) closely matches the psychophysical detection limit, thereby lending quantitative support to the suggestion that aliasing is curtailed by spatial averaging over the entrance aperture of individual cone photoreceptors. Although this argument is widely accepted for foveal vision 48, 70, 71, 106 it was surprising to find evidence in favor of this hypothesis also for peripheral vision since receptive fields of even the smallest primate retinal ganglion cells were thought to pool many cone receptive fields in peripheral retina. 5, 61 However, recent anatomical experiments add considerable weight to the idea that some human ganglion cells (the P1 class identified by Kolb et al. 39 ) have receptive field centers consisting of a single cone in peripheral vision, just as Polyak first described for foveal vision. 58
In his classic description of the architecture of the human fovea, 58 Polyak coined the term "monosynaptic pathway" to stand for the exclusive, 1-to-1 pathway from single cones to single ganglion cells via single bipolar cells. Although Polyak believed the monosynaptic pathways were restricted to the foveal region, a growing body of evidence suggests the existence of a monosynaptic pathway also in human peripheral retina. It has been known for 25 years that an individual midget bipolar cell of the primate retina makes exclusive synaptic contact with only one cone photoreceptor, 37 but only recently has it been reported that the output of some individual midget bipolars is directed to single midget ganglion cells (the P1 sub-class). 38, 39 Although there are enough midget bipolar cells to act as interneurons in such a monosynaptic pathway, 91 there aren't enough ganglion cells available to provide a 1-to-1 pathway for every cone. Nevertheless, the anatomical evidence clearly shows that some P-ganglion cells exist in the periphery which are functionally connected to single bipolar cells, which in turn are connected to single cones. Corroborating physiological evidence of extremely small receptive fields has been reported by Crook et al. 15 who found that about 20% of cells recorded in peripheral monkey retina responded to gratings well above the Nyquist limit.
Although the anatomical and physiological evidence reviewed above is consistent with the psychophysical measurements of detection acuity, a 1-to-1 connection from cones to ganglion cells is not strictly required by the psychophysical evidence. A small degree of convergence of cone signals onto ganglion cells could be present and not markedly limit detection acuity because, unlike the foveal region, the peripheral cones are not tightly packed (numerous, smaller rods fill the space between cones). Given a grating with a half-period smaller than cone spacing yet larger than the cone radius, individual cones will act as functional subunits that sample the sinusoidal stimulus at random locations. If the number of these subunits per ganglion cell is small, a significant degree of neural contrast may persist across an array of such neurons. However, if the number of randomly located subunits is large, then their effects will tend to cancel and the contrast in the neural image will vanish, thus precluding the possibility of psychophysical detection.

Figure 13. Frequency response characteristics of hypothetical P-ganglion cells. Top row shows receptive field maps of 3 examples (left: one cone/ganglion cell; middle: seven closely-packed cones per ganglion cell; right: seven widely-spaced cones per ganglion cell for which R/S=0.3). Bottom row shows contour plot of the magnitude portion of the 2-dimensional Fourier transform of the corresponding receptive field.
To quantify these arguments, we computed the Fourier transforms of three hypothetical receptive fields illustrated in the top row of Fig. 13. At the left is a small receptive field the size of a single cone, in the center is a medium-sized field formed by summing the outputs of 7 closely-packed cones, and on the right is a large field formed by summing the outputs of 7 widely spaced cones. To make the latter case representative of the mid-peripheral retina (30 deg eccentricity; see Fig. 9), cone radius R was set to 30% of the spacing S between cones. Directly below each receptive field map is shown a contour plot of the corresponding two-dimensional frequency spectrum of the receptive field. As expected from the equivalent width theorem, since the medium field is three times larger than the small field, it has one-third the bandwidth. However, for a peripheral ganglion cell connected to widely spaced cones, the equivalent width theorem is not a useful guide to understanding the more complicated filtering characteristics of the neuron. As may be seen in the bottom, right panel of Fig. 13, the overall bandwidth is similar to that of a single cone, but there are gaps in the spectrum at intermediate frequencies for which the cell is less sensitive. Evidence of such gaps in the spectrum has not been reported in perceptual aliasing experiments, but this is not surprising since it is very unlikely that all ganglion cells in a given region of the retina would be connected in precisely the same fashion. Instead, the anatomical evidence of large variance in dendritic field diameters indicates that neighboring ganglion cells are probably connected to a small but variable number of cones. 18 As a population, this would have the effect of filling in any gaps in the frequency spectrum and yet allow the array to signal the presence of patterns which are much finer than the Nyquist limit of the array and so produce the aliasing phenomenon of peripheral vision.
For many real-world tasks there is considerable pressure to increase the amount of information presented to a human operator through a visually-coupled interface. However, this demand inevitably leads to central vision overload and so new strategies must be considered for alleviating some of the workload. One idea is to improve the information-handling capacity of central vision by avoiding the optical limitations normally present in the eye. Another idea is to shift some of the information from the overcrowded central portion of the visual field into the vastly larger peripheral field. This latter strategy is also an option for many forms of visual dysfunction in which central vision is reduced or lost completely due to disease, injury, or the normal aging process. In this section we critically examine these two approaches in the context of the neuro-optical model of human visual performance presented above. In that model, low-pass filtering by the optical system of the eye, followed by image sampling by the mosaic of retinal neurons, set absolute limits to the transmission of visual information to the human operator. These limits are different for central and peripheral vision and thus have implications for both of the design strategies just mentioned. We conclude with a discussion of biological visual systems as a model for space-variant image processing by machine vision systems.
Developing methods to by-pass the optical limitations of central vision is an attractive goal of current research in visual and ophthalmic optics. The optical system of the human eye acts as a low-pass spatial-filter which significantly reduces the contrast of the retinal image. 11 For small pupils the major limiting optical factor is diffraction at the pupil margin. Theory predicts that for a diffraction-limited optical system, the optical cutoff spatial frequency for a 2 mm pupil is in the range 60-100 cyc/deg for the visible spectrum. Although the diffraction cutoff is proportionally higher for larger pupils, these potential values are never realized in normal vision since optical aberrations of the eye limit image quality for larger pupils. On the other hand, it is conceivable that future optical designs of contact lenses, spectacles, or visual stimulation devices will correct the eye's aberrations and thus greatly increase the quality of the retinal image, approaching the very much higher limits set by diffraction. In fact, even the contrast attenuation due to diffraction can be circumvented to some extent by employing a Maxwellian-view design. By this method light is concentrated within the pupil in the form of the Fourier transform of the object. 103 Consequently diffraction is avoided for all frequencies low enough to pass through the pupil, although optical aberrations of the eye are not necessarily avoided. 77, 79, 103 .
Given these potential technologies for significantly improving retinal image quality, it is important to ask the following question. What would be the consequences of avoiding normal optical limitations on retinal image quality? As shown in Fig. 12, the optical cutoff for central vision closely matches the Nyquist limit imposed by the sampling mosaic of foveal cones, which implies that low-pass optical filtering in the naked eye protects central vision from aliasing due to undersampling. Consequently, very fine spatial details are normally filtered out by the eye's optical system before they can be undersampled by the retina. However, when these limitations are avoided experimentally by using Maxwellian-view interferometric stimulation, perceptual aliasing can occur when viewing patterns too fine to be resolved in the fovea. 106 Thus, in the attempt to avoid optical imperfections and improve image quality, the fovea becomes exposed to the effects of undersampling and the result is likely to be misperception of both stationary 82, 106 and moving 3, 13 spatial patterns. On the other hand, Snyder and colleagues have argued convincingly that the penalty of aliasing would be significantly offset by the greater contrast of sub-Nyquist image components resulting from improved image quality. 69
Our understanding of the performance limits of peripheral vision have expanded greatly over the past few years through basic research into the anatomy, physiology, and psychophysics of the primate/human visual system. We now know that the quality of the eye's optical system is nearly as good in the periphery as it is for foveal vision provided that focusing errors are corrected with spectacle lenses. However, the functional sampling density of the retinal mosaic of neurons is much lower in the periphery than it is in the fovea. Consequently, the peripheral retinal image is subject to retinal undersampling, which will cause perceptual aliasing of spatial frequencies greater than the classical resolution limit. The bandwidth of the aliasing spectrum can be very large in the peripheral field since the lower limit (e.g. about 3 cyc/deg at 30 deg of eccentricity) is set by the sampling density of a very sparse array of retinal ganglion cells while the upper limit (about 30 cyc/deg) is set either by the optical system or ultimately by spatial averaging over the entrance aperture of individual cone photoreceptors. These findings imply that the aliasing spectrum of normal peripheral vision may extend up to an order of magnitude beyond the classical resolution limit as shown in Fig. 12.
In orthodox engineering circles, aliasing is generally held to be an undesirable consequence of undersampling which is to be avoided by anti-alias, low-pass filtering. In a biological system, on the other hand, anti-alias filtering may have a greater cost than benefit. For example, optical low-pass filters attenuate not only the high frequencies to be rejected, but they also attenuate the low frequencies to be passed. Evidently the design trade-off nature has made in the human visual system is to tolerate the possibility of erroneous perception caused by aliasing in exchange for improved retinal contrast. As pointed out by Snyder et al., 69 retinal image contrast is greatly increased by avoiding the substantial amount of low-pass filtering which is required to avoid aliasing altogether. What is needed in future research is an assessment of the benefits and/or costs of various strategies for filtering the aliasing spectrum. Depending upon the specific visual task or application, such filtering may or may not be desirable. At this early stage of research we only know that the spectrum of spatial frequencies subject to aliasing is potentially much larger than the resolvable spectrum in the peripheral field.
Although laboratory experiments have demonstrated aliasing phenomena for simple grating patterns, the impact of aliasing on real-world stimuli with rich frequency spectra is largely unknown. Initial experiments in this area have used edges or compound gratings containing just a few harmonic components. Early results indicate that different grating components can interact with each other, with one masking the appearance of the other. That is, although aliased components can still be seen in the presence of sub-Nyquist components 86 their visibility is reduced 89 or completely eliminated. 24 Conversely, the visibility of sub-Nyquist gratings is reduced in the presence of supra-Nyquist gratings. 87 Given these interactions, it is perhaps not surprising that aliasing in peripheral vision was not even discovered until only recently, and that neural undersampling may not be a significant handicap in everyday vision.
The foregoing discussion raises another question relevant to the design of visually-coupled systems: what kind of information is appropriate for peripheral display? Although central vision is commonly regarded as greatly superior to peripheral vision, in many regards just the opposite is true. Night vision is an obvious example for which the central scotoma is attributed to the lack of rods in the retinal fovea. Another broad area in which peripheral vision excels is in the sensing and control of body movement. For example, the visual control of posture, locomotion, head, and eye movements are largely under the control of motor mechanisms sensitive to peripheral stimulation. 31, 45 Many of these functions of peripheral vision are thought of as reflex-like actions which, although they can be placed under voluntary control, largely work in an "automatic-pilot" mode with minimal demands for conscious attention. This suggests that information regarding body attitude, self-motion through the environment, and moving objects are ideally suited for peripheral display since such a strategy matches the information to be displayed with the natural ability of the peripheral visual system to extract such information. However, retinal undersampling in the periphery can lead to erroneous perception of motion direction. 3, 13
In the beginning of this chapter we used the language of the
vision scientist to motivate our investigation of peripheral vision
by asking the question: what is the physiological and anatomical
basis of the variation of acuity across the visual field? From the
viewpoint of the design engineer, however, it is more interesting to
ask what advantages accrue from the particular implementation of
space-variant imaging found in the human visual system? We mentioned
in Section 1.1 that one advantage of having the
spatial grain of the retina increase in direct proportion to
eccentricity is that it provides an anatomical substrate for neural
size constancy. In an elegant series of papers, Schwartz and
colleagues have shown that this simple idea has deep mathematical
roots and profound implications for how the topographic mapping of
retinal space onto the visual cortex is organized and its functional
implications. 22, 41,
64-66
Schwartz found that the distortion of the visual field revealed by
anatomical and physiological mapping experiments in animals could be
accounted for by a simple, logarithmic mapping function from retinal
coordinates to cortical coordinates. That is, if the retinal image is
referenced to a polar (r,
)
coordinate frame, and the neural image representation on the surface
of the lateral geniculate nucleus (LGN) or visual cortex is
referenced to a rectangular coordinate frame (u,v) then the
transformation from one coordinate frame to the other is given by the
pair of equations
|
|
|
This "log-polar" mapping function can be placed in more compact form by using complex-valued coordinates and is recognized as a complex logarithmic conformal map. 65

Figure 14. Retinotopic mapping from a non-uniform retina to a uniform lateral geniculate nucleus (LGN) distorts the brain's representation of the visual field. A: Schematic diagram of topographic map illustrates why the mapping magnifies the central field and compresses the periphery. B: In Schwartz's logarithmic conformal model the mapping function x=log(r) transforms a set of retinal neruons for which local spacing (open symbols, right ordinate) is proportional to eccentricity into a series of equally spaced locations in the LGN (filled symbols, left ordinate). The magnification du/dr of this transformation process varies as 1/r.
How might the log-polar transformation be implemented in the anatomical wiring of the visual pathway? Perhaps the simplest model is that the output ends of the optic nerve fibers spread uniformly over the surface of the LGN, the next stage of the primary visual pathway, and this basic organization transferred more or less intact to the visual cortex. A simplified schematic of the basic idea is shown in Fig. 14A. Although ganglion cells are tightly packed in the central regions of the retina and more widely spaced in the periphery, subsequent stages of the visual pathway are just the opposite: the LGN and visual cortex are conspicuously homogeneous. Consequently, if the optic nerve fibers of the eye fan out to cover the LGN surface uniformly, the result will be a distortion of the neural image yielding a disproportionately large representation of the foveal region of the visual field.
Quantitative support for this model emerges from its predictions
about ganglion cell spacing. As illustrated in Fig.
14B, the spacing of a radial sequence of retinal cells must
be directly proportional to their eccentricity if the logarithmic
function is to be used to map their axons onto evenly spaced
target-cells in the LGN. To see why this is true, let the local
(linear) magnification of the topographic map be given by
u/
r,
the ratio of spacing of nearest neighbors in the LGN to the spacing
of corresponding nearest neighbors in the retina. This LGN
magnification factor is given, in the limit, by the derivative of
u with respect to r
|
|
|
Since the spacing
u
of the LGN cells is assumed constant, the implication is that retinal
spacing of neighboring cells must be proportional to
eccentricity,
|
|
|
which is precisely the result inferred from psychophysical studies of visual acuity (see Section 1.1). Thus Schwartz's logarithmic model of cortical magnification, created initially to describe anatomical and physiological measurements of topographic mapping, also accounts for the variation of visual resolution across the visual field determined psychophysically. Furthermore, the model provides a mathematical and conceptual framework for understanding the intriguing finding that visual acuity is directly proportional to cortical magnification. 14 Since cortical magnification and the Nyquist limit are both proportional to ganglion cell density 92 they must be proportional to each other.
We cannot hope to give justice here to the many and varied functional implications of the logarithmic conformal model of topographic mapping in biological or machine vision systems. 22, 41, 64-66, 98-101 Instead, we close with a brief description of just two of the more obvious features of the images transformed by this scheme. 65 We have already mentioned the idea of size constancy in the context of "matched filtering" (see Section 1). It is now possible to appreciate the full, two-dimensional nature of this idea and relate it to the concept of cortical magnification as illustrated in the top of Fig. 15. Suppose an object viewed from afar occupies the area of the visual field marked by the shaded area (left panel). When the viewing distance is reduced, the object grows larger and more eccentric as shown by the cross-hatched area (middle panel). However, in the transformed neural space of the brain, the neural image remains the same size and is merely translated to a new location (right panel). Thus the cortical image remains a fixed neural size for a looming object of fixed physical size despite the fact that the retinal image is growing in proportion to its eccentricity from the fovea (see geometry of Fig. 1). Furthermore, the visibility of the object remains roughly constant as it moves into the peripheral field since contrast sensitivity in the periphery depends on the ratio of eccentricity to retinal image size 54 (see the chapter by Peli in this book), which is constant in this scenario.
A different kind of invariance is illustrated in the bottom half of Fig. 15. Here the object (or the eye) rotates around the visual axis, maintaining constant eccentricity. Once again the cortical image is translated, but this time in the orthogonal direction to that for zoom. Such a neural mapping scheme would seem ideally suited for detecting and correcting rotational misalignment of the two eyes about their individual visual axes, an important prerequisite for binocular fusion and depth perception.

Figure 15. Top: Perceptual size constancy is facilitated by the logarithmic conformal mapping scheme for image transformation. If retinal image size is proportional to eccentricity, cortical image size of a looming object is constant. Bottom: The logarithmic mapping scheme converts the rays and exponentially-spaced rings of a polar coordinate reference frame into an evenly spaced, orthogonal grid of a rectangular reference frame. As a result, rotation of the field about the visual axis (r=0) causes translation of the cortical neural image.
Sensors which mimic the design of the human retina, alternatively known as log spiral, conformal logarithmic, and polar exponential grid (PEG) arrays, have found applications in a variety of machine vision systems. These include robot vision, 47, 63 spacecraft docking, 99 and video compression for remote driving. 100 The advantages of log-polar mapping for machine vision would seem to be the same as for biological vision. For example, PEG arrays permit a wide field of view with high central resolution and progressively decreasing resolution for peripheral objects, which greatly reduces memory requirements and processing loads on image processing computers. 99 Rotation and zoom operations, which require matrix multiplication in rectangular coordinates, are more simply computed as addition of log-polar coordinates. Exponential sensor arrays also ease the burden of computing target range from perspective changes, time-to-collision from optic flow velocity, and stereographic depth from binocular images. 98
Perhaps no other aspect of vision science has resulted in such mutual benefit for engineers and biologists. The discovery of log-polar image processing by neurobiologists has been exploited by engineers to improve the design of man-made vision systems. In return, design engineers are revealing the practical advantages of such a system which previously had been mere biological speculation. Together, they have provided fresh ideas about how the eye evolved into such a marvelous instrument for extracting useful information from our visible world.
The ideas presented in this paper have developed in parallel with the dissertation research of D. Walsh, F. Cheney, D. Still, M. Wilkinson, R. Anderson, and Y. Wang. This research was supported by National Institutes of Health grant EY05109.