In Evolution of the Camera Eye we investigated how the simple camera eye first evolved over a period of just 364,000 years. This article looks at the advanced vision of some of our ancestors and explores different characteristics of our own advanced vision.
It was during the Devonian period 375 million years ago that our aquatic ancestor, Tiktaalik rosae, first ‘stepped’ out of the water and onto dry land.
Tiktaalik was a 9 foot (2.75m ) long predator with sharp teeth, a crocodile-like head and flattened body. It probably had advanced vision that allowed it to see in the shallow, freshwater environments in which it hunted and captured its prey.
By 286 million years ago our fish ancestors had evolved into synapsid reptiles like these dimetrodons. All mammals living today can trace their ancestry directly back to synapsid reptiles which dominated the land fauna of the Permian and early Triassic periods.
The composition of soft tissue comprising synapsid eyes is not known so we are unable to say with any certainty how good synapsid vision actually was. Suffice to say that fossil evidence suggests that synapsids did have advanced vision. This conclusion has been reached by an examination of the fossilized skulls of synapsid reptiles which show that they had scleral (sclerotic) rings inside their orbital cavities (eye sockets).
Birds, with their advanced vision, also have sclerotic rings. In birds, as can be seen in the skull of this owl, sclerotic rings offer protection and support to the soft tissue of large, delicate eyeballs- a role that the sclerotic rings of synapsids also performed.
During the Jurassic era 160 million years ago an early mammal species, Juramaia sinensis, lived in trees to escape feathered dinosaur predators. Jumamaia would have required well developed, advanced vision to prey on fast moving insects and help it negotiate branches of trees as it ran through the forest canopy.
Early hominids, like this Ouranopithecus from the late Miocene era 9.6 million years, would have possessed eyes very similar to our own.
Advanced eyes allowed these early hominids to target their prey whilst simultaneously keeping a look out for predators.
Human vision has a wide dynamic range. Dymanic range describes how well we see bright and dark objects. We can see objects that are both very bright and very dark, often both at the same time- as is demonstrated in the following example.
Not only does our vision have a wide dynamic range, but we can even see very faint objects billions of miles away.
We have well developed binocular vision which we use to focus both our eyes on what we see. Our total field of vision, using both our eyes at the same time, is 120°.
This image provides a visual representation of the detail we can see with a single glance of our eyes using our 120° field of binocular vision.
Notice how in the central part of our vision we see in color, in 3D and with high ‘visual acuity’ (ie with sharpness and clarity). Towards the periphery our vision rapidly deteriorates. At the periphery our vision becomes less acute and we can see colors less effectively.
In reality our vision at the periphery is not as bad as the above image would have you believe. This is because we focus on an image for longer than a quick ‘single glance’. The images we see at the periphery have less visual acuity and less color than our central vision, but the images we do see tend not to be distorted in shape.
In contrast cameras often have difficulty capturing images on the periphery without shape distortion, although peripheral camera images invariably possess both visual acuity and strong colors.
This is illustrated in the photograph of the buildings below. Where the angle of view is 155°, the relative sizes of objects appear exaggerated and objects near the edges of the frame appear stretched. Where the angle of view is narrow, at 90°,the buildings in the distance lose all sense of depth.
The optimum angle of view is 120°; this is the angle where the relative sizes of buildings and windows are in proportion and where buildings possesses a sense of depth.
The optimum angle of 120° just happens to be the dual eye overlap of our own binocular vision!
It takes technical expertise to ensure that details at the periphery of photographs are distortion free.
Our own brains possess this remarkable technical expertise which ensures our visual processing skills can construct 3D images which are distortion free at the periphery.
What’s even more remarkable about how our brains process what we see is that fact that an image captured by retinal tissue at the back of the eye ball is formed upside down. It is the visual cortex of our brains that reconstruct images so that we see them the right way up.
Our retinas contain 100 million specialist types of light sensitive (‘photoreceptive’) cell referred to as ‘neurons’ that convert light into a series of electrical nerve impulses. Those electrical impulses are directed along our two optic nerves (or cranial nerves II)……..
….to the optic chiasm. It is at the optic chiasm where the optic nerve fibres cross over.
After passing through the optic chiasm electrical nerve impulses pass along the optic tract to the visual cortex The visual cortex in the brain interprets the image by extracting shape, meaning, memory and context from the nerve impulses.
Nerve impulses from the left eye are conveyed to the visual cortex in the left and right sides (‘hemispheres’) of the brain through the optic chiasm. Similarly nerve impulses from the right eye are conveyed to the visual cortex in the right and left hemispheres of the brain through the optic chiasm.
The crossing over of optic nerve fibres at the optic chiasm allows the visual cortex to receive, and process, nerve impulses of everything that a person sees with both eyes.
Each eye sees a slightly different view of the same image; these slightly different views of the same image are often referred to as ‘binocular disparities’.
Processing signals of different images through two eyes, allows our visual cortex to generate binocular vision. Binocular vision provides a sense of depth and allows us to see images in 3D.
‘Depth perception’ is an important attribute of human vision that is lacking when we can only see through one eye!
Even with good binocular vision our brains can still trick us into believing there is depth perception where none exists! ‘Trompe-l’œil’ (French for ‘deceive the eye’) is an art technique that uses realistic imagery to create the illusion that something exists in three dimensions.
Take another example- these two curious cheetahs. Do you, when looking at this image, attach equal visual weight to each part or pay more attention to individual parts?
Most people would probably looking at the cheetahs’ faces first to identify the species of animal they are looking at.
Whenever we look at an image our eyes move around making small, rapid and jerky eye movements called saccadic movements. We will typically make 3 saccadic eye movements every second. After making a saccadic eye movement, our eyes stop (‘fixate’) and briefly maintain visual gaze on a single location before moving again to make the next saccadic eye movement.
This image tracks the saccadic eye movements and visual fixations of a person looking at a bust of Queen Nefertiti .
Our need for frequent saccadic eye movements and visual fixations can be explained by the composition of the central part of the human retina, known as the fovea. The fovea, which provides the high-resolution portion of our vision, is very small in humans.
Constant eye movement takes place to ensure that the fovea continues to occupy the central part of our vision. This can be illustrated by the image below showing an exaggerated representation of the effectiveness of our foveal vision compared to surrounding peripheral vision.
It was the renouned Russian psychologist Alfred Lukyanovich Yarbus (1914 -1986) who pioneered the study of saccadic eye movements.
Yarbus demonstrated how the saccadic route that our eyes follow differ according to the task our brains ask us to perform. In an interesting experiment Yarbus recorded the saccadic eye movements of human volunteers as they studied ‘The Unexpected Visitor’ by the Russian artist Ilya Repin.
In advance of any viewings, which were of three minutes duration, Yarbus gave each of his observers a single question that he wanted answering.
The following provides a summary of the questions Yarbus wanted each observer to answer in addition to the route followed by each observer’s saccadic eye movements and visual fixations.
The first observer was provided with a free examination of the painting;
the second estimated how much wealth the family possessed;
the third estimated the age of each person;
the fourth speculated about what the family had been doing before the arrival of the ‘unexpected visitor’;
the fifth was asked to remember the clothes worn by the people;
the sizth was asked to remember the position of people and objects in the room;
the final observer estimated how long the unexpected visitor had been away from the family.
Yarbus showed that the trajectories followed by the gaze depended on the task that the observer had to perform. If the observer was asked specific questions about the images, his/her eyes would concentrate on areas of the images that were relevant to the question.
Our brains track the millions of signals originating from the retina in order to assemble and update a dynamic model of the spatial structure of the environment that we see. There is no actual “surface” in the brain that “we” are look at.
When we peer into the world outside, the process of seeing is implemented by a network of billions of neurons (cells which transmit nerve impulses); each neuron exchanges signals with adjoining neurons.
The flow of signals along neurons transmits information about different parts of an image seen by our eyes.
In the below image neurons would typically transmit the outline of the foxes, their color, fine detail of their fur, the expression on their faces and the 3D definition of their ears. Successive ‘fixations’ made over a very short period of time are integrated into high-resolution visual representations of the foxes and their environment.
There are a huge numbers of sensory neurons, which regulate sight, in the ‘central nervous system’. These sensory neurons make vast numbers of connections to make sense of what we are looking at and giving us the ability to see.
With billions of neurons transmitting so many electrical signals to each other, our brains are sometimes ‘tricked’. There can sometimes be a mismatch between what we think something looks like and what that something really looks like. Everything that enters the retinas of our eyes needs to be interpreted through the brain -and these interpretations can go wrong!
Many scientists say that we should not call these types of image ‘optical’ illusions. Since the illusions occur as a result of brain, rather than eye activity, the term ‘visual’ illusion would be more appropriate.
Instead of thinking that you cannot trust your eyes when you see an illusion, you really should be saying, “I cannot always trust my visual system.”