Red, theory; black, fact.
My post on the thalamus suggests that in thinking about the brain, we should maintain a sharp distinction between temporal information (signals most usefully plotted against time) and spatial information (signals most usefully plotted against space). Remember that the theory of General Relativity, which posits a unified space-time, applies only to energy and distance scales far from the quotidian.
In the thalamus post, I theorized about how the brain could tremendously data-compress temporal information using the Laplace transform, by which a continuous time function, classically containing an infinite number of points, can be re-represented as a mere handful of summarizing points called poles and zeroes, scattered on a two-dimensional plot called the complex frequency plane. Infinity down to a handful. Pretty good data compression, I'd say. The brain will tend to evolve data-compression schemes if these reduce the number of neurons needed for processing (I hereby assume that they always do), because neurons are metabolically expensive to maintain and evolution favors parsimony in the use of metabolic energy.
Ultimately, the efficiency of the Laplace transform seems to come from the fact that naturally-occurring time functions tend to be pretty stereotyped and repetitious: a branch nodding in the wind, leaves on it oscillating independently and more rapidly, the whole performance decaying exponentially to stillness with each calming of the wind; an iceberg calving discontinuously into the sea; astronomical cycles of perfect regularity; and a bacterial population growing exponentially, then shifting gears to a regime of ever-slowing growth as resources become limiting, the whole sequence following what is called a logistic curve.
Nature is very often described by differential equations, such as Maxwell's equations, those of General Relativity, and Schrodinger's Equation, the three greats. Other differential equations describe growth and decay processes, oscillations, diffusion, and passive but non-chemically energy-storing electrical and mechanical systems. A differential equation is one that contains at least one symbol representing the rate of change of a first variable versus a second variable. Moreover, differential equations seem to be relatively easy to derive from theories. The challenge is to solve the equation, not for a single number, but for a whole function that gives the actual value of the first variable versus the second variable, for purposes of making quantitative, testable predictions, thereby allowing testing of the theory itself. The Laplace transform greatly facilitates the solution of many of science's temporal differential equations, and these solutions are remarkably few and stereotyped: oscillations, growth/decay curves, and simple sums, magnifications, and/or products of these. Clearly, the complexity of the world comes not from its temporal information, but from it's spatial information. However, spatial regularities that might be exploited for spatial data compression are weaker than in the temporal case.
The main regularity in the spatial domain seems to be hierarchical clustering. For an example of this, let's return to the nodding branch. Petioles, veins, and teeth cluster to form a leaf. Leaves and twigs cluster to form a branch. Branches and trunk cluster to form a tree. Trees cluster to form a forest. This spatially clustered aspect of reality is being exploited currently in an approach to machine intelligence called "deep learning," where the successive stages in the hierarchy of the data are learned by successive hidden layers of simulated neurons in a neural net. Data is processed as it passes through the stack of layers, with successive layers learning to recognize successively larger clusters, representing these to the next layer as symbols simplified to aid further cluster recognition. This technology is based on discoveries about how the mammalian visual system operates. (For the seminal paper in the latter field, see Hubel and Wiesel, Journal of Physiology, 1959, 148[3], pp 574-591.)
Visual information passes successively through visual areas Brodmann 17, 18, and 19, with receptive fields becoming progressively larger and more complex, as would be expected from a hierarchical process of cluster recognition. The latter two areas, 18 and 19, are classed as association cortex, of which humans have the greatest amount of any primate. However, cluster recognition requires the use of neuron specialist sub-types, each looking for a very particular stimulus. To even cover most of the cluster-type possibilities, a large number of different specialists must be trained up. This does not seem like very good data compression from the standpoint of metabolic cost savings. Thus, the evolution of better ability with spatial information should require many more new neurons than with the case of temporal information.
My hypothesis here is that what is conferred by the comparatively large human cerebral cortex, especially the association cortices, is not general intelligence, but facility with using spatial information. We take it on and disgorge it like water-bombers. Think of a rock-climber sizing up a cliff face. Think of an architect, engineer, tool-and-die maker, or trades person reading a blueprint. Now look around you. Do we not have all these nice buildings to live and work in? Can any other animal claim as much? My hypothesis seems obvious when you look at it this way.
Mere possession of a well developed sense of vision will not necessarily confer such ability with spatial information. The eyes of a predatory bird, for instance, could simply be gathering mainly temporal information modulated onto light, and used as a servo error for dynamically homing in on prey. To make a difference, the spatial information has to have someplace to go when it reaches the higher brain. Conversely, our sense of hearing is far from useless in providing spatial information. We possess an elaborate network of brain-stem auditory centers for accomplishing exactly this. Clearly, the spatial/temporal issue is largely dissociable from the issue of sensory modality.
You may argue that the uniquely human power of language suggests that our cortical advantage is used for processing temporal information, because speech is a spaceless phenomenon that unfolds only in time. However, the leading theory of speech seems to be the Wittgenstein picture theory of meaning, which postulates that a statement shows its meaning by its logical structure. Bottom line: language as currently understood is entirely consistent with my hypothesis that humans are specialized for processing spatial information.
Since fossil and comparative evidence suggests that our large brain is our most recently evolved attribute, it is safe to suppose that it may be evolving still, for all we know. There may still be a huge existential premium on possession of improved spatial ability. For example, Napoleon's strategy for winning the decisive Battle of Austerlitz while badly outnumbered seems to have involved a lot of visualization. The cultural face of the zeitgeist may reflect this in shows and movies where the hero prevails as a result of superior use of spatial information. (e.g., Star Wars, Back to the Future, and many Warner Bros. cartoons). Many if not most of our competitive games take place on fields, courts, or boards, showing that they test the spatial abilities of the contestants. By now, the enterprising reader will be thinking, "All I have to do is emphasize the spatial [whatever that means], and I'll be a winner! What a great take-home!"
Let me know how it goes, because all this is just theory.