Archive for the ‘Vision’ Category

1) Scale mismatch between the synapse-synapse level and the kind of description you want to acquire about the nervous system for a particular goal. He argues that the point at which the interesting neural computation works might be at the mesoscale. It might be enough to know the statistics of how nerve cells work at the synapse level if you want to predict behavior.

2) Structure-function relationships are elusive in the nervous system. It’s harder to understand the information that is being propagated to the nervous system because its purpose is so much more nebulous than a typical organ, like a kidney.

3) Computation-substrate relationships are elusive in general. The structure of an information processing machine doesn’t tell you about the processing it performs. For example, you can analyze in finest detail the microprocessor structure in a computer, and it will constrain the possible ways it can act, but it won’t tell you what actual operating system it is using.

Here is a link to the video of Movshon’s opening remarks. He also mentions the good-if-true point that the set of connections of C. elegans is known, but our understanding of its physiology hasn’t “been materially enhanced” by having that connectome.

The rest of the debate was entertaining but not all that vitriolic. Movshon and Seung do not appear to disagree on all that much.

I personally lean towards Seung’s side. This is not so much due to the specifics (many of which can be successfully quibbled with), but rather due to the reference class of genomics, a set of technologies and methods which have proven to be very fruitful and well worth the investment.

Read Full Post »

Certain visual inputs can be consistently interpreted in more than one way. One classic example of this is the young-woman/old-woman puzzle:

"Boring figure", via Wikipedia user Bryan Derksen

An important finding related to these types of illusions is that we don’t perceive both possibilities at once, but rather switch spontaneously between them.

Buesing et al.’s recent study formalized a network model of spiking neurons, equivalent to sampling from a probability distribution, and used it on a quantifiable model of such visual ambiguity, binocular rivalry.

This allowed them to show how spontaneous switches between perceptual states can be caused by a sampling process which produces successively correlated samples.

In particular, they constructed a computational model with 217 neurons, and assigned each neuron a tuning curve with a preferred orientation such that the full set of orientations covered the entire 180° interval.

They then ran a simulation of these neurons according to their rules for spiking and refraction, computed the joint probability distribution, projected it in 2-d, and drew the endpoints of the projections as dots, shown below. They took samples every millisecond for 20 seconds of biological time.

the "prior distribution"; each colored dot is a sampled network state; the relative orientation of each dot corresponds to the primary orientation of the perception at that time point; a dot's distance from the origin encodes the perception's "strength"; doi:10.1371/journal.pcbi.1002211.g004 part d

Note that there is a fairly homogenous distribution across the whole orientation spectrum, indicating a lack of preference for one direction. You might think of the above as the resting state activity, as there was nothing to mimic external input to the system.

In order to add this input, the authors did another simulation in which they specified the states of a few of the neurons, “clamping” them to one value. In particular, they clamped two neurons with orientation preference ~45° to 1 (“firing”), two neurons with preference ~135° to 1, and four cells with preference ~90° to 0 (“not firing”).

Since the neurons set to firing are at opposite sides of the semicircle, this set-up mimics an ambiguous visual state. They then ran a simulation with the remaining 209 neurons as above, with the results shown below.

the "posterior distribution"; the black line shows the evolution of the network states z for 500 ms during a switch in perceptual state; doi:10.1371/journal.pcbi.1002211.g004 part e

As you can see, in this case the network samples preferentially from states that correspond to the clamped positions at either ~45° or ~135°. The black trace indicates that the network tends to remain in one high probability state for awhile and then shift rapidly to the other.

As compared to the above “prior” distribution, this “posterior” distribution has greatly reduced variance.

Although the ability of their network to explain perceptual bistability is fascinating, it is perhaps most interesting due to its broader implications for how cortical regions might be able to switch between cognitive states via sampling.


Buesing L, Bill J, Nessler B, Maass W (2011) Neural Dynamics as Sampling: A Model for Stochastic Computation in Recurrent Networks of Spiking Neurons. PLoS Comput Biol 7(11): e1002211. doi:10.1371/journal.pcbi.1002211

Read Full Post »

A top-down feedback system in vision

That is the focus of an interesting article called “Top-Down Predictions in the Cognitive Brain” from the journal Brain and Cognition. The authors summarize various models pertaining to how inputs from the retinas are transferred to high level areas in the brain.

They first draw attention to how complex and fast-moving our visual processes must be, by taking the example of driving. Drivers must be able to map two dimensional screens into 3 dimensional models in seconds, while taking into account past experiences, watching our speed, and shooting off a quick text to our friend.

In order to account for this fast moving process, there must be some sort of top-down filtering of information as well in addition to the obvious bottom-up stream from retina to thalamus to primary visual cortex. They presuppose that there must be a top-down filtering for a few reasons:

1) Given the multitude of shadows, angles, and changes in light patterns that appear in the every day world, it would be improbable and highly taxing for a bottom-up system to compute which “edges” belong to which objects.

2) Resolution of ambiguous objects (a feature that we know to be possible of the human brain) is impossible without some sort of top-down feedback. The system must analyze the inputs based on its past experience with the images.

3) From computer vision research, we have learned the how advantageous these top-down systems can be.

Most of these models rely on some sort of recursive, interactive loop between the top-down and the bottom-up systems. Once the higher-level systems have some amount of information, they send their predictions back “down” and it is checked against the input data. Then, a separate process measures the amount of error between this prediction and the actual stimulus generated activity. Depending on the level of error, the higher neural region will either create a new prediction (and continue this process until a prediction yields little error), or stop the cycle and intuit that the prediction must be correspond to reality.

The researchers discuss various specific models (with this general structure) of visual processing, and then they apply their models to explain priming, emotion, schizophrenia, and dyslexia. In their discussion they are a little too verbose about how advantageous cognitive predictions are (because, duh), but they also suggest some cool ways that this general visual model could be applied to other brain functions.


K. Kveraga, A.S. Ghuman, M. Bar, Top–down predictions in the cognitive brain, Brain and Cognition 65 (2007), pp. 145–168.

Read Full Post »

Shoval et al recently tested the efficacy of TiN electrodes, fabricated on oxidized silicon substrates and coated with multi-wall carbon nanotubes, as a multielectrode array for the treatment of retinal degeneration. Vision loss due to retinal degradation is common, but since the output retinal ganglion cells remain intact, it has been suggested that a retinal implant may be able to circumvent the photoreceptors and restore visual information transfer. Carbon nanotubes have a diameter of 82 nanometers with a narrow distribution. Due to their chemical inertivity, resistance to mechanical damage, high conductivity, and ease of production, they are an attractive material to interface between electrodes and neural tissue.

Specifically, the researchers tested their design on rodent retinal brain splices and analyzed data from electrodes that had activity of 0.2 Hz or above. Compared to commercial TiN electrodes, the carbon nanotube interfaced electrodes had both 1) Increased variability in the amplitude distribution (up to ~250 μV as opposed to ~100 μV for TiN electrodes), suggesting an improved electrical coupling between the neurons and the electrodes, and 2) Increased signal-to-noise ratio, again suggesting an improved coupling between tissue and electrode, perhaps due to an increase in surface area contact between the cell’s processes (i.e., neurites) and the electrodes.

There are many other technical considerations to overcome in engineering the retinal implants, such as biocampatability, stability, a lack of current spread throughout the retina following focal stimulation, and an insufficient ratio of electrodes to ganglion cells. But given that age related macular degeneration affects 30% of individuals aged 75-85, this is a pressing issue that deserves attention.


Shoval A, et al. 2009 Carbon nanotube electrodes for effective interfacing with retinal tissue. Fronteirs in Neuroengineering 2, 4. doi: 10.3389/neuro.16.004.2009.

Age-Related Macular Denegeration Info: http://www.agingeye.net/maculardegen/maculardegeninformation.php.

Read Full Post »

If humans see two objects in quick succession (less than 700 ms), they are only able to focus on one of them. This phenomenon is a specific type of the more general “information processing bottleneck” found in many cognitive systems.

Prior research has suggested that this deficiency in human processing is due to “later” mechanisms such as working memory and spatial selection. Williams et al. suggest in their 2008 paper that the problem may arise from earlier systems such as the primary visual cortex. In order to do so they used a novel spatial orientation, displaying the first object at the fovea and the second object in the periphery of the visual field.

Using fMRI, the researchers found a statistically significant correlation between BOLD activity in the primary visual cortex and the attentional blink activity (which they also found evidence for, but that is unsurprising).

They also found some intriguing data about an effect known as “lag-1 sparing,” which is that there was less attentional blink deficiency shown when the first and second object were shown at the same time than when they were separated by 200 ms. The common explanation for this effect was that subjects can view both of the objects at once because they are in the same spatial location, but these results show that the objects need not be in the same location (recall that object one is in the fovea but object two is in the periphery of the visual field). One possible way to explain this data is that subjects could divide their attention between the objects, enabling them both to be in the same “attentional gate.” This is a testable hypothesis–if attention is divided, evidence of details in either object would perhaps be less lucid.

Their results throw a curveball into the literature, as the effect must now be considered as a feedback result from upstream processing instead of purely an attentional deficiency. An interesting paper with some avenues for further research.


Williams MA, Visser TA, Cunnington R, Mattingley JB. 2008 Attenuation of neural responses in primary visual cortex during the attentional blink. Journal of Neuroscience 28:9890-9840. doi:10.1523/JNEUROSCI.3057-08.

Read Full Post »

Weiss, Simoncelli, and Aldeson’s 2002 paper Motion Illusion as Optimal Percepts sets out a useful model to explain a few of the inconsistencies in human vision. First, some background on a couple of these inconsistencies.

The aperture problem results when parallel lines move along a two dimensional slit. You can view an animation of it here. We perceive the motion as diagonal, but the lines could also be moving right or downward. In order to clear up this ambiguity, you need to be able to see the ends of the bars.

One application of the aperture problem that the researchers examine is in a rhombus. You should view it here, and compare the “thinnest” and “fattest” rombi with high contrast and occluders on. The “thin” should appear as diagonal movement, while the “fat” one should appear as horizontal movement.

Before this paper, the rules for estimating coherent pattern motion could not account for both of these effects at once. The “intersections of constraints” model is able to explain the horizontal perception in the “fat” rhombus, but not diagonal motion in the “thin” rhombus. The “vector average” model is able to explain the perception of diagonal motion in the “thin” rhombus but not the horizontal motion in the “fat” rhombus.

The three authors create a new model that can explain both of these phenomenon at the same time. One other phenomenon that their model explains is that when contrast increases, perceived velocity increases as well. The paper becomes most interesting when they explain how they came about their model, which they break up into 5 steps:

1) They make an assumption of “intensity conservation,” meaning that points in the visual field move but do not alter in intensity over time.

2) They assume that intensity will not be conserved exactly and therefore that there will be some noise, for which they add a variable of n.

3) They assume that this noise is Gaussian (a bell-shaped curve) and that the velocity will remain constant in a small geometric space. They also make an assumption about intensity that I don’t fully follow (my error I’m sure) but that allows them to approximate intensity linearly in a short time frame.

4) They assume a prior favoring low speeds. This means that pre-data input, the most likely velocity is no movement, and that the slower speed will be the more likely conclusion in every case.

5) They assume that the entire image moves based on a single translation velocity.

Through some substitution and calculus, they are able to write their equation using standard linear algebra. It is elegant, and predicts some empirical data well. This is an influential paper in vision modeling, and according to Scopus it has already been cited 114 times.


Y. Weiss, E.P. Simoncelli, E.H. Adelson, Motion illusions as optimal percepts, Nature Neuroscience 5 (2002), pp. 598-604. doi: 10.1038/nn858.

Read Full Post »

At lower speeds, an object moving through a background with less contrast will appear slower. At around 8 Hz, percieved speed is not affected by contrast. And at higher speeds, reducing contrast will result in overestimation of speed. That is, the perceieved speed of the object will increase.

That’s the thrust of the research by Thompson et al., who also present a model to explain their findings. In opposition to the two Bayesian models suggested by earlier researchers, theirs is a “very simple” ratio model, which includes only two parameters. We are assured that these parameters are physiologically plausible.

The actual vision effect is interesting and deserves further study. Why does this contrast not effect perceieved speed at 8 Hz? What’s so special about that frequency?


P. Thompson, K. Brooks, S. T. Hammett, Speed can go up as well as down at low contrast: Implications for models of motion perception, Vision Research 46 (2006), pp. 782-786.

Read Full Post »