【论文笔记】A Spatial Cognitive Model that Integrates the Effects of Endogenous and Exogenous Information









  • exogenous and endogenous information

  • hippocampus

  • striatum

  • spatial cognition

  • brain-inspired computation

1 Introduction

Research shows that the hippocampus in mammals is the core area of spatial cognition.


海马及其邻近区域包含多种神经元,如位置细胞(place cells)、网格细胞(grid cells)和头方向细胞(head-direction cells)。

  1. Place cells will fire when a rat arrives at a specific place, and the range corresponding to the firing activities is called the place field (PF). 当老鼠到达一个特定的位置时,放置细胞会触发,与发射活动相对应的范围称为位置场(PF)。
  2. 海马体内的位置细胞建立外部真实世界和脑区之间的联系,这是认知地图的神经生理学基础
  3. Many studies have focused on the computational models of hippocampal place cells, which can be divided into three types by the way they deal with the exogenous and endogenous information. 许多研究都集中在海马位置细胞的计算模型上,通过处理外源性内源性信息的方式可以分为三种类型

Exogenous information is the visual, olfactory and auditory information they see, smell and hear when they move freely in the environment. 外部信息:动物在环境中运动时的视觉、嗅觉和听觉。

Endogenous information is the self-motor information of their proprioception and vestibular sensations. 内部信息包括本体感觉的自我运动信息和前庭觉信息。

Some researchers think place cells depend only on exogenous information(只取决于外部信息), and that the most representative model is the boundary vector cell model (BVC). 边界向量细胞模型


However, other researchers think place cells completely depend on endogenous information.(只取决于内部信息)用神经网络将位置细胞的活动联系起来

  1. Rolls:竞争性学习网络;海马体侧向抑制;信息处理过程
  2. Yu:反向传播(BP)网络
  3. Zhou and Wu:径向基神经网络



The striatum, a brain region closely connected to the hippocampus, is mainly responsible for rewarding learning and action selection, which is also believed to be involved with spatial cognition. 纹状体是一个与海马体紧密相连的大脑区域,主要负责奖励性学习和行动选择,这也被认为与空间认知有关。

The striatum receives the location information from the hippocampus and reward information from dopamine cells in the ventral tegmental area. It subsequently integrates various neuronal signals to participate in action regulation. 纹状体从海马体接收位置信息,从腹侧被盖区的多巴胺细胞接收奖励信息。它随后整合了各种神经元信号,参与作用调节。

The computational models based on the striatum mainly relate it to reinforcement learning or action selection, but few studies discuss how it works together with the hippocampus in spatial cognition. 基于纹状体的计算模型主要将其与强化学习或行动选择联系起来,但很少有研究讨论它在空间认知中如何与海马体协同作用。



A. Aggarwal. The sensori-motor model of the hippocampal place cells. Neurocomputing, vol.185, pp.142–152, 2016. DOI: 10.1016/j.neucom.2015.12.044.



Arleo and Gerstner:内源信息和外源信息(只有视觉) + 奖励学习 —— 机器人导航

However, the exogenous information and the endogenous information did not work simultaneously, which means the robot used only one kind of information at one time. 然而,外源性信息和内源性信息并不能同时工作,这意味着机器人一次只使用一种信息。

Rodent:利用内源信息和外援信息编码环境 + 充分利用两条线索执行任务,实验效果:只用外援信息比两者都用慢1天


  1. 内源信息和外援信息都被考虑进来了;
  2. 海马体和纹状体都被包含进来了;
  3. 虽然在模型中海马体和纹状体是全连接的,但是只有积极的空间细胞才能够发射到纹状体;

2 Model

2.1 Model architecture


The model consists of the information perception module (IPM,信息感知模块), the ventral tegmental area (VTA,腹侧被盖区), hippocampus (HPC,海马体) and striatum (STR,纹状体).

文中外源信息包括:视觉 + 嗅觉


Meanwhile, VTA, which contains many dopamine cells, is added to form the reward circuit with the striatum. 同时,加入含有许多多巴胺细胞的VTA,与纹状体形成奖赏回路

Both HPC and VTA are connected to the STR module, which simulates the striatum and is responsible for reward learning and action selection. HPC和VTA都与STR模块相连,该模块模拟纹状体,负责奖励学习和行动选择。

The neurons in STR are divided into two groups depending on their function: one group is responsible for action selection and consists of multiple neurons, each of which represents an action, while the other group is responsible for reward learning and includes only one neuron for simplicity. STR中的神经元根据其功能分为两组:一组负责行动选择,由多个神经元组成,每个神经元代表一个动作,而另一组负责奖励学习,为了简单起见,只包含一个神经元


2.2 Information perception module

【论文笔记】A Spatial Cognitive Model that Integrates the Effects of Endogenous and Exogenous Information_第1张图片
【论文笔记】A Spatial Cognitive Model that Integrates the Effects of Endogenous and Exogenous Information_第2张图片
【论文笔记】A Spatial Cognitive Model that Integrates the Effects of Endogenous and Exogenous Information_第3张图片
【论文笔记】A Spatial Cognitive Model that Integrates the Effects of Endogenous and Exogenous Information_第4张图片
【论文笔记】A Spatial Cognitive Model that Integrates the Effects of Endogenous and Exogenous Information_第5张图片

2.3 Hippocampus module

Here, we use a two-layer feed-forward neural network to simulate HPC as shown in. 两层前馈神经网络模拟对内外源信息的处理。

However, the exact proportion of the two kinds of information in cognition remains unclear. 内外源信息的确切比例尚不明确。

  • The first layer is responsible for transmitting information from the IPM.
  • The second layer consists of N place cells and is responsible for generating the cognitive map for the environment.

The connection weights between the layers are modified in the manner of “Winner-Take-All”. Only those place cells with the maximum firing rate are selected to update the related weights. 各层之间的连接权重以“赢家通吃”的方式进行修改。只选择具有最大发射率的位置单元来更新相关的权重。

The HPC module works like a competitive neural network. In such a mechanism, we can find the place cells that match the external and internal information best. The firing activities of these place cells can form a cognitive map, and the primary task in spatial cognition is completed. HPC模块工作在一个竞争的神经网络上。在这种机制中,我们可以找到与外部信息和内部信息最匹配的位置单元格。这些位置细胞的放电活动可以形成一幅认知地图,而空间认知的主要任务已经完成。

2.4 Striatum module




  1. 如果 Q ( s t , a t ) Q(s_{t},a_{t}) Q(st,at) 为0,则智能体将随机选择一个概率为 P P P的动作,或保持 1 − P 1−P 1P的概率为当前方向。
  2. 如果 Q ( s t , a t ) Q(s_{t},a_{t}) Q(st,at) 不为0,则智能体通过贪婪策略选择一个动作,即选择 Q Q Q值最大的动作,或以 ϵ \epsilon ϵ的概率随机移动。
  3. After an action has been selected, the related Q value and the connection weight between the active place cells and the action-selected neuron are updated. 在选择了一个动作后,相关的Q值和活动位置细胞和动作选择的神经元之间的连接权值被更新

Here, the Q values are stored in the network in a dispersed way in the form of the connection weights. 在这里,Q值以连接权值的形式以分散的形式存储在网络中。

Meanwhile, the updating that takes place every time corresponds to a few active place cells. This indicates that multiple locations are involved in one updating, and helps reduce the state space and speed of the algorithm. 同时,每次发生的更新都对应于几个活跃的位置单元格。这表明一个更新涉及多个位置,有助于降低算法的状态空间和速度。

In addition, the introduction of the filter decreases the number of the place cells involved in the updating, which further improves the efficiency. 此外,过滤器的引入减少了参与更新的位置单元的数量,从而进一步提高了效率。

2.5 Working algorithm and flow chart

3 Experiment

The experiment forces animals to swim and to learn how to find a survival platform. It is mainly used to test the learning and memory ability of experimental animals in spatial cognition and is the first choice for behavioral research, especially learning and memory research. 该实验迫使动物们去游泳,并学习如何找到一个生存平台。它主要用于测试实验动物在空间认知方面的学习记忆能力,是行为研究,特别是学习记忆研究的首选。



  • 能够测量从自身到墙壁的距离,不一定能一次接收到所有的距离数据
  • 能够闻到平台上的食物气味,从而提供外源性信息
  • 能够计算其真实位置,从而提供内源信息

3.1 Parameters settings


3.2 Primary experiments: Spatial cognition in a simple environment


  1. 智能体具有探索和达到平台的能力,具有环境感知能力
  2. 模型具有自我学习能力,且能不受监督
  3. 模型具有灵活性和适应性
  4. 模型具有渐进式学习能力,实验开始时的性能通常不是很好,但随着学习的过程而提高,直到agent最终到达平台。这与桑代克等人的提出的行使定律相一致。


3.3 Advanced experiments: Spatial cognition in a more complicated environment



3.4 Contrast experiments

1) Comparison with reinforcement learning algorithm

强化学习的本质是主体不断与环境互动并从环境的反馈并从中学习。在这里,我们将我们的模型与水迷宫实验中的 S A R S A ( λ ) SARSA(\lambda) SARSA(λ)算法进行了比较。

经过多次试验后, S A R S A ( λ ) SARSA(\lambda) SARSA(λ)的参数设置如下: α = 0.02 \alpha=0.02 α=0.02 η = 0.9 \eta=0.9 η=0.9 λ = 1 \lambda=1 λ=1,其中为 α \alpha α学习率,为 η \eta η为折扣率,为 λ \lambda λ衰减因子。

我们的模型比 S A R S A ( λ ) SARSA(λ) SARSA(λ)算法更快,并且可以找到更短的目标路径。

2) Comparison with brain-inspired model










4 Discussions


4.1 Effect of the endogenous information



In our opinion, the exogenous information plays a prime role in spatial cognition, while the endogenous information is a necessary complement to it. This improves the accuracy of the space perception and thus also improves the space cognition. 我们认为,外源信息在空间认知中起着主要作用,而内源信息是对空间认知的必要补充。这提高了空间感知的准确性,从而也提高了空间认知。

4.2 Number of active place cells

Irrespective of how it changes, the number of active place cells for the model with combined information is always more than that of the model with only exogenous information. This indicates that the overall firing rate of place cells for the combined information is always higher than for the exogenous only information. 不管它如何变化,具有组合信息的模型的活动位置细胞的数量总是大于只有外源性信息的模型的数量。这表明,对组合信息的位置细胞的总体放电率总是高于仅对外源性信息。

A higher firing rate can activate more place cells, and more place cells can provide more details about the location because of their location specificity. Thus, a more accurate cognitive map can be built up, and the performance in spatial cognition is improved. 更高的放电率可以激活更多的位置细胞,而更多的位置细胞可以提供更多的位置细节,因为它们的位置特异性。这样可以建立更准确的认知地图,提高空间认知的表现.

Generally speaking, the fusion of both kinds of information increases the firing rate of place cells so that more place cells are activated. 一般来说,这两种信息的融合都增加了位置细胞的放电率,从而使更多的位置细胞被激活。

In fact, the design in our model that multiple place cells are activated when the agent is in a certain position
is totally in accordance with O′Keefe′s findings, which illuminates the biological plausibility of our model. 事实上,在我们的模型中,当智能体处于一定位置时,多个位置细胞被激活的设计完全符合O‘Keefe的发现,这说明了我们的模型的生物学合理性。

J. O′Keefe, J. Dostrovsky. The hippocampus as a spatial map. Preliminary evidence from unit activity in the freely moving rat. Brain Research, vol.34, no.1, pp.171–175,1971. DOI: 10.1016/0006-8993(71)90358-1.

