A biologically inspired neural model for visual and proprioceptive integration including sensory training.

Journal of Integrative Neuroscience, Vol. 12, No. 4 (2013) 491–511 c Imperial College Press ° DOI: 10.1142/S0219635213500301

J. Integr. Neurosci. 2013.12:491-511. Downloaded from www.worldscientific.com by AUSTRALIAN NATIONAL UNIVERSITY on 03/15/15. For personal use only.

A biologically inspired neural model for visual and proprioceptive integration including sensory training Maryam Saidi, Farzad Towhidkhah*, Shahriar Gharibzadeh and Abdolaziz Azizi Lari Department of Biomedical Engineering Amirkabir University of Technology (Tehran Polytechnic) Tehran, Iran, 15875-4413 *[email protected] [Received 31 July 2013; Accepted 21 October 2013; Published 4 December 2013] Humans perceive the surrounding world by integration of information through di®erent sensory modalities. Earlier models of multisensory integration rely mainly on traditional Bayesian and causal Bayesian inferences for single causal (source) and two causal (for two senses such as visual and auditory systems), respectively. In this paper a new recurrent neural model is presented for integration of visual and proprioceptive information. This model is based on population coding which is able to mimic multisensory integration of neural centers in the human brain. The simulation results agree with those achieved by casual Bayesian inference. The model can also simulate the sensory training process of visual and proprioceptive information in human. Training process in multisensory integration is a point with less attention in the literature before. The e®ect of proprioceptive training on multisensory perception was investigated through a set of experiments in our previous study. The current study, evaluates the e®ect of both modalities, i.e., visual and proprioceptive training and compares them with each other through a set of new experiments. In these experiments, the subject was asked to move his/her hand in a circle and estimate its position. The experiments were performed on eight subjects with proprioception training and eight subjects with visual training. Results of the experiments show three important points: (1) visual learning rate is signi¯cantly more than that of proprioception; (2) means of visual and proprioceptive errors are decreased by training but statistical analysis shows that this decrement is signi¯cant for proprioceptive error and non-signi¯cant for visual error, and (3) visual errors in training phase even in the beginning of it, is much less than errors of the main test stage because in the main test, the subject has to focus on two senses. The results of the experiments in this paper is in agreement with the results of the neural model simulation. Keywords: Multisensory integration; arti¯cial neural networks; supervised training; sensory uncertainty; proprioception.

*Corresponding

author.

491

492

M. SAIDI ET AL.


1. Introduction Humans perceive the surrounding world through di®erent sensory modalities. The human brain combines multiple sensory modalities information to form coherent and uni¯ed percept. This has occupied researchers in di®erent disciplines with di®erent points of view such as: behavioral, physiological, and mathematical. Some mathematical models are based on behavioral experiments. For instance in behavioral experiments, to evaluate the contribution of each sense to multisensory integration output, sensory uncertainties (i.e., input disturbances) have been used (Ernst & Banks, 2002; Heron et al., 2004; Kording & Wolpert, 2004). This uncertainty was imposed in two ways: (1) manipulating the variance of each sensory estimation (Ernst & Banks, 2002; Heron et al., 2004; Kording & Wolpert, 2004) and (2) exerting a bias on the unisensory estimations (Warren & Cleaves, 1971; Bresciani et al., 2006). An example of the bias uncertainty is a situation in which there is a shift between the real hand position and its visual feedback in a virtual reality environment (Van Beers et al., 1996, 1999; Sober & Sobes, 2003, 2005). In such cases, a mismatch between the senses helps us to assess subject's reliance on each sense by considering the closeness of the integration result to the position of that sensory feedback. In this type of studies, depending on subject's awareness about the existence of the shift (bias), there are two possibilities which yield di®erent results: (1) When the subject is not aware of the shift, the multisensory estimation may result from a linear combination of di®erent sensory information, i.e., Bayesian inference (Van Beers et al., 1999; Alais & Burr, 2004; Kording & Wolpert, 2004; Beierholm et al., 2008) and (2) once the subject becomes aware of shift existence, but not its value, the multisensory estimation may result from a nonlinear combination of di®erent sensory information, i.e., causal Bayesian inference (Kording et al., 2007; Kording & Tenenbaum, 2007; Beierholm et al., 2008; Hospedales & Vijayakumar, 2009). A reason for the nonlinear combination is that in addition to the problem of noise and uncertainty in the unisensory estimations, the subject is not sure whether there is one cause or two distinct causes for all sensory cues. Models such as Bayesian raise an important question: how does the brain solve the mathematical equations of the model (for example calculating the variances of its sensory estimates)? Hence, some models such as \population codes" that justify the neural mechanism of multimodal integration are proposed (Deneve et al., 1999). In these models, modality information, such as stimuli position is coded by a population of neurons (Deneve et al., 1999). The neurons in each population process the sensory information and bene¯t recurrent connections. Deneve et al. (1999) have proved that it is possible to simulate maximum likelihood estimation (MLE) of the position of a bimodal stimulus using a recurrent neural network with adjusted parameters. Recurrent feedbacks produce an attractor that provides stable, noise-free multisensory estimation. Deneve & Pouget (2004) showed that such a model could be used to simulate Bayesian integration as well. Later, Ma et al. (2006) claimed that if the neural noise was assumed to be Poisson-like, linear combination of neural activities of


A BIOLOGICALLY INSPIRED NEURAL MODEL

493

the population automatically would result in Bayesian integration. Ohshiro et al. (2011) proposed a normalization neural model to explain some fundamental response properties of multisensory neurons including inverse e®ectiveness and the spatial principle. Inverse e®ectiveness states that multisensory enhancement is large for weak multimodal stimuli and decreases with stimulus intensity. The spatial/temporal principle of multisensory enhancement states that when stimuli are spatially congruent and temporally synchronous, robust multisensory enhancement will occur, instead, when-spatial or temporal o®sets are large, response suppression will happen. For situations in which the subject becomes aware of the existence of the shift between sensory stimuli modalities, but not its value, causal Bayesian inference is used. No neural model has been suggested for this situation yet. The major purpose of this study is to propose a neural network model that accommodates the results of Bayesian and causal Bayesian inferences. Furthermore, another aim of this model is to simulate the e®ect of human attention and behavior under sensory training as shown in our previous study (Saidi et al., 2012). In that study the e®ect of proprioceptive modality training on hand location estimation was investigated through a set of experiments. The results of these experiments indicated that: (1) modality training increases the subject reliance on the proprioceptive sensory information, i.e., bias decrement in sensory integration; (2) increasing the discrepancy between the modalities leads to more uncertainty, i.e., variance in the estimation of hand position, but the variance of the ¯nal estimate is less than the variance of the proprioceptive estimate. To complete the investigation of the sensory training e®ect, a set of new experiments were designed in this paper. The experiments of present study evaluated the e®ect of both modalities, i.e., visual and proprioceptive training and compared them with each other. 2. Methods In this section, the proposed model and the conducted experiments are explained. The model is inspired by the neuro-imaging ¯ndings and is based upon neural networks. The e®ect of sensory training on multisensory integration evaluated as another aim of these experiments. 2.1. The model According to neuroimaging studies, there are two hypotheses for neural interactions in multisensory integration that are not seemingly contradictory: ¯rst, there are feedback connections from multisensory to unisensory areas, which transfer some information to unisensory neurons to be processed (Macaluso et al., 2000); second, there are direct connections between unisensory areas (Cappe & Barone, 2005). Another important note in multisensory integration is sensory attention. Ciaramitaro et al. (2007) found through functional magnetic resonance imaging (fMRI) that by attention to every sense, neural activity decreases in the cortex of the other sense. The proposed model in this study considers all of these hypotheses.

494

M. SAIDI ET AL.


2.1.1. Overall structure The model is a three-layer neural network (Fig. 1). The ¯rst layer, i.e., input layer, consists of two populations of neurons; one for visual and the other for proprioceptive information processing. These two populations are distinct and each corresponds to one sense. Each population in the ¯rst layer has 21 neurons with Gaussian tuning (activation) curve. The outputs of ¯rst layer are fed into the second layer. The number of neurons in the second layer is the same as that of the ¯rst layer and they are sensory speci¯c as well. The output of each neuron in the second layer is a normalized linear combination of its inputs such that the sum of neurons output is equal to one. The direct connections between unisensory areas show modulation of two sensory areas. This modulation is determined by the weights of the second to the third layers. The third layer is multisensory layer that has 21 neurons too. These neurons have linear activation functions. Their outputs are normalized as the second layer and fed back to the input of the second layer. The output of each neuron in the third layer, multiplied by feedback coe±cients, and fed back to corresponding neuron in both visual and proprioception sections in the second layer. Figure 2 shows a more detailed schema of the model. Each population in the ¯rst layer has 21 neurons with Gaussian tuning (activation) curve and some overlap. Each neuron has a spatial receptive ¯eld, with the highest response to the stimulus in the center of the ¯eld. Since the model is one-dimensional, the spatial receptive ¯eld is considered to be one-dimensional too. As the spatial coding uncertainty of visual neurons is less than that of the proprioceptive ones, the overlap between the tuning curves of proprioceptive neurons is more than that of visual neurons (Fig. 3). Therefore, the maximum response of visual neurons was assumed to be more than that of proprioceptive ones. The Gaussian functions for tuning curves of visual (TCV )

Fig. 1. The overall structure of the three-layer model. The model includes two distinct sections; the right and left parts are speci¯ed to proprioceptive and visual processing, respectively. The outputs of the third layer neurons are fed back to the inputs of the second layer. Bidirectional arrow between the two sections shows modulation of two sensory modalities.



495

Fig. 2. The more detailed schematic of the model in Fig. 1. The ¯rst layer of model has 21 neurons with Gaussian tuning activation. There is a fully-connected structure between the ¯rst and second layers. The connection weights between these layers (Wp1to2 and Wv1to2 Þ are based on a Gaussian pro¯le. The neurons of second layer have linear activation functions. The outputs of second layer neurons multiply by connection weights between the second and the third layers -that are also based on Gaussian curves- (Wp2to3 and Wv2to3 ) and feed into third layer. Direct connections between the two parts are based on the modi¯cation of the parameters (mean and standard deviation) of these Gaussian curves. The existence of real shift between two sensory sources and announcement of possible mismatching (I 1 input) are two causes for uncertainty regarding the number of sensory sources that have inhibition e®ect on the multisensory neurons. The outputs of third layer neurons are normalized and fed back to the second layer. The feedback weights represent the e®ect of sensory attention. The values of parameters v and p are proportions of attention on visual and proprioception respectively (i.e., p þ v ¼ 1). 1 v is the feedback weight from multisensory to visual section and 1 p is that from multisensory to proprioception section in second layer. Paying more attention to visual sense leads to increasing the value of v. Subsequently 1 v decreases and 1 p increases. There is such this procedure for paying more attention to proprioception.

and proprioceptive (TCP ) neurons presented in Fig. 3 are: 1 xv xi 2 yvi ¼ TCV ðiÞ ¼ pffiffiffiffiffiffi ; exp 0:5 SV 2SV 1 ðxp xi Þ 2 ypi ¼ TCP ðiÞ ¼ pffiffiffiffiffiffi exp 0:5 ; SP 2SP

ð1Þ

where xp and xv are the proprioceptive and visual stimuli positions respectively in one dimension of space. Subscript i, ranges from 1 to 21, determines neuron number and xi; ranges from 10 to 10, is the center of receptive ¯eld for the ith neuron (see Fig. 3). SV and SP (the standard deviations of Gaussian functions) set to 0.4 and 1.5 for visual and proprioceptive neurons, respectively. Note that in order to display the di®erence between visual and proprioceptive neurons, the relative values of SV and SP are more important rather than their absolute values. Finally yvi and ypi are the output of ith neuron of visual and proprioceptive sections in the ¯rst layer, respectively.


496

M. SAIDI ET AL.

Fig. 3. The Gaussian tuning curves for visual (grey) and proprioceptive (black) neurons of the ¯rst layer. Bold curves represent the tuning curve for a neuron which is more sensitive to the presence of a stimulus in position indicated by 0.

The two sensory speci¯c sections of the model have fully-connected structure in their ¯rst and second layers. The connection weights between the two corresponding neurons in the ¯rst and second layers are the largest. Other weights are changed according to the distance between the neurons based on a Gaussian pro¯le. Thus Wp1to2 and Wv1to2 , the connection weights between the ¯rst and second layers in proprioception and visual sections respectively, are as follows: 2 xi ~ xj Wp1to2 ði; jÞ ¼ exp ; S1to2 ¼ 0:9; Cprop S1to2 ð2Þ xi ~ xj 2 Wv1to2 ði; jÞ ¼ exp ; S1to2 ¼ 0:9: Cvis S1to2 Equation (2) represents the weight connection between two neurons- one is placed in position xi in the ¯rst layer and the other is placed in position x~j in the second layer. The subscript of 1to2 in the variables indicates those variables are related to weights between the ¯rst and second layers (e.g., p1to2 and v1to2 are associated to proprioception and visual sections, respectively). S1to2 is the standard deviation of Gaussian functions. Coe±cients Cvis and Cprop are modi¯ed by sensory training. This modi¯cation will be explained in the next section. As the neurons in the second layer have linear activation function, the output of jth neuron in this layer is: X ypi Wp1to2 ði; jÞ þ Oj ð1 pÞ; zpj ¼ i¼1

zvj ¼

X i¼1

yvi Wv1to2 ði; jÞ þ Oj ð1 vÞ;

ð3Þ



497

where, yvi and ypi are the outputs of ith neurons of visual and proprioceptive sections in the ¯rst layer, respectively; Wp1to2 and Wv1to2 are the connection weights between the ¯rst and second layers de¯ned in Eq. (2); Oj is the output of jth neuron in third layer that is fed back to second layer. 1 p and 1 v are the feedback coe±cients (see Fig. 2); and ¯nally zvj and zpj are the output of jth neuron of visual and proprioceptive sections in the second layer, respectively. According to the outputs of second layer, the location of winner neurons in each sensory section determines the estimated location of stimulus on that sensory section. Winner neurons of each section has maximum output of the corresponding neurons of that section. The shift between two sensory areas is de¯ned as the di®erence of stimulus estimated locations of the two sensory sections. As Fig. 2 shows, the connection weights between the second and the third layers are also based on Gaussian curves. The parameters (mean and standard deviation) of these Gaussian curves are speci¯ed according to the shift between two sensory areas. Therefore, the modulation of two sensory areas is determined by the weights of the second to the third layers. Equation (4) shows the standard deviation of these Gaussian curves: S2to3 ¼ 10; t¼0 ; ð4Þ S2to3 ¼ 2 þ 0:1 shift t > 0 where shift is the di®erence of stimulus estimated locations of the two sensory sections. There is no information in the output layer to feedback to the second layer before any integration takes place; therefore there is no interference of the senses. Hence, at the start time (t ¼ 0) the standard deviations in Eq. (4) are set to a relative large number (i.e., equals to 10) to transfer information from the second layer to the third one without any change. After one iteration, integration occurs. In this phase, increasing the amount of the shift between two sensory sections increases the value of S2to3 . In fact, the lower the shift is the more precise the estimation is. In contrast, the more the shift is, the less the e®ect of visual neurons on coding proprioceptive information is. It leads to more deviation of the multisensory estimation. Moreover, the means of these Gaussian curves are modi¯ed by the shift between two senses. This modi¯cation is done by absorbing the peak of neurons activity of one sense to other one. In other words, when there is a shift between two senses, each modality adapts to incorporate this shift. If a true or target value is known, then each input should be adapted in the direction of this target (Ghahramani, 1995). Yet there is no explicit target in our model. Thus unsupervised learning has occurred. Therefore adaptation that shows the relation between the amount of the shift and its e®ect on the displacement of the peak of the Gaussian curves is de¯ned by AD ¼

shift : shift þ 0:5

ð5Þ

Equation (5) is the same as weight modi¯cation equation in adaptive resonance theory (Grossberg, 2012) networks. In this way, the Gaussian curves (that are

498

M. SAIDI ET AL.

connection weights between the second and the third layers) are accordingly: xk ðxpwin þ ADÞ 2 ; Wp2to3 ðj; kÞ ¼ exp S2to3 xk ðxvwin þ ADÞ 2 Wv2to3 ðj; kÞ ¼ exp ; S2to3

ð6Þ


where, xpwin and xvmin are the positions of winner neurons (that have maximum output) among the neurons corresponding to proprioception and visual sections in the second layer, respectively; AD is de¯ned in Eq. (5); xk is the position of kth neuron in third layer. The output of third layer is calculated as: Ok ¼

21 X j¼1

zpj Wp2to3 ðj; kÞ þ

21 X

zvj Wv2to3 ðj; kÞ OUMS ;

ð7Þ

j¼1

where OUMS is the e®ect of uncertainty of multisensory sources (see Sec. 2.1.2). The feedback weights determine how much information is transferred from multisensory to unisensory areas. Therefore they are based upon sensory attention. The parameters v and p are considered that are the proportions of attention on visual and proprioception, respectively. Obviously the values are between 0 and 1 and p þ v ¼ 1, because it is considered that there are no other senses except visual and proprioceptive senses. Also 1 v (see Fig. 2) is the feedback weight from multisensory to visual section and 1 p is from multisensory to proprioception section in second layer. In this way, paying more attention to a modality causes less feedback weight from multisensory to that modality processing area; thus, the e®ect of the other sense in the integration would fade out. For example, paying more attention to visual sense leads to increasing of the value of v, then 1 v decreases and 1 p increases. Therefore the proprioception section is more a®ected by multisensory output rather than visual section. In this way, the visual section plays more important role in multisensory estimation. By far if all of the attention is attracted to visual sense, then 1 v ¼ 0 and 1 p ¼ 1. Thus proprioception area is completely a®ected by multisensory area, but visual area is independent from it. Finally the information of visual sense will be overcome. Moreover, the feedback mechanism makes a recurrent network and helps to get a stable estimation as a result of multisensory integration. Network training stops by reaching a stable output on third (multisensory) layer. 2.1.2. Uncertainty of multisensory sources There are two causes for uncertainty regarding the number of sensory sources: (1) announcement of possible mismatching (I 1 in Fig. 2), for example in a sample test if the examiner announces the subject about possible di®erences (i.e., mismatching) between the data received from di®erent sensory sources, even if it is not true, the doubt about the existing multi cause for all sensory is increased; (2) the existence of real shift between two sensory sources. If this shift is small, the subject will not be aware of such shift. But by increasing the shift, subject's con¯dence of existence of


499

two causes increases. To combine these two causes a neuron with sigmoid activation function is considered as follows:


1 OUMS ¼ f ðshift þ I 1Þ; f ðY Þ ¼ ; 1 þ e Y 0; inactivated I1 ¼ : 1; activated

ð8Þ

The value of I 1 (0 or 1) is added to the value of \shift" and used as input function f . OUMS is the output of the neuron related to the uncertainty of multisensory sources. As observed in Eq. (7), the output of this neuron has inhibition e®ect on multisensory neurons. Indeed, once the subject ensures that there are two sensory sources, his/her brain likely will process information of two sensory modalities independently. This inhibits the multisensory neurons (see Fig. 2). 2.1.3. The e®ect of sensory training Modality training decreases uncertainty (variance and bias) in the estimation of that sense. In order to expose this e®ect in the model, a training block is considered. Figure 4 shows this block and the e®ect of that on the multisensory model. In the training phase, this block compares output uncertainty with maximum acceptable error (depending upon the task). If output uncertainty is more than the maximum acceptable error then, the learning process occurs. By considering this process and amnesia process, coe±cients Cprop and Cvis in Eq. (2) are determined such that: i. If visual training occurs then: Cvis ð2Þ ¼ Cvis ð1Þ Cvis ð1Þ and Cprop ð2Þ ¼ Cprop ð1Þ ii. If proprioceptive training occurs then: Cprop ð2Þ ¼ Cprop ð1Þ Cprop ð1Þ and Cvis ð2Þ ¼ Cvis ð1Þ

Fig. 4. Sensory training block. This block compares the output uncertainty with the maximum acceptable error. Based on this comparison, learning process occurs that leads to modi¯cation of connection weights (Wp1to2 and Wv1to2 Þ in the model.

500

M. SAIDI ET AL.


iii. If amnesia occurs then: Cprop ð2Þ ¼ Cprop ð1Þ þ Cprop ð1Þ and Cvis ð2Þ ¼ Cvis ð1Þ þ Cvis ð1Þ where is the learning rate. If is considered very small, the speed of training is very slow. For a large value of , it is impossible to reach a solution. Therefore, setting to 0.2 is a small constant of proportionality according to the assumed value for standard deviation in Eq. (2). These coe±cients change the variance of Gaussian curve of connecting weights between the ¯rst and second layer such that the learning decreases the variance of curve and increases the precision. Amnesia acts inversely. Training process is completed when the output uncertainty is less than the maximum acceptable error. For example in a task if the maximum acceptable error is 1 cm then, the output uncertainty must be less than 1 cm to provide a well training process. It should be mentioned, the aim of training in this section is sensory training not network training. Sensory training occurs in training phase. In this phase, only one sense takes place and there is no integration. This training is a type of supervised training. The proposed model for modeling multisensory integration is a recurrent network that trains in an unsupervised way. 2.1.4. Sensory attention As described earlier, the feedback weights represent the e®ect of sensory attention. The values of weights in the model are adjusted based on three factors that are added to the proposed model in Fig. 5. They are: (1) The uncertainty of senses, the sense with lower uncertainty absorbs more attention. The comparator block in Fig. 5 compares

Fig. 5. The proposed model in detail. The factors that determine much of the attention on each sense are: (1) the uncertainty of senses (the comparator block compares the uncertainty of two senses), (2) sensory training, (3) multi target estimation (I 2 input). These factors are inputs of AT neuron with sigmoidal activation function. Output of this neuron determines the feedback weights.


501


the uncertainty of two senses; (2) sensory training, if one sense is trained, the attention upon that sense is increased; (3) multi target estimation. For instance in a sample test if the examiner asks the subject to report location of both his/her sense, the subject will not be able to focus his/her attention on only one sense. If multi target estimation is requested the I 2 input in Fig. 5 will be activated (I 2 ¼ 1). These three factors are inputs of AT neuron (see Fig. 5) that have the sigmoidal activation function. Output of this neuron determines the feedback weights as follows: 1 ; OAT ¼ f ðOcomparator þ Otraining block þ I 2Þ; f ðY Þ ¼ 1 þ e Y 0; inactivated I2 ¼ ; 1; activated 8 < 1; visual training 0; no training Otraining block ¼ ; : þ1; proprioception training 1; visual uncertainty < proprioception uncertainty ; Ocomparator ¼ þ1; visual uncertainty > proprioception uncertainty

ð9Þ

where OAT , Ocomparator and Otraining block are the output of neuron AT, uncertainty comparator, and training block respectively. The feedback weights are determined based on OAT as pþv¼1

p ¼ OAT ! 1 p ¼ 1 OAT

and

1 v ¼ OAT ;

ð10Þ

where 1 p and 1 v are the feedback coe±cients and the parameters v and p are the proportions of attention on visual and proprioception respectively. According to the de¯nitions in Eq. (9) OAT has a direct relation with the proportion of attention on proprioception (p). Figure 5 shows the proposed model in detail. This model is one-dimensional and could be easily extended to three dimensions. 2.2. Experiments The aim of the experiments was to evaluate both sensory modalities (visual and proprioceptive) training; therefore, the following requirements had to be considered in designing the experiments: 1. In the visual (proprioceptive) training phase, all other senses speci¯cally proprioception (visual) had to be canceled. 2. Motion learning had not to occur by sensory training. 3. The error feedback had to be presented to subject appropriately. 4. The task in visual and proprioceptive training phase had to be analogical. According to these requirements, the setup and stages of the experiments were designed as follows.

502

M. SAIDI ET AL.

2.2.1. Participants Of 16 right-handed subjects, 18–28 years old, participated in the experiments. All the subjects completed the informed consent form prior to the beginning of the experiments. They all had normal or corrected to normal vision voluntarily participated in this study. They were divided into two groups. Group 1 included eight participants whose proprioception was trained. Group 2 were eight participants whose visual senses were trained in ¯rst stage of experiments. All the participants were aware of the purpose of the study.


2.2.2. Setup The setup consisted of two spaces: visual screen and proprioceptive workspace. Participants sat in front of a table on which the setup was mounted. The visual screen was located above the proprioceptive workspace. The two spaces were exactly aligned in a manner that a one to one mapping existed between these two spaces. The chair was adjusted before starting the experiments for each subject. The proprioceptive workspace consisted of a movable handle and a digitizer tablet with its pen (Wacom Intuosr 3), which had X and Y pen-tip position with sampling frequency of 206 Hz. The tablet was connected to a computer via a USB port. The pen mounted right above the handle. Moving the pen on the tablet resulted in the cursor movement on the visual screen. The position data of the pen, which was the same position data as the cursor, was transferred to a vector in MATLABr workspace. A rectangular region on the screen, the same size as the tablet, was chosen exactly above the tablet. Therefore, there was a linear scaled mapping and a one-to-one correspondence between each point on the tablet and each pixel on the rectangular region of the screen. Thus, the mapping scale between the screen and the workspace was assumed to be one. The movable handle provided only circular movement for subject. Subjects were asked to move handle with tip of index ¯nger of their right hand. Therefore, the subjects actively moved their arm and proprioceptive information was acquired actively. Changing the length of handle produced circles with di®erent diameters (from 2 cm to 20 cm). All trials had the same start point. In our setup, visual screen was a Samsung 1900 monitor with a 60 Hz refresh rate which was laid horizontally on the experiment table. The subjects were not able to see their hands, but they could see the screen while they were moving the handle. Using a MATLABr program, we made a ¯gure which covered almost the entire screen. In each trial, depending on the experiment type, we were able to remove the visual feedback of hand position or present it. The visual feedback was presented as a moving marker which was a circle with a diameter of 1 cm (approximately equal to width of ¯ngertip) on the screen. 2.2.3. Stages of the experiments The experiments included two stages: \training stage" and \main test". In the \training stage" one of the two senses (visual or proprioception) was trained. In the \main test", two senses were existent and subjects had to integrate them.



503

Participants were divided in two groups. The proprioceptive senses of group 1 were trained. They had no visual feedback and were asked to rotate the handle one turn. Once the circular movement completed, a ¯gure including circular regions (Fig. 6) were introduced to the subjects and they were asked to specify the region corresponding to the performed circular movement. After the participant's response, the correct answer was told to him/her. In this way, the subject received the error feedback which helped him/her to train his/her proprioceptive sense. Then, the handle immediately returned to the start position and the next trial was initiated. The experiment was stopped after ¯ve consecutive answers were correct. Regarding group 1, visual senses of group 2 were trained. They were asked to see a moving marker on the visual screen while their hands were in resting position. Once the marker circular movement completed, similar to proprioceptive training, subjects had to specify the region in which the marker circular movement occurred and then the correct answer was told to them. In this way their visual senses were trained. As Fig. 6 shows, interval between circles region is 2 cm. It is around the ¯nger tip that provides proprioceptive information. Two independent sources of information collaborated: proprioceptive information and visual information. The subjects were informed that the visual screen and the proprioceptive workspace were aligned exactly. Participants moved the handle while seeing the movement of visual marker. The two modalities were matched, but the subjects were told that two sensory modalities might be mismatched. In other words, they expected that their hands might draw a circle di®erent from that of the visual marker. Then, they were asked to report separately the regions' numbers of circles corresponding to visual and proprioceptive senses presented simultaneously. In this way, the participant had to pay attention to both sensory inputs. This stage included two repetitions for each circle in each region, i.e., 14 trials totally.

Fig. 6. Visual screen is divided to circular regions to subject specify the region corresponding the circular movement.

504

M. SAIDI ET AL.

Participants made a two-dimensional movement, but they were not informed about movement's geometry, they were asked to report the regions' numbers of circles. Indeed, participants had to estimate the radius of each circle (see Fig. 6). Therefore the estimation of circle radius considered as target position estimation was requested in one dimension and could be replaced with xv or xp in the equations of proposed model. 3. Results


3.1. Results of experiments The recorded data was used to identify how sensory training and uncertainty about sensory sources a®ect the multisensory perception. Therefore, the data analysis was performed in three categories: (1) comparing the training rates of visual and proprioceptive senses (2) comparing the e®ect of visual and proprioceptive training on the results of main test stage (multisensory estimation), and (3) determining the e®ect of uncertainty about sensory sources in main test on unimodal estimations. In order to analyze the experiments data, error was de¯ned as the di®erence between the real and estimated circle regions and calculated by subtracting the number of real and estimated circles. To compare the training rate of visual and proprioceptive senses, the means of errors for two groups are plotted vs. the trial numbers in Fig. 7. The learning rate of visual sense is signi¯cantly more than proprioception due to two reasons: (1) in this task, the uncertainty of visual neurons is less than that of proprioceptive ones. It means that the visual training phase occurred in daily experiences of subjects; (2) due to their special characteristics, visual neurons are able to learn comparatively more than proprioceptive neurons. Figure 8 shows the visual and proprioceptive errors (sum of errors in all trials) in the main stage for both groups. To investigate the e®ect of each sensory training modality, a comparison between two groups is exploited. In other words, to analyze

Fig. 7. Comparison of the error mean curve in training phase of experiments in two groups (visual and proprioceptive trained).



505

Fig. 8. Mean and standard deviation errors: visual error (grey bar) and proprioceptive error (black bar) for two groups (right for visual trained group and left for proprioception trained group).

the e®ect of proprioceptive training, proprioceptive training group is compared with non-proprioceptive training group, i.e., visual training participants. It is observed in Fig. 8 that the mean of visual and proprioceptive errors in the group with that sensory training are less than the other group. T-test showed there is a signi¯cant di®erence in proprioception error between two group (p ¼ 0:35), but the di®erence in visual error is not signi¯cant (p ¼ 0:78). This result suggests that the training was e®ective in decreasing proprioception error but not in visual error. The results show the visual errors in training phase, even at the beginning, are much less than those in main test stage. In the visual training phase, the subject focuses only on visual sense, while in main test stage; he/she has to focus on both senses equally. It must be noted that in the optimal integration, the subjects normally estimate a target using all their sensory modalities. Under this condition, they also would be sure that all modalities are matched and their brain gives appropriate and optimal weights to each modality for uni¯ed perception. Yet, the subjects are not sure about matching of modalities and are asked to report the estimation of both senses; therefore, their brains have to give the same and non-optimal weight to each sense. 3.2. Computer simulations To evaluate the performance of the proposed model, we performed computer simulations corresponding to our experimental tests mentioned above, as well as the

506

M. SAIDI ET AL.

reported experiments by other researchers (Van Beers et al., 1996, 1999; Kording et al., 2007; Hospedales & Vijayakumar, 2009; Saidi et al., 2012). 3.2.1. Experimental conditions to be simulated


We want to simulate ¯ve experimental conditions with our model: 1. In the ¯rst category, subjects were asked to estimate a target location with two sensory modalities. They were not told about a possible mismatching between two modalities. No sensory training was performed. The amount of shift in each trial was variable. For simplicity, the proprioception stimulus was placed in \0" position and the visual stimulus was placed in positions 0, 1, 2, and 3, depending on the amount of the shift. The experiment was carried out by Hospedales & Vijayakumar (2009). 2. The second category was similar to the ¯rst one except that the subject was informed about a possible discrepancy between the sources. The experiment was carried out by Kording et al. (2007). 3. The third category was similar to the second one, but the proprioceptive sense of the subject was trained (Saidi et al., 2012). 4. In this category, the subject was asked to report his/her estimation of both senses. He/she was informed about possible discrepancy between sources. The proprioceptive sense of subject was trained. In this condition, to investigate the e®ect of the shift between two sensory modalities, the shift was increased similar to that in the ¯rst category. 5. The ¯fth category was similar to the fourth one, but instead of proprioception sense the visual sense of the subject was trained. 3.2.2. Model tuning for di®erent experimental conditions In order to simulate di®erent experimental conditions mentioned above, the model parameters including I 1 and I 2 inputs, output of training block, feedback weights, Cvis and Cprop were tuned as follows. It is noted that since the normalization between zero and one was done in several levels in the model, when an input of model was activated, the value of it was set to one and when it was inactivated the value of it was set to zero. (1) For the ¯rst category simulation, the I 1 and I 2 inputs in the model were inactivated (set to zero). Also, output of training block was zero (i.e., no training was applied). Therefore, the feedback weights of the model were determined only by comparing the uncertainty of unimodals. In the normal condition, uncertainty of proprioception was more than that of visual modality, i.e., feedback weights in Fig. 5 were as: v > p or 1 p > 1 v. Since no sensory training occurred in this condition, Cvis and Cprop (the coe±cients that were modi¯ed by sensory training) in Eq. (2), were set to their initial values. The initial value for these coe±cients is one.



507

(2) Tuning of the model in the second category simulation was similar to that of the ¯rst category, but, since in this condition, the subject was informed about a possible discrepancy between the sources, I 1 input (announcement of possible mismatching input), in model (Fig. 5) was activated (was set to one). (3) In the third category simulation, ¯rst it was necessary to complete the training process of proprioception in the model. Thus, Cprop in Eq. (2) was modi¯ed (decreased). Therefore, the feedback weight 1 p decreased. Other tuning parameters (I 1 and I 2 inputs, feedback weights and Cvis Þ were carried out as the second category. (4) For the fourth category simulation, regarding to fourth experimental condition that the subject was asked to report his/her estimation of both senses, I 2 input (multi target estimation) was activated. Also in fourth experimental condition the subject was informed about possible discrepancy between sources so for simulation, I 1 input (announcement of possible mismatching) was activated (set to one). Therefore, the input of multisensory neurons and tuning feedback weights were changed. Also, the training process on proprioception was adjusted in the model as the third category. (5) Model tuning in the ¯fth category was the same as that in the fourth category, except that in this simulation, the training process was applied on visual sense instead of proprioception. 3.3. Results of computer simulations The results of simulating di®erent experimental conditions applied to the model are summarized in Fig. 9 (in rows A–E and columns a–d). In Fig. 9, the ¯rst column shows the position of proprioception stimulus and the second column displays the proprioception neurons activity for that stimulus just before entering the second layer. Also the ¯rst row shows visual stimulus position and the second row displays the visual neurons activity for that stimulus (just before entering the second layer). In plots of di®erent conditions three curves are drawn: (1) model estimation in multisensory area or output of neurons in the third layer (black dashed line), (2) output of proprioception neurons in the second layer (light grey line), and (3) output of visual neurons in the second layer (dark grey line). In mandatory integration in which participants perceive no discrepancy between the senses and subject integrates information of two modalities for estimation of one target (condition Aa in Fig. 9), Bayesian inference occurs. In this condition, uncertainty of bimodal estimation is less than unimodal one. By increasing the shift between two modalities, uncertainty of bimodal estimation increases (conditions Ab–Ad). In this condition, real shift between modalities is the only cause that increases uncertainty about modalities mismatching and the subject is not aware of the shift. When the subject is aware of a possible discrepancy (the second experimental condition category, i.e., conditions Ba–Bd), uncertainty of bimodal estimation is

M. SAIDI ET AL.


508

Fig. 9. The results of computer simulations in di®erent condition categories (categories 1–5 in rows A–E and columns a–d). The ¯rst column shows the position of proprioception stimulus that is located in position 0 and the second column show the proprioception neurons activity for that stimulus (just before entering the second layer). The ¯rst row shows visual stimulus position that changes from 0 to 3 in columns a–d and makes shifts 0–3. The second row shows the visual neurons activity for that stimulus (just before entering the second layer). In all plots (in rows A–E and columns a–d), black dashed curves represent model estimation in multisensory area. Dark and light grey curves represent visual and proprioceptive estimation in each unisensory areas (output of the second layer). Vertical and horizontal axes show normalized activities and position, respectively.

more than that of previous mode (the ¯rst experimental condition category, i.e., conditions Aa–Ad in Fig. 9) (Kording et al., 2007). In this condition, if the subject's proprioceptive sense is trained, uncertainty of proprioception decrease and subsequently the uncertainty of bimodal estimation decreases (the third experimental



509

condition category, i.e., conditions Ca–Cd), because, more shifts cause more reliance of the participant on the sense that is closer to reality. In human experiment and natural conditions manipulating of visual feedback is usually easier than proprioception that is an internal sense. Thus proprioception is usually closer to reality. The amount of the shift, as well as the participant's attention to one of the two modalities, is important in reliability of modalities (Saidi et al., 2012). When the subject is aware of a possible shift and is asked to report unimodal estimation separately, activity in multisensory area decreases. Conditions Da–Dd & Ea–Ed in Fig. 9 (fourth and ¯fth experimental condition categories) depict this reality. This result is consistent with experimental results obtained in this paper. Proprioceptive training decreases the uncertainty of the ¯nal estimations but visual training does not have a signi¯cant e®ect on it. Since the uncertainty of visual sense is normally less than that of proprioception, the training of visual sense has less e®ect on output estimation modi¯cation than that of proprioception. This result is consistent with the experimental results obtained in this study. 4. Conclusion This paper proposed a neural computational model for multisensory integration. The model consists of a multilayer recurrent neural network with two populations of neurons: code visual and proprioceptive sensory stimuli positions. The populations of neurons with bell-shaped receptive ¯eld are the same as many neurons in the primary cortex. The model bene¯ts from neuroimaging ¯nding including the existence of feedback connections transfer some information to unisensory neurons for processing and there exist direct connections between unisensory areas. Sensory attention is considered in feedback weights to indicate how much multisensory information is used by each sensory modality. The sense that attracts more attention relies more on its own information and uses less amount of information of other sense (or multisensory area including two sensory modalities). The attention on each sense determined by some internal and external factors for instance external forcing to focus on one sense was considered in the model. Awareness of mismatching between sensory sources has inhibition e®ect on multisensory neurons in the model. The awareness of mismatching between sensory sources decreases neuron activity in multisensory area. Due to these facts, the model is able to mimic multisensory integration behavior of neural centers in the human brain. Previous proposed sensory integration models including Bayesian and causal Bayesian inferences excluded sensory training process. For model validation, a new setup and experiments were designed and performed in this study. In our experiments, the estimation of hand movement in visual-proprioceptive integration was studied on two groups of subjects with trained visual and trained proprioceptive senses. The results of the experiments infer that visual learning rate is signi¯cantly more than that of proprioception. The mean of visual and proprioceptive errors are decreased by training. Statistical analysis shows that while this

510

M. SAIDI ET AL.


decrease is signi¯cant for proprioceptive error it is insigni¯cant for visual error. Visual errors in training phase are much less than the main test stage errors, even in the beginning of the phase. In the main test, the subject has to focus on two senses. The presented model is consistent with the results of Bayesian and casual Bayesian inferences. Furthermore, the model can also represent human sensory training procedure. Finally, the proposed model could be used in clinical applications such as diseases related to sensory integration problems. For example, this model lends itself to study autism in which the patients have di±culties to integrate sensory information properly. REFERENCES Alais, D. & Burr, D. (2004) The ventriloquist e®ect results from near optimal bimodal integration. Curr. Biol., 14, 257–262. Beierholm, U., Kording, K., Shams, S., Ma & W.J. (2008) Comparing Bayesian models for multisensory cue combination without mandatory integration. Advances Neural Inform. Process. Sys., 20, 1–8. Bresciani, J.P., Dammeier & Ernst M.O. (2006) Vision and touch are automatically integrated for the perception of sequences of events. J. Vision, 6, 554–564. Cappe, C. & Barone, P. (2005) Hetero modal connections supporting multisensory integration at low levels of cortical processing in the monkey. Europ. J. Neurosci., 22, 2886–2902. Ciaramitaro, V.M., Buracas, G.T. & Boynton, G.M. (2007) Spatial and cross-modal attention alter responses to unattended sensory information in early visual and auditory human cortex. J. Neurophysiol., 98, 2399–2413. Deneve, S., Latham, P.E. & Pouget, A. (1999) Reading population codes: A neural implementation of ideal observers. Nat. Neurosci. 2, 740–745. Deneve, S. & Pouget, A. (2004) Bayesian multisensory integration and cross-modal spatial links. J. Physiol. (Paris), 98, 249–258. Ernst, M.O. & Banks, M.S. (2002) Humans integrate visual and haptic information in a statistically optimal fashion. Nature, 415, 429–433. Ghahramani, Z. (1995) Computation and Psychophysics of Sensorimotor Integration. Doctor of Philosophy, Massachusetts Institute of Technology. Grossberg, S. (2012) Adaptive resonance theory: How a brain learns to consciously attend, learn, and recognize a changing world. Neural Netw., 37, 1–47. Heron, J., Whitaker, D.Mc. & Graw, P.V. (2004) Sensory uncertainty governs the extent of audio-visual interaction. Vision Res. 44, 2875–2884. Hospedales, T. & Vijayakumar, S. (2009) Multisensory oddity detection as Bayesian inference. PLoS ONE, 4, e4205. Kording, K.P. & Wolpert, D.M. (2004) Bayesian integration in sensorimotor learning. Nature, 427, 244–247. Kording, K.P., Beierholm, U., Ma, W.J., Quartz, S., Tenenbaum, J.B. & Shams, L. (2007) Causal inference in multisensory perception. PLos ONE, 9, 1–10. Kording K.P. & Tenenbaum, J.B. (2007) Causal inference in sensorimotor integration. Adv. Neural Infor. Process. Syst., 19, 737–744.



511

Ma, W.J., Beck, J.M., Latham, P.E. & Pouget, A. (2006) Bayesian inference with probabilistic population codes. Nat. Neurosci., 9, 1432–1438. Macaluso, E., Frith, C.D. & Driver, J. (2000) Modulation of human visual cortex by crossmodal spatial attention. Science, 289, 1206–1208. Ohshiro, T., Angelaki, D.E. & DeAngelis, G.C. (2011) A normalization model of multisensory integration. Nature, 14, 775–782. Saidi, M., Towhidkhah, F., Lagzi, F. & Gharibzadeh, S. (2012) The e®ect of proprioceptive training on multisensory perception under visual uncertainty. J. Integr. Neurosci., 11, 401–415. Sober, S.J. & Sobes, P.N. (2003) Multisensory integration during motor planning. J. Neurosci., 23, 6982–6992. Sober, S.J. & Sobes, P.N. (2005) Flexible strategies for multisensory integration during motor planning. Nat. Neurosci., 8, 490–497. Van Beers, R.J., Sitting, A.C. & Van der Gon, J. (1996) How humans combine simultaneous proprioceptive and visual position information?. Exp. Brain. Res., 111, 253–261. Van Beers, R.J., Sitting, A.C. & Van der Gon, J. (1999) Integration of proprioceptive and visual position-information: An experimentally supported model. J. Neurophysiol., 81, 1355–1364. Warren, D.H. & Cleaves, W. (1971) Visual-proprioceptive interaction under large amounts of con°ict. J. Exper. Psych., 90, 206–214.

A neuron-inspired computational architecture for spatiotemporal visual processing: real-time visual sensory integration for humanoid robots.

Introducing memory and association mechanism into a biologically inspired visual model.

Biologically Inspired Model for Inference of 3D Shape from Texture.

A biologically-inspired framework for contour detection using superpixel-based candidates and hierarchical visual cues.

Developmental changes in the visual-proprioceptive integration threshold of children.

Multisensory integration, sensory substitution and visual rehabilitation.

A Generalized ideal observer model for decoding sensory neural responses.

ERNN: a biologically inspired feedforward neural network to discriminate emotion from EEG signal.

A Biologically-Inspired Symmetric Bidirectional Switch.

One-Class FMRI-Inspired EEG Model for Self-Regulation Training.

Biologically Inspired Methods for Imaging, Cognition, Vision, and Intelligence.

Contributions of visual and proprioceptive information to travelled distance estimation during changing sensory congruencies.

The trigemino-pupillary reflex: a model of sensory-vegetative integration.

Biologically inspired biophotonic surfaces with self-antireflection.

Mussel-inspired human gelatin nanocoating for creating biologically adhesive surfaces.

Biologically inspired stealth peptide-capped gold nanoparticles.

Biologically inspired pteridine redox centres for rechargeable batteries.

The future of biologically inspired next-generation factories for chemicals.

A Biologically Inspired Computational Model of Basal Ganglia in Action Selection.

Multisensory integration. Neural and behavioral solutions for dealing with stimuli from different sensory modalities.

Biologically inspired nanofibers for use in translational bioanalytical systems.

Intermanual transfer and proprioceptive recalibration following training with translated visual feedback of the hand.

Minimally disruptive needle insertion: a biologically inspired solution.

Perspectives on biologically inspired hybrid and multi-modal locomotion.