A statistical mixture method to reveal bottom-up and top-down factors guiding the eye-movements

The main idea of this work is to measure the spatial distribution of the visual attention when participants are gazing at a product. This should help to detect what kind of cognitive factors might direct the visual attention and to identify the properties – i.e. overall impressions or attributescarried by the design which influence the product’s assessment. In the present study we want to A statistical mixture method to reveal bottom-up and top-down factors guiding the eye-movements


Introduction
In the industrial context of product evaluation, it appears relevant to use non declarative methods to detect which properties of the product attract the customer's attention. Indeed, when a manufactured object is designed, it is necessary to adapt the characteristics of the object -i.e. its design-in order to match both the designer's intentions and the public's expectations.
While some researchers are using ethological methods (manipulations, interactions), sociological ones (interaction between users about the product), or sensorial (phys-(physical perception of the product), our contribution to this field is to analyze the visual attention of the customers.
The main idea of this work is to measure the spatial distribution of the visual attention when participants are gazing at a product. This should help to detect what kind of cognitive factors might direct the visual attention and to identify the properties -i.e. overall impressions or attributes-carried by the design which influence the product's assessment. In the present study we want to A statistical mixture method to reveal bottom-up and top-down factors guiding the eye-movements estimate the contribution of several perceptive and cognitive factors which could potentially guide the visual attention. While the overt visual attention can be inferred from eye-movements (Anderson & Dearbom, 1952) in various situations like driving or playing chess, the links between the visual and the cognitive activities are not obvious during an assessment task. Indeed, the visual information is not systematically processed serially (but in parallel): the eye positions do not reflect the encoding of local visual information (Rayner, 1998). Moreover, in situations with complex task or stimulus, the visual attention is subject to multiple cognitive top-down factors (Yarbus, 1967;Carpenter & Just, 1983;Henderson, 2003) in interaction with bottom-up ones (Chauvin, 2003;Itti & Koch, 2001 ;Itti et al., 2005). Several authors suggest some quantitative models to link eye positions and information encoding during scene processing (Reichle, Rayner & Pollatsek, 2003). Like De Graef (De Graef, 1998), Baccino (Baccino, 2002), or Tatler (Tatler et al., 2006) who use eye-tracking techniques to observe the attention process, we want to extract information from eyemovements about cognitive and affective processes during an assessment task of pictures of car cab interiors.
The previous studies that have focused on manufactured objects confirm the complexity of the relationship between the eye positions, the cognitive locus and the assessment task. Hammer (Hammer & Lengyel, 1991) shows that the eyes are directed towards the product areas that support text information. Sharmin (Sharmin, 2004) shows that during a mobile phone evaluation, visual attention may browse the scene depending on at least two strategies: neighboring exploration and holistic information analysis.
Therefore, our approach is not to propose a supplementary model completely dedicated to a precise context to explain the multiplicity of the involved processes, but thanks to this controlled context, we propose a general framework to design a statistical model linking eye movements and visual attention.
Our methodology uses a statistical additive mixture model to estimate the contribution of several a priori distributions to explain the fixation distribution, depending on the visual scene (here the manufactured product) and the task. The first step of the study is to measure the eyemovements of customers during a product evaluation task, and to aggregate them per task and product. Then, we make a hypothesis on the factors which might guide the visual attention. For each of them, a statistical model is defined to design each spatial distribution. In this study, five factors are defined. The first two are independent to the visual scene: the random effect and the centrality bias. The next two depend on the visual scene: the visual saliency predicting the visual attention only from low-level local image features such as colors, edges, contrasts and luminance, and the information optimization based on the relative position of the edges. The last one is semantic. The model for this last factor consists in extracting the relevant semantic information useful to solve the task. We suggest applying here the experimental "Bubbles" paradigm to evaluate this semantic information. Finally, the contributions of each factor are compared between each experimental situation (task × product).
The first part of this article presents the computational principle of the additive mixture model. We first consider an additive Gaussian model in order to illustrate this model on our experimental eye tracking data and to set the background of the proposed method. In this model, the different modes in the additive model are not necessarily Gaussian but must implement the different a priori guiding factors. Then we detail the design of each a priori mode from empirical distributions, developing the use of the "Bubbles" experimental paradigm to build the semantic distribution. The second part exposes the experimental protocol and the results.

Additive m ixture m odel to explain eye fixation distribution
The most common technique to create density maps is to make convolution between the fixations map and a Parzen kernel (here for example a Gaussian kernel adjusted with the fovea size). It is a non-parametric method which is not useful to extract the clustering structure of the data. So we prefer a parametric modeling using an additive mixture model. Moreover, in the case of noisy data or of lack of robust data, we complete this model with a "random mode" to extract a part of the noisy structure in the data. Therefore, in this section we present the use of an additive mixture Gaussian method to model the spatial distribution of eye-fixations, first without the "random mode", and secondly with the "random noise". This approach is commonly used to estimate the spatial gaze density function. This method is "imagedependant" because its interpretation is directly linked to the objects which compose the scene. The Gaussian additive mixture model is implemented with the "Expectation-Maximisation" algorithm (Dempster, Laird & Rubin, 1977) as a statistical tool for density estimation. The density function f(x) of a random uni or multivariate variable x is estimated by an additive mixture of K Gaussian modes according to the following equation: , ( with K the a priori number of Gaussian modes, p k the weight of each mode (p 1 +…p K =1), φ(x; θ k ) the Gaussian density of the k th mode and θ k its parameter (mean and covariance matrix).
The number of modes (K) is a priori unknown and must be chosen. The selection model assesses the fitting quality (higher value for K) and the robustness without overtraining (lower value for K). A classical approach is to use an information criterion which balances the likelihood of the model with its complexity (Hastie, Tibshirani & Friedman, 2001). Among the different available criteria, the Bayesian Information Criterion (BIC) (Schwarz 1978) is preferred in a density estimation context (Keribin 2000). A range of possible values of K is chosen depending on the complexity of the visual scene. For each value of K in this interval, the optimal parameters (p k , θ k , k=1...K) of the mixture are found at the convergence of the Expectation-Maximization algorithm ("EM" algorithm,). From all these sets of parameters, the "best" model is selected: it minimizes the BIC criterion: , with L the maximum log-likelihood of the estimated model at the "EM" convergence, ν the number of free parameters and n, the number of observed data. In this example the BIC criterion reaches its optimum for K=6 when K varies from one to eight. The best model has thus six Gaussian modes (figure 1.a). Figure 1.b shows the localization of these modes. Each mode is illustrated by its position (mean) and its spreading at one standard deviation (ellipse). These modes describe the spatial areas which are fixated during the scene explora-tion, i.e. the probability for each area to be gazed at. This example illustrates also one common difficulty when using the classical "EM" algorithm faced with noisy data. If the latent data clustering is not very strong, some Gaussian modes can be extracted or not depending on the random initial conditions. That it is the case with the vertical Gaussian mode close to the left side of the image; its extraction depends on the initial conditions. To cope with such situations, we add a supplementary mode in the model: a uniform density. The experimental observations which are not close to latent clusters get contributions to this mode: it is the "noise" mode. The equation of the complete model is then: , with K the a priori number of Gaussian modes, p k the weight of each Gaussian mode, p u the weight of the uniform mode (p 1 + …p K + p u =1), φ(x; θ k ) the Gaussian density of the k th mode and U(x) the uniform constant density such as . The "EM" algorithm is adapted to this model in order to estimate also the contribution p u . So the model selection concerns the two previous ones, with or without the uniform mode. For the same data, the results are illustrated at figure 2. The minimum value of the BIC criterion is -295.60. This value is reached for the model with four Gaussian modes and the uniform mode. The contributions of each mode are presented in table 1. We notice the contribution pu is significant compared to the contributions for the Gaussian modes. This uniform mode explains scattered data. Here there are scattered eye fixations which are not localized on specific areas (around here 17% of the whole fixations), i.e "ambient" eye fixations which are very sensitive to inter-individual variability.

b,c)
This approach, combining Gaussian mixture and the "EM" algorithm, is common to estimate density functions. Depending on their position and deviation, the modes can be interpreted according to the objects in the scene. Here, the three modes in the left side among the four modes in this scene are related to objects: the steering wheel, the pedals, and the central desk. These objects are very important for the interpretation of the scene. But this approach does not reveal for a given task, if some modes are induced by a similar factor, or if a factor has similar effects on visual scenes which have not similar semantic properties.
Nevertheless, we keep this statistical model as a general framework to set-up the new model in which the modes are not necessarily Gaussian. They must represent the candidate guiding factors across different visual scenes for the same task and not spatial concentrations of eye fixations for a given visual scene.
According to the previous section, the mixture model can be set depending on the density properties of the experimental data. But it can also be done by a priori hypotheses defining the number and the properties of each mode of the mixture. Then the global contribution of the whole fixations to each mode is estimated. This approach is employed by Vincent et al. (Vincent et al., 2009) where the eye positions density is modeled with a mixture of elementary a priori defined densities, each density representing a specific factor which might guide the visual attention. Thus, each mode of the additive mixture is defined by a density modeling one candidate factor which might drive the cognitive analysis of the visual task. The common properties of these two models are the additive mixture of the modes and the "EM" algorithm to define their configuration parameters. These factors describe both low level and high level processes. Each mode is used to assess the contribution of the associated factor. First, it is necessary to identify these candidate factors, in relation to the visual task and then, their statistical density model. Each of these densities is represented by a spatial density map, either from a specific image processing, or from a manual segmentation and or also from statistical hypotheses, depending on the nature of the attention factors. The "EM" algorithm provides stable results if the a priori distributions are not strongly correlated. Each distribution which codes a guiding factor must provide complementary effect on the studied process, the visual attention. At the convergence of the "EM" algorithm, the contributions of each density are estimated, maximizing the likelihood of the final model which is then completely defined.
To summarize, a noticeable characteristic of this method is that the additive model contains a priori density maps, which are chosen depending on the stimuli, the tasks, and the assumptions to be investigated, and which must be previously characterized.

Setting up the a priori distributions com posing the m odel
Five factors are suggested to explain the observed fixation distribution, each one being modeled by one spatial distribution and being considered as one mode of the additive mixture.
First of all, if the eyes are guided by a random process, the distribution will follow a uniform law: each area of the space has the same probability of being gazed at. In the mixture, this map acts as a "trash" map, capturing fixations which are not explained by other assumptions.
The second factor is a process of central gazing (Tatler, 2007): the "on screen" gazing produces a central bias: the eyes preferentially gaze at the center of the screen and tend to return to it regularly, regardless of the content of the image. This is also the initial gaze position and may also be a rest position. A "centrality map" is thus defined, where the central area has a higher probability to be fixed than peripheral ones. The density is determined by a Gaussian function applied to the center of the image. In the original model proposed by Vincent, the mean and the variance covariance matrix will be adjusted during the algorithm as in a usual "EM" algorithm. Here these parameters are fixed because we want to evaluate the contribution of this factor in the central area (see Figure 3). Indeed, if these parameters are set after learning from the "EM" algorithm, this spatial mode can move or not in another place depending on the visual scene.
The third factor comes from the visual bottom-up saliency. One of the basic principles is that the eyes are attracted towards areas of high contrasts combining different low level visual features on textural luminance and chrominance variances. The bottom-up saliency model proposed by Itti (Itti & Koch, 2001) is very popular in this domain. In this work, we use a similar algorithm which is developed in our laboratory. It is based on the same general principles as Itti's, but using a more accurate model at the retina level (Ho, Guyader & Guérin-Dugué, 2009). This map is considered here as merging low-level visual information to predict the relative attractiveness of spatial areas without "top-down" attention factors (see Figure 4).
The fourth factor is based on an information maximization approach (see Figure 5): it comes from experimental observations (Renninger, Coughlan & Vergheese 2005) that the eyes can be spatially positioned in such way to optimize the acquisition of visual information instead of sweep over an area to encode each information. This approach needs to compute from which point of view the details' perception is maximized, limiting the number of saccadic movements. The edges are then extracted by an edge detector (here the "Sobel" detector).Then a spatial clustering algorithm is used to find the position of the edges barycenters. We use the "Mean Shift" algorithm (Fukunaga & Hostetler, 1975;Cheng, 1995). Finally, a Gaussian function, set up with the fovea size, is defined for each cluster and centered on each barycenter. For this factor, the highest probability of gazing at an area is located at the barycenters of the edges and their neighboring areas.
The last factor is a semantic top-down factor: the visual attention is driven by cognitive processes and therefore by the semantic content of the visual scene. Thus, we have to extract the local areas of the scene which contain relevant information regarding the task. To model this factor, we use complementary experiments with the "Bubbles" paradigm using the same visual scenes and the same tasks. Our hypothesis is: the resulting "classification map" obtained by this paradigm gathers the topdown effects in the observed eye fixations density. The constructions of these maps are detailed in the next section (see Figures 6 & 7).
After having estimated the spatial density distribution of each factor and measured the eye-movements, the "EM" algorithm is employed in combination with the additive mixture model to find the best parameters for the model. In our case, these parameters are the relative contribution of each mode (each factor) to the eye fixations density. In the next section, we provide the details of the method to build the semantic map, based on the "Bubbles" paradigm.

Material & Methods
For the top-down factor, we want to extract the relevant parts of the visual scene which should be observed to solve the given task. As far as we know, the "Bubbles" paradigm (Gosselin & Schyns, 2001) has never been employed to compare semantic information and bottom-up effects on visual attention. We apply it here to select, for a given couple image × task, the visual areas encoding relevant information to solve the judgment task. This paradigm was originally designed to identify facial areas associated with facial expressions recognition (Humphreys et al., 2006). It consists in watching the scene through a mask, with only a few parts being maintained visible (the "Bubbles", which are spatially set at random).
Therefore the participants must solve a decision task while gazing the scene through the "Bubbles" mask. This method is relevant when the task resolution requires the capture of local visual information in the scene. The fixation density distribution shows areas which attract the visual attention and the "Bubbles" method identifies the decision areas. Moreover it shows whether the decision is really about local areas or not, and if the judgment task is homogenous or not between participants.
To summarize, this method is efficient when there is a "ground truth" which is related to the correct answers, when the decision is based on local visual areas, and when the decision criteria are stable.
Otherwise, if several "correct answers" exist, if the different visual scenes cannot be discriminated by local areas, or if the decision is subject-dependant, then the algorithm will not be able to extract local areas statistically associated with consensual decisions.   The "Bubbles" paradigm proposed by Gosselin (Gosselin & Schyns, 2001) statistically links the subjects' answers with the areas of the visual scene gazed during a decision task. These local areas are called the "diagnostic areas".
The stimuli we employ are visually and semantically complex, and the decision activates high-level processes. Moreover, a consensus between participants is necessary in order to extract some stable diagnostic areas: one "right" answer and a "false" one must exist, and this alternative must be homogenous to all the participants. Therefore, we adapt this paradigm to a paired-wise comparison task, in order to assess the stimuli with a reference (Humphreys et al., 2006). The "Bubbles" are set at random for one image of the pair and the same localizations are set for the second image. See Figure 7. The decision is taken after the visual inspection of both images of the pair, having a similar masking (left and right sides of the screen): in the pair, one image has the required property, the second, does not. The description of the different properties studied of the car's cab interiors are described in the next section. The algorithm adjusts automatically the number of "Bubbles", to be adapted to the performance of the participants (from 70 up to 80% of correct answers) during the trial and will move towards a setup threshold of correct answers. The surface of each bubble is set such as its radius is one angular degree (fovea). On average (depending on the complexity of the scene), between 10% and 15% of the picture is visible. Finally the algorithm estimates the correlations between the correct answers and the position of the visible areas, and provides a probabilities distribution. This is the probability for a spatial area to be associated to a right answer. In other words, the "Bubbles" paradigm provides a spatial map of right decision making: the classification map.
The experiment was designed with the Stat4Ci 1 Matlab Toolbox provided by F. Gosselin. The classification task is carried out with 10 participants per condition and 1 http://www.mapageweb.umontreal.ca/gosselif/labogo/Sta t4Ci.html 320 decisions per subjects. At least 900 tries are performed on each pair and each task. For each try, the pictures are partially masked by the bubbles. Consequently, the participant has to make a decision on partial information. If the answer is false, we can consider that the visible information through the "bubbles" isn't related to the assessment task. Otherwise, if the answer is right, the visual information is sufficient: the visible parts through the "Bubbles" are significant regarding to the task. By cumulating the answers of all the participants, the correct answers are statistically correlated with the visible areas.
The participants are 64% male and 36% female; average age is 31.4 years old. They are not working in the automotive sector (design, marketing or communication).
Two car cab interiors are chosen to be judged by participants: Peugeot 207 (207) and Citroën C6 (C6), designed in two versions: sports versus standard for 207, white versus black for C6. We therefore obtain four visual stimuli.
Two tasks are chosen: for the 207, the participants assess the "sport" car's design or the "quality level" of the interior cabs. For C6, instructions are to evaluate the "quality level" or the" luxury quality" of the interior cabs.

Results
It must be noted that the quality of the results depends on the experimental conditions and the number of trials. Figure 6.a presents the classification results for the 207 (sport character), and this map actually represents decision areas. Moreover, for the task on the "high level" assessment, the decision areas appear less locally accurate than for the task on "sport" type.
For C6, the decision areas are not spatially localized. Actually, for the "quality level" assessment, the choice is not homogeneous among the subjects: the "Bubbles" algorithm cannot extract converging areas. For the "luxury" assessment task (see Figure 6b), the choice effectively points to the white modality ("correct answer"), but the only decision criterion is the color.
Oculometry measures: assessment of the factors driving the eyes

Material & m ethod
We have defined in the preceding section eight experimental conditions (4 pictures * 2 assessments). In this experiment, we add a control task: four groups observe each stimulus without instructions (4 conditions).
Twelve groups are thus formed, having ten participants per group, and for the control (free viewing task), four groups are formed, one per interior compartment. The participants are selected in such way to have homogeneous groups in terms of age and gender; they all bought a medium-range car in the last two years; they live in France and do not work in the automotive sector. The participants' ages are homogeneously distributed from 20 to 60 years old (median: 35 y.o.).
The cars pictures are exposed at scale 0.80 of their real size on a screen of 160cm × 125cm at 2m from the participant (42° visual angle). The eye-tracker employed is the FACELAB® 4.1, used in precision mode (1.5°) with a sampling frequency of 60Hz. Two warm up tries are done before the first judgment trial. Each participant realizes two assessment tasks, one on a 207 and one on a C6. This order (207 or C6 first) is counterbalanced. After a calibration step, the sequence begins with a black slide, and the participant is asked to stare at a dot located at the center of the screen. Then, the image is exposed for 8 seconds. Finally the participant gives his answer, and the sequence is repeated.
Twenty eye-movements sequences are recorded per product and task. After eliminating wrong measures or failed trials (around 20%), we obtain around sixty sequences of 8 seconds per condition.

Results
In order to explain the spatial fixations density for each experimental condition, we have then defined the statistical model based on an additive mixture of the five distributions previously described which might compete to guide the visual attention in this context. At the convergence of the "EM" algorithm, we obtain the contribution of each of these five factors. These contributions are considered as the relative effect of each factor guiding the eye-movements (the sum of the contributions value is equal to one). Table 2 shows these contributions for each experimental condition (stimulus × task). Table 3 shows the results for the free viewing task.   First of all, the "Random" factor does not contribute to the model: the fixations are not randomly distributed and they can be explained by the other factors. Secondly, the distinction between 207 and C6 conditions is highlighted. For 207, the semantic map explains the eyemovements better than the low-level maps ("Saliency" and "InfoMax") which have weak contributions. For C6, the situation is reversed; the low-level maps explain the experimental data better than the high-level semantic map. Moreover, the centrality bias is stronger for C6. A Principal Components Analysis appears here as a very useful way to compare the various eye movement sets in order to highlight the similarities and differences between the different models. For this, a dataset is created with the eight experimental conditions for the assessment task, merged with the four experimental condi-tions for the free viewing task. This provides twelve situations described by the four factors with a non null contribution corresponding to three degrees of freedom. There is one constraint: the contribution sum is set to one. The resulting biplot of the projection on the first two principal components (see Figure 8) shows mainly two trends: the C6 products versus the 207 products, depending on the semantic factor contribution. For the C6 products, the visual attention seems to be guided by bottomup factors (visual saliency, maximization). For the 207 products, two subgroups are discriminated by the contribution of the "centrality" factor: the assessment of the sporting property induces attention more focused on the decision areas than the other tasks (highest cognitive level).
Therefore, the eye positions depend of course on the task and may in some cases overlap with the visual decision criteria. When the central areas are strongly gazed at, it might be because the attention is less directed to local areas, or because the central local information is watched.
But, when participants gaze freely at the 207 products (without a judgment task), the fixation distribution is well explained by the semantic map (which is built by the "Bubbles" experiment on the basis of judgment process). Thus, even without instructions, the decision areas are particularly observed for the 207. The interpretation of the link between visual attention and decision is therefore complex. For the C6, we notice that the free viewing attention is well explained by the centrality hypothesis, but not by saliency or maximization, as might have been foreseen.

Discussion
This approach seems very efficient to highlight the relative weights of different factors which can guide the distribution of the visual attention. This method allows comparison to several experimental conditions, and to identify specific features of the attention processes for each situation.
About the results, several points must be discussed. First, the respective effects of the factors are quite similar for the 207 (either sport or standard) with and without instruction: it does not mean that the participants gaze these areas because they have information useful for the task or if there are some physical or cognitive distinctiveness areas. Even if the task requires local visual information, these areas might be gazed at because of the judgment processes, but also because of attractiveness, visual complexity, or object identification processes. Second, if the random effect appears null, the centrality one is rather highly weighted. It is convergent with the Tatlers' obser-vations (Tatler, 2007), showing that either in a free viewing task or in a search task, the eyes tend to stand in the center. Our results therefore confirm that even in a judgment task, this bias is strong, but with our experimental protocol, we can't decide if the bias comes from the initial position of the eyes or because it is an optimal location to gaze at the scene. Third, the saliency appears to be null for the 207 cars while the semantic ones is high, and in all the cases the saliency weights are lower than the information-maximization hypothesis. The saliency map is considered to model the locations where the overt attention goes, but it does not seem to model the areas where the participants often gaze at. We can suggest that in such kind of judgment task, the saliency effects are counterbalanced, not by the task (cf. free viewing results) but by the knowledge of the object, and therefore by the semantic content of the scene. While the Informationmaximization seems to contribute well, it can be explained by the fact that this map models the optimal locations to gaze at the objects of the scene, which is linked with the a priori knowledge of the spatial structure of the objects.
About the materials employed, we use pictures of car cabin interiors; therefore some information is missing between the real object and its representation (three dimensional depth, ecological immersion…). Moreover, these objects are very well known (pre-existing cognitive structures) and are consistent (the precepts are organized in a coherent manner), the representation is dense (a lot of objects can be gazed at simultaneously), and they may be interpreted at multiple levels, from colors and textures, to the presence / lack of functionalities, the relative position of the object, the overall attractiveness,...These observations converge to confirm our methodology in various experimental contexts, i.e. to measure the attention processes in real scenes, and to test other tasks and several kinds of objects.
The analysis of eye-movement data is one of the most striking points of such kind of behavioral experiments. If the experimenters have some hypothesis about the areas which will be gazed at, the definition of areas of interest independent to the measures is interesting. The transition occurrences between areas, the temporal patterns of fixation, or the characteristics of the eye-movements per area (fixation frequencies, delays before the first fixation, duration, saccade amplitude) can be studied. If there is no hypothesis about the areas which will be observed, or if we need to compare the similarities between several tries, the analysis using the density estimation is relevant. After having established the density of the fixations, they can be either compared directly, or being employed to compare several tries regarding to a specific density reference (Tatler et al., 2006) data employed to estimate the density will have an effect on the number of local maxima, their highest values, and their topology.
Regarding the statistical method proposed by Vincent, we slightly modify it. First we use one generic map instead of several dedicated maps to model the bottom-up factors on the basis of a visual saliency process (chrominance, edges, and luminance) regardless of the task. Secondly, we add semantic information based on the "Bubbles" paradigm (regardless of physical characteristics of the scene) which seems well adapted to build a top-down map, to be defined for each task and each picture. This is our main contribution to this method. At last, the relevance of the additive mixture model comes from the assumption that each driven factor is complementary to the others. As a consequence of these properties, the weaknesses of this method are the following: First, the fusion model of the different factors is simple (additive mix-ture). If there are complex interactions, they are not taken into account. And then, if the factors are strongly correlated, the "EM" algorithm will be unstable. A second limitation comes from the "Bubbles" method: if there is a weak consensus among participants, or if the decision areas are not local, some common diagnostic areas do not appear. Finally, concerning the global model, we can note that the factors estimations are relatively stable across different trials and initial conditions. Nevertheless, the confidence estimates might have been computed using a bootstrap resampling to confirm this robustness.
In this context we have in one hand, a complex process of visual attention depending on factors which interact with each other, and in the other hand, the experimental data reflects the great variability of the subjects' behavior. A statistical approach performed in a sufficient number of tries is relevant. Even if the assumptions of the statistical model are simple as it is the case for the additive mixture, this approach remains relevant while the aim is to capture the very main effects. So the additive mixture model appears especially well adapted to such kinds of paradigms.