July, 27 2021, 15min read
Thomas, A. W., Molter, F., & Krajbich, I. (2021). Uncovering the computational mechanisms underlying many-alternative choice. Elife, 10, e57012. doi.org/10.7554/eLife.57012
In our everyday lives, we often have to choose between many different options. When deciding what to order off a menu, for example, or what type of soda to buy in the supermarket, we have a range of possibilities to consider. Despite the prevalence of these choices in our everyday lives, there is surprisingly little decision modeling work with large choice sets. Even further, there is an apparent disconnect between research in small choice sets, supporting a process of gaze-driven evidence accumulation, and research in larger choice sets, arguing for models of optimal choice, satisficing, and hybrids of the two. With this work, we bridged this divide by developing and comparing different versions of these models in a many-alternative value-based choice experiment with 9, 16, 25, or 36 alternatives (see Figure 1).
Figure 1. Choice task. (A) Subjects chose a snack food item (e.g., chocolate bars, chips, gummy bears) from choice sets with 9, 16, 25, or 36 items. There were no time restrictions during the choice phase. Subjects indicated when they had made a choice by pressing the spacebar of a keyboard in front of them. Subsequently, subjects had 3 s to indicate their choice by clicking on their chosen item with a mouse cursor that appeared at the centre of the screen. Subjects used the same hand to press the space bar and navigate the mouse cursor. Trials from the four set sizes were randomly intermixed. Before the beginning of each choice trial, subjects had to fixate a central fixation cross for 0.5 s. Eye movement data were collected during the central fixation and choice phase. (B) After completing the choice task, subjects indicated how much they would like to eat each snack food item on a 7-point rating scale from 3 (not at all) to 3 (very much).
To first develop an intuition of how individuals behave in this task, let's take a quick look at some exemplary trials for each choice set size (9: top left, 16: top right, 25: lower left, 36: lower fight):
In general, subjects began their search at the centre of the screen, coinciding with the preceding fixation cross. Subjects then typically transitioned to the top left corner and then moved from top to bottom. Over the course of the trial, subjects generally focused their search more on highly rated and larger items, while the probability that their gaze returned to an item also steadily increased, as did the durations of these returning gazes. In general, the effects of item position and size on the search process decreased over time. Overall, the fraction of total trial time that subjects looked at an item was dependent on the liking rating, size, and position of the item, as well as the number of items contained in the choice set (see the original paper for a more detailed analysis of individuals' visual search).
To next test the computational mechanisms that individuals use for form a decision, we considered the following set of decision models, spanning the space between rational choice and gaze-driven evidence accumulation:
The optimal choice model with zero search costs is based on the framework of rational decision-making. It assumes that individuals look at all the items of a choice set and then choose the best seen item with a fixed probability, while making a probabilistic choice over the set of seen items otherwise.
The hard satisficing model assumes that individuals search until they either find an item that satisfies their reservation value or they have looked at all items. In the former case, individuals immediately stop their search and choose the first item that satisfies the reservation value. In the latter case, individuals make a probabilistic choice over the set of seen items, as in the optimal choice model.
Based on the findings by Reutskaja and colleagues we also considered a probabilistic version of satisficing, which combines elements from the optimal choice and hard satisficing models. Specifically, the probabilistic satisficing model (PSM) assumes that the probability with which individuals stop their search and make a choice at a given time point increases with elapsed time in the trial and the cached (i.e., highest-seen) item value. Once the search ends, individuals make a probabilistic choice over the set of seen items, as in the other two models.
We further considered an independent evidence accumulation model (IAM), in which evidence for an item begins accumulating once the item is looked at. Importantly, each accumulator evolves independently from the others, based on the subjective value of the represented item. Once the accumulated evidence for an alternative reaches a predefined decision threshold, a choice is made for that alternative (much like deciding whether the item satisfies a reservation value).
Lastly, we also considered a relative evidence accumulation model (as captured by the GLAM), which assumes that individuals accumulate and compare noisy evidence in favour of each item relative to the others. As with the IAM, a choice is made as soon as the accumulated relative evidence for an item reaches a predetermined decision threshold.
Importantly, we considered two different accounts of gaze in the decision process. The passive account of gaze assumes that gaze allocation solely determines the set of items that are being considered; an item is only considered once it is looked at. In contrast, the active account of gaze (as indicated by the addition of a “+” to the model name) assumes that gaze influences the subjective value of an item in the decision process, thereby generating higher choice probabilities for items that are looked at longer.
Figure 2. Choice psychometrics for each set size. (A) The subjects were very likely to choose one of the highest-rated (i.e., best) items that they looked in all set sizes. (B, C) The fraction of items of a choice set that subjects looked at in a trial decreased with set size (B), while subjects’ mean response times (RTs) increased (C). (D) Subjects chose the item that they looked at last in a trial about half the time. (E) Subjects generally exhibited a positive association of gaze allocation and choice behaviour (as indicated by the gaze influence measure, describing the mean increase in choice probability for an item that is looked at longer than the others, after correcting for the influence of item value on choice probability; for details on this measure, see Qualitative model comparison). (F) Associations of the behavioural measures shown in (A–E) (as indicated by Spearman’s rank correlation). Correlations are computed by the use of the pooled subject means across the set size conditions. Correlations with p-values smaller than 0.01 (Bonferroni corrected for multiple comparisons: 0.1/10) are printed in bold font. Different colours in (A–E) represent the set size conditions. Violin plots show a kernel density estimate of the distribution of subject means with boxplots inside of them.
Qualitatively, we find that subjects choice behavior does not match the assumptions of the optimal choice model with zero search costs or the hard satisficing model: subjects do not look at all items in a trial (Fig. 2B) and do not look at the chosen item last (Fig. 2D). However, subjects' choices were generally in line with the assumptions of the PSM and the two accumulator models.
We also probed the behavioural association of gaze allocation and choice. To this end, we utilized a previously proposed measure of gaze influence, which quantifies the average increase in choice probability for the item that is looked at longest in each trial. We found that all subjects exhibited positive values on this measure in all set sizes (Figure 2E; with values ranging from 1.7% to 75%) and that it increased with set size (Figure 2E), indicating an overall positive association between gaze allocation and choice.
To further discriminate between the evidence accumulation and probabilistic satisficing models, we fitted them to each subject’s choice and RT data for each set size and compared their fit by means of the widely applicable information criterion (WAIC).
Figure 3. Relative model fit. (A-D) Individual WAIC values for the probabilistic satisficing model (PSM), independent evidence accumulation model (IAM), and gaze-weighted linear accumulator model (GLAM) for each set size. Model variants with an active influence of gaze are marked with an additional ’+'. The WAIC is based on the log-score of the expected pointwise predictive density such that larger values in WAIC indicate better model fit. Violin plots show a kernel density estimate of the distribution of individual values with boxplots inside of them. (E–H) Difference in individual WAIC values for each pair of the active-gaze model variants. Asterisks indicate that the two distributions of WAIC values are meaningfully different in a Mann–Whitney U test with a Bonferroni adjusted alpha level of 0.0042 per test (0.05/12). Colours indicate set sizes
According to the WAIC, the choices and RTs of the vast majority of subjects were best captured by the model variants with an active account of gaze (82% [40/49], 94% [46/49], 90% [44/49], and 86% [42/49] for 9, 16, 25, and 36 items respectively; Figure 3A–D). Specifically, the PSM+ won the plurality of individual WAIC comparisons in each set size (39% [19/49], 65% [32/49], 47% [23/49], and 51% [25/49] in the sets with 9, 16, 25, and 36 items, respectively), while the plurality of the remaining WAIC comparisons was won by the GLAM+ for 9 and 16 items (29% [14/49] and 16% [8/49] subjects, respectively) and by the IAM+ for 25 and 36 items (22% [11/49] and 24% [12/49] subjects, respectively).
To further test whether there was a winning model for each set size, we performed a comparison of the distributions of individual WAIC values resulting from each of the three active-gaze model variants (Figure 3E–H). This analysis revealed that the WAIC distributions of the PSM+ and GLAM+ were not meaningfully different from one another in any set size, while the PSM+ was meaningfully better than the IAM+ for 16 items, 25, or 36 items. The GLAM+ was not meaningfully better than the IAM+ in any set size.
Figure 4. Absolute model fit. Predictions of mean RT (A–C), probability of choosing the highest-rated (i.e., best) item (D–F), and gaze influence on choice probability (G–I) by the active-gaze variants of the probabilistic satisficing model (PSM+; A, G, D), independent evidence accumulation model (IAM+; B, E, H), and gaze-weighted linear accumulator model (GLAM+; C, F, I). (A–C) The PSM+ and GLAM+ accurately recover mean RT, while the IAM+ underestimates short and overestimates long mean RTs. (D–F) The PSM+ provides the overall best account of choice accuracy, followed by the GLAM+, and IAM+. (G–I) The PSM+ and IAM+ clearly underestimate strong influences of gaze on choice; the GLAM+ provides the best account of this association and only slightly underestimates strong influences of gaze on choice. Gray lines indicate mixed-effects regression fits of the model predictions (including a random intercept and slope for each set size) and black diagonal lines represent ideal model fit. Model predictions are simulated using parameter estimates obtained from individual model fits. Colours and shapes represent different set sizes, while scatters indicate individual subjects.
Yet, WAIC only tells us about relative model fit. To determine how well each model fit the data in an absolute sense, we simulated data for each individual with each model and regressed the simulated mean RTs, probability of choosing the highest-rated item, and gaze influence on choice probability onto the observed subject values for each of these measures, in a linear mixed-effects regression analysis with one random intercept and slope for each set size (Figure 4). If a model captures the data well, the resulting fixed-effects regression line should have an intercept of 0 and a slope of 1 (as indicated by the black diagonal lines in Figure 4).
The PSM+ and GLAM+ both accurately recovered mean RT (Figure 4A,C), while the IAM+ underestimated short and overestimated long mean RTs (Figure 4B). All three models generally underestimated high probabilities of choosing the highest-rated item from a choice set (Figure 4D–F), while the PSM+ provided the overall most accurate account of this metric (Figure 4D), followed by the GLAM+ (Figure 4E), and IAM+ (Figure 4F).
Turning to the gaze data, the PSM+ and IAM+ both slightly overestimated weak associations between gaze and choice while clearly underestimating stronger associations between them (Figure 4G–H). The GLAM+, in contrast, only slightly underestimated strong associations of gaze and choice (Figure 4I).
In sum, this work shows that subjects’ behaviour qualitatively is not well captured by optimal choice or standard instantiations of satisficing. After incorporating active effects of gaze into a probabilistic version of satisficing, it explained the data well, slightly outperforming the evidence accumulation models in fitting choice and RT data. Yet, the relative accumulation model with active gaze influences provided by far the best fit to the observed association between gaze allocation and choice behaviour, which was not explicitly accounted for in the likelihood-based model comparison, thus demonstrating that gaze-driven relative evidence accumulation provides the most comprehensive account of behaviour in MAFC.