Have I done enough planning or should I plan more?

posted in: reading | 0
People’s decisions about how to allocate their limited computational resources are essential to human intelligence. An important component of this metacognitive ability is deciding whether to continue thinking about what to do and move on to the next decision. Here, we show that people acquire this ability through learning and reverse-engineer the underlying learning mechanisms. Using a process-tracing paradigm that externalises human planning, we find that people quickly adapt how much planning they perform to the cost and benefit of planning. To discover the underlying metacognitive learning mechanisms we augmented a set of reinforcement learning models with metacognitive features and performed Bayesian model selection. Our results suggest that the metacognitive ability to adjust the amount of planning might be learned through a policy-gradient mechanism that is guided by metacognitive pseudo-rewards that communicate the value of planning.



While current trends in artificial intelligence depend on the exponential increase in computational power, the human mind’s cognitive resources are much more limited. Therefore, how is it possible that people are nevertheless able to outperform computers on a wide range of difficult real-world tasks? One critical capacity that enables people to do more with less computation is meta-reasoning, that is reasoning about reasoning [11]. In the context of planning, this means making intelligent decisions about when and how to plan and thereby whether and how to allocate computational resources. In AI research, optimal metareasoning is often regarded to be intractable [27]. This raises the question of how people are able to solve the apparently intractable metareasoning problem despite their limited computational resources. One intriguing possibility is that people learn an approximate solution through trial and error. This idea is known as metacognitive reinforcement learning [18, 17, 21].


According to previous research, the brain is equipped with multiple decision systems that interact in a variety of ways [8, 7]. The model-based system, in contrast to Pavlovian and model-free systems, allows for flexible reasoning about which action is preferable but demands a process for deciding which information should be considered in a given decision. Therefore, an important part of deciding how to decide is to efficiently balance decision quality and decision time given a huge amount of information. This is known as meta-decision-making [2].


Previous research has used this concept to describe how people learn to choose between different cognitive strategies [9, 24, 18], how many steps to plan ahead [17], when to exercise how much cognitive control [21] and how people learn which information to consider [13].