Have I done enough planning or should I plan more?

People’s decisions about how to allocate their limited computational resources are essential to human intelligence. An important component of this metacognitive ability is deciding whether to continue thinking about what to do and move on to the next decision. Here, we show that people acquire this ability through learning and reverse-engineer the underlying learning mechanisms. Using a process-tracing paradigm that externalises human planning, we find that people quickly adapt how much planning they perform to the cost and benefit of planning. To discover the underlying metacognitive learning mechanisms we augmented a set of reinforcement learning models with metacognitive features and performed Bayesian model selection. Our results suggest that the metacognitive ability to adjust the amount of planning might be learned through a policy-gradient mechanism that is guided by metacognitive pseudo-rewards that communicate the value of planning.

https://arxiv.org/abs/2201.00764

While current trends in artificial intelligence depend on the exponential increase in computational power, the human mind’s cognitive resources are much more limited. Therefore, how is it possible that people are nevertheless able to outperform computers on a wide range of difficult real-world tasks? One critical capacity that enables people to do more with less computation is meta-reasoning, that is reasoning about reasoning [11]. In the context of planning, this means making intelligent decisions about when and how to plan and thereby whether and how to allocate computational resources. In AI research, optimal metareasoning is often regarded to be intractable [27]. This raises the question of how people are able to solve the apparently intractable metareasoning problem despite their limited computational resources. One intriguing possibility is that people learn an approximate solution through trial and error. This idea is known as metacognitive reinforcement learning [18, 17, 21].

According to previous research, the brain is equipped with multiple decision systems that interact in a variety of ways [8, 7]. The model-based system, in contrast to Pavlovian and model-free systems, allows for flexible reasoning about which action is preferable but demands a process for deciding which information should be considered in a given decision. Therefore, an important part of deciding how to decide is to efficiently balance decision quality and decision time given a huge amount of information. This is known as meta-decision-making [2].

Previous research has used this concept to describe how people learn to choose between different cognitive strategies [9, 24, 18], how many steps to plan ahead [17], when to exercise how much cognitive control [21] and how people learn which information to consider [13].

About
Latest Posts

Ryan Watkins

Professor at George Washington University

I am a Professor with Human-Technology Collaboration and Educational Technology programs at George Washington University in Washington DC. I have written 12 books and more than 100 articles, and I co-host of the Parsing Science podcast where scientists tell the stories behind their research. I am also the developer of the WeShareScience.com online platform for sharing research videos, and SciencePods.com where researchers can create free podcasts about their science. My research interests include human interactions with intelligent machines, needs, needs assessments, and instructional design.