While current trends in artificial intelligence depend on the exponential increase in computational power, the human mind’s cognitive resources are much more limited. Therefore, how is it possible that people are nevertheless able to outperform computers on a wide range of difficult real-world tasks? One critical capacity that enables people to do more with less computation is meta-reasoning, that is reasoning about reasoning . In the context of planning, this means making intelligent decisions about when and how to plan and thereby whether and how to allocate computational resources. In AI research, optimal metareasoning is often regarded to be intractable . This raises the question of how people are able to solve the apparently intractable metareasoning problem despite their limited computational resources. One intriguing possibility is that people learn an approximate solution through trial and error. This idea is known as metacognitive reinforcement learning [18, 17, 21].
According to previous research, the brain is equipped with multiple decision systems that interact in a variety of ways [8, 7]. The model-based system, in contrast to Pavlovian and model-free systems, allows for flexible reasoning about which action is preferable but demands a process for deciding which information should be considered in a given decision. Therefore, an important part of deciding how to decide is to efficiently balance decision quality and decision time given a huge amount of information. This is known as meta-decision-making .
Previous research has used this concept to describe how people learn to choose between different cognitive strategies [9, 24, 18], how many steps to plan ahead , when to exercise how much cognitive control  and how people learn which information to consider .