PMID- 37645053
OWN - NLM
STAT- PubMed-not-MEDLINE
LR  - 20240216
IS  - 2331-8422 (Electronic)
IS  - 2331-8422 (Linking)
DP  - 2023 Aug 15
TI  - Planning to Learn: A Novel Algorithm for Active Learning during Model-Based 
      Planning.
LID - arXiv:2308.08029v1
AB  - Active Inference is a recently developed framework for modeling decision 
      processes under uncertainty. Over the last several years, empirical and 
      theoretical work has begun to evaluate the strengths and weaknesses of this 
      approach and how it might be extended and improved. One recent extension is the 
      "sophisticated inference" (SI) algorithm, which improves performance on 
      multi-step planning problems through a recursive decision tree search. However, 
      little work to date has been done to compare SI to other established planning 
      algorithms in reinforcement learning (RL). In addition, SI was developed with a 
      focus on inference as opposed to learning. The present paper therefore has two 
      aims. First, we compare performance of SI to Bayesian RL schemes designed to 
      solve similar problems. Second, we present and compare an extension of SI - 
      sophisticated learning (SL) - that more fully incorporates active learning during 
      planning. SL maintains beliefs about how model parameters would change under the 
      future observations expected under each policy. This allows a form of 
      counterfactual retrospective inference in which the agent considers what could be 
      learned from current or past observations given different future observations. To 
      accomplish these aims, we make use of a novel, biologically inspired environment 
      that requires an optimal balance between goal-seeking and active learning, and 
      which was designed to highlight the problem structure for which SL offers a 
      unique solution. This setup requires an agent to continually search an open 
      environment for available (but changing) resources in the presence of competing 
      affordances for information gain. Our simulations demonstrate that SL outperforms 
      all other algorithms in this context - most notably, Bayes-adaptive RL and upper 
      confidence bound (UCB) algorithms, which aim to solve multi-step planning 
      problems using similar principles (i.e., directed exploration and counterfactual 
      reasoning about belief updates given different possible actions/observations). 
      These results provide added support for the utility of Active Inference in 
      solving this class of biologically-relevant problems and offer added tools for 
      testing hypotheses about human cognition.
FAU - Hodson, Rowan
AU  - Hodson R
AD  - Laureate Institute for Brain Research. Tulsa, OK, USA.
FAU - Bassett, Bruce
AU  - Bassett B
AD  - University of Cape Town, South Africa.
AD  - African Institute for Mathematical Sciences, Muizenberg, Cape Town.
AD  - South African Astronomical Observatory, Observatory, Cape Town.
FAU - van Hoof, Charel
AU  - van Hoof C
AD  - Delft University of Technoloty, Department of Cognitive Robotoics.
FAU - Rosman, Benjamin
AU  - Rosman B
AD  - University of the Witwatersrand, South Africa.
FAU - Solms, Mark
AU  - Solms M
AD  - University of Cape Town, South Africa.
FAU - Shock, Jonathan P
AU  - Shock JP
AD  - University of Cape Town, South Africa.
AD  - INRS, Montreal, Canada.
FAU - Smith, Ryan
AU  - Smith R
AD  - Laureate Institute for Brain Research. Tulsa, OK, USA.
LA  - eng
GR  - P20 GM121312/GM/NIGMS NIH HHS/United States
PT  - Preprint
DEP - 20230815
PL  - United States
TA  - ArXiv
JT  - ArXiv
JID - 101759493
PMC - PMC10462173
EDAT- 2023/08/30 06:48
MHDA- 2023/08/30 06:49
PMCR- 2023/08/15
CRDT- 2023/08/30 03:46
PHST- 2023/08/30 06:48 [pubmed]
PHST- 2023/08/30 06:49 [medline]
PHST- 2023/08/30 03:46 [entrez]
PHST- 2023/08/15 00:00 [pmc-release]
AID - 2308.08029 [pii]
PST - epublish
SO  - ArXiv [Preprint]. 2023 Aug 15:arXiv:2308.08029v1.