Physics Colloquium with Xaq Pitkow on Rational Thoughts in Neural Codes

Xaq Pitkow (Hosted by Wessel) from Rice University will be presenting the colloquium "Rational Thoughts in Neural Codes"

Complex behaviors are often driven by an internal model, which integrates sensory information over time and facilitates long-term planning to reach subjective goals. We interpret behavioral data by assuming an agent behaves rationally --- that is, they take actions that optimize their subjective reward according to their understanding of the task and its relevant causal variables. We apply a new method, Inverse Rational Control (IRC), to learn an agent's internal model and reward function by maximizing the likelihood of its measured sensory observations and actions. Technically, we define an animal's strategy as solving a Partially Observable Markov Decision Process (POMDP), and we invert this model to find the task and subjective costs that have maximum likelihood. This is a generalization of both Inverse Reinforcement Learning and Inverse Optimal Control. Our mathematical formulation thereby extracts rational and interpretable thoughts of the agent from its behavior.

The thoughts imputed to the animal can then serve as latent targets for neural analyses. Using these targets, we provide a framework for interpreting the linked processes of encoding, recoding and decoding of neural data in light of the rational model for behavior. When applied to behavioral and neural data from simulated agents performing suboptimally on a naturalistic foraging task, this method successfully recovers their internal model and reward function, as well as the computational dynamics within the neural manifold that represents the task. Overall, this approach may identify explainable structure in complex neural activity patterns, and thereby lays a foundation for discovering how the brain represents and computes with dynamic beliefs.

Image from PNAS. Graphical model of a POMDP. Open circles denote latent variables, and solid circles denote observable variables. For the POMDP, the agent knows its beliefs but must infer the world state. For IRC, the scientist knows the world state but must infer the beliefs. The real-world dynamics depend on parameters ϕ, while the belief dynamics and actions of the agent depend on parameters θ, which include both its assumptions about the stochastic world dynamics and observations and its own subjective rewards and costs.

Register to attend Colloquium Through Zoom