The Limits of Pure Exploration in POMDPs: When the Observation Entropy is Enough
TL;DR: In POMDPs, the pure exploration task over latent states can be addressed by looking at observations only, and the induced mismatch is far from being hopeless. We show when this is the case and how to simply overcome the possible (structural) limitations under the assumption of knowing at least the observation model. https://arxiv.org/pdf/2406.12795