Recent studies, employing for example a roving paradigm, established that perceptual learning (PL) is substantially influenced by the statistical structure of the task conditions. However, existing computational models, based primarily on feedforward architectures, struggle to adequately account for these learning effects that depend on higher-level statistical contingencies embedded in the structure of the task. We propose a Bayesian framework that uses contextual inference to represent multiple learning contexts simultaneously with their corresponding stimuli. The model infers the extent to which each reference-context might have contributed to a given trial and gradually learns the transitions between reference-contexts from experience. In turn, the correct inference of the current reference-context supports efficient neural resource allocation for encoding the stimuli expected to occur in the given context. This reallocation of resources maximizes discrimination performance and strongly modulates PL. Our model not only reconciles parsimoniously previously unexplained roving effects observed in PL studies but also provides new predictions for learning and generalization. These results demonstrate that statistical learning and its higher-level generalization, structure learning, form a functional symbiosis with lower-level perceptual learning processes.