The authors investigated how human adults encode and remember parts of multielement scenes composed of recursively embedded visual shape combinations. The authors found that shape combinations that are parts of larger configurations are less well remembered than shape combinations of the same kind that are not embedded.
Combined with basic echanisms of statistical learning, this embeddedness constraint enables the development of complex new features for acquiring internal representations efficiently without being computationally intractable. The resulting representations also encode parts and wholes by chunking the visual input into components according to the statistical coherence of their constituents. These results suggest that a bootstrapping approach of constrained statistical learning offers a unified framework for investigating the formation of different internal representations in pattern and scene perception.