High-throughput neuroscience presents unique challenges for exploratory data analysis. Clustering often helps experimenters make sense of data, but model-based clustering techniques, including Dirichlet-process mixture models, have difficulty when differing subsets of dimensions are best explained by differing clusterings. As a result, they can be misled by irrelevant dimensions, they easily miss structure that dimensionality reduction methods find, and they often predict less accurately than discriminative alternatives. We introduce cross-categorization, a modeling technique for heterogeneous, high-dimensional tabular data that addresses these limitations. Based on an efficient MCMC inference scheme for a novel nonparametric Bayesian model, cross-categorization infers which groups of dimensions share a common generative history and are therefore mutually predictive.