We investigated the structure of image features that support human object recognition using a novel 2-AFC form coherence paradigm. Grayscale images of everyday objects were analyzed with a multi-scale bank of Gabor-wavelet filters whose responses defined the positions, orientations and phases of Gabor patches that were used to reconstruct a facsimile of the original image. Signal Gabors were assigned the parameters of the original image, noise Gabors were assigned random positions, leaving the other parameters, and therefore the overall amplitude spectrum, unchanged. Observers were shown the reconstructed, 100% signal image and were then required to discriminate a target image containing a proportion of signal elements from one containing only noise elements. A staircase determined the proportion of signal elements that were required for correct identification on 75% of trials. We used the statistics of the original image to determine which elements were designated signal and which were designated noise in seven conditions. Signal elements were selected at random or from areas where local orientation variability, density or luminance contrast was either high or low in the original scene. Thresholds were the same for random, orientation variability and density conditions, but were significantly lower for the high contrast and significantly higher for the low contrast conditions. Importantly, the latter result held whether the contrast of the Gabors in the reconstructed scene were either fixed at all the same value or followed the contrast of the original scene. This means that recognition performance is determined by the feature structure of the original scene that has high contrast and not the high contrast elements of the experimental image. These results show that, in general, image identification depends on specific relationships among local features that define natural scenes and not basic statistical measures such as feature density, variability or the contrast values of individual features.