Learning abstract rules in the auditory and visual domains is customarily investigated with the AAB vs. ABB paradigm where each scene contains three auditory events or visual objects and either identity or an attribute of these items, such as the size of the objects, follows a same-same-different (i.e. AAB) pattern during a training period. In a subsequent test session, never seen before items are used and subjects’ preference to judge the AAB over ABB arrangements as familiar is taken as evidence for acquiring the abstract rule. We asked whether 2D retinal or 3D perceptual size is the basis of this judgment in case of visual rule learning of size arrangements. We used three triplets of 3D computer graphic colored objects arranged in perspective so that by physical extent on the screen they followed a large-large-small (AAb) template, but due to perspective their perceptual appearance was (aBB). After 2 minutes of random sequential presentation of the triplets for 2 sec each without any explicit task, two tests were administered with two versions of instruction. In the first test (No Context), context and perspective were taken away, and triplets were presented horizontally on white background, in the second (Context), exactly the same context was used as during the practice. The instructions were either choose the more familiar scene (Naïve) or considering size, chose the more familiar scene (Cued). In the Naïve-No Context condition, subject showed no preference between AAb and aBB, which changed in the Cued condition to significant aBB preference. In the Cued-Context condition, subjects showed a strong aBB preference. However, in the Naïve-Context condition, they switched to significant AAb preference. Thus size-rule coding seems to utilize high-level perceptual coding of size when directed explicitly, but in implicit familiarity tasks the more veridical retinal coding has a stronger influence.