The brain encodes dynamic sensory information along different modalities effectively and accurately into structured representations by relying on various biases of different complexities. While the ultimate representation is multimodal, the biases used for encoding have been defined at the level of individual modalities despite a growing body of evidence showing that integration is already present at low-level cortical areas. We hypothesized that for efficient multimodal encoding the applied biases should also be similar across different modalities. We focused on two modalities -auditory and visual- and measured whether a well-established bias in auditory research, the Iambic-trochaic Law (ITL) also appears in the visual modality when the core dimensions of processing dynamic information are properly controlled. According to ITL, when a temporal structure of an auditory stimuli train consists of short (S) and long (L) objects separated by silence (…SSLSSLSSL…), the auditory system tends to rapidly interpret the stream as a repeating pattern of SSL rather than any other alternatives (e.g., SLS) as confirmed by decreased detection accuracy of randomly inserted short gaps at a perceived pattern’s border. To test the universality of ITL, first, we used a change detection go/no-go paradigm and a within-subject design (N=18) to assess the effect of simple feature changes on temporal resolution: we manipulated the duration (presentation time) of the alternating “object” (tone/square) and “background” (silence/gray background) and object intensity. Next, we used the same paradigm to test ITL in the two modalities (N=15). We found that although the general performance was lower in vision, biases in sensitivity towards objects and shorter repetitions were shared principles across modalities, and ITL temporal chunking bias could be identified in the visual results. These results provide an important initial step toward building a unified computational model for modality-independent sensory perception.