Statistical learning is a mechanism for detecting associations among co-occurring elements in many domains and species. A key controversy is whether it leads to memory for discrete chunks composed of these associated elements, or merely to pairwise associations among elements. Critical evidence for the mere-association view comes from the ``phantom-word'' phenomenon, where learners recognize statistically coherent but unattested items better than actually presented items with weaker internal associations, suggesting that they prioritize pair-wise associations over memories for discrete units. However, this phenomenon has only been demonstrated for sequentially presented stimuli, but not for simultaneously presented visual shapes, where some evidence suggests that learners might prioritize discrete units over pair-wise associations. Here, I ask whether the phantom-word phenomenon can be observed with simultaneously presented visual shapes. Learners were familiarized with scenes combining two triplets of visual shapes (hereafter ``words''). They were then tested on their recognition of these words vs. part-words (attested items with weaker internal associations), of phantom-words (unattested items with strong internal associations) vs. part-words, and of words vs. phantom-words. Learners preferred both words and phantom-words over part-words and showed no preference for words over phantom-words. These results suggested that, as for sequentially presented elements, statistical learning in simultaneously presented shapes leads primarily to pair-wise associations rather than to memories for discrete chunks. However, as, in some analyses, the preference for words over part-words was slightly higher than for phantom-words over part-words, the results do not rule out that learners might also have a limited sensitivity to frequency of occurrence.