Cross-modal retrieval(CMR) refers to the task of retrieving semantically related items across different modalities. For example, given an image query, the task is to retrieve relevant text descriptions or audio clips. One of the major challenges in CMR is the modality gap, which refers to the differences between the features and representations used to encode information in different modalities. To address the modality gap, researchers have developed various techniques such as joint embedding, where the features from different modalities are mapped to a common embedding space where they can be compared directly. Binary-valued and real-valued representations are two different ways to represent data. A binary-valued representation is a type of discrete representation where data is represented using either 0 or 1. Real-valued representation, on the other hand, represents each item as a vector of real numbers. Both types of representations have their advantages and disadvantages, and researchers continue to explore new techniques for generating representations that can improve the performance of CMR systems. First time, the work presented here generates both the representations and comparison is made by performing experiments on standard benchmark datasets using mean average precision (MAP). The result suggest that real-valued representation outperforms binary-valued representation in terms of MAP, especially when the data is complex and high-dimensional. On the other hand, binary codes are more memory-efficient than real-valued embedding, and they can be computed much faster. Moreover, binary codes can be easily stored and transmitted, making them more suitable for large-scale retrieval tasks.