This thesis investigates the creation of novel algorithms for representing images to address two important areas in the field of computer vision: content-based image retrieval (CBIR) and scene recognition. CBIR can be classified into two types, instance-level retrieval and category-level retrieval, and in this thesis, we address the former. Motivated by our joint work with INCIBE, we build deep learning-based systems that can help Law Enforcement Agencies to match evidences in crime scene investigations, among a wide range of other applications. In particular, we propose two algorithms for CBIR, one based on the colour description of objects and the other one on the texture description of patches on images, and another additional algorithm for scene prediction and retrieval that relies on the combination of local and global scene content.CBIR for instance-level retrieval aims at retrieving images from an image or video database that contain the same object or scene as the one depicted in a query image. We introduce two algorithms to address this task in order to gain robustness against colour and texture variances, respectively. On the one hand, we propose colour neural descriptors that are composed of convolutional neural networks (CNNs) features obtained by combining different colour spaces and colour channels. In contrast to previous works, which rely on fine-tuning pre-trained networks, we compute the proposed descriptors based on the activations generated from a pre-trained CNN without fine-tuning. Also, we take advantage of an object detector to optimize the proposed instance retrieval architecture to generate features at both local and global scales. In addition, we introduce a stride based query expansion technique to retrieve objects from multi-view datasets. Finally, we experimentally demonstrated that the proposed colour neural descriptors obtain state-of-the-art results on the Paris 6K, Revisiting-Paris 6k, INSTRE-M and COIL-100 datasets, with mean average precision of 81.70%, 82.02%, 78.8% and 97.9%, respectively.On the other hand, we focus on the texture properties of images. In crime scene investigations, some clues may come from texture patches of images that do not contain much information about the object contour, like a t-shirt lying on the floor. To define the characteristics of such images, the texture patterns are the prime cues for teriores en tres conjuntos de datos p úblicos: MIT-67 Indoor, NYU-v2 y Hotels-50k. La precisi ón alcanzada (MIT-67 Indoor = 94, 5%, NYU-v2 = 74, 5% y la precisi ón top-1 10, 1% sin oclusi ón y 7,8% con oclusi ón media en el Hotels-50k) demostr ó la eficacia del método propuesto, que también supera significativamente los enfoques del estado del arte existentes.Esta tesis contribuye al desarrollo de métodos para crear descriptores robustos a los cambios de color, textura y punto de vista y presenta marcos para utilizarlos en sistemas CBIR y de reconocimiento de escenas.