Zero-Shot Deep Learning for Media Mining: Person Spotting and Face Clustering in Video Big Data

Abdallah, Mohamed S.; Kim, HyungWon; Ragab, Mohammad Ehab; Hemayed, Elsayed E.

doi:10.3390/electronics8121394

Cited by 9 publications

(10 citation statements)

References 33 publications

(48 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The results showed effectiveness on large datasets, with the model outperforming other state-of-the-art approaches on very low-resolution images and images with some disguises. Abdallah et al [16] proposed a zero-shot learning model consisting of 19 CNN layers for person spotting and face clustering in video stream data. The proposed network extracts face feature vectors similar to FaceNet-extracted embeddings from the pre-whitening processed video frames.…”

Section: Related Workmentioning

confidence: 99%

Lightweight and Resource-Constrained Learning Network for Face Recognition with Performance Optimization

Deng

Chiang

2020

Sensors

View full text Add to dashboard Cite

Despite considerable progress in face recognition technology in recent years, deep learning (DL) and convolutional neural networks (CNN) have revealed commendable recognition effects with the advent of artificial intelligence and big data. FaceNet was presented in 2015 and is able to significantly improve the accuracy of face recognition, while also being powerfully built to counteract several common issues, such as occlusion, blur, illumination change, and different angles of head pose. However, not all hardware can sustain the heavy computing load in the execution of the FaceNet model. In applications in the security industry, lightweight and efficient face recognition are two key points for facilitating the deployment of DL and CNN models directly in field devices, due to their limited edge computing capability and low equipment cost. To this end, this paper provides a lightweight learning network improved from FaceNet, which is called FN13, to break through the hardware limitation of constrained computational resources. The proposed FN13 takes the advantage of center loss to reduce the variations of the between-class features and enlarge the difference of the within-class features, instead of the triplet loss by using FaceNet. The resulting model reduces the number of parameters and maintains a high degree of accuracy, only requiring few grayscale reference images per subject. The validity of FN13 is demonstrated by conducting experiments on the Labeled Faces in the Wild (LFW) dataset, as well as an analytical discussion regarding specific disguise problems.

show abstract

Section: Related Workmentioning

confidence: 99%

Lightweight and Resource-Constrained Learning Network for Face Recognition with Performance Optimization

Deng

Chiang

2020

Sensors

View full text Add to dashboard Cite

show abstract

“…Recently, face image clustering has attracted a lot of interest from researchers [7][8][9][10][11]. L. Zhang et al [7] presented a clustering method based on a spectral clustering and selforganizing feature mapping (SOM) neural network.…”

Section: Introductionmentioning

confidence: 99%

“…M.S. Abdallah et al [9] proposed a TV media mining system based on a DCNN to rapidly identify a specific individual in real-time processing video data. I. Ahn et al [10] proposed a multiple segmentation technique combined with constrained spectral clustering to label facial images containing objects with complicated boundaries.…”

Section: Introductionmentioning

confidence: 99%

Adaptive Facial Imagery Clustering via Spectral Clustering and Reinforcement Learning

Shen

Qian

2021

Applied Sciences

View full text Add to dashboard Cite

In an era of big data, face images captured in social media and forensic investigations, etc., generally lack labels, while the number of identities (clusters) may range from a few dozen to thousands. Therefore, it is of practical importance to cluster a large number of unlabeled face images into an efficient range of identities or even the exact identities, which can avoid image labeling by hand. Here, we propose adaptive facial imagery clustering that involves face representations, spectral clustering, and reinforcement learning (Q-learning). First, we use a deep convolutional neural network (DCNN) to generate face representations, and we adopt a spectral clustering model to construct a similarity matrix and achieve clustering partition. Then, we use an internal evaluation measure (the Davies–Bouldin index) to evaluate the clustering quality. Finally, we adopt Q-learning as the feedback module to build a dynamic multiparameter debugging process. The experimental results on the ORL Face Database show the effectiveness of our method in terms of an optimal number of clusters of 39, which is almost the actual number of 40 clusters; our method can achieve 99.2% clustering accuracy. Subsequent studies should focus on reducing the computational complexity of dealing with more face images.

show abstract

“…All wirelessly connected devices collect petabytes of data that allow detecting objects and processes on an unprecedented scale [3]. The most notable examples are automatic face recognition [4], classification of hyperspectral data [5], automatic object detection and classification [6][7][8], remote gesture sensing [9][10][11], wireless detection, and location [12][13][14] -all powered by internet data. Many of these applications are the kind of regression problem for which DNN is the right solution [15].…”

Section: Introductionmentioning

confidence: 99%

Darknet on OpenCL: A Multi-platform Tool for Object Detection and Classification

Sowa¹,

Izydorczyk²

2020

Preprint

View full text Add to dashboard Cite

The article’s goal is to overview challenges and problems on the way from the state of the art CUDA accelerated neural networks code to multi-GPU code. For this purpose, the authors describe the journey of porting the existing in the GitHub, fully-featured CUDA accelerated Darknet engine to OpenCL. The article presents lessons learned and the techniques that were put in place to make this port happen. There are few other implementations on the GitHub that leverage the OpenCL standard, and a few have tried to port Darknet as well. Darknet is a well known convolutional neural network (CNN) framework. The authors of this article investigated all aspects of the porting and achieved the fully-featured Darknet engine on OpenCL. The effort was focused not only on the classification with the use of YOLO1, YOLO2, and YOLO3 CNN models. They also covered other aspects, such as training neural networks, and benchmarks to look for the weak points in the implementation. The GPU computing code substantially improves Darknet computing time compared to the standard CPU version by using underused hardware in existing systems. If the system is OpenCL-based, then it is practically hardware independent. In this article, the authors report comparisons of the computation and training performance compared to the existing CUDA-based Darknet engine in the various computers, including single board computers, and, different CNN use-cases. The authors found that the OpenCL version could perform as fast as the CUDA version in the compute aspect, but it is slower in memory transfer between RAM (CPU memory) and VRAM (GPU memory). It depends on the quality of OpenCL implementation only. Moreover, loosening hardware requirements by the OpenCL Darknet can boost applications of DNN, especially in the energy-sensitive applications of Artificial Intelligence (AI) and Machine Learning (ML).

show abstract

Zero-Shot Deep Learning for Media Mining: Person Spotting and Face Clustering in Video Big Data

Cited by 9 publications

References 33 publications

Lightweight and Resource-Constrained Learning Network for Face Recognition with Performance Optimization

Lightweight and Resource-Constrained Learning Network for Face Recognition with Performance Optimization

Adaptive Facial Imagery Clustering via Spectral Clustering and Reinforcement Learning

Darknet on OpenCL: A Multi-platform Tool for Object Detection and Classification

Contact Info

Product

Resources

About