Abstract-A copy-move forgery is created by copying and pasting content within the same image, and potentially postprocessing it. In recent years, the detection of copy-move forgeries has become one of the most actively researched topics in blind image forensics. A considerable number of different algorithms have been proposed focusing on different types of postprocessed copies. In this paper, we aim to answer which copy-move forgery detection algorithms and processing steps (e. g. , matching, filtering, outlier detection, affine transformation estimation) perform best in various postprocessing scenarios. The focus of our analysis is to evaluate the performance of previously proposed feature sets. We achieve this by casting existing algorithms in a common pipeline. In this paper, we examined the 15 most prominent feature sets. We analyzed the detection performance on a per-image basis and on a per-pixel basis. We created a challenging real-world copy-move dataset, and a software framework for systematic image manipulation. Experiments show, that the keypoint-based features SIFT and SURF, as well as the block-based DCT, DWT, KPCA, PCA and ZERNIKE features perform very well. These feature sets exhibit the best robustness against various noise sources and downsampling, while reliably identifying the copied regions.
Electroluminescence (EL) imaging is a useful modality for the inspection of photovoltaic (PV) modules. EL images provide high spatial resolution, which makes it possible to detect even finest defects on the surface of PV modules. However, the analysis of EL images is typically a manual process that is expensive, time-consuming, and requires expert knowledge of many different types of defects.In this work, we investigate two approaches for automatic detection of such defects in a single image of a PV cell. The approaches differ in their hardware requirements, which are dictated by their respective application scenarios. The more hardware-efficient approach is based on hand-crafted features that are classified in a Support Vector Machine (SVM). To obtain a strong performance, we investigate and compare various processing variants. The more hardware-demanding approach uses an end-to-end deep Convolutional Neural Network (CNN) that runs on a Graphics Processing Unit (GPU). Both approaches are trained on 1,968 cells extracted from high resolution EL intensity images of mono-and polycrystalline PV modules. The CNN is more accurate, and reaches an average accuracy of 88.42 %. The SVM achieves a slightly lower average accuracy of 82.44 %, but can run on arbitrary hardware. Both automated approaches make continuous, highly accurate monitoring of PV cells feasible.
In this paper, we present a new deep learning framework for 3-D tomographic reconstruction. To this end, we map filtered back-projection-type algorithms to neural networks. However, the back-projection cannot be implemented as a fully connected layer due to its memory requirements. To overcome this problem, we propose a new type of cone-beam back-projection layer, efficiently calculating the forward pass. We derive this layer's backward pass as a projection operation. Unlike most deep learning approaches for reconstruction, our new layer permits joint optimization of correction steps in volume and projection domain. Evaluation is performed numerically on a public data set in a limited angle setting showing a consistent improvement over analytical algorithms while keeping the same computational test-time complexity by design. In the region of interest, the peak signal-to-noise ratio has increased by 23%. In addition, we show that the learned algorithm can be interpreted using known concepts from cone beam reconstruction: the network is able to automatically learn strategies such as compensation weights and apodization windows.
No abstract
Compressed sensing-based Magnetic Resonance Imaging (CS-MRI) is a promising paradigm allowing to accelerate MRI acquisition by reconstructing images from only a fraction of the normally required k-space measurements. Traditionally, sparsity-based methods and their data-driven variants such as dictionary learning [10] have been popular due to their mathematically robust formulation for perfect reconstruction. However, these methods are limited in acceleration factor and also suffer from high computational complexity. More recently, several deep learning-based architectures have been proposed as an attractive alternative for CS-MRI. The advantages of these techniques are their computational efficiency, G. Yang and J. Schlemper/D. Rueckert and A. Maier share second/last coauthorship.
Deep Convolutional Neural Networks (CNN) have shown great success in supervised classification tasks such as character classification or dating. Deep learning methods typically need a lot of annotated training data, which is not available in many scenarios. In these cases, traditional methods are often better than or equivalent to deep learning methods. In this paper, we propose a simple, yet effective, way to learn CNN activation features in an unsupervised manner. Therefore, we train a deep residual network using surrogate classes. The surrogate classes are created by clustering the training dataset, where each cluster index represents one surrogate class. The activations from the penultimate CNN layer serve as features for subsequent classification tasks. We evaluate the feature representations on two publicly available datasets. The focus lies on the ICDAR17 competition dataset on historical document writer identification (Historical-WI). We show that the activation features trained without supervision are superior to descriptors of state-of-theart writer identification methods. Additionally, we achieve comparable results in the case of handwriting classification using the ICFHR16 competition dataset on historical Latin script types (CLaMM16).
The encoding of local features is an essential part for writer identification and writer retrieval. While CNN activations have already been used as local features in related works, the encoding of these features has attracted little attention so far. In this work, we compare the established VLAD encoding with triangulation embedding. We further investigate generalized max pooling as an alternative to sum pooling and the impact of decorrelation and Exemplar SVMs. With these techniques, we set new standards on two publicly available datasets (ICDAR13, KHATT).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.