The photograph and our understanding of photography is ever changing and has transitioned from a world of unprocessed rolls of C-41 sitting in a fridge 50 years ago to sharing photos on the 1.5" screen of a point and shoot camera 10 years back. And today the photograph is again something different. The way we take photos is fundamentally different. We can view, share, and interact with photos on the device they were taken on. We can edit, tag, or "filter" photos directly on the camera at the same time the photo is being taken. Photos can be automatically pushed to various online sharing services, and the distinction between photos and videos has lessened. Beyond this, and more importantly, there are now lots of them. To Facebook alone more than 250 billion photos have been uploaded and on average it receives over 350 million new photos every day [6], while YouTube reports that 300 hours of video are uploaded every minute [22]. A back of the envelope estimation reports 10% of all photos in the world were taken in the last 12 months, and that was calculated already more than three years ago [8].Today, a large number of the digital media objects that are shared have been uploaded to services like Flickr or Instagram, which along with their metadata and their social ecosystem form a vibrant environment for finding solutions to many research questions at scale. Photos and videos provide a wealth of information about the universe, covering entertainment, travel, personal records, and various other aspects of life in general as it was when they were taken.
A thorough investigation of the application of support vector regression (SVR) to the superresolution problem is conducted through various frameworks. Prior to the study, the SVR problem is enhanced by finding the optimal kernel. This is done by formulating the kernel learning problem in SVR form as a convex optimization problem, specifically a semi-definite programming (SDP) problem. An additional constraint is added to reduce the SDP to a quadratically constrained quadratic programming (QCQP) problem. After this optimization, investigation of the relevancy of SVR to superresolution proceeds with the possibility of using a single and general support vector regression for all image content, and the results are impressive for small training sets. This idea is improved upon by observing structural properties in the discrete cosine transform (DCT) domain to aid in learning the regression. Further improvement involves a combination of classification and SVR-based techniques, extending works in resolution synthesis. This method, termed kernel resolution synthesis, uses specific regressors for isolated image content to describe the domain through a partitioned look of the vector space, thereby yielding good results.
This paper introduces the Voices Obscured In Complex Environmental Settings (VOICES) corpus, a freely available dataset under Creative Commons BY 4.0. This dataset will promote speech and signal processing research of speech recorded by far-field microphones in noisy room conditions. Publicly available speech corpora are mostly composed of isolated speech at close-range microphony. A typical approach to better represent realistic scenarios, is to convolve clean speech with noise and simulated room response for model training. Despite these efforts, model performance degrades when tested against uncurated speech in natural conditions. For this corpus, audio was recorded in furnished rooms with background noise played in conjunction with foreground speech selected from the Lib-riSpeech corpus. Multiple sessions were recorded in each room to accommodate for all foreground speech-background noise combinations. Audio was recorded using twelve microphones placed throughout the room, resulting in 120 hours of audio per microphone. This work is a multi-organizational effort led by SRI International and Lab41 with the intent to push forward state-of-the-art distant microphone approaches in signal processing and speech recognition.
We propose an image interpolation algorithm that is nonparametric and learning-based, primarily using an adaptive k-nearest neighbor algorithm with global considerations through Markov random fields. The empirical nature of the proposed algorithm ensures image results that are data-driven and, hence, reflect "real-world" images well, given enough training data. The proposed algorithm operates on a local window using a dynamic k -nearest neighbor algorithm, where k differs from pixel to pixel: small for test points with highly relevant neighbors and large otherwise. Based on the neighbors that the adaptable k provides and their corresponding relevance measures, a weighted minimum mean squared error solution determines implicitly defined filters specific to low-resolution image content without yielding to the limitations of insufficient training. Additionally, global optimization via single pass Markov approximations, similar to cited nearest neighbor algorithms, provides additional weighting for filter generation. The approach is justified in using a sufficient quantity of training per test point and takes advantage of image properties. For in-depth analysis, we compare to existing methods and draw parallels between intuitive concepts including classification and ideas introduced by other nearest neighbor algorithms by explaining manifolds in low and high dimensions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.