Computational approaches in language identification often result in highnumber of false positivesand low recall rates, especially if the languages involved come from the same subfamily. In this paper,we aim to determine the cause of this problemby measuring language similarity through trigrams. Religious and literary texts were used as training data. Our experiments involving language identification show that the number of common trigrams for a given language pair is inversely proportional to precision and recall rates, whereas the average word length is directly proportional to the number of true positives. Future directions include improving language modeling and providing an approach to increase precision and recall.
Autonomous robots for smart homes and smart cities mostly require depth perception in order to interact with their environments. However, depth maps are usually captured in a lower resolution as compared to RGB color images due to the inherent limitations of the sensors. Naively increasing its resolution often leads to loss of sharpness and incorrect estimates, especially in the regions with depth discontinuities or depth boundaries. In this paper, we propose a novel Generative Adversarial Network (GAN)-based framework for depth map super-resolution that is able to preserve the smooth areas, as well as the sharp edges at the boundaries of the depth map. Our proposed model is trained on two different modalities, namely color images and depth maps. However, at test time, our model only requires the depth map in order to produce a higher resolution version. We evaluated our model both quantitatively and qualitatively, and our experiments show that our method performs better than existing state-of-the-art models.
This paper discusses a super-resolution (SR) system implemented on a mobile device. We utilized an Android device's camera to take successive shots and applied a classical multiple-image super-resolution (SR) technique that utilized a set of low-resolution (LR) images. Images taken from the mobile device are subjected to our proposed filtering scheme wherein images that have noticeable presence of blur are discarded to avoid outliers from affecting the produced high-resolution (HR) image. The remaining subset of images are subjected to non-local means denoising, then feature-matched against the first reference LR image. Successive images are then aligned with respect to the first image via affine and perspective warping transformations. The LR images are then upsampled using bicubic interpolation. An L 2 -norm minimization approach, which is essentially taking the pixel-wise mean of the aligned images, is performed to produce the final HR image. Our study shows that our proposed method performs better than the bicubic interpolation, which makes its implementation in a mobile device quite feasible. We have also proven in our experiments that there are substantial differences from images captured using burst mode that can be utilized by an SR algorithm to create an HR image.
In this study, we present Dice's coefficient on trigram profiles as metric for language similarity. As testbed, we focused on eight Philippine languages. No known language similarity value for these languages exists. Documents containing transcribed audio recordings, news articles, religious and literary texts were taken from an online corpus and used as training data. Character trigram profiles were then generated using an n-gram generator and language similarity was computed. The results were matched against those reported in the literature and against the language family tree. To evaluate the metric, it was applied to five languages with known similarity values. The results were then compared with an existing lexical similarity metric. The average difference is 27%. Analyses of the results reveal that phonetic spelling play an important role in language similarity. As future work, the metric can be used on phonetic transcriptions.
A "triaxial velocity sensor" consists of three uniaxial velocity sensors, which are nominally identical, orthogonally oriented among themselves, and co-centered at one point in space. A triaxial velocity sensor measures the acoustic particle velocity vector, by its three Cartesian components, individually component-by-component, thereby offering azimuth-elevation two-dimensional spatial directivity, despite the physical compactness that comes with the collocation of its three components. This sensing system's azimuth-elevation beam-pattern has been much analyzed in the open literature, but only for an idealized case of the three uniaxial velocity sensors being exactly identical in gain. If this nominal identity is violated among the three uniaxial velocity sensors, as may occur in practical hardware, what would happen to the corresponding "spatial matched filter" beam-pattern's peak direction? How would this effective peak direction deviate from the nominal "look direction"? This paper, by modeling each uniaxial velocity sensor's gain as stochastic, derives this deviation's statistical mean and variance, analytically in closed mathematical forms. This analytical derivation is verified by Monte Carlo simulations.
In this work, we present a network architecture with parallel convolutional neural networks (CNN) for removing perspective distortion in images. While other works generate corrected images through the use of generative adversarial networks or encoder-decoder networks, we propose a method wherein three CNNs are trained in parallel, to predict a certain element pair in the 3×3 transformation matrix, M^. The corrected image is produced by transforming the distorted input image using M^−1. The networks are trained from our generated distorted image dataset using KITTI images. Experimental results show promise in this approach, as our method is capable of correcting perspective distortions on images and outperforms other state-of-the-art methods. Our method also recovers the intended scale and proportion of the image, which is not observed in other works.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.