Word embedding models have become a fundamental component in a wide range of Natural Language Processing (NLP) applications. However, embeddings trained on human-generated corpora have been demonstrated to inherit strong gender stereotypes that reflect social constructs. To address this concern, in this paper, we propose a novel training procedure for learning gender-neutral word embeddings. Our approach aims to preserve gender information in certain dimensions of word vectors while compelling other dimensions to be free of gender influence. Based on the proposed method, we generate a Gender-Neutral variant of GloVe (GN-GloVe). Quantitative and qualitative experiments demonstrate that GN-GloVe successfully isolates gender information without sacrificing the functionality of the embedding model.
We present a conceptually simple yet effective algorithm to detect wireframes [14] in a given image. Compared to the previous methods [14,33] which first predict an intermediate heat map and then extract straight lines with heuristic algorithms, our method is end-to-end trainable and can directly output a vectorized wireframe that contains semantically meaningful and geometrically salient junctions and lines. To better understand the quality of the outputs, we propose a new metric for wireframe evaluation that penalizes overlapped line segments and incorrect line connectivities. We conduct extensive experiments and show that our method significantly outperforms the previous state-of-the-art wireframe and line extraction algorithms [14,33,32]. We hope our simple approach can be served as a baseline for future wireframe parsing studies. Code has been made publicly available at https://github.com/zhou13/lcnn.
Adversarial attacks against machine learning models have threatened various real-world applications such as spam filtering and sentiment analysis. In this paper, we propose a novel framework, learning to discriminate perturbations (DISP), to identify and adjust malicious perturbations, thereby blocking adversarial attacks for text classification models. To identify adversarial attacks, a perturbation discriminator validates how likely a token in the text is perturbed and provides a set of potential perturbations. For each potential perturbation, an embedding estimator learns to restore the embedding of the original word based on the context and a replacement token is chosen based on approximate kNN search. DISP can block adversarial attacks for any NLP model without modifying the model structure or training procedure. Extensive experiments on two benchmark datasets demonstrate that DISP significantly outperforms baseline methods in blocking adversarial attacks for text classification. In addition, in-depth analysis shows the robustness of DISP across different situations.
No abstract
In this work, we introduce the novel problem of identifying dense canonical 3D coordinate frames from a single RGB image. We observe that each pixel in an image corresponds to a surface in the underlying 3D geometry, where a canonical frame can be identified as represented by three orthogonal axes, one along its normal direction and two in its tangent plane. We propose an algorithm to predict these axes from RGB. Our first insight is that canonical frames computed automatically with recently introduced direction field synthesis methods can provide training data for the task. Our second insight is that networks designed for surface normal prediction provide better results when trained jointly to predict canonical frames, and even better when trained to also predict 2D projections of canonical frames. We conjecture this is because projections of canonical tangent directions often align with local gradients in images, and because those directions are tightly linked to 3D canonical frames through projective geometry and orthogonality constraints. In our experiments, we find that our method predicts 3D canonical frames that can be used in applications ranging from surface normal estimation, feature matching, and augmented reality. Principal Directions RGB Images 3D Scans Local Principal Axes + Projected Tangent Principal Directions R e n d e r fo r S u p e r v is io n Normal Prediction Perspective Invariant Feature Matching Augmented Reality QuadriFlow CNN Camera Alignment
In this paper, we propose a method to obtain a compact and accurate 3D wireframe representation from a single image by effectively exploiting global structural regularities. Our method trains a convolutional neural network to simultaneously detect salient junctions and straight lines, as well as predict their 3D depth and vanishing points. Compared with the state-of-the-art learning-based wireframe detection methods, our network is much simpler and more unified, leading to better 2D wireframe detection. With global structural priors such as Manhattan assumption, our method further reconstructs a full 3D wireframe model, a compact vector representation suitable for a variety of high-level vision tasks such as AR and CAD. We conduct extensive evaluations on a large synthetic dataset of urban scenes as well as real images. Our code and datasets will be released.
QuadriFlow is a scalable algorithm for generating quadrilateral surface meshes based on the Instant Field‐Aligned Meshes of Jakob et al. (ACM Trans. Graph. 34(6):189, 2015). We modify the original algorithm such that it efficiently produces meshes with many fewer singularities. Singularities in quadrilateral meshes cause problems for many applications, including parametrization and rendering with Catmull‐Clark subdivision surfaces. Singularities can rarely be entirely eliminated, but it is possible to keep their number small. Local optimization algorithms usually produce meshes with many singularities, whereas the best algorithms tend to require non‐local optimization, and therefore are slow. We propose an efficient method to minimize singularities by combining the Instant Meshes objective with a system of linear and quadratic constraints. These constraints are enforced by solving a global minimum‐cost network flow problem and local boolean satisfiability problems. We have verified the robustness and efficiency of our method on a subset of ShapeNet comprising 17,791 3D objects in the wild. Our evaluation shows that the quality of the quadrangulations generated by our method is as good as, if not better than, those from other methods, achieving about four times fewer singularities than Instant Meshes. Other algorithms that produce similarly few singularities are much slower; we take less than ten seconds to process each model. Our source code is publicly available.
Terahertz (THz) spectroscopic sensing and imaging has identified its potentials in a number of areas such as standoff security screening at portals, explosive detection at battle fields, bio-medical research, and so on. With these needs, the development of an intense and broadband THz source has been a focus of THz research. In this work, we report an intense (~10 mW) and ultra-broadband (~150 THz) THz to infrared (IR) source with a Gaussian wavefront, emitted from nano-pore-structured metallic thin films with femtosecond laser pulse excitation. The underlying mechanism has been proposed as thermal radiation. In addition, an intense coherent THz signal was generated through the optical rectification process simultaneously with the strong thermal signal. This unique feature opens up new avenues in biomedical research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.