Seraphina Nix scite author profile

The Global Network of Optical Magnetometers to search for Exotic physics (GNOME) is a network of geographically separated, time-synchronized, optically pumped atomic magnetometers that is being used to search for correlated transient signals heralding exotic physics. The GNOME is sensitive to nuclear-and electron-spin couplings to exotic fields from astrophysical sources such as compact dark-matter objects (for example, axion stars and domain walls). Properties of the GNOME sensors such as sensitivity, bandwidth, and noise characteristics are studied in the present work, and features of the network's operation (e.g., data acquisition, format, storage, and diagnostics) are described. Characterization of the GNOME is a key prerequisite to searches for and identification of exotic physics signatures.

show abstract

Adversarial Training for High-Stakes Reliability

Ziegler¹,

Nix²,

Chan³

et al. 2022

Preprint

View full text Add to dashboard Cite

In the future, powerful AI systems may be deployed in high-stakes settings, where a single failure could be catastrophic. One technique for improving AI safety in high-stakes settings is adversarial training, which uses an adversary to generate examples to train on in order to achieve better worst-case performance. In this work, we used a language generation task as a testbed for achieving high reliability through adversarial training. We created a series of adversarial training techniques-including a tool that assists human adversaries-to find and eliminate failures in a classifier that filters text completions suggested by a generator. In our simple "avoid injuries" task, we determined that we can set very conservative classifier thresholds without significantly impacting the quality of the filtered outputs. With our chosen thresholds, filtering with our baseline classifier decreases the rate of unsafe completions from about 2.4% to 0.003% on in-distribution data, which is near the limit of our ability to measure. We found that adversarial training significantly increased robustness to the adversarial attacks that we trained on, without affecting in-distribution performance. We hope to see further work in the high-stakes reliability setting, including more powerful tools for enhancing human adversaries and better ways to measure high levels of reliability, until we can confidently rule out the possibility of catastrophic deployment-time failures of powerful models.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Seraphina Nix

Characterization of the global network of optical magnetometers to search for exotic physics (GNOME)

Adversarial Training for High-Stakes Reliability

Contact Info

Product

Resources

About