In this paper we attempt to determine the effectiveness of using entropy, as defined in NIST SP800-63, as a measurement of the security provided by various password creation policies. This is accomplished by modeling the success rate of current password cracking techniques against real user passwords. These data sets were collected from several different websites, the largest one containing over 32 million passwords. This focus on actual attack methodologies and real user passwords quite possibly makes this one of the largest studies on password security to date. In addition we examine what these results mean for standard password creation policies, such as minimum password length, and character set requirements.
To advance models of multimodal context, we introduce a simple yet powerful neural architecture for data that combines vision and natural language. The "Bounding Boxes in Text Transformer" (B2T2) also leverages referential information binding words to portions of the image in a single unified architecture. B2T2 is highly effective on the Visual Commonsense Reasoning benchmark 1 , achieving a new state-of-the-art with a 25% relative reduction in error rate compared to published baselines and obtaining the best performance to date on the public leaderboard (as of May 22, 2019). A detailed ablation analysis shows that the early integration of the visual features into the text analysis is key to the effectiveness of the new architecture. A reference implementation of our models is provided 2 .
Certain English constructions permit two syntactic alternations. (1) a. I looked up the number. b. I looked the number up. (2) a. He is often at the office. b. He often is at the office. This study investigates the relationship between syntactic alternations and processing difficulty. What cognitive mechanisms are responsible for our attraction to some alternations and our aversion to others?This article reviews three psycholinguistic models of the relationship between syntactic alternations and processing: Maximum Per Word Surprisal (building on the ideas of Hale, in Proceedings of the 2nd Meeting of the North American chapter of the association for computational linguistics. Association for Computational Linguistics, Pittsburgh, PA, pp 159-166, 2001), Uniform Information Density (UID) (Levy and Jaeger in Adv Neural Inf Process Syst 19:849-856, 2007; inter alia), and Dependency Length Minimization (DLM) (Gildea and Temperley in Cognit Sci 34:286-310, 2010). Each theory makes predictions about which alternations native speakers should favor. Subjects were recruited using Amazon Mechanical Turk and asked to judge which of two competing syntactic alternations sounded more natural. Logistic regression analysis on the resulting data suggests that both UID and DLM are powerful predictors of human preferences. We conclude that alternations that approach uniform information density and minimize dependency length are easier to process than those that do not.
Statistical or machine learning approaches have become quite prominent in the Natural Language Processing literature. Common techniques include generative models such as Hidden Markov Models or Probabilistic Context-Free Grammars, and more general noisy-channel models such as the statistical approach to machine translation pioneered by researchers at IBM in the early 90s. Recent work has considered discriminative methods such as (conditional) markov random fields, or large-margin methods. This tutorial will describe several of these techniques. The methods will be motivated through a number of natural language problems: from part-of-speech tagging and parsing, to machine translation, dialogue systems and information extraction problems. I will also concentrate on links to the COLT and kernel methods literature: for example covering kernels over the discrete structures found in NLP, online algorithms for NLP problems, and the issues in extending generalization bounds from classification problems to NLP problems such as parsing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.