We introduce a new spin chain which is a deformation of the Fredkin spin chain and has a phase transition between bounded and extensive entanglement entropy scaling. In this chain, spins have a local interaction of three nearest neighbors. The Hamiltonian is frustration-free and its ground state can be described analytically as a weighted superposition of Dyck paths. In the purely spin 1/2 case, the entanglement entropy obeys an area law: it is bounded from above by a constant, when the size of the block n increases (and t > 1). When a local color degree of freedom is introduced the entanglement entropy increases linearly with the size of the block (and t > 1). The entanglement entropy of half of the chain is tightly bounded by n log s where n is the size of the block, and s is the number of colors.Our chain fosters a new example for a significant boost to entropy and for the existence of the associated critical rainbow phase where the entanglement entropy scales with volume that has recently been discovered in Zhang et al. [1].
Common grounding is the process of creating, repairing and updating mutual understandings, which is a critical aspect of sophisticated human communication. However, traditional dialogue systems have limited capability of establishing common ground, and we also lack task formulations which introduce natural difficulty in terms of common grounding while enabling easy evaluation and analysis of complex models. In this paper, we propose a minimal dialogue task which requires advanced skills of common grounding under continuous and partially-observable context. Based on this task formulation, we collected a largescale dataset of 6,760 dialogues which fulfills essential requirements of natural language corpora. Our analysis of the dataset revealed important phenomena related to common grounding that need to be considered. Finally, we evaluate and analyze baseline neural models on a simple subtask that requires recognition of the created common ground. We show that simple baseline models perform decently but leave room for further improvement. Overall, we show that our proposed task will be a fundamental testbed where we can train, evaluate, and analyze dialogue system's ability for sophisticated common grounding.
Abstract.We investigate ground-and excited-state properties of the deformed Fredkin spin chain proposed by Salberger, Zhang, Klich, Korepin, and the authors. This model is a one-parameter deformation of the Fredkin spin chain, whose Hamiltonian is 3-local and translationally invariant in the bulk. The model is frustration-free and its unique ground state can be expressed as a weighted superposition of colored Dyck paths. We focus on the case where the deformation parameter t > 1. By using a variational method, we prove that the finite-size gap decays at least exponentially with increasing the system size. We prove that the magnetization in the ground state is along the z-direction, namely s x = s y = 0, and show that the z-component s z exhibits a domain-wall structure. We then study the entanglement properties of the chain. In particular, we derive upper and lower bounds for the von Neumann and Rényi entropies, and entanglement spectrum for any bipartition of the chain.
Off-policy Evaluation (OPE), or offline evaluation in general, evaluates the performance of hypothetical policies leveraging only offline log data. It is particularly useful in applications where the online interaction involves high stakes and expensive setting such as precision medicine and recommender systems. Since many OPE estimators have been proposed and some of them have hyperparameters to be tuned, there is an emerging challenge for practitioners to select and tune OPE estimators for their specific application. Unfortunately, identifying a reliable estimator from results reported in research papers is often difficult because the current experimental procedure evaluates and compares the estimators' performance on a narrow set of hyperparameters and evaluation policies. Therefore, it is difficult to know which estimator is safe and reliable to use. In this work, we develop Interpretable Evaluation for Offline Evaluation (IEOE), an experimental procedure to evaluate OPE estimators' robustness to changes in hyperparameters and/or evaluation policies in an interpretable manner. Then, using the IEOE procedure, we perform extensive evaluation of a wide variety of existing estimators on Open Bandit Dataset, a large-scale public real-world dataset for OPE. We demonstrate that our procedure can evaluate the estimators' robustness to the hyperparamter choice, helping us avoid using unsafe estimators. Finally, we apply IEOE to realworld e-commerce platform data and demonstrate how to use our protocol in practice.
Recent models achieve promising results in visually grounded dialogues. However, existing datasets often contain undesirable biases and lack sophisticated linguistic analyses, which make it difficult to understand how well current models recognize their precise linguistic structures. To address this problem, we make two design choices: first, we focus on OneCommon Corpus (Udagawa and Aizawa, 2019, 2020), a simple yet challenging common grounding dataset which contains minimal bias by design. Second, we analyze their linguistic structures based on spatial expressions and provide comprehensive and reliable annotation for 600 dialogues. We show that our annotation captures important linguistic structures including predicate-argument structure, modification and ellipsis. In our experiments, we assess the model's understanding of these structures through reference resolution. We demonstrate that our annotation can reveal both the strengths and weaknesses of baseline models in essential levels of detail. Overall, we propose a novel framework and resource for investigating fine-grained language understanding in visually grounded dialogues.
Purpose Abnormalities of the running pattern of choroidal vessel have been reported in eyes with pachychoroid diseases. However, it is difficult for clinicians to judge the running pattern with high reproducibility. Thus, the purpose of this study was to compare the degree of concordance of the running pattern of the choroidal vessels between that determined by artificial intelligence (AI) to that determined by experienced clinicians. Methods The running pattern of the choroidal vessels in en face images of Haller’s layer of 413 normal and pachychoroid diseased eyes was classified as symmetrical or asymmetrical by human raters and by three supervised machine learning models; the support vector machine (SVM), Xception, and random forest models. The data from the human raters were used as the supervised data. The accuracy rates of the human raters and the certainty of AI’s answers were compared using confidence scores (CSs). Results The choroidal vascular running pattern could be determined by each AI model with an area under the curve better than 0.94. The random forest method was able to discriminate with the highest accuracy among the three AIs. In the CS analyses, the percentage of certainty was highest (66.4%) and that of uncertainty was lowest (6.1%) in the agreement group. On the other hand, the rate of uncertainty was highest (27.3%) in the disagreement group. Conclusion AI algorithm can automatically classify with ambiguous criteria the presence or absence of a symmetrical blood vessel running pattern of the choroid. The classification was as good as that of supervised humans in accuracy and reproducibility.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.