The random phase approximation (RPA) is an increasingly popular post-Kohn-Sham correlation method, but its high computational cost has limited molecular applications to systems with few atoms. Here we present an efficient implementation of RPA correlation energies based on a combination of resolution of the identity (RI) and imaginary frequency integration techniques. We show that the RI approximation to four-index electron repulsion integrals leads to a variational upper bound to the exact RPA correlation energy if the Coulomb metric is used. Auxiliary basis sets optimized for second-order Møller-Plesset (MP2) calculations are well suitable for RPA, as is demonstrated for the HEAT [A. Tajti et al., J. Chem. Phys. 121, 11599 (2004)] and MOLEKEL [F. Weigend et al., Chem. Phys. Lett. 294, 143 (1998)] benchmark sets. Using imaginary frequency integration rather than diagonalization to compute the matrix square root necessary for RPA, evaluation of the RPA correlation energy requires O(N(4) log N) operations and O(N(3)) storage only; the price for this dramatic improvement over existing algorithms is a numerical quadrature. We propose a numerical integration scheme that is exact in the two-orbital case and converges exponentially with the number of grid points. For most systems, 30-40 grid points yield muH accuracy in triple zeta basis sets, but much larger grids are necessary for small gap systems. The lowest-order approximation to the present method is a post-Kohn-Sham frequency-domain version of opposite-spin Laplace-transform RI-MP2 [J. Jung et al., Phys. Rev. B 70, 205107 (2004)]. Timings for polyacenes with up to 30 atoms show speed-ups of two orders of magnitude over previous implementations. The present approach makes it possible to routinely compute RPA correlation energies of systems well beyond 100 atoms, as is demonstrated for the octapeptide angiotensin II.
Abstract. We describe a new optimization scheme for finding highquality clusterings in planar graphs that uses weighted perfect matching as a subroutine. Our method provides lower-bounds on the energy of the optimal correlation clustering that are typically fast to compute and tight in practice. We demonstrate our algorithm on the problem of image segmentation where this approach outperforms existing global optimization techniques in minimizing the objective and is competitive with the state of the art in producing high-quality segmentations.
Abstract. Cell detection and segmentation in microscopy images is important for quantitative high-throughput experiments. We present a learning-based method that is applicable to different modalities and cell types, in particular to cells that appear almost transparent in the images. We first train a classifier to detect (partial) cell boundaries. The resulting predictions are used to obtain superpixels and a weighted region adjacency graph. Here, edge weights can be either positive (attractive) or negative (repulsive). The graph partitioning problem is then solved using correlation clustering segmentation. One variant we newly propose here uses a length constraint that achieves state-of-art performance and improvements in some datasets. This constraint is approximated using nonplanar correlation clustering. We demonstrate very good performance in various bright field and phase contrast microscopy experiments.
In this paper, we introduce a new optimization approach to Entity Resolution. Traditional approaches tackle entity resolution with hierarchical clustering, which does not benefit from a formal optimization formulation. In contrast, we model entity resolution as correlation-clustering, which we treat as a weighted set-packing problem and write as an integer linear program (ILP). In this case, sources in the input data correspond to elements and entities in output data correspond to sets/clusters. We tackle optimization of weighted set packing by relaxing integrality in our ILP formulation. The set of potential sets/clusters can not be explicitly enumerated, thus motivating optimization via column generation. In addition to the novel formulation, we also introduce new dual optimal inequalities (DOI), that we call flexible dual optimal inequalities, which tightly lower-bound dual variables during optimization and accelerate column generation. We apply our formulation to entity resolution (also called de-duplication of records), and achieve state-of-the-art accuracy on two popular benchmark datasets. Our F-DOI can be extended to other weighted set-packing problems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.