Penalized regression methods aim to retrieve reliable predictors among a large set of putative ones from a limited amount of measurements. In particular, penalized regression with singular penalty functions is important for sparse reconstruction algorithms. For large-scale problems, these algorithms exhibit sharp phase transition boundaries where sparse retrieval breaks down. Large optimization problems associated with sparse reconstruction have been analyzed in the literature by setting up corresponding statistical mechanical models at a finite temperature. Using replica method for mean field approximation, and subsequently taking a zero temperature limit, this approach reproduces the algorithmic phase transition boundaries. Unfortunately, the replica trick and the nontrivial zero temperature limit obscure the underlying reasons for the failure of a sparse reconstruction algorithm, and of penalized regression methods, in general. In this paper, we employ the "cavity method" to give an alternative derivation of the mean field equations, working directly in the zerotemperature limit. This derivation provides insight into the origin of the different terms in the self-consistency conditions. The cavity method naturally involves a quantity, the average local susceptibility, whose behavior distinguishes different phases in this system. This susceptibility can be generalized for analysis of a broader class of sparse reconstruction algorithms.
The packaging of chromatin within the nucleus of eukaryotic cells is achieved through several levels of spatial organization. The lowest levels give rise to nucleosomes and the 30-nm chromatin fiber while the higher levels involve folding of the chromatin fiber into chromosomes. These higher levels, often referred to as the higher order organization of chromatin, are still poorly understood but are actively being investigated through a new class of experiments known as chromatin conformation capture (3C), and its high-throughput derivative called Hi-C. These experiments detect contacts between different genomic loci, yielding contact probabilities (CPs) that may be used to elucidate the higher order organization of chromatin. Here we present a computational method for recovering chromatin conformation ensembles from reference CPs (Meluzzi D and Arya G. Nucleic Acids Research. 2013 41:63). The conformations are generated by simulating a bead-chain polymer model that represents the 30-nm chromatin fiber. Selected parameters of this polymer model are optimized iteratively until the CPs estimated from the conformation ensembles match the reference CPs. To minimize the size of ensembles required to reliably compute the CPs, we have developed a method that estimates CPs by fitting the extended generalized lambda distribution to simulated inter-bead distances. We show that our overall approach enables the recovery of conformation ensembles for genomic lengths on the order of 1 Mbp and that these ensembles can be used to investigate the shape and spatial properties of biologically relevant chromatin domains.
In this note we examine the autoregressive generalization of the FNet algorithm, in which selfattention layers from the standard Transformer architecture are substituted with a trivial sparse-uniform sampling procedure based on Fourier transforms. Using the Wikitext-103 benchmark, we demonstrate that FNetAR retains state-of-the-art performance (25.8 ppl) on the task of causal language modeling compared to a Transformer-XL baseline (24.2 ppl) with only half the number self-attention layers, thus providing further evidence for the superfluity of deep neural networks with heavily compounded attention mechanisms. The autoregressive Fourier transform could likely be used for parameter reduction on most Transformer-based time-series prediction models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.