Joey Öhman scite author profile

Joey Öhman

3Publications

25Citation Statements Received

47Citation Statements Given

How they've been cited

How they cite others

Affiliations

Publications

Order By: Most citations

Compiling Universal Probabilistic Programming Languages with Efficient Parallel Sequential Monte Carlo Inference

Lundén

Öhman²,

Kudlicka

et al. 2022

View full text Add to dashboard Cite

Probabilistic programming languages (PPLs) allow users to encode arbitrary inference problems, and PPL implementations provide general-purpose automatic inference for these problems. However, constructing inference implementations that are efficient enough is challenging for many real-world problems. Often, this is due to PPLs not fully exploiting available parallelization and optimization opportunities. For example, handling probabilistic checkpoints in PPLs through continuation-passing style transformations or non-preemptive multitasking—as is done in many popular PPLs—often disallows compilation to low-level languages required for high-performance platforms such as GPUs. To solve the checkpoint problem, we introduce the concept of PPL control-flow graphs (PCFGs)—a simple and efficient approach to checkpoints in low-level languages. We use this approach to implement RootPPL: a low-level PPL built on CUDA and C++ with OpenMP, providing highly efficient and massively parallel SMC inference. We also introduce a general method of compiling universal high-level PPLs to PCFGs and illustrate its application when compiling Miking CorePPL—a high-level universal PPL—to RootPPL. The approach is the first to compile a universal PPL to GPUs with SMC inference. We evaluate RootPPL and the CorePPL compiler through a set of real-world experiments in the domains of phylogenetics and epidemiology, demonstrating up to 6$$\times $$ × speedups over state-of-the-art PPLs implementing SMC inference.

show abstract

Fine-Grained Controllable Text Generation Using Non-Residual Prompting

Carlsson¹,

Öhman²,

Liu³

et al. 2022

View full text Add to dashboard Cite

DaLAJ-GED - a dataset for Grammatical Error Detection tasks on Swedish

Volodina¹,

Mohammed²,

Berdičevskis³

et al. 2023

View full text Add to dashboard Cite

DaLAJ-GED is a dataset for linguistic acceptability judgments for Swedish, covering five head classes: lexical, morphological, syntactical, orthographical and punctuation. DaLAJ-GED is an extension of DaLAJ.v1 dataset (Volodina et al., 2021a,b). Both DaLAJ datasets are based on the SweLL-gold corpus (Volodina et al., 2019) and its correction annotation categories.DaLAJ-GED presented here contains 44,654 sentences, distributed (almost) equally between correct and incorrect ones and is primarily aimed at linguistic acceptability judgment task, but can also be used for other tasks related to grammatical error detection (GED) on a sentence level. DaLAJ-GED is included into the Swedish SuperLim 2.0 collection, 1 an extension of SuperLim (Adesam et al., 2020), a benchmark for Natural Language Understanding (NLU) tasks for Swedish. This paper gives a concise overview of the dataset and presents a few benchmark results for the task of linguistic acceptability, i.e. binary classification of sentences as either correct or incorrect.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Joey Öhman

Compiling Universal Probabilistic Programming Languages with Efficient Parallel Sequential Monte Carlo Inference

Fine-Grained Controllable Text Generation Using Non-Residual Prompting

DaLAJ-GED - a dataset for Grammatical Error Detection tasks on Swedish

Contact Info

Product

Resources

About