Zhao Song scite author profile

In this work we present a flexible, probabilistic and reference-free method of error correction for high throughput DNA sequencing data. The key is to exploit the high coverage of sequencing data and model short sequence outputs as independent realizations of a Hidden Markov Model (HMM). We pose the problem of error correction of reads as one of maximum likelihood sequence detection over this HMM. While time and memory considerations rule out an implementation of the optimal Baum-Welch algorithm (for parameter estimation) and the optimal Viterbi algorithm (for error correction), we propose low-complexity approximate versions of both. Specifically, we propose an approximate Viterbi and a sequential decoding based algorithm for the error correction. Our results show that when compared with Reptile, a state-of-the-art error correction method, our methods consistently achieve superior performances on both simulated and real data sets.

show abstract

A novel route to cyclic dominance in voluntary social dilemmas

Guo

Song

Geček

et al. 2020

J. R. Soc. Interface.

View full text Add to dashboard Cite

Cooperation is the backbone of modern human societies, making it a priority to understand how successful cooperation-sustaining mechanisms operate. Cyclic dominance, a non-transitive set-up comprising at least three strategies wherein the first strategy overrules the second, which overrules the third, which, in turn, overrules the first strategy, is known to maintain biodiversity, drive competition between bacterial strains, and preserve cooperation in social dilemmas. Here, we present a novel route to cyclic dominance in voluntary social dilemmas by adding to the traditional mix of cooperators, defectors and loners, a fourth player type, risk-averse hedgers, who enact tit-for-tat upon paying a hedging cost to avoid being exploited. When this cost is sufficiently small, cooperators, defectors and hedgers enter a loop of cyclic dominance that preserves cooperation even under the most adverse conditions. By contrast, when the hedging cost is large, hedgers disappear, consequently reverting to the traditional interplay of cooperators, defectors, and loners. In the interim region of hedging costs, complex evolutionary dynamics ensues, prompting transitions between states with two, three or four competing strategies. Our results thus reveal that voluntary participation is but one pathway to sustained cooperation via cyclic dominance.

show abstract

Revisiting the Softmax Bellman Operator: New Benefits and New Perspective

Song¹,

Parr²,

Carin³

2018

Preprint

View full text Add to dashboard Cite

The impact of softmax on the value function itself in reinforcement learning (RL) is often viewed as problematic because it leads to sub-optimal value (or Q) functions and interferes with the contraction properties of the Bellman operator. Surprisingly, despite these concerns, and independent of its effect on exploration, the softmax Bellman operator when combined with Deep Q-learning, leads to Q-functions with superior policies in practice, even outperforming its double Q-learning counterpart. To better understand how and why this occurs, we revisit theoretical properties of the softmax Bellman operator, and prove that (i) it converges to the standard Bellman operator exponentially fast in the inverse temperature parameter, and (ii) the distance of its Q function from the optimal one can be bounded. These alone do not explain its superior performance, so we also show that the softmax operator can reduce the overestimation error, which may give some insight into why a sub-optimal operator leads to better performance in the presence of value function approximation. A comparison among different Bellman operators is then presented, showing the trade-offs when selecting them.

show abstract

Local and global stimuli in reinforcement learning

et al. 2021

View full text Add to dashboard Cite

In efforts to resolve social dilemmas, reinforcement learning is an alternative to imitation and exploration in evolutionary game theory. While imitation and exploration rely on the performance of neighbors, in reinforcement learning individuals alter their strategies based on their own performance in the past. For example, according to the Bush–Mosteller model of reinforcement learning, an individual’s strategy choice is driven by whether the received payoff satisfies a preset aspiration or not. Stimuli also play a key role in reinforcement learning in that they can determine whether a strategy should be kept or not. Here we use the Monte Carlo method to study pattern formation and phase transitions towards cooperation in social dilemmas that are driven by reinforcement learning. We distinguish local and global players according to the source of the stimulus they experience. While global players receive their stimuli from the whole neighborhood, local players focus solely on individual performance. We show that global players play a decisive role in ensuring cooperation, while local players fail in this regard, although both types of players show properties of ‘moody cooperators’. In particular, global players evoke stronger conditional cooperation in their neighborhoods based on direct reciprocity, which is rooted in the emerging spatial patterns and stronger interfaces around cooperative clusters.

show abstract

Evolutionary dynamics drives role specialization in a community of players

Jia

Wang

Song

et al. 2020

J. R. Soc. Interface.

View full text Add to dashboard Cite

The progression of game theory from classical to evolutionary and spatial games provided a powerful means to study cooperation, and enabled a better understanding of general cooperation-promoting mechanisms. However, current standard models assume that at any given point players must choose either cooperation or defection, meaning that regardless of the spatial structure in which they exist, they cannot differentiate between their neighbours and adjust their behaviour accordingly. This is at odds with interactions among organisms in nature who are well capable of behaving differently towards different members of their communities. We account for this natural fact by introducing a new type of player—dubbed link players—who can adjust their behaviour to each individual neighbour. This is in contrast to more common node players whose behaviour affects all neighbours in the same way. We proceed to study cooperation in pure and mixed populations, showing that cooperation peaks at moderately low densities of link players. In such conditions, players naturally specialize in different roles. Node players tend to be either cooperators or defectors, while link players form social insulation between cooperative and defecting clusters by acting both as cooperators and defectors. Such fairly complex processes emerging from a simple model reflect some of the complexities observed in experimental studies on social behaviour in microbes and pave a way for the development of richer game models.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zhao Song

PREMIER — PRobabilistic error-correction using Markov inference in errored reads

A novel route to cyclic dominance in voluntary social dilemmas

Revisiting the Softmax Bellman Operator: New Benefits and New Perspective

Local and global stimuli in reinforcement learning

Evolutionary dynamics drives role specialization in a community of players

Contact Info

Product

Resources

About

Zhao Song

PREMIER &#x2014; PRobabilistic error-correction using Markov inference in errored reads

A novel route to cyclic dominance in voluntary social dilemmas

Revisiting the Softmax Bellman Operator: New Benefits and New Perspective

Local and global stimuli in reinforcement learning

Evolutionary dynamics drives role specialization in a community of players

Contact Info

Product

Resources

About

PREMIER — PRobabilistic error-correction using Markov inference in errored reads