Anssi Kanervisto scite author profile

In this report, we summarize the takeaways from the first NeurIPS 2021 NetHack Challenge. Participants were tasked with developing a program or agent that can win (i.e., 'ascend' in) the popular dungeon-crawler game of NetHack by interacting with the NetHack Learning Environment (NLE), a scalable, procedurally generated, and challenging Gym environment for reinforcement learning (RL). The challenge showcased communitydriven progress in AI with many diverse approaches significantly beating the previously best results on NetHack. Furthermore, it served as a direct comparison between neural (e.g., deep RL) and symbolic AI, as well as hybrid systems, demonstrating that on NetHack symbolic bots currently outperform deep RL by a large margin. Lastly, no agent got close to winning the game, illustrating NetHack's suitability as a long-term benchmark for AI research.

show abstract

Effects of gender information in text-independent and text-dependent speaker verification

Kanervisto

Vestman

Sahidullah

et al. 2017

View full text Add to dashboard Cite

School of Computing, University of Eastern Finland ABSTRACTIt is well-known that for speaker recognition task, genderdependent acoustic modeling performs better than genderindependent modeling. The practice is to use the gender ground-truth and to train gender-dependent models. However, such information is not necessarily available, especially if speakers are remotely enrolled. A way to overcome this is to use a gender classification system, which introduces an additional layer of uncertainty. To date, such uncertainty has not been studied. We implement two gender classifier systems and test them with two different corpora and speaker verification systems. We find that estimated gender information can improve speaker verification accuracy over genderindependent methods. Our detailed analysis suggests that gender estimation should have a sufficiently high accuracy to yield improvements in speaker verification performance.

show abstract

Multi-task Learning with Attention for End-to-end Autonomous Driving

Ishihara

Kanervisto

Miura

et al. 2021

View full text Add to dashboard Cite

General Characterization of Agents by States they Visit

Kanervisto¹,

Kinnunen²,

Hautamäki³

2020

Preprint

View full text Add to dashboard Cite

Augmenting Microsurgical Training: Microsurgical Instrument Detection Using Convolutional Neural Networks

Leppänen

Vrzáková

Bednarik

et al. 2018

View full text Add to dashboard Cite

In video-based training, clinicians practice and advance their skills on surgeries performed by their colleagues and themselves. Although microsurgeries are recorded daily, training centers are lacking the workforce to manually annotate the segments important for practitioners, such as instrument presence and position. In this work, we propose intelligent instrument detection using Convolutional Neural Network (CNN) to augment microsurgical training. The network was trained on real microsurgical practice videos for which human annotators manually gathered a large corpus of instrument positions. Under challenging conditions of highly magnified and often blurred view, the CNN was capable to correctly detect a needle-holder (a dominant tool in suturing practice) with 78.3% accuracy (F-score = 0.84) with recognition speed above 15 FPS. The result is promising in the emerging domain of augmented medical training where instrument recognition presents benefits to the microsurgical training.

show abstract

Optimizing Tandem Speaker Verification and Anti-Spoofing Systems

Kanervisto

Hautamäki

Kinnunen

et al. 2022

IEEE/ACM Trans. Audio Speech Lang. Process.

View full text Add to dashboard Cite

As automatic speaker verification (ASV) systems are vulnerable to spoofing attacks, they are typically used in conjunction with spoofing countermeasure (CM) systems to improve security. For example, the CM can first determine whether the input is human speech, then the ASV can determine whether this speech matches the speaker's identity. The performance of such a tandem system can be measured with a tandem detection cost function (t-DCF). However, ASV and CM systems are usually trained separately, using different metrics and data, which does not optimize their combined performance. In this work, we propose to optimize the tandem system directly by creating a differentiable version of t-DCF and employing techniques from reinforcement learning. The results indicate that these approaches offer better outcomes than finetuning, with our method providing a 20% relative improvement in the t-DCF in the ASVSpoof19 dataset in a constrained setting.

show abstract

Do Autonomous Agents Benefit from Hearing?

Woubie¹,

Kanervisto²,

Karttunen³

et al. 2019

Preprint

View full text Add to dashboard Cite

Mapping states to actions in deep reinforcement learning is mainly based on visual information. The commonly used approach for dealing with visual information is to extract pixels from images and use them as state representation for reinforcement learning agent. But, any vision only agent is handicapped by not being able to sense audible cues. Using hearing, animals are able to sense targets that are outside of their visual range. In this work, we propose the use of audio as complementary information to visual only in state representation. We assess the impact of such multi-modal setup in reach-the-goal tasks in ViZDoom environment. Results show that the agent improves its behaviour when visual information is accompanied with audio features.

show abstract

12 3 4 5

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.