Word Segmentation on Discovered Phone Units with Dynamic Programming and Self-Supervised Scoring

Kamper, Herman

doi:10.48550/arxiv.2202.11929

Cited by 4 publications

(4 citation statements)

References 43 publications

(103 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Here, we took the system described in Algayres et al (2022) out of the box, and we showed good performance in speech segmentation compared to the state of the art, but there was still a large margin of improvement compared to text-based system. A recent unpublished paper (Kamper, 2022) came to our attention based on the non-lexical principle and showed similar or slightly better results than ours on a subset of the ZR17 language. Kamper (2022) also uses a segmentation lattice that resembles ours for inference.…”

Section: Conclusion and Open Questionssupporting

confidence: 76%

DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon

Algayres,

Ricoul,

Karadayi

et al. 2022

Preprint

View full text Add to dashboard Cite

Finding word boundaries in continuous speech is challenging as there is little or no equivalent of a 'space' delimiter between words. Popular Bayesian non-parametric models for text segmentation (Goldwater et al., 2006(Goldwater et al., , 2009 use a Dirichlet process to jointly segment sentences and build a lexicon of word types. We introduce DP-Parse, which uses similar principles but only relies on an instance lexicon of word tokens, avoiding the clustering errors that arise with a lexicon of word types. On the Zero Resource Speech Benchmark 2017, our model sets a new speech segmentation state-of-theart in 5 languages. The algorithm monotonically improves with better input representations, achieving yet higher scores when fed with weakly supervised inputs. Despite lacking a type lexicon, DP-Parse can be pipelined to a language model and learn semantic and syntactic representations as assessed by a new spoken word embedding benchmark. 1

show abstract

Section: Conclusion and Open Questionssupporting

confidence: 76%

DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon

Algayres,

Ricoul,

Karadayi

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Strictly speaking, new words refer to the types of words that appear first or are used with new meanings. When dealing with texts, a critical problem lies in the "word segmentation" phase, and almost all subsequent results rely on the first segmentation step (Kamper, 2022). Therefore, the accuracy of word segmentations significantly affects the subsequent processing.…”

Section: The Results Of Discovering New Wordsmentioning

confidence: 99%

Understanding the relationship between normative records of appeals and government hotline order dispatching: a data analysis method

Zhang

2024

DTA

View full text Add to dashboard Cite

PurposeAdvanced big data analysis and machine learning methods are concurrently used to unleash the value of the data generated by government hotline and help devise intelligent applications including automated process management, standard construction and more accurate dispatched orders to build high-quality government service platforms as more widely data-driven methods are in the process.Design/methodology/approachIn this study, based on the influence of the record specifications of texts related to work orders generated by the government hotline, machine learning tools are implemented and compared to optimize classify dispatching tasks by performing exploratory studies on the hotline work order text, including linguistics analysis of text feature processing, new word discovery, text clustering and text classification.FindingsThe complexity of the content of the work order is reduced by applying more standardized writing specifications based on combining text grammar numerical features. So, order dispatch success prediction accuracy rate reaches 89.6 per cent after running the LSTM model.Originality/valueThe proposed method can help improve the current dispatching processes run by the government hotline, better guide staff to standardize the writing format of work orders, improve the accuracy of order dispatching and provide innovative support to the current mechanism.

show abstract

“…Further, a language model is utilized with beam search to decode the outputs of the acoustic model. Interestingly, the discrete representations enable the unsupervised discovery of acoustic units where phonemes are automatically mapped to a small set of discrete representations, enabling phoneme discovery and segmentation [54][55][56][57]. This resulting property of automatic discovery of ground truth phonemes is of particular interest, as we hypothesize that it allows us to derive the atomic units of human movements from wearable sensor data by learning a mapping of discrete representations to spans of sensor data.…”

Section: Discrete Representations Learning In Other Domainsmentioning

confidence: 99%

Towards Learning Discrete Representations via Self-Supervision for Wearables-Based Human Activity Recognition

Haresamudram,

Essa,

Plötz

2024

Sensors

View full text Add to dashboard Cite

Human activity recognition (HAR) in wearable and ubiquitous computing typically involves translating sensor readings into feature representations, either derived through dedicated pre-processing procedures or integrated into end-to-end learning approaches. Independent of their origin, for the vast majority of contemporary HAR methods and applications, those feature representations are typically continuous in nature. That has not always been the case. In the early days of HAR, discretization approaches had been explored—primarily motivated by the desire to minimize computational requirements on HAR, but also with a view on applications beyond mere activity classification, such as, for example, activity discovery, fingerprinting, or large-scale search. Those traditional discretization approaches, however, suffer from substantial loss in precision and resolution in the resulting data representations with detrimental effects on downstream analysis tasks. Times have changed, and in this paper, we propose a return to discretized representations. We adopt and apply recent advancements in vector quantization (VQ) to wearables applications, which enables us to directly learn a mapping between short spans of sensor data and a codebook of vectors, where the index comprises the discrete representation, resulting in recognition performance that is at least on par with their contemporary, continuous counterparts—often surpassing them. Therefore, this work presents a proof of concept for demonstrating how effective discrete representations can be derived, enabling applications beyond mere activity classification but also opening up the field to advanced tools for the analysis of symbolic sequences, as they are known, for example, from domains such as natural language processing. Based on an extensive experimental evaluation of a suite of wearable-based benchmark HAR tasks, we demonstrate the potential of our learned discretization scheme and discuss how discretized sensor data analysis can lead to substantial changes in HAR.

show abstract

Word Segmentation on Discovered Phone Units with Dynamic Programming and Self-Supervised Scoring

Cited by 4 publications

References 43 publications

DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon

DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon

Understanding the relationship between normative records of appeals and government hotline order dispatching: a data analysis method

Towards Learning Discrete Representations via Self-Supervision for Wearables-Based Human Activity Recognition

Contact Info

Product

Resources

About