2021
DOI: 10.1017/s0954394521000041
|View full text |Cite
|
Sign up to set email alerts
|

Lects in Helsinki Finnish - a probabilistic component modeling approach

Abstract: This article examines Finnish lects spoken in Helsinki from the 1970s to the 2010s with a probabilistic model called Latent Dirichlet Allocation. The model searches for underlying components based on the linguistic features used in the interviews. Several coherent lects were discovered as components in the data, which counters the results of previous studies that report only weak covariation between features that are assumed to be present in the same lect. The speakers, however, are not categorical in their li… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 57 publications
(79 reference statements)
0
2
0
Order By: Relevance
“…The latter would anyway be hard to detect in the examined dataset of US-American tweets. Kuparinen et al (2021) used a Latent Dirichlet Allocation model (see Section 3.3) to discover lects in Helsinki Finnish. Although the aim of the study was similar to the current one, the data was differently pre-processed.…”
Section: Topic Models and Dimensionality Reduction In Dialectometrymentioning
confidence: 99%
See 1 more Smart Citation
“…The latter would anyway be hard to detect in the examined dataset of US-American tweets. Kuparinen et al (2021) used a Latent Dirichlet Allocation model (see Section 3.3) to discover lects in Helsinki Finnish. Although the aim of the study was similar to the current one, the data was differently pre-processed.…”
Section: Topic Models and Dimensionality Reduction In Dialectometrymentioning
confidence: 99%
“…Secondly, the dialectologically meaningful features are tied to the words, which means we are not actually calculating the frequency of the variants themselves (cf. Kuparinen et al, 2021), but the combinations of words and variants. If we modify the example from before, the occurrences talosa "in a house," koulusa "in a school," and kirkosa "in a church" would all end up as different tokens in the corpus, although they all have the same dialectal variant -sa of the inessive case.…”
Section: Applying Topic Models To Phonetically Transcribed Dialect Co...mentioning
confidence: 99%