2018 International Conference on Bangla Speech and Language Processing (ICBSLP) 2018
DOI: 10.1109/icbslp.2018.8554474
|View full text |Cite
|
Sign up to set email alerts
|

Pipilika N-Gram Viewer: An Efficient Large Scale N-Gram Model for Bengali

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
3
2
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 4 publications
0
5
0
Order By: Relevance
“…Different studies recommend the number of words to use in a keyphrase (Ahmad et al, 2018; Gledec et al, 2019; Zammit et al, 2020). In this research, we wanted to understand how the size of a keyphrase impacts accuracy and performance.…”
Section: Proposed Solutionmentioning
confidence: 99%
See 1 more Smart Citation
“…Different studies recommend the number of words to use in a keyphrase (Ahmad et al, 2018; Gledec et al, 2019; Zammit et al, 2020). In this research, we wanted to understand how the size of a keyphrase impacts accuracy and performance.…”
Section: Proposed Solutionmentioning
confidence: 99%
“…In text mining such keyphrases are also referred to as N‐grams (Ribeiro et al, 2017) and various research has been done to explore their use, including for non‐English languages (Ahmad et al, 2018; Gledec et al, 2019). A possible solution for an effective keyphrase assignment framework is to gather data and train a classification algorithm, but some authors outlined this as a disadvantage since it has a dependency on external documents (Gledec et al, 2019) and training requires time and effort.…”
Section: Introductionmentioning
confidence: 99%
“…There are multiple versions of ngram: 1-gram (unigram),2-gram (bigram),3-gram (trigram),etc. [15] Discussion and Analysis:…”
Section: N-grammentioning
confidence: 99%
“…The out-of-vocabulary (OOV) words, that are not included in the speech/text corpus, are handled by our lexicon-free acoustic model without any modification. This text corpus is built by web scrapping open-source websites -mainly Bangla newspaper portals [17] and Wikipedia pages. We use Kenlm language modeling toolkit to build the LM [18].…”
Section: Text Corpusmentioning
confidence: 99%