2013
DOI: 10.1215/00031283-2691424
|View full text |Cite
|
Sign up to set email alerts
|

Site-Restricted Web Searches for Data Collection in Regional Dialectology

Abstract: This paper presents a new method for data collection in regional dialectology based on site-restricted web searches. The method allows for the values of many lexical alternation variables to be measured across a region of interest using common search engines such as Google or Bing. The method involves estimating the proportions of the variants of a lexical alternation variable over a series of cities by counting the number of webpages that contain these variants on newspaper websites originating from these cit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
6
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
4
3
2

Relationship

3
6

Authors

Journals

citations
Cited by 30 publications
(6 citation statements)
references
References 20 publications
0
6
0
Order By: Relevance
“…There is, however, no standard method for bivariate map comparison in dialectology. Other than visually comparing dialect maps (e.g., Grieve et al, 2013 ), the simplest approach is to correlate the two maps by calculating a correlation coefficient (e.g., Pearson's r ), essentially comparing the values of the two maps at every pair of locations. This was the approach taken in Grieve ( 2013 ), for example, where Pearson correlation coefficients were calculated to compare a small number of maps representing general regional patterns of grammatical and phonetic variation.…”
Section: Methodsmentioning
confidence: 99%
“…There is, however, no standard method for bivariate map comparison in dialectology. Other than visually comparing dialect maps (e.g., Grieve et al, 2013 ), the simplest approach is to correlate the two maps by calculating a correlation coefficient (e.g., Pearson's r ), essentially comparing the values of the two maps at every pair of locations. This was the approach taken in Grieve ( 2013 ), for example, where Pearson correlation coefficients were calculated to compare a small number of maps representing general regional patterns of grammatical and phonetic variation.…”
Section: Methodsmentioning
confidence: 99%
“…Despite the potential noise in the collected data, the site-restricted Web searches method was proved valid through an evaluation across the USA for lexical word alternation variables distribution attested by both this method and previous American English research. Site-restricted Web searches returned linguistic distribution results that were comparable to results obtained through traditional linguistic data collections (see details in Grieve et al, 2013). The validity of the method was therefore widely proven despite of the potential noise.…”
Section: Linguistic Datamentioning
confidence: 71%
“…The most efficient technique up to date to gather linguistic frequencies from online texts is siterestricted Web searches (Grieve et al, 2013). Starting from a list of suitable newspaper Web sites based in the geographical area to be investigated and from a list of lexical alternation variables formed by variants denoting different degrees of formality, the Google search engine was queried for the number of hits for each variant of the selected variables in the entire archive of each newspaper.…”
Section: Linguistic Datamentioning
confidence: 99%
“…Grieve et al (2011) then used this data set to investigate American regional lexical variation. Grieve et al (2014) also focused on data from the web and introduced site-restricted web searches to dialectometry. Their approach consisted of searching (e.g., via Google) the websites of local newspapers in the United States for the occurrences of certain lexical alternations (e.g., 'bag' and 'sack').…”
Section: Correcting Transcription Inconsistencies Dialectometricallymentioning
confidence: 99%