Language revitalization theory suggests that one way to improve the health of a language is to increase the number of domains where the language is used. Social network platforms provide a variety of domains where indigenous-language communities are able to communicate in their own languages. Although the capability exists, is social networking being used by indigenous-language communities? This paper reports on one particular social networking platform, Twitter, by using two separate methodologies. First, Twitter statistics collated from the Indigenous Tweets website are analysed. The data show that languages such as Basque, Haitian Creole, Welsh, Irish Gaelic, Frisian and Kapampangan do have a presence in the "Twittersphere". Further analysis for te reo Mäori (the Mäori language) shows that tweets in te reo Mäori are rising and peak when certain events occur. The second methodology involved gathering empirical data by tweeting in te reo Mäori. This served two purposes: it allowed an ancillary check on the validity of the Indigenous Tweets data and it allowed the opportunity to determine if the number of indigenous-language tweets could be infl uenced by the actions of one tweeter.
The digital era has transformed how people live their lives and interact with the world and knowledge systems around them. In Aotearoa/New Zealand a range of initiatives incorporating Indigenous knowledge have been implemented to collect, catalog, maintain, and organize digital objects. In this article, we report on the ethics, processes, and procedures associated with the digitization of the manuscripts, works, and collected taonga (treasures) of the late Dr. Pei Te Hurinui Jones-and describe how it was transformed into a digital library. It discusses the decision-making processes
Māori loanwords are widely used in New Zealand English for various social functions by New Zealanders within and outside of the Māori community. Motivated by the lack of linguistic resources for studying how Māori loanwords are used in social media, we present a new corpus of New Zealand English tweets. We collected tweets containing selected Māori words that are likely to be known by New Zealanders who do not speak Māori. Since over 30% of these words turned out to be irrelevant (e.g., mana is a popular gaming term, Moana is a character from a Disney movie), we manually annotated a sample of our tweets into relevant and irrelevant categories. This data was used to train machine learning models to automatically filter out irrelevant tweets.
Twitter constitutes a rich resource for investigating language contact phenomena. In this paper, we report findings from the analysis of a large-scale diachronic corpus of over one million tweets, containing loanwords from te reo Māori, the indigenous language spoken in New Zealand, into (primarily, New Zealand) English. Our analysis focuses on hashtags comprising mixed-language resources (which we term hybrid hashtags), bringing together descriptive linguistic tools (investigating length, word class, and semantic domains of the hashtags) and quantitative methods (Random Forests and regression analysis). Our work has implications for language change and the study of loanwords (we argue that hybrid hashtags can be linked to loanword entrenchment), and for the study of language on social media (we challenge proposals of hashtags as "words," and show that hashtags have a dual discourse role: a micro-function within the immediate linguistic context in which they occur and a macro-function within the tweet as a whole).
Digital libraries have a pivotal role to play in the preservation and maintenance of international cultures in general and minority languages in particular. This paper outlines a software tool for building digital libraries that is well adapted for creating and distributing local information collections in minority languages, and describes some contexts in which it is used. The system can make multilingual documents available in structured collections, and allows them to be accessed via multilingual interfaces. It is issued under a free open source license, which encourages participatory design of the software, and an end-user interface allows community-based localization of the various language interfaces-of which there are many.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.