Francisco Guzmán scite author profile

This paper shows that pretraining multilingual language models at scale leads to significant performance gains for a wide range of crosslingual transfer tasks. We train a Transformerbased masked language model on one hundred languages, using more than two terabytes of filtered CommonCrawl data. Our model, dubbed XLM-R, significantly outperforms multilingual BERT (mBERT) on a variety of cross-lingual benchmarks, including +14.6% average accuracy on XNLI, +13% average F1 score on MLQA, and +2.4% F1 score on NER. XLM-R performs particularly well on low-resource languages, improving 15.7% in XNLI accuracy for Swahili and 11.4% for Urdu over previous XLM models. We also present a detailed empirical analysis of the key factors that are required to achieve these gains, including the trade-offs between (1) positive transfer and capacity dilution and (2) the performance of high and low resource languages at scale. Finally, we show, for the first time, the possibility of multilingual modeling without sacrificing perlanguage performance; XLM-R is very competitive with strong monolingual models on the GLUE and XNLI benchmarks. We will make our code, data and models publicly available. 1

show abstract

Unsupervised Cross-lingual Representation Learning at Scale

Conneau¹,

Khandelwal²,

Goyal³

et al. 2019

Preprint

252

392

View full text Add to dashboard Cite

The evolution of brand management thinking over the last 25 years as recorded in the Journal of Product and Brand Management

Veloutsou

Guzmán

2017

JPBM

170

184

View full text Add to dashboard Cite

show abstract

A consumer-perceived consumer-based brand equity scale

2016

View full text Add to dashboard Cite

The FLORES Evaluation Datasets for Low-Resource Machine Translation: Nepali–English and Sinhala–English

Guzmán¹,

Chen²,

Ott³

et al. 2019

130

161

View full text Add to dashboard Cite

For machine translation, a vast majority of language pairs in the world are considered low-resource because they have little parallel data available. Besides the technical challenges of learning with limited supervision, it is difficult to evaluate methods trained on lowresource language pairs because of the lack of freely and publicly available benchmarks. In this work, we introduce the FLORES evaluation datasets for Nepali-English and Sinhala-English, based on sentences translated from Wikipedia. Compared to English, these are languages with very different morphology and syntax, for which little out-of-domain parallel data is available and for which relatively large amounts of monolingual data are freely available. We describe our process to collect and cross-check the quality of translations, and we report baseline performance using several learning settings: fully supervised, weakly supervised, semi-supervised, and fully unsupervised. Our experiments demonstrate that current state-of-the-art methods perform rather poorly on this benchmark, posing a challenge to the research community working on lowresource MT. Data and code to reproduce our experiments are available at https://github. com/facebookresearch/flores.

show abstract

Co-creation of brand identities: consumer and industry influence and motivations

Kennedy¹,

Guzmán²

2016

JCM

View full text Add to dashboard Cite

Purpose This paper aims to develop an understanding of the phenomena of co-creation and how this practice is used in shaping brand identities. This research provides answers to questions on both the consumer and industry sides of co-creation. Design/methodology/approach Two studies are developed. First, a qualitative study is used to gain insight from key decision-makers with responsibility for a brand. Second, a study of millennial consumers is used to develop the antecedents of consumer motivations of co-creation of brand identities. Findings When combined, the outcomes of these studies create a comprehensive framework that encompasses two models of brand identity co-creation. The qualitative study leads to the emergence of two major constructs, which, combined with the consumer study, lead to the development of two models that represent the antecedents of co-creation from a managerial and consumer perspective. Research limitations/implications For Study one, a larger pool of respondents or different data collection method might have led to additional managerial insights. The study two sample was limited to millennials. Although this group of consumers is identified as highly engaged with brands, the study could have benefited from a more general consumer sample. Practical implications The organization framework could help managers gain a deeper understanding for effectively co-creating their brand identities with all stakeholders, in particular consumers. Originality/value This research contributes to theory and practice by analyzing the process of stakeholder brand identity co-creation.

show abstract

Particulate Air Pollution in Mexico City: A Collaborative Research Project

Edgerton

Bian

Doran

et al. 1999

Journal of the Air & Waste Management Association

132

View full text Add to dashboard Cite

PM, PM, precursor gas, and upper-air meteorological measurements were taken in Mexico City, Mexico, from February 23 to March 22, 1997, to understand concentrations and chemical compositions of the city's particulate matter (PM). Average 24-hr PM concentrations over the period of study at the core sites in the city were 75 H g/m. The 24-hr standard of 150 μ g/m was exceeded for seven samples taken during the study period; the maximum 24-hr concentration measured was 542 μ g/m. Nearly half of the PM was composed of fugitive dust from roadways, construction, and bare land. About 50% of the PM consisted of PM, with higher percentages during the morning hours. Organic and black carbon constituted up to half of the PM. PM concentrations were highest during the early morning and after sunset, when the mixed layers were shallow. Meteorological measurements taken during the field campaign show that on most days air was transported out of the Mexico City basin during the afternoon with little day-to-day carryover.

show abstract

How CSR reputation, sustainability signals, and country-of-origin sustainability reputation contribute to corporate brand performance: An exploratory study

Cowan

Guzmán

2020

Journal of Business Research

114

View full text Add to dashboard Cite

instance, Interbrand (2012) revealed that consumers undervalue brand sustainability efforts. Consequently, investing in sustainability and CSR initiatives may increase costs for a corporate brand without delivering the desired benefits (Sen & Bhattacharya, 2001). In other words, although consumers use corporate reputation as a signal globally (Swoboda, Puchert, & Morschett, 2016), low consumer awareness can reduce the effectiveness of reputation signals (e.g., CSR or sustainability) that brands use to enhance brand performance (Sen, Bhattachayra, & Korschun, 2006), especially in global markets (Sen & Bhattacharya, 2001; Sen et al., 2006). Furthermore, when consumers are unaware of corporate brand reputation, they rely on other signals, including corporate brand rankings-e.g., Fortune's "World Most Admired Companies"-(Chabowski et al., 2011). However, the effectiveness of these signals can be affected by corporate brand country of origin (

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.