Justin Davidson

Associate Professor of Spanish and Romance Linguistics

Ph.D., University of Illinois at Urbana-Champaign, 2015. Spanish Linguistics, Romance linguistics, SLATE (Second Language Acquisition and Teacher Education).

Research Expertise and Interest

Sociolinguistics, contact linguistics and language contact, language variation and change, Romance linguistics, quantitative methods (statistics, variable rule analyses for sociolinguistics, and computer software for statistics), sociohistorical linguistics, sociophonetics, bilingualism, Catalan, Spanish, dialectal diversification, foreign language pedagogy.

Research Profile

My main research agenda is guided by questions that primarily address language variation and language change in contact situations, specifically as linked to the empirical assessment of linguistic influence (via language contact), incorporating a variety of linguistic frameworks and methodologies. In particular, I have explored bi-directional effects of language contact between Spanish and Catalan manifested phonetically in the speech of the diverse community of Catalan-Spanish bilingual speakers in Barcelona and Valencia, Spain. I am interested in the dynamics of language use in bilingual speech communities, particularly as a consequence of a complex interplay between both linguistic and social factors, and my research aims to account for why, as well as by what processes certain linguistic features (and not others) propagate throughout the wider community of speakers. Central to this line of research is the pursuit of the best quantitative models in sociolinguistics, from which I have developed a vested interest in evaluating (and combining) various statistical toolkits (in addition to an attempt to help new R users become more accustomed to analyzing data with R – see files at the bottom of this page!). I have also published on the diachronic development of diaspora varieties of Catalan from a framework of sociohistorical linguistics, as well as the variable acquisition of Spanish inflectional morphology by U.S. heritage speakers and L2-learners using empirical methodologies informed by the fields of second language acquisition and psycholinguistics.



Select Publications by Topic


Davidson_2022_On Catalan as a Minority Language: Barcelona Lateral Production

Fagyal_Davidson_2022_Sociophonetics in Romance Languages

Davidson_2021_Catalonian Spanish Fricatives

Davidson_2020_On the Gradient Nature of Lateral Categories

Davidson_2019_Covert and Overt Attitudes Toward Barcelonan Laterals and Fricatives



Davidson_2022_On (Not) Acquiring Sociolinguistic Stereotypes

Montrul_et al_2014_Heritage/L2 Gender Agreement Processing

Montrul_et al_2012_Heritage/L2 Gender Production via Diminutives



Davidson_2020_Directionality of Language Contact Effects in Barcelona and Valencia

Davidson_2019_Andean Spanish Fricative Voicing in Contact with Quechuan Varieties

Davidson_2010_Catalan Linguistic Drift


Active On-Site Project

As of Fall 2016, the Corpus of Bay Area Spanish (CBAS) is underway. Data collection, in the form of both formal (3-4 word phrase readings) and informal (casual interview) Spanish speech, is ongoing and responds to specific questions regarding Spanish-English language contact as manifested in the diverse population of Spanish-speakers living in the Bay Area (corresponding to approximately 25% of the total Bay Area population according to recent Census data). Though initial linguistic analysis will focus on acoustic (phonetic) elements of Bay Area Spanish from perspectives of Variationist Sociolinguistics and Contact Linguistics, the creation of a formal and online-accessible Corpus will permit future analyses on multiple linguistic features (from phonology/phonetics to morphosyntax and the lexicon, etc.) from a diverse set of linguistic perspectives, including Second Language Acquisition and Heritage Language Acquisition.

For prospective and current undergraduate and graduate students, the CBAS project offers the opportunity to engage first-hand in corpus-based sociolinguistic research. Undergraduate and graduate students are openly invited to collaborate in participant recruitment and actual data collection (i.e., meet with participants and conduct interview sessions to record speech) along with ongoing data analysis, leading to possibilities for advanced (Hispanic) Linguistics undergraduate research in the form of a Senior Thesis, or, for graduate students in (Hispanic) Linguistics, opportunities for professional research and publications. Interested students should contact me via e-mail for an appointment to discuss CBAS collaboration.

Participants interested in contributing to the CBAS project should review the CBAS recruitment flyers below, and contact me at the e-mail address provided.






PDF File: R Tutorial for the Non-Coding-Inclined (version 4.0.3.t1)

R File: R Tutorial for the Non-Coding-Inclined (version 4.0.3.t1)


The files above contains a series of R codes and explanations for numerous kinds of quantitative analysis, as well as templates/examples. The files are current with respect to the most recent update to R, but will change as R and packages within it are updated. Feel free to e-mail me if you believe any of the codes is no longer valid.

As for the PDF version, beyond the first 17 pages, which cover the minimal coding you’ll need, general tips and terminology, a flow-chart to decide what test is appropriate for your data, the interpretation of ANOVA outputs vs. regression outputs, as well as a series of example R outputs and the corresponding tables/prose that one could create from them for a publication, each page is titled with the name of the test covered. Crucially, expectations for the data (i.e., what kinds of tests are suitable for which kind of data) appear below each title.

The R file version allows for perhaps easier copy-pasting, since users will not need to constantly click between R and their PDF-viewer program, but at the cost of color-coding. The only other differences between the R file and the PDF are the former’s omission of examples of R outputs and their interpretation, and additionally, since Chi-Squared in R is difficult to describe in pure prose, this test is omitted from the R file (though is still present as the final page of the PDF file).


The only coding knowledge required relates to the independent variables included in a model, which is covered in the legend at the top of each test. For example, for models working with 4 IVs (GENDER as fixed, COUNTRY as fixed, VERB as fixed, and PARTICIPANT as random), the following “IVDump” notations are possible:

GENDER + COUNTRY                                                                 (fixed effects model with 2 main effects)

GENDER * COUNTRY                           (fixed effects model with 2 main effects and their interaction)

GENDER + COUNTRY * VERB                  (fixed effects model with 3 main effects and 1 interaction)

GENDER + (1|PARTICIPANT)       (mixed effects model with 1 main effect and 1 random intercept)

GENDER + (GENDER|PARTICIPANT)      (same as above but with the addition of a random slope)


Finally, below is an Excel spreadsheet with templates to show how different kinds of data are to be organized for analysis with R, as well as example data to practice analyses with.


Templates & Example Datasets for R