I am a sociolinguist: my research centers on how society and language are intertwined and how they affect one another.

I use a mixture of statistical/computational and qualitative methods to investigate sociolinguistic questions. Some view quantitative methods and critical qualitative methods as incompatible, but I think they complement each other very well! Quantitative methods are great for identifying broad patterns and not so great for understanding language in context; qualitative methods are great for understanding micro-level aspects of language in context but can be difficult to apply to large datasets. In my research I try to bridge the gap between these seemingly disparate approaches.

I’m always happy to answer questions about my research at [email protected]. I’d also be thrilled to talk about potential collaboration, especially if your interests overlap with mine, or if you’re a qualitative analyst who is quant-curious (or vice versa!).

Primary research interests

Dissertation: Personal narratives of restaurant service

I am currently working on my dissertation, which is a discourse analysis of narratives told by restaurant servers about restaurant service during COVID-19. Some of the questions I’m interested in are:

I’m currently working on an analysis of narratives from the Reddit storytelling forum /r/TalesFromYourServer. Later, I plan to conduct some oral interviews to elicit narratives from people face-to-face (or monitor-to-monitor as the case may be).

For more information and some related references, you can download the slides from my thesis proposal defense.

Online climate discourse

My second qualifying paper (download) was a computational analysis of of data from five different Reddit communities that focus on climate change: /r/climate, /r/climate_science, /r/ClimateOffensive, /r/collapse, and /r/climateskeptics. Perhaps unsurprisingly, the way these communities talk about climate change reflects their communal ideologies about it, and this can be detected computationally and analyzed qualitatively. For example, in a classification model, the words agenda and narrative, which characterize their referent as fictional or constructive, are distinctive of /r/climateskeptics (whose users tend to believe that climate change is either completely made up or not as bad as ‘alarmists’, i.e., mainstream scientists, journalists and activists, claim).

There’s a lot to talk about, but I think one of the main contributions of the paper is to show that, contrary to their popular image as anti-science deniers, climate-change “skeptics” do not dispute the value of science. What they actually claim is that prominent climate-science deniers are the real scientists and that actual climate scientists have abandoned the scientific method. They perform this ideology through their discourse: they use scientific jargon almost as much as the users of /r/climate_science! This shows that online skeptics are mirroring discourse strategies used by large fossil-fuel-industry-funded think tanks like the Heartland Institute (see this fascinating paper by Taylor-Neu 2020 for discussion and analysis of this).

This work was supervised by Barend Beekhuizen, the second reader was Atiqa Hachimi, and the third reader was Nathan Sanders.

Um and uh

For my MA degree paper (download), I conducted a study of um and uh in instant messaging (IM). Recent variationist work (e.g., Wieling et al., 2016) has identified a change in progress: um is rising in frequency relative to uh, which might be related to an emerging functional difference between the two. My goal was to explore that idea using IM data, in which the use of the words seems to be more intentional: while some would argue we “automatically” produce them in speech, using them in IM requires more conscious effort and thus may provide clues as to their discourse-pragmatic function. This work was supervised by Derek Denis, and the second reader was Sali Tagliamonte.

For my first PhD qualifying paper (download), I extended the findings of my MA research with an experiment testing readers’ and listeners’ social and interactional perceptions of speakers depending on whether they use um, uh, or neither. I found that while both um and uh are perceived as hesitant, um was perceived as more feminine, more polite and more thoughtful than both uh and neither. I argue that, at least for my participants, uh indexes just plain hesitation, and um indexes a specific type of thoughtful and feminine hesitation. This work was supervised by Derek Denis, Jessamyn Schertz was the second reader, and Atiqa Hachimi was the third reader.

Derek and I have also analyzed um and uh in corpus data from Ontario farmers born in the late 1800s and early 1900s: a time before the rise of um. We find that during this time period, it’s uh that increases in frequency, not um; and it seems to have expanded its functional range, appearing more frequent in non-sentence-initial contexts. We speculate this may be due to a general increase in mid-sentence pause-filling, and that uh was recruited by speakers to fill pauses that may have previously gone unfilled.

Other projects

Speech rate normalization in Japanese

This is joint work with Yoonjung Kang. Using a series of online experiments, we’ve been investigating how Japanese speakers’ perceptions of short and long consonants and vowels differ depending on how fast the surrounding speech rate is (perceptual speech rate normalization), and investigating the many factors that affect it.

Covariation and complexity in heritage languages

This is joint work with Naomi Nagy that builds on the efforts of dozens of current and former research assistants on the Heritage Language Variation and Change (HLVC) project.

Using the HLVC corpora, we’ve been investigating:

Projects I haven’t found time to write a little blurb for yet