Chuang, Yu-Ying; Brown, Dunstan; Evans, Roger; Baayen, Harald. 2021. Using word vectors to understand Russian lexemes: defectiveness and case-number paradigms. Invited talk at the Linguistics Department colloquium, Stony Brook University, November 19, 2021.

  • Yu-Ying Chuang (Invited speaker)
  • Brown, D. (Invited speaker)
  • Roger Evans (Invited speaker)
  • Harald Baayen (Invited speaker)

Activity: Talk or presentationInvited talk


We report on ongoing work where we take another look at Russian nouns that are defective in the genitive plural. Rather than look at the form-based properties that contribute to the uncertainty of exponence in these nouns, we consider the problem from a new perspective, namely the distributional semantics of case and number using word vectors and t-SNE. There are two motivations for considering defectiveness from this angle. One is Sims’ (2015: 101) conjecture that defectiveness may be the preferred option when there is semantic incongruity; a second is the modelling work in Linear Discriminative Learning (Chuang and Baayen 2021) to understand the relationship between form and distributional semantics. There are two measures that might be used for comparing defective nouns with non-defective nouns: i) Cosine similarity ('angle'); ii) Euclidean distance ('distance'). The first of these has been found to be predictive of how we perceive semantic similarity. Interestingly, the meanings associated with the inflected variants of defective nouns are less similar to each with respect to their angle than is the case for the meanings of the inflected variants of non-defective nouns. However, it appears that there is also potential structure in the distances between the meanings of inflected forms, and that non-defective and defective nouns show different patterns, also with respect to distance. Defective nouns are further away from their idealized case-number vectors and are closer in semantic space to their idealized lexeme vectors. So what we get out of the current analysis is not only that angle matters, but also that distance is clearly relevant for case and number inflection in Russian, and therefore for understanding defectiveness. The challenge for our research is how to understand the linguistic and cognitive aspects of these distances.
Period19 Nov 2021
Held atDepartment of Linguistics, Stony Brook University, United States
Degree of RecognitionInternational