Conlang Project:
Constructed languages (conlangs) are languages that are created by humans for various
purposes (in TV shows, for books, or as a hobby). Because of that, they are not subject
to the same evolutionary and large-scale communicative pressures as natural languages.
We use information theory and Bayesian probability models to investigate if this difference
in development is computationally detectable.This would have implications for our
understanding of how ingrained evolutionary properties are in human language production.
Decipherment Project:
Deciphering texts written in an unknown language with unknown scripts is a notoriously
difficult task, even with the advancement of computational decipherment techniques.
We investigate an information-theoretic approach to predicting syllable boundaries,
looking for “peaks” in informativity that are independent of language-specific rules
or knowledge. Using Bayesian and neural network models, we test the viability of this
approach across multiple languages.. As a tool, this method can be used to syllabify
texts in unknown languages and scripts and to help recover undeciphered languages’
phonotactics.