| University of North Texas

Conlang Project:
Constructed languages (conlangs) are languages that are created by humans for various purposes (in TV shows, for books, or as a hobby). Because of that, they are not subject to the same evolutionary and large-scale communicative pressures as natural languages. We use information theory and Bayesian probability models to investigate if this difference in development is computationally detectable.This would have implications for our understanding of how ingrained evolutionary properties are in human language production.

Decipherment Project:
Deciphering texts written in an unknown language with unknown scripts is a notoriously difficult task, even with the advancement of computational decipherment techniques. We investigate an information-theoretic approach to predicting syllable boundaries, looking for “peaks” in informativity that are independent of language-specific rules or knowledge. Using Bayesian and neural network models, we test the viability of this approach across multiple languages.. As a tool, this method can be used to syllabify texts in unknown languages and scripts and to help recover undeciphered languages’ phonotactics.

Projects