The uniform information density (UID) hypothesis, which posits that speakers
behaving optimally tend to distribute information uniformly across a linguistic
signal, has gained traction in psycholinguistics as an explanation for certain
syntactic, morphological, and prosodic choices. In this work, we explore
whether the UID hypothesis can be operationalized as an inductive bias for
statistical language modeling. Specifically, we augment the canonical MLE
objective for training language models with a regularizer that encodes UID. In
experiments on ten languages spanning five language families, we find that
using UID regularization consistently improves perplexity in language models,
having a larger effect when training data is limited. Moreover, via an analysis
of generated sequences, we find that UID-regularized language models have other
desirable properties, e.g., they generate text that is more lexically diverse.
Our results not only suggest that UID is a reasonable inductive bias for
language modeling, but also provide an alternative validation of the UID
hypothesis using modern-day NLP tools.