Machine-based prediction of real-world events is garnering attention due to
its potential for informed decision-making. Whereas traditional forecasting
predominantly hinges on structured data like time-series, recent breakthroughs
in language models enable predictions using unstructured text. In particular,
(Zou et al., 2022) unveils AutoCast, a new benchmark that employs news articles
for answering forecasting queries. Nevertheless, existing methods still trail
behind human performance. The cornerstone of accurate forecasting, we argue,
lies in identifying a concise, yet rich subset of news snippets from a vast
corpus. With this motivation, we introduce AutoCast++, a zero-shot
ranking-based context retrieval system, tailored to sift through expansive news
document collections for event forecasting. Our approach first re-ranks
articles based on zero-shot question-passage relevance, honing in on
semantically pertinent news. Following this, the chosen articles are subjected
to zero-shot summarization to attain succinct context. Leveraging a pre-trained
language model, we conduct both the relevance evaluation and article
summarization without needing domain-specific training. Notably, recent
articles can sometimes be at odds with preceding ones due to new facts or
unanticipated incidents, leading to fluctuating temporal dynamics. To tackle
this, our re-ranking mechanism gives preference to more recent articles, and we
further regularize the multi-passage representation learning to align with
human forecaster responses made on different dates. Empirical results
underscore marked improvements across multiple metrics, improving the
performance for multiple-choice questions (MCQ) by 48% and true/false (TF)
questions by up to 8%. Code is available at
this https URL