“For the times they are a-changin’”
Loading...
Date
2021-02
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This paper deals with the problem of deriving consistent time-series from newspaper contentbased
topic models. In the first part, we recapitulate a few our own failed attempts, in the
second one, we show some results using a twin strategy, that we call prototyping and seeding.
Given the popularity news-based indicators have assumed in econometric analyses in recent
years, this seems to be a valuable exercise for researchers working on related issues.
Building on earlier writings, where we use the topic modelling approach Latent Dirichlet
Allocation (LDA) to gauge economic uncertainty perception, we show the difficulties that arise
when a number of one-shot LDAs, performed at different points in time, are used to produce
something akin of a time-series. The models’ topic structures differ considerably from
computation to computation. Neither parameter variations nor the accumulation of several
topics to broader categories of related content are able solve the problem of
incompatibleness. It is not just the content that is added at each observation point, but the
very properties of LDA itself: since it uses random initializations and conditional reassignments
within the iterative process, fundamentally different models can emerge when the algorithm
is executed several times, even if the data and the parameter settings are identical. To tame
LDA’s randomness, we apply a newish “prototyping” approach to the corpus, upon which our
Uncertainty Perception Indicator (UPI) is built. Still, the outcomes vary considerably over time.
To get closer to our goal, we drop the notion that LDA models should be allowed to take
various forms freely at each run. Instead, the topic structure is fixated, using a “seeding”
technique that distributes incoming new data to our model’s existing topic structure. This
approach seems to work quite well, as our consistent and plausible results show, but it is
bound to run into difficulties over time either.
Description
Table of contents
Keywords
uncertainty, economic policy, business cycles, Covid-19, latent Dirichlet allocation, seeded LDA