Yu, Honglin
Description
Understanding the popularity evolution of online media has become an important
research topic. There are a number of key questions which have high scientific significance
and wide practical relevance. In particular, what are the statistical characteristics
of online user behaviors? What are the main factors that affect online
collective attention? How can one predict the popularity of online content? Recently,
researchers have tried to understand the way popularity evolves from both
a...[Show more] theoretical and empirical perspective. A number of important insights have been
gained: e.g., most videos obtain the majority of their viewcounts at the early stage
after uploading; for videos having identical content, there is a strong “first-mover”
advantage, so that early uploads have the most views; YouTube video viewcount dynamics
strongly correlate with video quality. Building upon these insights, the main
contributions of the thesis are: we proposed two new representations of viewcount
dynamics. One is popularity scale where we represent each video’s popularity by
their relative viewcount ranks in a large scale dataset. The other is the popularity
phase which models the rise and fall of video’s daily viewcount overtime; We also
proposed four computational tools. The first is an efficient viewcount phase detection
algorithm which not only automatically determines the number of phases each
video has, but also finds the phase parameters and boundaries. The second is a
phase-aware viewcount prediction method which utilizes phase information to significantly
improve the existing state-of-the-art method. The third is a phase-aware
viewcount clustering method which can better capture “pulse patterns” in viewcount
data. The fourth is a novel method of predicting viewcounts using external information
from the Twitter network. Finally, this thesis sets out results from large-scale,
longitudinal measurement study of YouTube video viewcount history, e.g. we find videos with different popularity and categories have distinctive phase histories. And
we also observed a non-trivial number of concave phases. Dynamics like this can not
be explained in terms of existing models, and the terminology and tools introduced
here have the potential to spark fresh analysis efforts and further research. In all, the
methods and insights developed in the thesis improve our understanding of online
collective attention. They also have considerable potential usage in online marketing,
recommendation and information dissemination e.g., in emergency & natural
disasters.
Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.