Activity Recognition in Videos with Segmented Streams
Date
2019-06
Authors
Cai, Zixian
Gould, Stephen
Journal Title
Journal ISSN
Volume Title
Publisher
The Australian National University
Abstract
We investigate a Convolutional Neural Networks (CNN) architecture for activity recognition in short video clips. Applications are ubiquitous, ranging from guiding unmanned vehicles to captioning video clips. While the employment of CNN architectures on large image datasets (such as ImageNet) has been successfully demonstrated in many prior works, there is still no clear answer as to how one can use adapt CNNs to video data. Several different architectures have been explored such as C3D and two-stream networks. However, they all use RGB frames of the video clips as is. In this work, we introduce segmented streams, where each stream consists of the original RGB frames segmented by motion types. We find that after training on the UCF101 dataset, we are able to improve over the original two-stream work by fusing our segmented streams.
Description
Keywords
activity recognition, two-stream
Citation
Collections
Source
Type
Report (Student work)
Book Title
Entity type
Access Statement
Open Access
License Rights
Restricted until
Downloads
File
Description