Activity Recognition in Videos with Segmented Streams

Date

2019-06

Authors

Cai, Zixian
Gould, Stephen

Journal Title

Journal ISSN

Volume Title

Publisher

The Australian National University

Abstract

We investigate a Convolutional Neural Networks (CNN) architecture for activity recognition in short video clips. Applications are ubiquitous, ranging from guiding unmanned vehicles to captioning video clips. While the employment of CNN architectures on large image datasets (such as ImageNet) has been successfully demonstrated in many prior works, there is still no clear answer as to how one can use adapt CNNs to video data. Several different architectures have been explored such as C3D and two-stream networks. However, they all use RGB frames of the video clips as is. In this work, we introduce segmented streams, where each stream consists of the original RGB frames segmented by motion types. We find that after training on the UCF101 dataset, we are able to improve over the original two-stream work by fusing our segmented streams.

Description

Keywords

activity recognition, two-stream

Citation

Source

Type

Report (Student work)

Book Title

Entity type

Access Statement

Open Access

License Rights

Restricted until

Downloads