Papers
Topics
Authors
Recent
Search
2000 character limit reached

Modelling Temporal Information Using Discrete Fourier Transform for Recognizing Emotions in User-generated Videos

Published 20 Mar 2016 in cs.CV | (1603.06568v2)

Abstract: With the widespread of user-generated Internet videos, emotion recognition in those videos attracts increasing research efforts. However, most existing works are based on framelevel visual features and/or audio features, which might fail to model the temporal information, e.g. characteristics accumulated along time. In order to capture video temporal information, in this paper, we propose to analyse features in frequency domain transformed by discrete Fourier transform (DFT features). Frame-level features are firstly extract by a pre-trained deep convolutional neural network (CNN). Then, time domain features are transferred and interpolated into DFT features. CNN and DFT features are further encoded and fused for emotion classification. By this way, static image features extracted from a pre-trained deep CNN and temporal information represented by DFT features are jointly considered for video emotion recognition. Experimental results demonstrate that combining DFT features can effectively capture temporal information and therefore improve emotion recognition performance. Our approach has achieved a state-of-the-art performance on the largest video emotion dataset (VideoEmotion-8 dataset), improving accuracy from 51.1% to 62.6%.

Citations (8)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.