Abstract
In this paper, we discuss and audio-visual approach to automatic
web video categorization. We propose content descriptors
which exploit audio, temporal, and color content. The
power of our descriptors was validated both in the context of
a classification system and as part of an information retrieval
approach. For this purpose, we used a real-world scenario,
comprising 26 video categories from the blip.tv media platform
(up to 421 hours of video footage). Additionally, to
bridge the descriptor semantic gap, we propose a new relevance
feedback technique which is based on hierarchical clustering.
Experiments demonstrated that retrieval performance
can be increased significantly and becomes comparable to that
of high level semantic textual descriptors.
Original language | English |
---|---|
Title of host publication | Proceedings of the 20thEuropean Signal Processing Conference |
Pages | 375-379 |
Number of pages | 5 |
Publication status | Published - 2012 |
Fields of science
- 102 Computer Sciences
- 102001 Artificial intelligence
- 102003 Image processing
JKU Focus areas
- Computation in Informatics and Mathematics
- Engineering and Natural Sciences (in general)