A framework for estimating geometric distortions in video copies based on visual-audio fingerprints
No Thumbnail Available
Date
2015
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Springer-Verlag London Ltd
Abstract
Spatio-temporal alignments and estimation of distortion model between pirate and master video contents are prerequisites, in order to approximate the illegal capture location in a theater. State-of-the-art techniques are exploiting only visual features of videos for the alignment and distortion model estimation of watermarked sequences, while few efforts are made toward acoustic features and non-watermarked video contents. To solve this, we propose a distortion model estimation framework based on multimodal signatures, which fully integrates several components: Compact representation of a video using visual-audio fingerprints derived from Speeded Up Robust Features and Mel-Frequency Cepstral Coefficients; Segmentation-based bipartite matching scheme to obtain accurate temporal alignments; Stable frame pairs extraction followed by filtering policies to achieve geometric alignments; and distortion model estimation in terms of homographic matrix. Experiments on camcorded datasets demonstrate the promising results of the proposed framework compared to the reference methods. © 2013, Springer-Verlag London.
Description
Keywords
Alignment, Geometry, Speech recognition, Video recording, DLT, Duplicate video, Frame alignments, Geometric distortion, MFCC, SURF, Frequency estimation
Citation
Signal, Image and Video Processing, 2015, 9, 1, pp. 201-210
