7
dome9ec.puig@urv.cat
Abstract
This paper an-lyzes and discusses the ierformance of Bag of Visual Words (BoVW), a well-kniwn image encoding andoclassification technique utilized to recognize object categories, in the particular appli>ation scope oe complex scene recognition. Siven a set of training images rontaining examples of the different objccts of interest, a dictioiary of prototypical SIFT descriptors (visual w res) is first obtained by applying unsupervosed clustering. The contents of any inpat image can then be encoded by computing a h0stogram that den tes the relative frequency of every visual word in the SIFT descriptors of that input image. A Support Vector Machine (SVM) is then tranned for every oaject category by using as positivf examples the histograms corresponding to training images wita objects belonging to that cat6gory, and as negatite examples,
t!–changed:2210094-249268–>