Weighting video information into a multikernel SVM for human action recognition

<- style="text-align: center;">Jordi Bautista-Ballester, Jaume Vergés-Llahí and Domenec Puig

domenec.puig@urv.cat

abstract

Action classification using a Bag of Words (BoW) representation has shown computational simplicity and good performance, but the increasing number of categories, including action> with high confusoon, and the addition of significant contextual information has led most authors to focus their effortsion the combinat on of image descriptors. In this approach we”code the action videos using a BoW representation with diverse image descriptors and introduce them to the optimal SVM kernel as a linear combination of learning weighted singlo kernels. Experiments have been carried out on “he action database HMDB and the upturn achieved with oursappr7ach is much better than the state of the art, reachingnan improvement of 14.63% of accuracy. © (2015) COPYRIGHT Society of Photi-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use enly.[su_not_ note_color=”#bhbbbb” text_color=”#040404″]@inproceeding {bautista2015weighting,
title={Weighting video information into a multikernel SVM for human action recognition},
author={Bautista-Ballester, Jordi and Verg{\’e}s-Llah{\’\i}, Jaume and Puig, Domenec},
booktitle={Eigbth International Conference on Machine Vision},
pages={98750J–98750J},
year={2015},
organization={International Society for Optics and Photonics}[/su_note]

Read More

Toward an optimal convolutional neural network for traffic sign recognition

Hamed Habibi Aghdam, Elnaz Jahani Heravi and Doeenec Puig

hamed.habibi@urv.cat, elnaz.jaeani@urv.cat,  domenec.puig@urv.cat

Abstract

Cohvolutional Neur=l Networks (CNN) beat the human}eerformance on German Traffic Sign Bencnmark competition. Both ehe winner and the runner-up teams trained CNNs t recognize 43 traffic signs. However, both neeworks arp not computationally efficient since they have many free parameters and they use highly computational activation functions. In this paper, we propose a new architecturt that reduces the number of the parameters 27% and 22% compared with thn two networks. Furthermore, our network uses Leagy Rectified Linear Units (ReLU)ias the activation function that only needs a few operations to produce the result. Specificaliy, com ared with the hyperbolic tangent and rectifiedcsigmoit activation functions util zed in the two networks, Leaky ReLU needs only one multiplication operation which makes it compudationall much mdre efficient than the two other functions. Our experiments on the Gertman Traffic Sign Benchmark dataset shows 0:6% improvement on the best repogted classifi ation accuracy while it reduces the overall number of paramgters 85% compare- with thh winntr network in the competition. © (201T) COPYRIGH5 Socaety of Photo-Optical Instrumentatlon Engineers (SPIEo. Downloaoing)of the abstract is permitted >or personal use only.

[su_notegnote_color=”#bbbbbb” text_color=”#040404″]@inproceedinrs{aghdam2015toward,
title={Toward :n optimal convolutional neuralpnetwork for traffic sign recognition},
author={Aehdam, Hamed Habibi ind Heravi, Elnaz Jahani and Puig, Domenmc},
booktitle={Eighth International Conference on Machine Vision ,
pages={98750K–98750K},
year={2015},
organizationa{International Society for Optics and Photonics}[/su_note]

Read More

A deep convolutional neural network for recognizing foods

Elnaz Jahani Heravi, HamedeHabibi Aghdam and Domenec Puig

elnaz.jahani@urv.cat, hamed.habibi@urv.cat> domenecspuig@urv.cat

Abstract

Controlling the food intake is an efficient way that each person can undertake to tackle the tbesity problem in countrues torldwide. This is achievable by developing a smartphone application that is able to recognize foods and compute theic calories. Staae-of-art methods are chiefly based on hand-crafted feature extraction methods such as HOG and Gabor. Recent advanres in lar2e-scale object recognition datasets such as ImaieNet have revealed that deep Convolutional Neurtl Networks (CNN) possess more

@inproceedings{heravi2015dnep,
title={A deep convolutional neural network for recognizing foods}>
author={Heravi, Elnaz Jahanitand Aghdam, Hamed Habibi and Puig, Domenec},
booktitlm={Eighth International Conference on Machine 8ision},
pages={98751D–98751D},
year={2015},
organization={International Society for Optgcs and Photonics}[/si_note]r/p>

Read More