Víctor Campos Amaia Salvador Xavier Giro-i-Nieto Brendan Jou
Víctor Campos Amaia Salvador Xavier Giro-i-Nieto Brendan Jou

A joint collaboration between:

logo-upc logo-etsetb logo-gpi logo-columbia logo-dvmmlab
Universitat Politecnica de Catalunya (UPC) UPC ETSETB TelecomBCN UPC Image Processing Group Columbia University Digital Video and Multimedia Lab (DVMM)


Visual media are powerful means of expressing emotions and sentiments. The constant generation of new content in social networks highlights the need of automated visual sentiment analysis tools. While Convolutional Neural Networks (CNNs) have established a new state-of-the-art in several vision problems, their application to the task of sentiment analysis is mostly unexplored and there are few studies regarding how to design CNNs for this purpose. In this work, we study the suitability of fine-tuning a CNN for visual sentiment prediction as well as explore performance boosting techniques within this deep learning setting. Finally, we provide a deep-dive analysis into a benchmark, state-of-the-art network architecture to gain insight about how to design patterns for CNNs on the task of visual sentiment prediction.


Please cite with the following Bibtex code:

  title={Diving Deep into Sentiment: Understanding Fine-tuned CNNs for Visual Sentiment Prediction},
  author={Campos, Victor and Salvador, Amaia and Giro-i-Nieto, Xavier and Jou, Brendan},
  booktitle={Proceedings of the 1st International Workshop on Affect \& Sentiment in Multimedia},

You may also want to refer to our publication with the more human-friendly APA style:

Campos, V., Salvador, A., Giro-i-Nieto, X., & Jou, B. (2015, October). Diving Deep into Sentiment: Understanding Fine-tuned CNNs for Visual Sentiment Prediction. In Proceedings of the 1st International Workshop on Affect & Sentiment in Multimedia (pp. 57-62). ACM.

A preprint of the paper is publicly available on arXiv. More details can be found in our slides at ASM 2015, and the related bachelor thesis report and video by Victor Campos at ETSETB TelecomBCN in July 2015.


Diving deep into sentiment: Understanding fine-tuned CNNs for visual sentiment prediction from Xavier Giro


2015 - TFG - Víctor Campos from Image Processing Group on Vimeo.


The weights for the best CNN model can be downloaded from here (217 MB).

The deep network was developed over Caffe by Berkeley Vision and Learning Center (BVLC). You will need to follow these instructions to install Caffe.


We would like to especially thank Albert Gil Moreno and Josep Pujal from our technical support team at the Image Processing Group at UPC.

AlbertGil-photo JosepPujal-photo
Albert Gil Josep Pujal
We gratefully acknowledge the support of NVIDIA Corporation with the donation of the GeForce GTX Titan Z and Titan X used in this work. logo-nvidia
The Image Processing Group at the UPC is a SGR14 Consolidated Research Group recognized and sponsored by the Catalan Government (Generalitat de Catalunya) through its AGAUR office. logo-catalonia
This work has been developed in the framework of the project BigGraph TEC2013-43935-R, funded by the Spanish Ministerio de Economía y Competitividad and the European Regional Development Fund (ERDF). logo-spain


If you have any general doubt about our work or code which may be of interest for other researchers, please use the public issues section on this github repo. Alternatively, drop us an e-mail at xavier.giro@upc.edu.