Deep Learning for Computer Vision Barcelona

Summer seminar UPC TelecomBCN (July 4-8, 2016)

Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.

Course Instructors


Xavier Giro-i-Nieto (XG)	Elisa Sayrol (ES)	Amaia Salvador (AS)	Jordi Torres (JT)	Eva Mohedano (EM)	Kevin McGuinness (KM)

Teaching assistants


Junting Pan	Míriam Bellver	Albert Jiménez	Andrea Ferri	Alberto Montes	Maurici Yagües

Organizers


UPC ETSETB TelecomBCN	UPC Image Processing Group	Universitat Politecnica de Catalunya (UPC)	Barcelona Supercomputing Center	Insight Centre for Data Analytics	Dublin City University (DCU)

Lecture Slides and Videos

Topic	Speaker	Slideshare	YouTube
D1L1 Welcome	XG	Slides
D1L2 Classification	EM	Slides
D1L3 Deep networks	ES	Slides
D1L4 Backward Propagation	ES	Slides	Video
D1L5 Training	EM	Slides
D1L6 Software Frameworks	KM	Slides	Video
D2L1 Memory & Computation	KM	Slides	Video
D2L2 Data Augmentation	EM	Slides
D2L3 Visualization	AS	Slides	Video
D2L4 Imagenet Challenge	XG	Slides
D2L5 Transfer & Adaptation	KM	Slides	Video
D2L6 Recurrent Networks	XG	Slides	Video
D3L1 Unsupervised Learning	KM	Slides	Video
D3L2 Saliency Prediction	ES	Slides
D3L3 Optimization	KM	Slides
D3L4 Object Detection	AS	Slides	Video
D3L5 Face Recognition	ES	Slides	Video
D3L6 Image retrieval	EM	Slides	Video
D4L1 Generative Models	KM	Slides
D4L2 Segmentation	AS	Slides	Video
D4L3 Language and Vision	XG	Slides	Video
D4L4 Video Analytics	XG	Slides	Video
D4L5 Medical Imaging	ES	Slides	Video
D4L6 Attention Models	AS	Slides	Video
D5L Closing	XG	Slides

Hands on TensorFlow

The seminar includes five practical sessions on TensorFlow, the Open Source Software Library for Machine Intelligence developed by Google. These sessions were taught by Professor Jordi Torres, with the teaching assistance of Maurici Yagües. Both of them are part of the Barcelona Supercomputing Center (BSC).

Topic
D1T Linear regressor	Slides
D2T Clustering	Slides
D3T Neuron & Tensorboard	Slides
D4T CNN & SLIM	Slides
D5T RNN	Slides

The full course with code snippets is available in this repo.

Student projects

Master students together with some bachelor students organized in teams of five members who solved four directed tasks and developed an open project. The duration of the project corresponds to the single week of the course. Their slides and source code is available from their repos. If you are interested in hiring or contacting the students, some of them have provided their LinkedIn profiles from their project pages.

Team	Project	Page	Slides	Repo
Team 1	Character autorotation + Autoencoders	Web	Slides	Repo
Team 2	Neural Style	-	Slides	Repo
Team 3	Generative Adversarial Network	-	Slides	Repo
Team 4	Multi-layer Neural Style	-	Slides	Repo
Team 5	Deep Dream	-	Slides	Repo

Schedule

When	Monday 4	Tuesday 5	Wednesday 6	Thursday 7	Friday 8
3:00-3:20	Welcome	Memory	Unsupervised	Adversarial	Project Expo 3
3:20-3:40	Classification	Augmentation	Saliency	Segmentation	Project Expo 4
3:40-4:00	Deep	Visualization	Optimization	Language	Project Expo 5
4:00-5:00	TensorFlow	TensorFlow	TensorFlow	TensorFlow	TensorFlow
4:00-5:00	Project	Project	Project	Project	Closing 3,4,5
5:00-5:20	Backpropagation	ImageNet	Objects	Video	Project Expo 1
5:20-5:40	Training	Transfer	Faces	Medical	Project Expo 2
5:40-6:00	Frameworks	Recurrent	Ranking	Attention	Break
6:00-7:00	Project	Project	Project	Project	Closing 1,2
6:00-7:00	TensorFlow	TensorFlow	TensorFlow JT	TensorFlow JT	TensorFlow

Practical

Course code: 230360 (Phd & master) / 230324 (Bachelor)
ECTS credits: 2.5 (Phd & master) / 2 (bachelor) (corresponds to full-time dedication during the week course)
Teaching language: English
The course is offered for both master and bachelor students, but under two study programmes adapted to each profile.
Class Dates: 4-8 July, 2016
Class Schedule: 3-7pm (you will need 6 extra hours a day for homework during the week course)
Capacity: 14 MSc students + 16 BSc students
Location: Campus Nord UPC, Module D5, Room 010

Registration

Registration is sold out for this edition of the seminar. The 30 available seats were covered by UPC TelecomBCN students.

We greatly appreciate the interest of several other students who could not register. We are planning a new edition of this seminar for June-July 2017. A new seminar on Deep Learning for Speech and Language is also planned for January 2017.

You are also encouraged to share your questions and solution in the public issues section for future reference and quality improvement of the course.

Video recordings

Sessions will be recorded in video and posted afterwards, together with the slides.

Contact

This term we will be using Piazza for class discussion. The system is highly catered to getting you help fast and efficiently from classmates, the TA, and myself. Rather than emailing questions to the teaching staff, I encourage you to post your questions on Piazza. If you have any problems or feedback for the developers, email team@piazza.com.

Find us at the class page.

Fei-Fei Li, Andrej Karpathy, Justin Johnson, “CS231n: Convolutional Neural Networks for Visual Recognition”. Stanford University, Spring 2016.
Sanja Fidler, “Deep Learning in Computer Vision”. University of Toronto, Winter 2016.
Hugo Larochelle, “Neural Networks”. Université de Sheerbroke.
Joan Bruna, “Stats212b: Topics on Deep Learning”. Berkeley University. Spring 2016.
Yann LeCun, “Deep Learning: Nine Lectures at Collège de France”. Collège de France, Spring 2016. [Facebook page]
Dhruv Batra, “ECE 6504: Deep learning for perception”. Virginia Tech, Fall 2015.
Vincent Vanhoucke, Arpan Chakraborty, “Deep Learning”. Google 2016.
Xavier Giro-i-Nieto, “Deep learning for computer vision: Image, Object, Videos Analytics and Beyond”. LaSalle URL. May 2016.
German Ros, Joost van de Weijer, Marc Masana, Yaxing Wang, “Hands-on Deep Learning with Matconvnet”. Computer Vision Center (CVC) 2015.

Acknowledgements

This course is co-funded by the Erasmus+ programme from the European Union
This course is supported by the NVdia GPU Center of Excellence at the Barcelona Supercomputing Center & Universitat Politecnica de Catalunya.