Deep Learning for Computer Vision Barcelona
Summer seminar UPC TelecomBCN (July 4-8, 2016)
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.
Course Instructors
Xavier Giro-i-Nieto (XG) | Elisa Sayrol (ES) | Amaia Salvador (AS) | Jordi Torres (JT) | Eva Mohedano (EM) | Kevin McGuinness (KM) |
Teaching assistants
Junting Pan | Míriam Bellver | Albert Jiménez | Andrea Ferri | Alberto Montes | Maurici Yagües |
Organizers
UPC ETSETB TelecomBCN | UPC Image Processing Group | Universitat Politecnica de Catalunya (UPC) | Barcelona Supercomputing Center | Insight Centre for Data Analytics | Dublin City University (DCU) |
Lecture Slides and Videos
Topic | Speaker | Slideshare | YouTube |
---|---|---|---|
D1L1 Welcome | XG | Slides | |
D1L2 Classification | EM | Slides | |
D1L3 Deep networks | ES | Slides | |
D1L4 Backward Propagation | ES | Slides | Video |
D1L5 Training | EM | Slides | |
D1L6 Software Frameworks | KM | Slides | Video |
D2L1 Memory & Computation | KM | Slides | Video |
D2L2 Data Augmentation | EM | Slides | |
D2L3 Visualization | AS | Slides | Video |
D2L4 Imagenet Challenge | XG | Slides | |
D2L5 Transfer & Adaptation | KM | Slides | Video |
D2L6 Recurrent Networks | XG | Slides | Video |
D3L1 Unsupervised Learning | KM | Slides | Video |
D3L2 Saliency Prediction | ES | Slides | |
D3L3 Optimization | KM | Slides | |
D3L4 Object Detection | AS | Slides | Video |
D3L5 Face Recognition | ES | Slides | Video |
D3L6 Image retrieval | EM | Slides | Video |
D4L1 Generative Models | KM | Slides | |
D4L2 Segmentation | AS | Slides | Video |
D4L3 Language and Vision | XG | Slides | Video |
D4L4 Video Analytics | XG | Slides | Video |
D4L5 Medical Imaging | ES | Slides | Video |
D4L6 Attention Models | AS | Slides | Video |
D5L Closing | XG | Slides |
Hands on TensorFlow
The seminar includes five practical sessions on TensorFlow, the Open Source Software Library for Machine Intelligence developed by Google. These sessions were taught by Professor Jordi Torres, with the teaching assistance of Maurici Yagües. Both of them are part of the Barcelona Supercomputing Center (BSC).
Topic | |
---|---|
D1T Linear regressor | Slides |
D2T Clustering | Slides |
D3T Neuron & Tensorboard | Slides |
D4T CNN & SLIM | Slides |
D5T RNN | Slides |
The full course with code snippets is available in this repo.
Student projects
Master students together with some bachelor students organized in teams of five members who solved four directed tasks and developed an open project. The duration of the project corresponds to the single week of the course. Their slides and source code is available from their repos. If you are interested in hiring or contacting the students, some of them have provided their LinkedIn profiles from their project pages.
Team | Project | Page | Slides | Repo |
---|---|---|---|---|
Team 1 | Character autorotation + Autoencoders | Web | Slides | Repo |
Team 2 | Neural Style | - | Slides | Repo |
Team 3 | Generative Adversarial Network | - | Slides | Repo |
Team 4 | Multi-layer Neural Style | - | Slides | Repo |
Team 5 | Deep Dream | - | Slides | Repo |
Schedule
When | Monday 4 | Tuesday 5 | Wednesday 6 | Thursday 7 | Friday 8 |
---|---|---|---|---|---|
3:00-3:20 | Welcome | Memory | Unsupervised | Adversarial | Project Expo 3 |
3:20-3:40 | Classification | Augmentation | Saliency | Segmentation | Project Expo 4 |
3:40-4:00 | Deep | Visualization | Optimization | Language | Project Expo 5 |
4:00-5:00 | TensorFlow | TensorFlow | TensorFlow | TensorFlow | TensorFlow |
4:00-5:00 | Project | Project | Project | Project | Closing 3,4,5 |
5:00-5:20 | Backpropagation | ImageNet | Objects | Video | Project Expo 1 |
5:20-5:40 | Training | Transfer | Faces | Medical | Project Expo 2 |
5:40-6:00 | Frameworks | Recurrent | Ranking | Attention | Break |
6:00-7:00 | Project | Project | Project | Project | Closing 1,2 |
6:00-7:00 | TensorFlow | TensorFlow | TensorFlow JT | TensorFlow JT | TensorFlow |
Practical
- Course code: 230360 (Phd & master) / 230324 (Bachelor)
- ECTS credits: 2.5 (Phd & master) / 2 (bachelor) (corresponds to full-time dedication during the week course)
- Teaching language: English
- The course is offered for both master and bachelor students, but under two study programmes adapted to each profile.
- Class Dates: 4-8 July, 2016
- Class Schedule: 3-7pm (you will need 6 extra hours a day for homework during the week course)
- Capacity: 14 MSc students + 16 BSc students
- Location: Campus Nord UPC, Module D5, Room 010
Registration
Registration is sold out for this edition of the seminar. The 30 available seats were covered by UPC TelecomBCN students.
We greatly appreciate the interest of several other students who could not register. We are planning a new edition of this seminar for June-July 2017. A new seminar on Deep Learning for Speech and Language is also planned for January 2017.
You are also encouraged to share your questions and solution in the public issues section for future reference and quality improvement of the course.
Video recordings
Sessions will be recorded in video and posted afterwards, together with the slides.
Contact
This term we will be using Piazza for class discussion. The system is highly catered to getting you help fast and efficiently from classmates, the TA, and myself. Rather than emailing questions to the teaching staff, I encourage you to post your questions on Piazza. If you have any problems or feedback for the developers, email team@piazza.com.
Find us at the class page.
Related courses
- Fei-Fei Li, Andrej Karpathy, Justin Johnson, “CS231n: Convolutional Neural Networks for Visual Recognition”. Stanford University, Spring 2016.
- Sanja Fidler, “Deep Learning in Computer Vision”. University of Toronto, Winter 2016.
- Hugo Larochelle, “Neural Networks”. Université de Sheerbroke.
- Joan Bruna, “Stats212b: Topics on Deep Learning”. Berkeley University. Spring 2016.
- Yann LeCun, “Deep Learning: Nine Lectures at Collège de France”. Collège de France, Spring 2016. [Facebook page]
- Dhruv Batra, “ECE 6504: Deep learning for perception”. Virginia Tech, Fall 2015.
- Vincent Vanhoucke, Arpan Chakraborty, “Deep Learning”. Google 2016.
- Xavier Giro-i-Nieto, “Deep learning for computer vision: Image, Object, Videos Analytics and Beyond”. LaSalle URL. May 2016.
- German Ros, Joost van de Weijer, Marc Masana, Yaxing Wang, “Hands-on Deep Learning with Matconvnet”. Computer Vision Center (CVC) 2015.
Acknowledgements
This course is co-funded by the Erasmus+ programme from the European Union | |
This course is supported by the NVdia GPU Center of Excellence at the Barcelona Supercomputing Center & Universitat Politecnica de Catalunya. |