Instructor
Xavier Giro-i-Nieto (XG) |
Slides
Video Lecture
(to be added)
Related Work & Resources
-
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., & Fei-Fei, L. (2014, June). Large-scale video classification with convolutional neural networks. In Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on (pp. 1725-1732). IEEE.
-
Tran, Du, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri. “Learning spatiotemporal features with 3D convolutional networks.” In Proceedings of the IEEE International Conference on Computer Vision, pp. 4489-4497. 2015. video-caffe
-
Srivastava, Nitish, Elman Mansimov, and Ruslan Salakhutdinov. “Unsupervised learning of video representations using LSTMs.” arXiv preprint arXiv:1502.04681 (2015).
-
Yue-Hei Ng, Joe, Matthew Hausknecht, Sudheendra Vijayanarasimhan, Oriol Vinyals, Rajat Monga, and George Toderici. “Beyond short snippets: Deep networks for video classification.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4694-4702. 2015.
-
Sharma, Shikhar, Ryan Kiros, and Ruslan Salakhutdinov. “Action Recognition using Visual Attention.” arXiv preprint arXiv:1511.04119 (2015). [code]
-
Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., van der Smagt, P., Cremers, D. and Brox, T., 2015. FlowNet: Learning Optical Flow With Convolutional Networks. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2758-2766).
-
P. Ondruska and I. Posner, “Deep Tracking: Seeing Beyond Seeing Using Recurrent Neural Networks”, in The Thirtieth AAAI Conference on Artificial Intelligence (AAAI), Phoenix, Arizona USA, 2016. [code]
-
Xiaolong Wang, Ali Farhadi and Abhinav Gupta “Actions ~ Transformations”. CVPR 2016
-
Feichtenhofer, Christoph, Axel Pinz, and Andrew Zisserman. “Convolutional Two-Stream Network Fusion for Video Action Recognition.” CVPR 2016.
-
Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, and Luc Van Gool, Temporal Segment Networks: Towards Good Practices for Deep Action Recognition. ECCV 2016, Amsterdam, The Netherlands. [source code]
-
Mathieu, Michael, Camille Couprie, and Yann LeCun. “Deep multi-scale video prediction beyond mean square error.” ICLR 2016. Torch TensorFlow
Other resources
- Xavier Giro-i-Nieto, Deep Learning for Computer Vision: Video Analytics. Master in Multimedia Creation. URL La Salle Barcelona. May 2016.
Related work at UPC
- Alberto Montes bachelor thesis (ETSETB 2016)