Spring 2015 - CIS 5543


Computer Vision   


Basic Information:

·  Lecture time: Tuesday 5:30-8:00pm, TTLMAN 0403B

·  Instructor: Haibin Ling | Wachman Hall, Room 305 | 215-204-6973 | hbling AT temple.edu

·  Office Hours: Tuesday 3:00-5:00pm, or by appointment

·  Syllabus: PDF


Matlab Resources
Temple Univ.
CIS Dept.
DABI Center



Topics and Materials (Tentative, will be revised FREQUENTLY)

Additional Readings

Week 1

General introduction. (Slide 1, 2)

·  Background

·  Topics in visual data analysis

·  Applications

·  Related fields
·  Image formation


·  Szeliski's book, Ch. 1-2





Week 2

Background – Elementary Math and Statistics (Slides)

·  Elementary probability and statistics

·  Linear analysis (PCA, LDA, etc.)

·  Graph algorithms (HMM, MRF, etc.)


Materials: (Slides)

·  Szeliski's book, Appendix A-C

·  A Global Geometric Framework for Nonlinear Dimensionality Reduction, Joshua B. Tenenbaum, Vin de Silva, John C. Langford, Science, 2000.

·  Nonlinear Dimensionality Reduction by Locally Linear Embedding, Sam T. Roweis, Lawrence K. Saul, Science, 2000.





Week 3


Paper selection due!

Background - Optimization (Slides)

·  Basic idea and linear optimization

·  Convex optimization


·  Convex optimization by Boyd and Vandenberghe.


·  Additive logistic regression: a statistical view of boosting, J. Friedman, T. Hastie, and R. Tibshirani, The Annals of Statistics 2000.




Week 4




Segmentation (Slides)

·  Clustering, perceptual grouping, segmentation



·  Szeliski's book, Ch. 5

Paper presentation:

· [Presenter: Mian Wang] Efficient Graph-Based Image Segmentation, P.F. Felzenszwalb and Daniel P. Huttenlocher, IJCV, 59(2), 2004.

· [Presenter: Brian Thibodeau] Region-based particle filter for video object segmentation, David Varas and Ferran Marques, CVPR 2014

Additional Readings:

· Stochastic relaxation, gibbs distributions, and the bayesian restoration of images,  S. Geman and D. Geman. PAMI, 6:721--741, 1984.
· Normalized Cuts and Image Segmentation, by J. Shi and J. Malik, PAMI 2000

· Fast Approximate Energy Minimization via Graph Cuts, Y. Boykov, O. Veksler and R. Zabih, PAMI, 2001.

· P. Arbelaez, M. Maire, C. Fowlkes and J. Malik. Contour Detection and Hierarchical Image Segmentation. IEEE TPAMI, Vol. 33, No. 5, pp. 898-916, May 2011. [web page]

· R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Süsstrunk, SLIC superpixels compared to state-of-the-art superpixel methods, IEEE TPAMI, vol. 34, no. 11, pp. 2274–2282, 2012.




Week 5

Features and Representation (Slides)

·  What are features?

·  Invariant features and representations



·  Szeliski's book, Ch. 4


Paper presentation

· [Presenter: Tian Bai] Hinton, G. E. and Salakhutdinov, R. R, Reducing the dimensionality of data with neural networks. Science, Vol. 313. no. 5786, pp. 504 - 507, 28 July 2006.

· [Presenter: Abdulsalam Hdadi] All about VLAD, R. Arandjelovic, A. Zisserman, IEEE Conference on Computer Vision and Pattern Recognition, 2013

· [Presenter: Peter Dane Mollica] Visualizing and Understanding Convolutional Networks, M. D. Zerler and B. Fergus, ECCV 2014.

Additional Readings:

· C. Schmid & R. Mohr, Local Grayvalue Invariants for Image Retrieval, PAMI, 19(5), 530-535, 1997.

· Distinctive image features from scale-invariant keypoints," D. Lowe, IJCV 2004.

· F. Perronnin, J. Sanchez, and T. Mensink. Improving the Fisher kernel for large-scale image classification. In Proc. ECCV, 2010

· Deep Fisher Kernels – End to End Learning of the Fisher Kernel GMM Parameters. Vladyslav Sydorov, Mayu Sakurada, Christoph H. Lampert, CVPR 2014.





Week 6

Matching, Alignment, and Registration (slides)

·  Feature matching

·  Surface matching

·  Image matching


·  Szeliski's book, Ch. 6


Paper presentation:

· [Presenter: Yiyuan Zhang] X. Zhu and D. Ramanan. Face detection, pose estimation, and landmark localization in the wild. In CVPR, pages 2879–2886. IEEE, 2012.
· [Presenter: Cong Rao] Face Alignment at 3000 FPS via Regressing Local Binary Features, S. Ren, X. Cao, W. Wei, J. Sun, CVPR 2014.

· [Presenter: Miriam Fuchs] SIFT Flow: Dense Correspondence across Scenes and its Applications, Ce Liu, Jenny Yuen, Antonio Torralba, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 33, No. 5, 2011.

Additional Readings:

· Principal Warps: Thin-Plate Splines and the Decomposition of Deformations, by F. Bookstein, PAMI 1989, Vol 11, No 6.

· Iterative Point Matching for Registration of Free-Form Curves and Surfaces, Z. Zhang, IJCV, Vol.13, No.2, pages 119-152, 1994

· A tensor-based algorithm for high-order graph matching, O. Duchenne, F. Bach, I. Kweon, and J. Ponce, TPAMI, 33(12):2383–2395, 2011.
· Alignment by Maximization of Mutual Information, by P. Viola and W. M. Wells III, IJCV, 1997

· Gauss-Newton Deformable Part Models for Face Alignment in-the-Wild, Georgios Tzimiropoulos, Maja Pantic, CVPR 2014.

· Incremental Face Alignment in the Wild, Akshay Asthana, Stefanos Zafeiriou, Shiyang Cheng, and Maja Pantic, CVPR 2014




Week 7

Shape (Slides)

·  Shape space - Procrustes analysis

·  Shape matching - TPS, RANSAC

·  Shape classification - shape context


·  Szeliski's book, Ch. 14


Paper presentation:

· [Presenter: Pengpeng Liang] Hedi Tabia, Hamid Laga, David Picard, Philippe-Henri Gosselin, Covariance Descriptors for 3D Shape Matching and Retrieval, CVPR 2014.

· [Presenter: Tao Wang] D.C. Hauagge, N. Snavely. Image Matching using Local Symmetry Features. CVPR, 2012..

· [Presenter: Liang Du] Xinghai Sun, Changhu Wang, Chao Xu, Lei Zhang. Indexing Billions of Images for Sketch-based Retrieval, ACM Multimedia 2013.

Additional Readings:

· Active shape models - their training and application, T. F. Cootes, C. J. Taylor, D. H. Cooper, and J. Graham, CVIU, 61(1):38-59, 1995.

· A survey of content based 3D shape retrieval methods. J. W. Tangelder and R.C. Veltkamp. Multimedia Tools Appl. 39, 3 2008, 441-471.

· Shape Matching and Object Recognition Using Shape Contexts, S. Belongie, J. Malik and J. Puzicha, PAMI, 24(4):509-522, 2002.

· Full-Angle Quaternions for Robustly Matching Vectors of 3D Rotations, Stephan Liwicki, Minh-Tri Pham, Stefanos Zafeiriou, Maja Pantic, Bjorn Stenger, CVPR 2014.





Week 8

Visual recognition - objects and faces (Slides)

·  Eigenface


·  ASM


·  Szeliski's book, Ch. 14


Paper presentation:

· [Presenter: Runzhong Huang] Alex Krizhevsky, Geoffrey E. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, NIPS, 2012

· [Presenter: Yiran Li] Yi Sun, Yuheng Chen, XiaogangWang, and Xiaoou Tang, Deep Learning Face Representation by Joint Identification-Verification, NIPS'14

Additional Reading:

· Eigenfaces for recognition, Journal of Cognitive Neuroscience, Turk, M. & Pentland, A. (1991) 3, 71-86

· Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection, P. Belhumeur, J. Hespanha and D. Kriegman, PAMI, 19(7), pp. 711-20, July 1997.

· Lambertian Reflectance and Linear Subspaces, IEEE Trans. on Pattern Analysis and Machine Intelligence, 25(2):218-233, (2003). R. Basri and D. Jacobs.

· Robust Face Recognition via Sparse Representation, J. Wright, A. Yang, A. Ganesh, S. Sastry, and Y. Ma, PAMI'09

· Yaniv Taigman, Ming Yang, Marc'Aurelio Ranzato, Lior Wolf, DeepFace: Closing the Gap to Human-Level Performance in Face Verification, CVPR 2014.




Week 9


Midterm - Project proposal due

·  Electronic version due before class (email to me before the class starts)

·  No presentation needed

·  Template, using CVPR 2015template, available at


·  Requirement:

-        Strictly following CVPR template above, including font and page sizes

-        Should contain at least the following sections: (1) Introduction, (2) Proposed research, (3) Evaluation plan, and (4) References

-        Minimum page request 2 pages, this does NOT include the references.


Category Classification and Scene Understanding

·  Bag-of-words


·  Szeliski's book, Ch. 14

·  Tutorial by L. Fei-fei, R. Fergus, and A. Torralba


Paper presentation:

· [Presenter: Xiang Li] Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images, S. Gupta, P. Arbelaez, and J. Malik, CVPR 2013.

· [Presenter: Sang Yu] Clement Farabet, Camille Couprie, Laurent Najman and Yann LeCun: Learning Hierarchical Features for Scene Labeling, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013.

· [Presenter: Shuai Di] Finding Things: Image Parsing with Regions and Per-Exemplar Detectors, Joseph Tighe, Svetlana Lazebnik, CVPR 2013.

Additional Reading:

· The Pyramid Match Kernel: Efficient Learning with Sets of Features.  K. Grauman and T. Darrell.  Journal of Machine Learning Research (JMLR), 8 (Apr): 725--760, 2007.

· Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. S. Lazebnik, C. Schmid, and J. Ponce, CVPR 2006  

· Modeling the shape of the scene: a holistic representation of the spatial envelope, Aude Oliva, Antonio Torralba, International Journal of Computer Vision, Vol. 42(3): 145-175, 2001.





Week 10

Object detection  (Slides)

·  Human detection

·  Face detection

·  General object detection

Paper presentation:

· [Presenter: Ning Wang] Object Detection with Discriminatively Trained Part Based Models, P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan, PAMI, Vol. 32, No. 9, 2010

· [Presenter: Xiao Xiao] Fast, Accurate Detection of 100,000 Object Classes on a Single Machine, T. Dean, M. A. Ruzon, M. Segal, J. Shlens, S. Vijayanarasimhan, and J. Yagnik, CVPR 2013


Additional Reading:

· P. Viola and M. Jones, Robust Real-time Object Detection, IJCV 2002.

· N. Dalal and B. Triggs, Histograms of Oriented Gradients for Human Detection, CVPR, 2005.

· Human detection using partial least squares analysis. W. R. Schwartz, A. Kembhavi, D. Harwood, L.S. Davis, ICCV 2009.

· P. Dollar, C. Wojek, B. Schiele, and P. Perona. Pedestrian detection: An evaluation of the state of the art. TPAMI, 2011.

· Cong Yao, Xiang Bai, Baoguang Shi, and Wenyu Liu, Strokelets: A Learned Multi-Scale Representation for Scene Text Recognition, CVPR 2014

· Rich feature hierarchies for accurate object detection and semantic segmentation, Ross Girshick Jeff Donahue Trevor Darrell Jitendra Malik

· 30Hz Object Detection with DPM V5, Mohammad Amin Sadeghi and David Forsyth, ECCV 2014.




Week 11

 Video analysis – Tracking (Slides)

·  Survey of visual tracking



· A. Yilmaz, O. Javed, and M. Shah, Object Tracking: A Survey, ACM Journal of Computing Surveys, Vol. 38, No. 4, 2006.

· W. Luo, X. Zhao, T-K. Kim, Multiple Object Tracking: A Review, arXiv:1409.7618, 2014


Paper presentation:

· [Presenter: Tuan Anh Vo] Multi-target Tracking by Lagrangian Relaxation to Min-Cost Network Flow. Asad A. Butt and Robert T. Collins, CVPR 2013.

· [Presenter: Yiyi Zhu] Multi-Forest Tracker: A Chameleon in Tracking, David Joseph Tan, Slobodan Ilic, CVPR 2014.

· [Presenter: Yu Pang] High-Speed Tracking with Kernelized Correlation Filters, Joao F. Henriques, Rui Caseiro, Pedro Martins, Jorge Batista, PAMI, 2015


Additional Reading:

· CONDENSATION - conditional density propagation for visual tracking, M. Isard and A. Blake, IJCV 1998.

· R. Collins. Multitarget data association with higher-order motion models. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012

· H. Pirsiavash, D. Ramanan, and C. C. Fowlkes. Globally optimal greedy algorithms for tracking a variable number of objects. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2011.

· J. Berclaz, F. Fleuret, E. Turetken, and P. Fua. Multiple object tracking using k-shortest paths optimization. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 33(9):1806-1819, 2011.

· Multi-Target Tracking by Online Learning a CRF Model of Appearance and Motion Patterns, Bo Yang and Ram Nevatia, IJCV 2014




Week 12

Video analysis - Activity Understanding (Slides)

·  Features and representation

·  Low and high level models



·  J. K. Aggarwal and M. S. Ryoo, "Human Activity Analysis: A Review", ACM Computing Surveys (CSUR), 43(3), April 2011.

· P. Turaga, R. Chellappa, V.S. Subrahmanian and O. Udrea, "Machine Recognition of Human Activities: A Survey", IEEE Trans. on Circuits and Systems for Video technology, 18:1473-1488, 2008


Paper presentation:

· [Presenter: Nikhil Reddy Mogulla] Large-scale Video Classification with Convolutional Neural Networks, Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, and Li Fei-Fei, CVPR 2014

· [Presenter: Yuxi Wang] Robust Motion Segmentation with Unknown Correspondences, Pan Ji, Hongdong Li, Mathieu Salzmanm and Yuchao Dai, ECCV 2014

Additional Reading:

· On space-time interest points, I Laptev, IJCV 2005.
· Actions as space-time shapes, Gorelick, L., Blank, M., Shechtman, E., Irani, M., and Basri, R. PAMI 2007.

· Range-Sample Depth Feature for Action Recognition. Cewu Lu, Jiaya Jia, Chi-Keung Tang, CVPR 2014

· Efficient Action Localization with Approximately Normalized Fisher Vectors, Dan Oneata, Jakob Verbeek, Cordelia Schmid, CVPR 2014.

· A Hierarchical Context Model for Event Recognition in Surveillance Video, Xiaoyang Wang and Qiang Ji, CVPR 2014.





Week 13

Advanced Topics

· Medical image analysis  (Slides)


Final - Project presentation

Slot 1: Runzhong Huang | Slot 2: Miriam Fuchs | Slot 3: Yu Sang | Slot 4: Cong Rao | Slot 5: Ning Wang | Slot 6: Mian Wang |





Week 14

Advanced Topics



Final - Project presentation

Slot 7: Tuan Anh Vo | Slot 8: Yiyi Zhu | Slot 9: Abdulsalam Aref Hdadi | Slot 10: Brian Thibodeau and Peter Dane Mollica | Slot 11: Xiao Xiao | Slot 12: Tian Bai and  |






Project report due!