Fall 2013 - CIS 5543|4360


Computer Vision   


Basic Information:

·  Lecture time: Tuesday 5:30-8:00pm, TTLMAN 0403B

·  Instructor: Haibin Ling | Wachman Hall, Room 305 | 215-204-6973 | hbling AT temple.edu

·  Office Hours: Tuesday 3:00-5:00pm, or by appointment

·  Syllabus: PDF


Matlab Resources
Temple Univ.
CIS Dept.
IST Center



Topics and Materials (Tentative, up to change)

Week 1

General introduction. (Slide 1, 2)

·  Background

·  Topics in visual data analysis

·  Applications

·  Related fields
·  Image formation


·  Szeliski's book, Ch. 1-2



Week 2

Background – Elementary Math and Statistics (Slides)

·  Elementary probability and statistics

·  Linear analysis (PCA, LDA, etc.)

·  Graph algorithms (HMM, MRF, etc.)


Materials: (Slides)

·  Szeliski's book, Appendix A-C

·  A Global Geometric Framework for Nonlinear Dimensionality Reduction, Joshua B. Tenenbaum, Vin de Silva, John C. Langford, Science, 2000.

·  Nonlinear Dimensionality Reduction by Locally Linear Embedding, Sam T. Roweis, Lawrence K. Saul, Science, 2000.



Week 3

Background - Optimization (Slides)

·  Basic idea and linear optimization

·  Convex optimization


·  Convex optimization by Boyd and Vandenberghe.

·  Additive logistic regression: a statistical view of boosting, J. Friedman, T. Hastie, and R. Tibshirani, The Annals of Statistics 2000.



Week 4



Paper selection due!

Segmentation (Slides)

·  Clustering, perceptual grouping, segmentation



·  Szeliski's book, Ch. 5

Paper presentation:

· [Presenter: David Dobor, slides] Fast Approximate Energy Minimization via Graph Cuts, Y. Boykov, O. Veksler and R. Zabih, PAMI, 2001.

· [Presenter: Feipeng Zhao, slides] Video Object Segmentation through Spatially Accurate and Temporally Dense Extraction of Primary Object Regions, D. Zhang, O. Javed, M. Shah, CVPR 2013


Additional Readings:

· Normalized Cuts and Image Segmentation, by J. Shi and J. Malik, PAMI 2000

· Stochastic relaxation, gibbs distributions, and the bayesian restoration of images,  S. Geman and D. Geman. PAMI, 6:721--741, 1984.
· Learning to Combine Bottom-Up and Top-Down Segmentation, A. Levin and Y. Weiss, IJCV 2009.

· Random walks for image segmentation, L. Grady, PAMI, 2006.
· Efficient Graph-Based Image Segmentation, P.F. Felzenszwalb and Daniel P. Huttenlocher, IJCV, 59(2), 2004.

· Spatial Inference Machines, Roman Shapovalov, Dmitry Vetrov, Pushmeet Kohli, CVPR 2013

· Hierarchy and adaptivity in segmenting visual scenes, E. Sharon, M. Galun, D. Sharon, R. Basri, and A. Brandt, Nature, 442(7104): 719-846, August 17, 2006.



Week 5

Features and Representation (Slides)

·  What are features?

·  Invariant features and representations



·  Szeliski's book, Ch. 4


Paper presentation

· [Presenter: Joseph Catrambone] Distinctive image features from scale-invariant keypoints," D. Lowe, IJCV 2004.

· [Presenter: Yufeng Wang] Performance Evaluation of 3D Keypoint Detectors, F. Tombari, S. Salti, L. Di Stefano, International Journal of Computer Vision (IJCV), 102(1-3):198-220, 2013


Additional Readings:

· C. Schmid & R. Mohr, Local Grayvalue Invariants for Image Retrieval, PAMI, 19(5), 530-535, 1997.

· Herbert Bay, Andreas Ess, Tinne Tuytelaars, Luc Van Gool , SURF: Speeded Up Robust Features, CVIU, Vol. 110, No. 3, pp. 346--359, 2008



Week 6

Matching, Alignment, and Registration (slides)

·  Feature matching

·  Surface matching

·  Image matching


·  Szeliski's book, Ch. 6


Paper presentation:

· [Presenter: Masih Tabrizi] ORB: an efficient alternative to SIFT or SURF, E. Rublee, V. Rabaud, K. Konolige and G. Bradski, ICCV 2011.

· [Presenter: Peiyi Li] A tensor-based algorithm for high-order graph matching, O. Duchenne, F. Bach, I. Kweon, and J. Ponce, TPAMI, 33(12):2383–2395, 2011.

Additional Readings:

· [Presenter: TBA] A new point matching algorithm for non-rigid registration, H. Chui and A. Rangarajan, Computer Vision and Image Understanding (CVIU), 89:114-141, 2003.

· [Presenter: TBA] Alignment by Maximization of Mutual Information, by P. Viola and W. M. Wells III, IJCV, 1997
· Principal Warps: Thin-Plate Splines and the Decomposition of Deformations, by F. Bookstein, PAMI 1989, Vol 11, No 6.

· Iterative Point Matching for Registration of Free-Form Curves and Surfaces, Z. Zhang, IJCV, Vol.13, No.2, pages 119-152, 1994




Week 7

Shape (Slides)

·  Shape space - Procrustes analysis

·  Shape matching - TPS, RANSAC

·  Shape classification - shape context


·  Szeliski's book, Ch. 14


Paper presentation:

· [Presenter: Qingyuan Liu] Shape Matching and Object Recognition Using Shape Contexts, S. Belongie, J. Malik and J. Puzicha, PAMI, 24(4):509-522, 2002.

· [Presenter: Samantha Claudet] The Shape Boltzmann Machine: a Strong Model of Object Shape, S. M. A. Eslami, N. Heess, J. Winn, CVPR 2012.


Additional Readings:

· 3D model retrieval using probability density-based shape descriptors, C. B. Akgul, B. Sankur, Y. Yemez, and F. Schmitt, PAMI, vol. 31, no. 6, 2009.

· Shape distributions, R. Osada, T. Funkhouser, B. Chazelle, and D. Dobkin, ACM Trans. Graphics, vol. 21, no. 4, pp. 807–832, 2002.

· Active shape models - their training and application, T. F. Cootes, C. J. Taylor, D. H. Cooper, and J. Graham, CVIU, 61(1):38-59, 1995.

· Rotation invariant spherical harmonic representation of 3D shape descriptors, M. Kazhdan, T. Funkhouser, and S. Rusinkiewicz. Eurographics/ACM SIGGRAPH symposium on Geometry processing (SGP). 2003.

· A survey of content based 3D shape retrieval methods. J. W. Tangelder and R.C. Veltkamp. Multimedia Tools Appl. 39, 3 2008, 441-471

· Patchwork of Parts Models for Object Recognition, Y. Amit, A.PTrouve, IJCV 2007. 



Week 8

Visual recognition - objects and faces (Slides)

·  Eigenface


·  ASM


·  Szeliski's book, Ch. 14


Paper presentation:

· [Presenter: Jun Yang] Robust Face Recognition via Sparse Representation, J. Wright, A. Yang, A. Ganesh, S. Sastry, and Y. Ma, PAMI'09

· [Presenter: Shuang Liang] Describable Visual Attributes for Face Verification and Image Search, by N. Kumar, A. C. Berg, P. N. Belhumeur, and S. K. Nayar, PAMI'11


Additional Reading:

· Eigenfaces for recognition, Journal of Cognitive Neuroscience, Turk, M. & Pentland, A. (1991) 3, 71-86

· Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection, P. Belhumeur, J. Hespanha and D. Kriegman, PAMI, 19(7), pp. 711-20, July 1997.

· Blessing of Dimensionality: High-dimensional Feature and Its Efficient Compression for Face Verification, D. Chen, X. Cao, F. Wen, and J. Sun, CVPR 2013
· Multilinear Analysis of Image Ensembles: TensorFaces, M. A. O. Vasilescu, D. Terzopoulos, ECCV, 2002.

· Lambertian Reflectance and Linear Subspaces, IEEE Trans. on Pattern Analysis and Machine Intelligence, 25(2):218-233, (2003). R. Basri and D. Jacobs.

· Face recognition using laplacianfaces, X. He, S. Yan, Y. Hu, and P. Niyogi, IEEE Trans. on Pattern Analysis and Machine Intelligence, 2005.



Week 9


Midterm - Project proposal due

·  Electronic version due before class (email to me before the class starts)

·  No presentation needed

·  Template, using CVPR 2013 template, available at


 ·  Requirement:

-        Strictly following CVPR template above, including font and page sizes

-        Should contain at least the following sections: (1) Introduction, (2) Proposed research, (3) Evaluation plan, and (4) References

-        Minimum page request 2 pages, this does NOT include the references.


Category Classification and Scene Understanding

·  Bag-of-words


·  Szeliski's book, Ch. 14

·  Tutorial by L. Fei-fei, R. Fergus, and A. Torralba


Paper presentation:

· [Presenter: Chen Shen] Modeling the shape of the scene: a holistic representation of the spatial envelope, Aude Oliva, Antonio Torralba, International Journal of Computer Vision, Vol. 42(3): 145-175, 2001.

· [Presenter: Ferria Amzovski] Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images, S. Gupta, P. Arbelaez, and J. Malik, CVPR 2013.


Additional Reading:

· The Pyramid Match Kernel: Efficient Learning with Sets of Features.  K. Grauman and T. Darrell.  Journal of Machine Learning Research (JMLR), 8 (Apr): 725--760, 2007.

· Locality-constrained Linear Coding for image classification. J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, Y. Gong, CVPR, 2010.

· Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. S. Lazebnik, C. Schmid, and J. Ponce, CVPR 2006  



Week 10

Object detection  (Slides)

·  Human detection

·  Face detection

·  General object detection

Paper presentation:

· [Presenter: Longfei Wu] Object Detection with Discriminatively Trained Part Based Models, P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan, PAMI, Vol. 32, No. 9, 2010

· [Presenter: Liya Ma] Fast, Accurate Detection of 100,000 Object Classes on a Single Machine, T. Dean, M. A. Ruzon, M. Segal, J. Shlens, S. Vijayanarasimhan, and J. Yagnik, CVPR 2013

· [Presenter: Adama Coulibaly] Human detection using partial least squares analysis. W. R. Schwartz, A. Kembhavi, D. Harwood, L.S. Davis, ICCV 2009.


Additional Reading:

· P. Viola and M. Jones, Robust Real-time Object Detection, IJCV 2002.

· N. Dalal and B. Triggs, Histograms of Oriented Gradients for Human Detection, CVPR, 2005.

· P. Dollar, C. Wojek, B. Schiele, and P. Perona. Pedestrian detection: An evaluation of the state of the art. TPAMI, 2011.



Week 11

 Video analysis – Tracking (Slides)

·  Survey of visual tracking



· A. Yilmaz, O. Javed, and M. Shah, Object Tracking: A Survey, ACM Journal of Computing Surveys, Vol. 38, No. 4, 2006.

Paper presentation:

· [Presenter: Nicholas J. Woodward] J. Berclaz, F. Fleuret, E. Turetken, and P. Fua. Multiple object tracking using k-shortest paths optimization. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 33(9):1806-1819, 2011.

· [Presenter: Shuang Liang] I. Saleemi and M. Shah, Multiframe Many-Many Point Correspondence for Vehicle Tracking in High Density Wide Area Aerial Videos, International Journal of Computer Vision, 2013.


Additional Reading:

· CONDENSATION - conditional density propagation for visual tracking, M. Isard and A. Blake, IJCV 1998.

· Kernel-Based Object Tracking, D. Comaniciu, V. Ramesh, P. Meer, PAMI, Vol. 25, No. 5, 564-575, 2003

· Robust Object Tracking with Online Multiple Instance Learning, B. Babenko, M.-H. Yang, and S. Belongie, PAMI, vol. 33, no. 8, pp. 1619-1632, 2011.

· Tracking-Learning-Detection, Kalal, Z.;   Matas, J.;   Mikolajczyk, K, PAMI 2012

· Multi-target tracking by on-line learned discriminative appearance models," C.H. Kuo, C. Huang, and R. Nevatia, CVPR 2010.

· [Presenter: TBA] Struck: Structured output tracking with kernels, S. Hare, A. Saffari, and P. Torr, ICCV, 2011

· Detection and Tracking of Multiple, Partially Occluded Humans by Bayesian Combination of Edgelet based Part Detectors, B. Wu and R. Nevatia, IJCV 2007.

· R. Collins. Multitarget data association with higher-order motion models. In IEEE Conference on Computer Vision

and Pattern Recognition (CVPR), 2012

· Multi-target Tracking by Lagrangian Relaxation to Min-Cost Network Flow. Asad A. Butt and Robert T. Collins, CVPR 2013.

· H. Pirsiavash, D. Ramanan, and C. C. Fowlkes. Globally optimal greedy algorithms for tracking a variable number of objects. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2011.



Week 12

Video analysis - Activity Understanding (Slides)

·  Features and representation

·  Low and high level models



·  J. K. Aggarwal and M. S. Ryoo, "Human Activity Analysis: A Review", ACM Computing Surveys (CSUR), 43(3), April 2011.


Paper presentation:

· [Presenter: Hongxu Zhang] Learning realistic human actions from movies, I. Laptev, M. Marszalek, C. Schmid and B. Rozenfeld; CVPR, 2008.

· [Presenter: Lakesh Kansakar] Context-Aware Modeling and Recognition of Activities in Video. Y. Zhu, N. M. Nayak, A. K. Roy-Chowdhury, CVPR 2013.


Additional Reading:

· On space-time interest points, I Laptev, IJCV 2005.
· Behavior Recognition via Sparse Spatio-Temporal Features, P Dollar, V Rabaud, G Cottrell, S Belongie, VS-PETS, 2005.

· P. Turaga, R. Chellappa, V.S. Subrahmanian and O. Udrea, "Machine Recognition of Human Activities: A Survey", IEEE Trans. on Circuits and Systems for Video technology, 18:1473-1488, 2008

· Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words, J. C. Niebles, H. Wang and L. Fei-Fei, IJCV 2008.

·  Actions as space-time shapes, Gorelick, L., Blank, M., Shechtman, E., Irani, M., and Basri, R. PAMI 2007.



Week 13

Advanced Topics

·  TBA


Final - Project presentation

· F. Amzovski (1), J. Catrambone (2), D. Dobor (3), C. Shen (4), F. Zhao (5)

Y. Wang & H. Zhang (6), J. Yang (7), P. Li (8)









Week 14

Advanced Topics

·  TBA


Final - Project presentation

·  L. Ma (9),  S. Claudet (10), S. Liang & L. Wu (11), N. Woodward (12)

Q. Liu (13), L. Kansakar (14), A. Coulibaly (15), M. Tabrizi (16)   




Project report due!