User:Yhcai/My Reading Area

4.22

Stochastic Adaptive Tracking In A Camera Network , ICCV2007

Geolocating Static Cameras, ICCV-workshop 2007
Learning Higher-order Transition Models in Medium-scale Camera Networks , ICCV2007

Xiaowan's Master thesis: Image Point Correspondence and its Applications

Four steps of finding image point correspondence among images:

1) Feature space: feature selection, such as SIFT, Shape context, SPIN, RIFT

2) Search space: determine the geometric transformation models: global or local, then determine parameters of models.

其中全局几何变换模型分为homography, epipolar constraint, etc

局部映射模型

分为：分段线性映射，样条函数(spline)，曲面拟合

3) Similarity measure: How to calulate the similarity between features.

4) Searching strategy: how to efficiently find out the parameters? and correspondence between features: 1-to-many?

常见搜索策略分为迭代松弛，分支限界法，梯度下降法， LM法， k-d tree, hashing

Additional steps: How to make use of geometric constraint to remove outliers? 和对对应点进行扩散

Useful information：

Rotation-invariant feature descriptor: Fourier Mellin, MSER, SPIN, RIFT

Scale-invariant feature descriptor:

Affine-invariant feature descriptor: SIFT, Shape context

How to compare two features: Chamfer distance, Hausdorff distance, SAD, ZNCC,

其中SAD以相关函数取最小值作为最佳匹配

ZNCC以相关函数取最大值作为最佳匹配（被Shu同学鄙视的一塌糊涂）

局部几何约束：将整幅图像分成小块，每一块可以近似用homography或者affine transformation model.

最大的收获：SIFT描述子的改进： ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

edge-based sift descriptor: Shape recognition with edge-based features.

SIFT+ shape context: A sift descriptor with global context

Color sift: CSIFT: a sift descriptor with color invariant characteristics

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

4.24

Correspondence-Free Multi-Camera Activity Analysis and Scene Modeling

6.15

Spatial random partition for common visual pattern discovery : Very interesting paper, 可以抄公式

Real-time foreground-background segmentation using codebook model

类似在线k-means的方法，得到一个codebook。其中可以考虑他的color and brightness部分来做color constancy.

其中codebook中包含了像素的最大、最小亮度值，codeword出现的频率，maximum negative run-length：longest interval during the training period that the codeword has not recurred.

The first and last time the codeword has occurred.

Multiscale categorical object recognition using contour fragments 星状体模型，

each scale-normalized part F is composed of a contour fragment with expected offset from centroid and spatial uncertainty.

The Chamfer distance gives the mean distance of edgels in a template T to their closet edgels in an edge map E.

mean difference in orientation between edgels in template T and the nearest edgels in edgel map E.

1. 生成一个codebook：每一类中有一个codebook

2. Fragment Clustering，通过k-medoids algorithm

3. 通过mean shift找到一组空间位置一致的subcluster

local descriptors of contour:

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

shape recognition with edge-based features

shape matching and object recognition using shape contexts

groups of adjacent contour segments for object detection.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

6.22

Applying color names to image description:

基于photometric 的方法，不好区分achromatic colors, such as black, grey and white。相应的方法有hue, normalized RGB etc。

该文提出一种基于color name的方法，把颜色空间聚类为11种basic color。

对于每一幅图像=〉Harris-Laplace detector=〉归一化=>descriptor 基于(SIFT, Color)=〉SVM分类

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Compact object descriptors from local color invariant histograms.

Scene classification via PLSA

color research and application

region-based image retrieval with high-level semantic color names

Categorizing nine visual classes using local appearance descriptors

a visual vocabulary for flower classification. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

8.19

Detecting Global motion patterns in complex videos, humin, ICPR08

[10] modeled the motion of all the moving objects performing the same activity by analyzing the temporal deformation of the shape which was constructed by joining the locations of objects in each frame.

(1) 对于每一帧的每一点i计算flow vector，其中包含位置和速度信息。

(2) apply 一个阈值去除背景上的vector。

(3) using Gaussian ART[13] 来降维

(4) 在寻找轨迹末端点的时候，不仅考虑了previous velocity而且考虑了velocities of its neighbors，这样得到了sink and its associated sink path

(5) 聚类sink and sink paths，通过在线的方式，得到所谓的super track, 表示主要的运动方向

(6) super track的匹配，可作视频检索，用bipartite graph作匹配。

Fast Human Detection from Videos Using Covariance Features

a cascade of LogitBoost classifiers

selected features: 8维

The covariance matrix is a very informative descriptor which encodes information about the variance of the features, their correlations with each other, and spatial layout. It can be computed efficiently computed by integral images.

Multi-Object Tracking Using Color, Texture and Motion

Combine color histogram, correlogram, LBP, motion smoothness, geometric distance for object tracking.

其中tracking 的框架采用的是杨涛的方法

Recognizing Action as Clouds of Space-Time Inerest Points

在不同的时间尺度上取spatio-temporal interest points. 每个尺度一朵云，总共6朵云，计算每朵云的shape, speed and density以及相对于目标质心位置的shape amd location information。总共S个尺度，因此每帧特征维数为8S+2。对于每个特征计算Nb个bins的直方图。

在挑选特征的时候，a feature is deemed as being informative and relevant to the recognition task if its value varies little for actions of the same class but vaires significantly for actions of different classes