UNC Crowd Scene Analysis Dataset

Pedestrian Count

Lighting Condition

Camera Viewpoint

Environment (Indoor vs Outdoor)

Background
(Real vs Simulated)

Crowd Behaviour (Aggresive vs Shy)

Overview video

Synthetic Videos
Labels: High density, Max Pedestrian Count: 382, Good Light Condition, Indoor Environment, Simulated Background, Crowd Behavior: Aggressive, Congestion

The main goal of our work is to produce a rich labeled dataset for machine learning. In order to learn high-level features from videos or images, for example performing crowd counting and motion segmentation, a ground truth is often necessary for training or verifies the results. Eventually, we will produce over a million videos, and over 20 million images with ground truth labeled. Our video dataset is generated with a diversity of variations, including: