Pedestrian Detection

for Mobile Platforms


  Slides      Paper      Model       Code

Project Goal

"Produce a deep network capable of detecting and localizing nearby, potentially occluded pedestrians under reasonable lighting conditions, that can be integrated into a mobile robot (eg. Tegra X1 chip)"

Compression Techniques


KD

Knowledge Distillation

Train a large network as teacher model and transfer the knowledge to smaller student model [Hinton, 2015]

MC

Model Confidence

Estimate teacher confidence by enable dropout at test time [Yarin Gal, 2015], and maximize the log-likehood of multivariate Gaussian distribution

HL

Hint Layer

Increase the dimensionality of teacher's guidance by adding extra fully-connected layer to allow better representation of teacher output distribution

HF

Hand-designed Features

Use Aggregate Channel Features [Dollar, 2014] as hand-designed feature to increse the student model complexity without introducing significant overhead

Pipeline Comparison


Results


Log-average miss rate on Caltech
(lower is better)
Models Log-avg MR Drop
 Teacher 17.5% 0.0%
 Student 24.5% 7.0%
 Student+KD 24.8% 7.3%
 Student+KD+Conf 23.7% 6.2%
 Student+KD+Hint 23.1% 5.6%
 Student+KD+Conf+Hint 22.4% 4.9%
 Student+KD+ACF 25.2% 7.7%
 Student+KD+ACF+Conf+Hint 23.4% 5.9%

Resource Usage
(Measured on Titan-X)
Models Params Memory Speed
 ResNet-200
 (Teacher)
63 M 4.93 GB 24 ms
 ResNet-18 11 M 612 MB 3 ms
 ResNet-18-Thin 2.8 M 308 MB 3 ms
 ResNet-18-Small 0.16 M 240 MB 3 ms