Human joint Pose Estimator


Disclaimer

I tried to reproduce the results from DeepPose: Human Pose Estimation via Deep Neural Networks (project link). I had no contribution over the original paper, all I did was to recreate the same results from the paper.


Figure 1: My result from reproducing the DeepPose

1. Introduction

The problem of human pose estimation, which involves locating human joints in images, has been extensively studied in the computer vision community. This problem is challenging due to factors such as strong articulations, occlusions, and the need to capture context. Most work in this field has focused on modeling articulations using part-based models, which have limited expressiveness and only model a small subset of interactions between body parts. Holistic methods have been proposed, but have had limited success in real-world problems. In this paper, the authors propose a novel approach to human pose estimation using deep neural networks (DNNs). They show that DNNs can effectively capture the full context of each body joint and present a simple yet powerful formulation of the problem as a joint regression task. They propose a cascade of DNN-based pose predictors to refine joint predictions and achieve state-of-the-art results on four widely used benchmarks. The approach is shown to perform well on images with strong variation in appearance and articulations, and generalizes well across datasets..

2. Architecture

2.1 Pose Vector

2.2 CNN Architecture

2.3 DNN regressor:

Figure 2: An overview of the model: Image from the original paper

3. Accuracy Metrics

3.1 Percentage of Correct Parts (PCP)

The Percentage of Correct Parts (PCP) is a metric used to measure the detection rate of limbs. It considers a limb to be detected if the distance between the two predicted joint locations and the true limb joint locations is no more than half of the limb length. However, the PCP metric has limitations in that it penalizes shorter and more difficult to detect limbs.
Figure 3: PCP during training

3.2 Percent of Detected Joints (PDJ)

Another metric called the Percent of Detected Joints (PDJ) has been proposed to address the limitations of the PCP metric. This metric measures joint detection rates by considering a joint to be detected if the distance between the predicted and true joint locations is within a certain fraction of the torso diameter. By adjusting this fraction, detection rates can be obtained for different levels of localization precision.
Figure 4: PDJ during training

4. Results

After we trained the model, we tested it over the training and testing dataset to see its performace.

4.1 Train images

Figure 5: Models output using training image

4.2 Test images

Figure 6: Models output using testing image

5. Conclusion

As you can see, the result for both training and testing images are good. However, in order to get a better preformance on the testing images (since the training images are very good) is to increase the number of data sample for training. We can also notice the overfitting effect on Figure 3 & Figure 4.

More infomation