SOCA-PRNet: Spatially Oriented Attention-Infused Structured-Feature-Enabled PoseResNet for 2D Human Pose Estimation

Ali Zakir; Sartaj Ahmed Salman; Hiroki Takahashi

doi:10.3390/s24010110

SOCA-PRNet: Spatially Oriented Attention-Infused Structured-Feature-Enabled PoseResNet for 2D Human Pose Estimation

Sensors (Basel). 2023 Dec 25;24(1):110. doi: 10.3390/s24010110.

Authors

Ali Zakir¹, Sartaj Ahmed Salman¹, Hiroki Takahashi^{1

2}

Affiliations

¹ Department of Informatics, Graduate School of Informatics and Engineering, The University of Electro-Communications, Tokyo 182-8585, Japan.
² Artificial Intelligence Exploration Research Center/Meta-Networking Research Center, The University of Electro-Communications, Tokyo 182-8585, Japan.

Abstract

In the recent era, 2D human pose estimation (HPE) has become an integral part of advanced computer vision (CV) applications, particularly in understanding human behaviors. Despite challenges such as occlusion, unfavorable lighting, and motion blur, advancements in deep learning have significantly enhanced the performance of 2D HPE by enabling automatic feature learning from data and improving model generalization. Given the crucial role of 2D HPE in accurately identifying and classifying human body joints, optimization is imperative. In response, we introduce the Spatially Oriented Attention-Infused Structured-Feature-enabled PoseResNet (SOCA-PRNet) for enhanced 2D HPE. This model incorporates a novel element, Spatially Oriented Attention (SOCA), designed to enhance accuracy without significantly increasing the parameter count. Leveraging the strength of ResNet34 and integrating Global Context Blocks (GCBs), SOCA-PRNet precisely captures detailed human poses. Empirical evaluations demonstrate that our model outperforms existing state-of-the-art approaches, achieving a Percentage of Correct Keypoints at 0.5 (PCKh@0.5) of 90.877 at a 50% threshold and a Mean Precision (Mean@0.1) score of 41.137. These results underscore the potential of SOCA-PRNet in real-world applications such as robotics, gaming, and human-computer interaction, where precise and efficient 2D HPE is paramount.

Keywords: 2D human pose estimation; CV; Global Context Blocks; SOCA-PRNet.

MeSH terms

Humans
Lighting*
Motion
Robotics*

Grants and funding

This research received no external funding.