Abeuov, Nurmukhammed and Absatov, Daniyar and Mutaliyev, Yelnur and Serek, Azamat (2025) Accurate Crowd Counting Using an Enhanced LCDANet with Multi-Scale Attention Modules. Buletin Ilmiah Sarjana Teknik Elektro, 7 (3). pp. 657-667.
14391-Article Text-65260-1-10-20251016.pdf - Published Version
Download (740kB)
Abstract
Accurate crowd counting remains a challenging task due to occlusion, scale variation, and complex scene layouts. This study proposes ME-LCDANet, an enhanced deep learning framework built upon the LCDANet backbone, integrating multi-scale feature extraction via Micro Atrous Spatial Pyramid Pooling (MicroASPP) and attention refinement using CBAMLite modules. A preprocessing pipeline with Gaussian-based density maps, synchronized augmentations, and a dual-objective loss function combining density and count supervision supports effective training and generalization. Experimental evaluation on the ShanghaiTech Part B dataset demonstrates a Mean Absolute Error (MAE) of 11.50 (95% CI: 10.20–12.91) and a Root Mean Squared Error (RMSE) of 11.54 (95% CI: 10.26–12.99). Training dynamics indicate steadily declining loss and reduced validation MAE, while gradient norm analysis suggests reliable convergence. Comparative results show that, although CSRNet and SaNet achieve slightly lower MAE, ME-LCDANet attains a notably reduced RMSE, reflecting robustness against large prediction deviations. While the study focuses on a single benchmark dataset, the proposed architecture offers a promising approach for robust crowd counting in diverse scenarios.
| Item Type: | Article |
|---|---|
| Subjects: | T Technology > TK Electrical engineering. Electronics Nuclear engineering |
| Depositing User: | BISTE UAD |
| Date Deposited: | 16 May 2026 16:47 |
| Last Modified: | 16 May 2026 16:47 |
| URI: | https://alxiv.org/id/eprint/868 |
