Mangkunegara, Iis Setiawan and Purwono, Purwono and Ma’arif, Alfian and Basil, Noorulden and Marhoon, Hamzah M. and Sharkawy, Abdel-Nasser (2025) Transformer Models in Deep Learning: Foundations, Advances, Challenges and Future Directions. Buletin Ilmiah Sarjana Teknik Elektro, 7 (2). pp. 231-241.
13053-Article Text-58269-1-10-20250624.pdf - Published Version
Download (923kB)
Abstract
Transformer models have significantly advanced deep learning by introducing parallel processing and enabling the modeling of long-range dependencies. Despite their performance gains, their high computational and memory demands hinder deployment in resource-constrained environments such as edge devices or real-time systems. This review aims to analyze and compare Transformer architectures by categorizing them into encoder-only, decoder-only, and encoder-decoder variants and examining their applications in natural language processing (NLP), computer vision (CV), and multimodal tasks. Representative models BERT, GPT, T5, ViT, and MobileViT are selected based on architectural diversity and relevance across domains. Core components including self-attention mechanisms, positional encoding schemes, and feed-forward networks are dissected using a systematic review methodology, supported by a visual framework to improve clarity and reproducibility. Performance comparisons are discussed using standard evaluation metrics such as accuracy, F1-score, and Intersection over Union (IoU), with particular attention to trade-offs between computational cost and model effectiveness. Lightweight models like DistilBERT and MobileViT are analyzed for their deployment feasibility. Major challenges including quadratic attention complexity, hardware constraints, and limited generalization are explored alongside solutions such as sparse attention mechanisms, model distillation, and hardware accelerators. Additionally, ethical aspects including fairness, interpretability, and sustainability are critically reviewed in relation to Transformer adoption across sensitive domains. This study offers a domain-spanning overview and proposes practical directions for future research aimed at building scalable, efficient, and ethically aligned. Transformer-based systems suited for mobile, embedded, and healthcare applications.
| Item Type: | Article |
|---|---|
| Subjects: | T Technology > TK Electrical engineering. Electronics Nuclear engineering |
| Depositing User: | Alfian Ma'arif |
| Date Deposited: | 08 Apr 2026 08:48 |
| Last Modified: | 08 Apr 2026 08:48 |
| URI: | https://alxiv.org/id/eprint/7 |
