AdaMoment: A unified adaptive-momentum framework for robust learning rate optimization
| dc.authorid | 0000-0002-2073-8956 | |
| dc.contributor.author | Dagal, Idriss | |
| dc.date.accessioned | 2026-01-31T15:08:20Z | |
| dc.date.available | 2026-01-31T15:08:20Z | |
| dc.date.issued | 2026 | |
| dc.department | İstanbul Beykent Üniversitesi | |
| dc.description.abstract | The learning rate remains one of the most critical yet poorly understood hyperparameters in machine learning optimization, where adaptive methods (e.g., Adam) and momentum-based techniques (e.g., Nesterov acceleration) often suffer from inconsistent convergence, hyperparameter sensitivity, and limited robustness. To address these gaps, we propose AdaMoment, a unified framework integrating adaptive learning rate scaling, Nesterov momentum, and gradient smoothing, featuring (1) dynamic decay mechanisms for stable adaptation, (2) lookahead gradient updates to reduce oscillation, and (3) bias-corrected dual-moment tracking for noise robustness. Experiments across convex and non-convex benchmarks demonstrate AdaMoment's superiority over SGD, Adam, and RMSProp, achieving 27 % faster convergence, 18-24 % lower final loss, 15-20 % reductions in MAE/ RMSE, and strong generalization (R-2 > 0.95). Theoretically, we provide non-convex convergence guarantees, bridging adaptive and momentum-based methods; practically, our framework reduces manual tuning while scaling across architectures. This work advances robust, automated learning rate selection and elucidates the adaptation-momentum interplay in optimization. | |
| dc.identifier.doi | 10.1016/j.knosys.2025.114739 | |
| dc.identifier.issn | 0950-7051 | |
| dc.identifier.issn | 1872-7409 | |
| dc.identifier.scopus | 2-s2.0-105022263724 | |
| dc.identifier.scopusquality | Q1 | |
| dc.identifier.uri | https://doi.org./10.1016/j.knosys.2025.114739 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.12662/10658 | |
| dc.identifier.volume | 332 | |
| dc.identifier.wos | WOS:001627929100001 | |
| dc.identifier.wosquality | Q1 | |
| dc.indekslendigikaynak | Web of Science | |
| dc.indekslendigikaynak | Scopus | |
| dc.language.iso | en | |
| dc.publisher | Elsevier | |
| dc.relation.ispartof | Knowledge-Based Systems | |
| dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | |
| dc.rights | info:eu-repo/semantics/closedAccess | |
| dc.snmz | KA_WoS_20260128 | |
| dc.subject | Adaptive optimization | |
| dc.subject | Learning rate scheduling | |
| dc.subject | Nesterov momentum | |
| dc.subject | Non-convex optimization | |
| dc.subject | Hybrid optimizer and gradient noise robustness | |
| dc.title | AdaMoment: A unified adaptive-momentum framework for robust learning rate optimization | |
| dc.type | Article |












