AdaMoment: A unified adaptive-momentum framework for robust learning rate optimization

dc.authorid0000-0002-2073-8956
dc.contributor.authorDagal, Idriss
dc.date.accessioned2026-01-31T15:08:20Z
dc.date.available2026-01-31T15:08:20Z
dc.date.issued2026
dc.departmentİstanbul Beykent Üniversitesi
dc.description.abstractThe learning rate remains one of the most critical yet poorly understood hyperparameters in machine learning optimization, where adaptive methods (e.g., Adam) and momentum-based techniques (e.g., Nesterov acceleration) often suffer from inconsistent convergence, hyperparameter sensitivity, and limited robustness. To address these gaps, we propose AdaMoment, a unified framework integrating adaptive learning rate scaling, Nesterov momentum, and gradient smoothing, featuring (1) dynamic decay mechanisms for stable adaptation, (2) lookahead gradient updates to reduce oscillation, and (3) bias-corrected dual-moment tracking for noise robustness. Experiments across convex and non-convex benchmarks demonstrate AdaMoment's superiority over SGD, Adam, and RMSProp, achieving 27 % faster convergence, 18-24 % lower final loss, 15-20 % reductions in MAE/ RMSE, and strong generalization (R-2 > 0.95). Theoretically, we provide non-convex convergence guarantees, bridging adaptive and momentum-based methods; practically, our framework reduces manual tuning while scaling across architectures. This work advances robust, automated learning rate selection and elucidates the adaptation-momentum interplay in optimization.
dc.identifier.doi10.1016/j.knosys.2025.114739
dc.identifier.issn0950-7051
dc.identifier.issn1872-7409
dc.identifier.scopus2-s2.0-105022263724
dc.identifier.scopusqualityQ1
dc.identifier.urihttps://doi.org./10.1016/j.knosys.2025.114739
dc.identifier.urihttps://hdl.handle.net/20.500.12662/10658
dc.identifier.volume332
dc.identifier.wosWOS:001627929100001
dc.identifier.wosqualityQ1
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherElsevier
dc.relation.ispartofKnowledge-Based Systems
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.snmzKA_WoS_20260128
dc.subjectAdaptive optimization
dc.subjectLearning rate scheduling
dc.subjectNesterov momentum
dc.subjectNon-convex optimization
dc.subjectHybrid optimizer and gradient noise robustness
dc.titleAdaMoment: A unified adaptive-momentum framework for robust learning rate optimization
dc.typeArticle

Dosyalar