Abstract
The performance of optimizers, particularly in deep learning, depends
considerably on their chosen hyperparameter configuration. The efficacy of
optimizers is often studied under near-optimal problem-specific
hyperparameters, and finding these settings may be prohibitively costly for
practitioners. In this work, we argue that a fair assessment of optimizers'
performance must take the computational cost of hyperparameter tuning into
account, i.e., how easy it is to find good hyperparameter configurations using
an automatic hyperparameter search. Evaluating a variety of optimizers on an
extensive set of standard datasets and architectures, our results indicate that
Adam is the most practical solution, particularly in low-budget scenarios.
Users
Please
log in to take part in the discussion (add own reviews or comments).