A Larger Stepsize Improves Gradient Descent in Classisificaion Problems

Abstract Number:

3762 

Submission Type:

Contributed Abstract 

Contributed Abstract Type:

Paper 

Participants:

Jingfeng Wu (1), Matus Telgarsky (2), Bin Yu (1), Peter Bartlett (3)

Institutions:

(1) University of California at Berkeley, N/A, (2) New York University, N/A, (3) University of California Berkeley, N/A

Co-Author(s):

Matus Telgarsky  
New York University
Bin Yu  
University of California at Berkeley
Peter Bartlett  
University of California Berkeley

First Author:

Jingfeng Wu  
University of California at Berkeley

Presenting Author:

Jingfeng Wu  
N/A

Abstract Text:

Gradient Descent (GD) and Stochastic Gradient Descent (SGD) are pivotal in machine learning, particularly in neural network optimization. Conventional wisdom suggests smaller stepsizes for stability, yet in practice, larger stepsizes often yield faster convergence and improved generalization, despite initial instability. This talk delves into the dynamics of GD for logistic regression with linearly separable data, under the setting that the stepsize η is constant but large, whereby the loss initially oscillates. We show that GD exits the initial oscillatory phase rapidly in under O(η) iterations, subsequently achieving a risk of Õ(1 / (t η)). This analysis reveals that, without employing momentum techniques or variable stepsize schedules, GD can achieve an accelerated error rate of Õ(1/T^2) after T iterations with a stepsize of η = Θ(T). In contrast, if the stepsize is small such that the loss does not oscillate, we show an Ω(1/T) lower bound. Our results further extend to general classification loss functions, nonlinear models in the neural tangent kernel regime, and SGD with large stepsizes. Our results are validated with experiments on neural networks.

Keywords:

logistic regression|gradient descent|optimization|neural network|acceleration|edge of stability

Sponsors:

IMS

Tracks:

Foundations of Machine Learning

Can this be considered for alternate subtype?

Yes

Are you interested in volunteering to serve as a session chair?

Yes

I have read and understand that JSM participants must abide by the Participant Guidelines.

Yes

I understand that JSM participants must register and pay the appropriate registration fee by June 1, 2024. The registration fee is non-refundable.

I understand