Undergraduate Research System | Aresty Research Center

Research Projects

Search Projects

Connect With Aresty

Location
Alexander Library, Suite 144-146
169 College Ave

Phone Number
848-932-7027

RAD Collaboratory SURF
New Optimization Approaches to Large Language Model Pretraining

Project Summary

We will explore several new optimization approaches to accelerate the pretraining of large language models (LLMs). Candidate approaches include second-order optimization and gradient orthogonalization. The student will examine the performance of these optimizers on GPT model families. We will also design new approaches to lower the communication cost of these optimizers.

Research Projects

Connect With Aresty

Follow Us