blog

Generative AI

I. The Hidden Mastermind Behind GPT-5 Training: Hired by OpenAI Based on a Muon Blog Post

  1. Researcher Keller Jordan successfully joined OpenAI with just a blog post about the Muon optimizer, which may be used for GPT-5 training.
  2. Muon is an optimizer for the hidden layers of neural networks, using the Newton-Schulz iteration method to orthogonalize the update matrix, resulting in faster training speeds than AdamW.
  3. Keller criticized the optimizer research literature for being filled with methods that have not been adopted, advocating for the validation of new methods’ effectiveness in competitive training tasks.

Scroll to Top