Generative AI
I. The Hidden Mastermind Behind GPT-5 Training: Hired by OpenAI Based on a Muon Blog Post
- Researcher Keller Jordan successfully joined OpenAI with just a blog post about the Muon optimizer, which may be used for GPT-5 training.
- Muon is an optimizer for the hidden layers of neural networks, using the Newton-Schulz iteration method to orthogonalize the update matrix, resulting in faster training speeds than AdamW.
- Keller criticized the optimizer research literature for being filled with methods that have not been adopted, advocating for the validation of new methods’ effectiveness in competitive training tasks.