AI Dev News | Machine Learning Engineering

AI Dev News covers applied AI engineering, LLM integration, and practical ML operations.

ONNX Runtime optimization levels: which fusions fire where

19 mins read

AI/ML

ONNX Runtime optimization levels: which fusions fire where

May 22, 2026May 22, 2026 Kwesi Mensah0Tagged Basic, Basic vs Extended, Extended, ONNX, onnx runtime optimization levels, Runtime

Understand onnx runtime optimization levels, the main trade-offs, and the practical checks to use before relying on it in practice.

JAX Gradient Checkpointing on TPU v5e: 40% Memory Cut at 12% Speed Cost

12 mins read

Cloud Computing

JAX Gradient Checkpointing on TPU v5e: 40% Memory Cut at 12% Speed Cost

April 12, 2026May 22, 2026 Jia Li Song0Tagged Anthropic News, Cohere News, Hugging Face News, JAX News, Keras News, Mistral AI News, OpenAI News, PyTorch News, Stability AI News, TensorFlow News

In this article How does JAX gradient checkpointing reduce memory on TPU v5e? What is the checkpoint policy that drives the 40% memory saving?

DeepSpeed Ulysses: A Breakthrough in Training Extreme Long-Sequence AI Models

14 mins read

AI/ML

DeepSpeed Ulysses: A Breakthrough in Training Extreme Long-Sequence AI Models

August 6, 2025December 27, 2025 Priya Sharma0Tagged DeepSpeed News

Introduction: Breaking the Sequence Length Barrier in Transformer Models The world of artificial intelligence is in a constant race to build larger, more.