Machine Learning - AI Dev News | Machine Learning Engineering

Mistral-7B-v0.3 QLoRA on Modal A100-40GB: nf4 + bf16_compute Beat My RunPod H100 Spot Cost Per Step

11 mins read

Cloud Computing

Mistral-7B-v0.3 QLoRA on Modal A100-40GB: nf4 + bf16_compute Beat My RunPod H100 Spot Cost Per Step

TL;DR: For a Mistral-7B-v0.3 QLoRA fine-tune at sequence length 2048 and micro-batch 4, a Modal A100-40GB container running bitsandbytes nf4 with bfloat16.

torch.compile in PyTorch 2.5: Where the Speedup Comes From and Where It Disappears

10 mins read

Deep Learning

torch.compile in PyTorch 2.5: Where the Speedup Comes From and Where It Disappears

April 8, 2026April 19, 2026 Mateo Santiago0Tagged Inductor, ML training, PyTorch, torch.compile, TorchDynamo

PyTorch 2.5 made torch.compile good enough that you can drop it into a real training script and expect a speedup most of the time.

How to Automate Hyperparameter Tuning in PyTorch With Optuna

16 mins read

Automation

How to Automate Hyperparameter Tuning in PyTorch With Optuna

April 6, 2026 Elara Vance0Tagged IBM Watson News

I still remember the early days of my machine learning career, sitting in front of a terminal at 2 AM, manually tweaking learning rates, batch sizes, and.

How to Convert PyTorch Models to ONNX Format for Faster Inference

15 mins read

DevOps

How to Convert PyTorch Models to ONNX Format for Faster Inference

April 6, 2026 Li Mei Fong0Tagged Qdrant News

I remember the first time I deployed a PyTorch model to production. I wrapped a beautifully trained ResNet model in a Flask API, spun up a Docker.

The Stable Kaggle CLI Fixes My Biggest Authentication Headache

6 mins read

Automation

The Stable Kaggle CLI Fixes My Biggest Authentication Headache

April 3, 2026April 19, 2026 Kwesi Mensah0Tagged Kaggle News

I was staring at my terminal at 11pm last Tuesday, watching a GitHub Actions runner fail for the third time. The error was always the same.

Massive AI Models Are Failing. Small Fast.ai Builds Win.

5 mins read

Cloud Computing

Massive AI Models Are Failing. Small Fast.ai Builds Win.

April 2, 2026April 19, 2026 Mateo Solano0Tagged Fast.ai News

I was staring at my AWS bill last Tuesday, trying to figure out how a simple image classification microservice managed to rack up $840 in three weeks.

Meta’s $100B AMD Pact Actually Fixes PyTorch’s Biggest Headache

4 mins read

AI/ML

Meta’s $100B AMD Pact Actually Fixes PyTorch’s Biggest Headache

April 1, 2026April 19, 2026 Elara Vance0Tagged Meta AI News

The Monopoly Tax is Getting Old I spent three hours yesterday trying to provision a single H100 instance on AWS. Three hours. For one node.

TensorRT Just Fixed Local Image Generation

2 mins read

AI/ML

TensorRT Just Fixed Local Image Generation

April 1, 2026April 26, 2026 Elara Vance0Tagged TensorRT News

Running modern, heavy diffusion models locally has felt like trying to stuff a mattress into a compact car for months now. You Learn about TensorRT News.

Hiding Android Malware in Hugging Face Repos

8 mins read

Android Development

Hiding Android Malware in Hugging Face Repos

March 13, 2026April 19, 2026 aidev_news_com0Tagged Hugging Face News

I spent my entire Tuesday morning cleaning up a mess because a junior developer treated Hugging Face like a trusted package manager. It isn’t.

Ditching Heavy Transformers for Static Embeddings

8 mins read

Cloud Computing

Ditching Heavy Transformers for Static Embeddings

March 12, 2026April 19, 2026 aidev_news_com0Tagged Sentence Transformers News

Well, I have to admit, I actually stumbled upon this solution by accident. There I was, staring at our AWS bill at 2am last Tuesday, trying to figure out.