Cloud Computing
Milvus vs Chroma self-hosted: filter selectivity, not scale
At first glance, the Milvus-vs-Chroma choice is a scale question: Chroma for small projects, Milvus, but under 10M vectors the deciding factor is not.
Replicate vs Modal for image-generation APIs: per-second billing, autoscaling, cold-start
By Mateo Santiago If you are choosing between Replicate and Modal to serve FLUX, SDXL, or a fine-tuned diffusion model, the honest answer is that they are.
JAX 0.5.1 Flips PjRt Default on TPU v5p: Compile Time Down 28%
Dated: February 5, 2026 — jax 0.5.1 Contents Why the PjRt migration matters on TPU v5p How should you measure the compile-time delta?
MLflow 2.20.1 Fixed the S3 Artifact Upload EndpointConnectionError in AWS GovCloud
If you run MLflow on AWS GovCloud and you saw botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL.
Migrating from W&B to MLflow 2.15: Savings, Gaps, and Hidden Costs
In this article What does migrating from W&B to MLflow 2.15 actually cost? How do you actually rewrite the training loop?
JAX Gradient Checkpointing on TPU v5e: 40% Memory Cut at 12% Speed Cost
In this article How does JAX gradient checkpointing reduce memory on TPU v5e? What is the checkpoint policy that drives the 40% memory saving?
Mistral-7B-v0.3 QLoRA on Modal A100-40GB: nf4 + bf16_compute Beat My RunPod H100 Spot Cost Per Step
TL;DR: For a Mistral-7B-v0.3 QLoRA fine-tune at sequence length 2048 and micro-batch 4, a Modal A100-40GB container running bitsandbytes nf4 with bfloat16.
Massive AI Models Are Failing. Small Fast.ai Builds Win.
I was staring at my AWS bill last Tuesday, trying to figure out how a simple image classification microservice managed to rack up $840 in three weeks.
Meta’s $100B AMD Pact Actually Fixes PyTorch’s Biggest Headache
The Monopoly Tax is Getting Old I spent three hours yesterday trying to provision a single H100 instance on AWS. Three hours. For one node.
Ditching Heavy Transformers for Static Embeddings
Well, I have to admit, I actually stumbled upon this solution by accident. There I was, staring at our AWS bill at 2am last Tuesday, trying to figure out.
