MLOps - AI Dev News | Machine Learning Engineering

Milvus vs Chroma self-hosted: filter selectivity, not scale

16 mins read

AI/ML

Milvus vs Chroma self-hosted: filter selectivity, not scale

At first glance, the Milvus-vs-Chroma choice is a scale question: Chroma for small projects, Milvus, but under 10M vectors the deciding factor is not.

Replicate vs Modal for image-generation APIs: per-second billing, autoscaling, cold-start

18 mins read

AI/ML

Replicate vs Modal for image-generation APIs: per-second billing, autoscaling, cold-start

April 26, 2026May 22, 2026 Mateo Santiago0Tagged Anthropic News, Cohere News, Hugging Face News, JAX News, Keras News, Mistral AI News, OpenAI News, PyTorch News, Stability AI News, TensorFlow News

By Mateo Santiago If you are choosing between Replicate and Modal to serve FLUX, SDXL, or a fine-tuned diffusion model, the honest answer is that they are.

LlamaIndex 0.12.28 QueryFusionRetriever Throws ValidationError After Pydantic 2.10 Bump

6 mins read

AI/ML

LlamaIndex 0.12.28 QueryFusionRetriever Throws ValidationError After Pydantic 2.10 Bump

April 19, 2026 Elara Vance0Tagged Anthropic News, Cohere News, Hugging Face News, JAX News, Keras News, Mistral AI News, OpenAI News, PyTorch News, Stability AI News, TensorFlow News

Originally reported: March 24, 2026 — llama_index 0.12.28 Overview What changed between the prior and current release Reproducing the ValidationError on a.

MLflow 2.20.1 Fixed the S3 Artifact Upload EndpointConnectionError in AWS GovCloud

11 mins read

Cloud Computing

MLflow 2.20.1 Fixed the S3 Artifact Upload EndpointConnectionError in AWS GovCloud

April 17, 2026April 19, 2026 Silas Vance0Tagged Anthropic News, Cohere News, Hugging Face News, JAX News, Keras News, Mistral AI News, OpenAI News, PyTorch News, Stability AI News, TensorFlow News

If you run MLflow on AWS GovCloud and you saw botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL.

Migrating from W&B to MLflow 2.15: Savings, Gaps, and Hidden Costs

13 mins read

Cloud Computing

Migrating from W&B to MLflow 2.15: Savings, Gaps, and Hidden Costs

April 12, 2026April 19, 2026 Kwesi Mensah0Tagged Anthropic News, Cohere News, Hugging Face News, JAX News, Keras News, Mistral AI News, OpenAI News, PyTorch News, Stability AI News, TensorFlow News

In this article What does migrating from W&B to MLflow 2.15 actually cost? How do you actually rewrite the training loop?

Mistral-7B-v0.3 QLoRA on Modal A100-40GB: nf4 + bf16_compute Beat My RunPod H100 Spot Cost Per Step

11 mins read

Cloud Computing

Mistral-7B-v0.3 QLoRA on Modal A100-40GB: nf4 + bf16_compute Beat My RunPod H100 Spot Cost Per Step

April 11, 2026May 22, 2026 Jia Li Song0Tagged Anthropic News, Cohere News, Hugging Face News, JAX News, Keras News, Mistral AI News, OpenAI News, PyTorch News, Stability AI News, TensorFlow News

TL;DR: For a Mistral-7B-v0.3 QLoRA fine-tune at sequence length 2048 and micro-batch 4, a Modal A100-40GB container running bitsandbytes nf4 with bfloat16.