Explore real-world engineering experiences from top tech companies.
Receive daily AI-curated summaries of engineering articles from top tech companies worldwide.
This post describes a 24-hour speedrun for training a text-to-image diffusion model using 32 H200 GPUs and a ~$1500 compute budget.
Pinterest investigates the online–offline discrepancy in L1 CVR models in their ads funnel.
TabPFN, by Prior Labs, applies the pre-trained LLM paradigm to tabular data, removing the need for traditional ML preprocessing and per-task training.
Meta open-sources RCCLX, an enhanced GPU communication library for AMD platforms that significantly improves AI training and inference performance.
Airbnb recaps its 2025 academic research at KDD, CIKM, and EMNLP covering ML, NLP, and recommendation systems.
Netflix introduces MediaFM, an in-house tri-modal (audio, video, text) foundation model for deep media content understanding at scale.
This post explains how to fine-tune small LLMs for free using Unsloth and Hugging Face Jobs, with support for coding agents like Claude Code and Codex.
Amazon SageMaker Inference now supports GA deployment of custom Amazon Nova models for production-grade inference.
Pinterest introduced a GPU-served two-tower model using MMOE-DCN architecture for lightweight ads engagement prediction.
This post introduces an agent skill that enables coding agents (Claude and Codex) to write production-ready CUDA kernels for HuggingFace's diffusers and transformers libraries.
This article explores low-bit inference techniques that make large AI models faster and more cost-efficient to serve in production.
This post from Lyft explains how they validate and diagnose Doubly Robust (AIPW) models used for causal inference when A/B testing is not feasible.