Explore real-world engineering experiences from top tech companies.
Receive daily AI-curated summaries of engineering articles from top tech companies worldwide.
This article presents Delta Weight Sync, a technique for efficiently synchronizing model weights in async reinforcement learning by transmitting only changed parameters.
This article argues that specialized models, aligned to specific deployment tasks, can outperform much larger frontier models at significantly lower cost.
Databricks Genie and TabPFN combine to enable business users to ask predictive questions in natural language through a multi-agent orchestrator.
OlmoEarth v1.1 is a more efficient family of transformer-based models for processing satellite imagery.
IBM released Granite Embedding Multilingual R2, two new multilingual embedding models balancing model size with retrieval quality.
This article explains the infrastructure building blocks on AWS for training and inferencing foundation models at scale.
This article presents Google Cloud's cluster-level reliability framework for TPUs designed to optimize infrastructure availability for training trillion-parameter AI models at scale.
Databricks built an evaluation framework using LLM judges aligned with human experts through MemAlign to assess the quality of Genie Code-generated machine learning notebooks.
This paper presents a Contextual Sequential Two-Tower Model for Pinterest ads that integrates real-time context into sequential recommender systems.
EMO is a mixture-of-experts model trained to develop modular expert groups that can be selectively used for specific tasks.
This article demonstrates LoRA fine-tuning of Qwen3-1.7B on MedMCQA using AMD MI300X with ROCm, enabling clinical question-answering without CUDA.
This article describes fixing train-inference mismatch when migrating PipelineRL from vLLM V0 to V1 in reinforcement learning.