Genkit introduces a middleware system for intercepting and customizing AI generation calls in agentic applications.
Arm SME2 and Google AI Edge enable efficient on-device AI inference by combining hardware acceleration with optimized software tools.
This tutorial demonstrates how to build long-running AI agents using Google's Agent Development Kit that maintain state across weeks without losing context.
This article presents DFlash, a diffusion-style speculative decoding method that achieves 3x speedups for LLM inference on Google TPUs by generating entire token blocks in a single forward pass instead of sequentially.
Gemini Embedding 2 is a multimodal embedding model that maps text, images, video, audio, and documents into a single semantic space.
Google Cloud announces Rapid Bucket, integrating Colossus storage with PyTorch for accelerated AI/ML training.
LiteRT is a cross-platform on-device AI framework that leverages Neural Processing Units (NPUs) to enable fast, efficient AI features on mobile, desktop, and IoT devices.
Google Cloud introduces Agents CLI in Agent Platform, a unified tool designed to streamline AI agent development from local environment to production deployment.
This article covers five key lessons for building production-ready AI agents, using a refactored sales research agent as an example.
A2UI v0.9 is a framework-agnostic standard for generative UI that enables AI agents to generate UI components in real-time using a client's existing design system and component catalog.
MaxText introduces Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) capabilities now available on single-host TPU configurations for post-training large language models.
Gemini CLI now supports subagents, specialized agents that handle complex tasks within their own isolated context windows, tools, and system instructions while keeping the main session focused and efficient.