Agent Pulse
Posts
Google’s TurboQuant shrinks LLM memory 6x with zero accuracy loss

Google’s TurboQuant shrinks LLM memory 6x with zero accuracy loss

Gemini 3.1 Flash Live API for real-time agent interactions

O P
March 27, 2026

In partnership with

The Future of AI in Marketing. Your Shortcut to Smarter, Faster Marketing.

Unlock a focused set of AI strategies built to streamline your work and maximize impact. This guide delivers the practical tactics and tools marketers need to start seeing results right away:

7 high-impact AI strategies to accelerate your marketing performance
Practical use cases for content creation, lead gen, and personalization
Expert insights into how top marketers are using AI today
A framework to evaluate and implement AI tools efficiently

Stay ahead of the curve with these top strategies AI helped develop for marketers, built for real-world results.

Download the Free Report

AIonFire | AArena | Submit Agent | Advertise | SEO Backlink

Welcome back! OP here again, helping you with another addition of Agent Pulse - your go-to spot for agentic news, insights and more.

In today’s:

👉 TOP Agentic News
✨ Featured Agents
🎙️ What to Watch this Week
⚔️ Agent Arena Battleboard
🏆 Agents Leaderboard
🗺️ Agents Landscape Map

Achieve First Page Rankings through Guest Posts

Google launches Gemini 3.1 Flash Live API for real-time agent interactions

Google introduced Gemini 3.1 Flash Live via the Gemini Live API in Google AI Studio, enabling developers to build real-time, streaming AI experiences with low latency. The model is designed for continuous interaction—supporting voice, text, and multimodal inputs—making it well-suited for agents that need to respond instantly rather than in discrete turns. Unlike traditional request/response APIs, the Live API allows persistent sessions, where agents can maintain context and react dynamically as new input arrives. Strategically, this pushes Gemini further into agent infrastructure territory, positioning it as a foundation for interactive, always-on agents across use cases like copilots, customer support, and real-time assistants.

❝

“Every business, just like they have a website, and a phone number, and an email address, is also going to have an AI.”

Mark Zuckerberg

Google’s TurboQuant shrinks LLM memory 6x and unlocks real-time agent scaling

Google Research introduced TurboQuant, a new compression method targeting the key-value (KV) cache—the core memory bottleneck in large language model inference. The system reduces KV cache size by at least 6x and delivers up to 8x speed improvements, while maintaining zero accuracy loss across benchmarks.

Technically, TurboQuant combines two innovations: PolarQuant, which simplifies vector geometry for high-quality compression, and Quantized Johnson-Lindenstrauss (QJL), a 1-bit error correction method that removes bias in attention calculations. Unlike traditional quantization approaches, it requires no retraining and introduces virtually no runtime overhead, making it deployable on existing models.

The strategic implication is significant: KV cache limits are one of the main constraints on long-context and multi-turn agents. By drastically reducing memory footprint, TurboQuant makes persistent, real-time, and multi-agent systems far more practical—shifting the bottleneck from hardware scaling to software efficiency.

Learn AI in 5 minutes a day

This is the easiest way for a busy person wanting to learn AI in as little time as possible:

Sign up for The Rundown AI newsletter
They send you 5-minute email updates on the latest AI news and how to use it
You learn how to become 2x more productive by leveraging AI

SAP buys Reltio to make enterprise data agent-ready

SAP said it will acquire Reltio to strengthen SAP Business Data Cloud and make both SAP and non-SAP enterprise data AI-ready. The strategic point is the data layer: SAP is positioning trusted, unified, low-latency master data as the backbone for Joule and multi-agent workflows, not just a back-office cleanup project. The release explicitly ties the deal to enterprise-wide agentic AI, with Reltio’s entity resolution, data unification, and MCP support meant to help agents act on cleaner, more interoperable business context. This is one of the clearest March 27 signals that big enterprise vendors see agentic AI as a data architecture problem as much as a model problem.

Salesforce says agents now handle 40% of its internal IT support cases

Salesforce published a behind-the-scenes deployment story showing that its own IT support agents now handle 40% of support cases. The company says that translated into 9,500 automatically resolved cases and $57,000 in savings within the first two months, across an environment managing about 25,000 support tickets per month for 76,000 employees. Strategically, this matters because it is not just product marketing; it is a concrete internal operations example with reported volume, savings, and rollout details around sandboxing, data masking, and least-privilege access. For Agent Pulse, this is a notable real-world proof point that agents are moving from demo mode into measurable enterprise service workflows.

Optimove launches a command center for collaborating marketing agents

Optimove introduced AI Decisioning Studio, a control layer where marketers can manage and monitor a team of AI agents from one interface. It launched with four coordinated agents focused on journeys, offers, content, and send-time optimization, with shared context designed so each decision informs the others. The strategic implication is that marketing vendors are now packaging agent systems less as one-off copilots and more as orchestrated multi-agent teams that compound performance across campaigns. That makes this a meaningful step toward operationalized agentic marketing rather than isolated AI features.

88% resolved. 22% loyal. Your stack has a problem.

Those numbers aren't a CX issue — they're a design issue. Gladly's 2026 Customer Expectations Report breaks down exactly where AI-powered service loses customers, and what the architecture of loyalty-driven CX actually looks like.

Get the report

Featured AI Agents

Orloj - Declarative agent infrastructure as code for multi-agent AI orchestration
Handinger - Turn any website into clean Markdown
Agentman - AI agents for your entire medical practice back office
Dzine AI - All-in-one AI image and video generator
Vibe Otter - Build a professional website in just 30 minutes
Nano Banana - Prompt-based photo editing with character consistency
TheLibrarian.io - Your WhatsApp AI sidekick
Blocks - Platform that brings coding agents into your development workflow