---
title: "LiveKit Agents hits 10K stars: shipping STT integrations"
description: "LiveKit's realtime voice AI agent framework merges 171 PRs in 30 days, adds Pulse STT, Inworld STT, and avatar playback signaling. 10,153 stars, 100 commits."
tldr: "LiveKit Agents crossed 10,153 stars with 171 merged PRs in the last 30 days. Recent work focuses on new STT provider integrations (Pulse, Inworld), TTS model additions (MiniMax, Qwen 3), and avatar session lifecycle improvements including playback_started RPC signaling."
url: "https://aigentic.blog/livekit-agents-sttt-avatar-shipping"
publishedAt: "2026-04-22T13:00:04.560Z"
updatedAt: "2026-04-22T13:00:04.560Z"
category: "repo-pulse"
tags: ["livekit-agents","python","realtime-voice","ai-agents","sttt-integrations"]
---

# LiveKit Agents hits 10K stars: shipping STT integrations

> LiveKit Agents crossed 10,153 stars with 171 merged PRs in the last 30 days. Recent work focuses on new STT provider integrations (Pulse, Inworld), TTS model additions (MiniMax, Qwen 3), and avatar session lifecycle improvements including playback_started RPC signaling.

LiveKit Agents is a Python framework for building realtime voice AI agents with video and audio I/O. The project has reached 10,153 stars and is actively shipping integrations, bugfixes, and infrastructure improvements at a sustained pace: 100 commits, 40 contributors, and 171 merged PRs over the last 30 days signal a healthy, production-focused codebase.

## By the numbers

| Metric | Value |
|--------|-------|
| Current stars | 10,153 |
| 30-day commits | 100 |
| 30-day contributors | 40 |
| 30-day PRs merged | 171 |
| 30-day issues opened | 61 |
| 30-day issues closed | 79 |
| Latest release | livekit-agents@1.5.5 (April 20, 2026) |
| Release cadence (90d) | 10 releases in 31 days |

The release velocity is notably consistent: six releases between March 19 and April 20 show a roughly 4 to 5 day cycle at the minor version level. This cadence reflects active development and rapid iteration on both core functionality and plugin integrations.

## What's shipping

Recent merged work breaks into three clear themes: new provider integrations, TTS model expansion, and avatar session lifecycle improvements.

**New STT providers**: PR #5312 added Pulse STT support from Smallest AI, including both real-time streaming and batch transcription modes. PR #5451 integrated Inworld STT as a new plugin provider. These additions expand the framework's speech recognition options beyond existing integrations and signal demand for multi-provider flexibility in production deployments.

**TTS model expansion**: PR #5518 added new MiniMax TTS models, addressing user requests for additional voice options. PR #5474 introduced Qwen 3 TTS support via the Simplismart LiveKit plugin. These are not framework-level changes but rather plugin-layer additions that let developers access more model variants without framework upgrades.

**Avatar session improvements**: PR #5511 introduced a playback_started RPC for remote avatar workers, enabling better synchronization between avatar animation and audio playback. PR #5499 added an AvatarSession base class and warnings for sync misconfiguration, addressing lifecycle management issues that developers encountered when wiring avatar state.

**Bugfixes and maintenance**: PR #5522 fixed a critical bug in MovingAverage.reset() that failed to clear the internal history buffer, causing stale averages after reset. PR #5502 cleaned up OpenAI response handling by dropping the prompt_cache_retention field. PR #5467 improved FrameProcessor lifecycle management in room I/O with ownership-aware cleanup. These fixes address correctness and memory issues likely surfaced in production.

**Provider parameter exposure**: PR #5493 exposed endpointing parameters in the X.AI STT integration, and PR #5486 added flux-general-multi model support to Deepgram STT v2. These are incremental improvements that give developers finer control over provider-specific behavior.

The overall pattern is clear: the team is shipping breadth (new providers, new models) alongside depth (lifecycle fixes, parameter exposure). The 171 merged PRs in 30 days, with 40 contributors, suggests a well-organized review process and active community contribution.

## Open questions

Eight issues remain open, and they reveal real pain points in production use.

**Timing and turn-taking**: Issue #5509 reports that agents transition from thinking to speaking after the user has already begun speaking, causing stale replies to play over new user turns. This is a fundamental problem in realtime conversation management: the agent's decision latency creates race conditions. Issue #5510 compounds this, noting that Deepgram TURN DETECTION in agent sessions triggers frequent user_state_changed events, causing incorrect filler audio. Both issues point to the difficulty of coordinating voice activity detection (VAD) across multiple providers and agent decision loops.

**Provider-specific timeouts**: Issue #5508 flags that Anthropic plugin's default 5-second httpx timeout is too aggressive for adaptive-thinking and large-context models. This is a practical scaling problem: as models become more capable, inference latency increases, and hardcoded timeouts become liabilities. The fix likely requires configurable or adaptive timeout strategies.

**Feature requests and proposals**: Issue #5512 proposes a native monetization layer via Merxex, and issue #5507 proposes a new rnnoise plugin for self-hosted noise cancellation. These are not bugs but architectural questions about scope and community contribution models. They suggest users are thinking about production deployment and operational concerns beyond the core agent loop.

**Provider support gaps**: Issue #5513 was about MiniMax TTS not supporting the speech-2.8-hd model; it was closed, implying the model was added (consistent with PR #5518). This type of issue is typical for plugin-based architectures: model catalogs evolve faster than integrations, and users hit unsupported variants.

The open issues suggest the framework is being used in ways that stress its concurrency model and provider integration patterns. The timing issues in particular are not simple to solve and likely require deeper changes to how agent state and user input are synchronized.

## Takeaways

**Provider breadth is a core competitive lever**: The last 30 days saw three new STT/TTS provider integrations (Pulse, Inworld, Qwen 3, MiniMax models). LiveKit is not building all the AI models; it is building the plumbing to let developers swap providers without rewriting agent logic. This is the right architectural bet for a framework, and the merge velocity shows the team is executing on it.

**Avatar synchronization is still rough**: Two recent PRs (#5511, #5499) and multiple open issues (#5509, #5510) center on avatar playback and turn-taking. The framework is shipping RPC signals and base classes to help, but the underlying problem (coordinating VAD, agent latency, and avatar animation) remains hard. Expect more work here as video agents become more common.

**Community contribution is scaling**: 40 contributors in 30 days, with PRs from external developers (Smallest AI, Simplismart, Inworld), shows the plugin model is working. Developers are not just filing issues; they are submitting integrations. This is a sign of ecosystem health and a network effect that benefits all users.

**Correctness bugs still surface in production**: The MovingAverage buffer bug (PR #5522), FrameProcessor lifecycle issues (PR #5467), and timeout problems (issue #5508) are not exotic edge cases; they are problems that real deployments hit. The team is responsive (bug reported and fixed within hours), but the volume suggests the framework is being pushed hard and the test coverage may not yet reflect all production patterns.

## Further reading

- [LiveKit Agents GitHub repository](https://github.com/livekit/agents) - Source code, issue tracker, and release history for the framework.
- [LiveKit Agents documentation](https://docs.livekit.io/agents/) - Official guides for building and deploying voice agents.
- [Smallest AI Pulse STT integration PR #5312](https://github.com/livekit/agents/pull/5312) - Details on real-time streaming and batch transcription support.
- [Avatar session improvements PR #5511](https://github.com/livekit/agents/pull/5511) - Playback synchronization and RPC signaling for remote avatar workers.
- [LiveKit realtime communication platform](https://livekit.io/) - Core infrastructure that LiveKit Agents builds upon.

## Frequently asked

### What new STT providers were added to LiveKit Agents recently?

In the last 30 days, Pulse STT from Smallest AI (PR #5312) and Inworld STT (PR #5451) were integrated. Pulse supports both real-time streaming and batch transcription modes. These additions give developers more speech recognition options beyond existing providers.

### How often does LiveKit Agents release new versions?

The project maintains a roughly 4 to 5 day release cycle at the minor version level. In the 90 days prior to April 2026, 10 releases shipped, with the latest being livekit-agents@1.5.5. This pace reflects active development and rapid iteration on integrations and bugfixes.

### What avatar synchronization issues are open?

Issue #5509 reports agents speaking after users have already started speaking, causing stale replies to overlap new turns. Issue #5510 notes that Deepgram TURN DETECTION triggers excessive user_state_changed events, causing incorrect filler audio. Both stem from difficulty coordinating voice activity detection and agent decision latency in realtime conversation.

### How many contributors are actively working on LiveKit Agents?

40 contributors merged code in the last 30 days, with 171 PRs merged. Contributors include both LiveKit staff and external developers from provider companies (Smallest AI, Inworld, Simplismart), indicating a healthy ecosystem around the plugin model.

### What was the MovingAverage bug in PR #5522?

The MovingAverage.reset() method did not clear its internal history buffer, causing stale averages to persist after reset. This was a correctness bug likely surfaced in production and was fixed within hours of being reported.
