Pipecat Voice Assistant Builder is a development claude skill built by sickn33. Best for: Developers building real-time conversational voice agents need a production-ready pipeline blueprint that orchestrates STT, LLM, and TTS with multi-provider support..
- What it does
- Build low-latency voice assistants using Pipecat with Gemini, OpenAI, and Whisper integration.
- Category
- development
- Created by
- sickn33
- Last updated
Pipecat Voice Assistant Builder
Build low-latency voice assistants using Pipecat with Gemini, OpenAI, and Whisper integration.
Skill instructions
name: pipecat-friday-agent description: "Build a low-latency, Iron Man-inspired tactical voice assistant (F.R.I.D.A.Y.) using Pipecat, Gemini, and OpenAI." category: voice-agents risk: safe source: community date_added: "2026-03-10" tags: [pipecat, voice, gemini, openai, python] tools: [pipecat]
Pipecat Friday Agent
Overview
This skill provides a blueprint for building F.R.I.D.A.Y. (Replacement Integrated Digital Assistant Youth), a local voice assistant inspired by the tactical AI from the Iron Man films. It uses the Pipecat framework to orchestrate a low-latency pipeline:
- STT: OpenAI Whisper (
whisper-1) orgpt-4o-transcribe - LLM: Google Gemini 2.5 Flash (via a compatibility shim)
- TTS: OpenAI TTS (
novavoice) - Transport: Local Audio (Hardware Mic/Speakers)
When to Use This Skill
- Use when you want to build a real-time, conversational voice agent.
- Use when working with the Pipecat framework for pipeline-based AI.
- Use when you need to integrate multiple providers (Google and OpenAI) into a single voice loop.
- Use when building Iron Man-themed or tactical-themed voice applications.
How It Works
Step 1: Install Dependencies
You will need the Pipecat framework and its service providers installed:
pip install pipecat-ai[openai,google,silero] python-dotenv
Step 2: Configure Environment
Create a .env file with your API keys:
OPENAI_API_KEY=your_openai_key
GOOGLE_API_KEY=your_google_key
Step 3: Run the Agent
Execute the provided Python script to start the interface:
python scripts/friday_agent.py
Core Concepts
Pipeline Architecture
The agent follows a linear pipeline: Mic -> VAD -> STT -> LLM -> TTS -> Speaker. This allows for granular control over each stage, unlike end-to-end speech-to-speech models.
Google Compatibility Shim
Since Google's Gemini API has a different message format than OpenAI's standard (which Pipecat aggregators expect), the script includes a GoogleSafeContext and GoogleSafeMessage class to bridge the gap.
Best Practices
- ✅ Use Silero VAD: It is robust for local hardware and prevents background noise from triggering the LLM.
- ✅ Concise Prompts: Tactical agents should give short, data-dense responses to minimize latency.
- ✅ Sample Rate Match: OpenAI TTS outputs at 24kHz; ensure your
audio_out_sample_ratematches to avoid high-pitched or slowed audio. - ❌ No Polite Fillers: Avoid "Hello, how can I help you today?" Instead, use "Systems nominal. Ready for commands."
Troubleshooting
- Problem: Audio is choppy or delayed.
- Solution: Check your
OUTPUT_DEVICEindex. Run a script liketest_audio_output.pyto find the correct hardware index for your OS.
- Solution: Check your
- Problem: "Validation error" for message format.
- Solution: Ensure the
GoogleSafeContextshim is correctly translating OpenAI-style dicts to Gemini-style schema.
- Solution: Ensure the
Related Skills
@voice-agents- General principles of voice AI.@agent-tool-builder- Add tools (Search, Lights, etc.) to your Friday agent.@llm-architect- Optimizing the LLM layer.
Limitations
- Use this skill only when the task clearly matches the scope described above.
- Do not treat the output as a substitute for environment-specific validation, testing, or expert review.
- Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.
Use this skill
Most skills are portable instruction packages. Claude Code supports SKILL.md directly. Other agents can use adapted files like AGENTS.md, .cursorrules, and GEMINI.md.
Claude Code
Save SKILL.md into your Claude Skills folder, then restart Claude Code.
mkdir -p ~/.claude/skills/pipecat-voice-assistant-builder && curl -L "https://raw.githubusercontent.com/sickn33/antigravity-awesome-skills/HEAD/skills/pipecat-friday-agent/SKILL.md" -o ~/.claude/skills/pipecat-voice-assistant-builder/SKILL.mdInstalls to ~/.claude/skills/pipecat-voice-assistant-builder/SKILL.md.
Use cases
Developers building real-time conversational voice agents need a production-ready pipeline blueprint that orchestrates STT, LLM, and TTS with multi-provider support.
Reviews
No reviews yet. Be the first to review this skill.
No signup required
Stats
Creator
Ssickn33
@sickn33