kiwi-voice Security Report — Trusted | ClawSafe

5 /100

kiwi-voice

Multi-language real-time voice assistant with OpenClaw AI backend integration, speaker identification, and voice security controls

Kiwi Voice is a legitimate multi-language voice assistant with robust security controls including Telegram approval for dangerous commands, speaker identification with priority levels, and two-layer command filtering. No malicious behavior detected.

Skill Namekiwi-voice

Duration75.3s

Enginepi

✓

Safe to install

The skill is safe to use. No security concerns require action. Ensure .env file with API keys is kept secure and not committed to version control.

Resource	Declared	Inferred	Status	Evidence
Filesystem	`READ`	`READ`	✓ Aligned	Voice profiles stored in voice_profiles/ directory, logs in logs/ directory - bo…
Network	`READ`	`READ`	✓ Aligned	WebSocket to 127.0.0.1:18789 (local), REST API on port 7789, Telegram/HA/TTS ext…
Shell	`NONE`	`NONE`	—	No shell execution from user input; subprocess only for legitimate CLI tools (pa…
Environment	`READ`	`READ`	✓ Aligned	Reads API keys from .env (KIWI_ELEVENLABS_API_KEY, KIWI_TELEGRAM_BOT_TOKEN, etc.…
Skill Invoke	`NONE`	`NONE`	—	No skill chaining detected
Clipboard	`NONE`	`NONE`	—	No clipboard access found
Browser	`NONE`	`NONE`	—	No browser automation detected
Database	`NONE`	`NONE`	—	No database access detected

1 Critical 32 findings

💀

Critical Dangerous Command 危险 Shell 命令

rm -rf /

docs/features/voice-security.md:17

🔗

Medium External URL 外部 URL

https://kiwi-voice.com

README.md:2

🔗

Medium External URL 外部 URL

https://kiwi-voice.com/assets/og-image.svg

README.md:3

🔗

Medium External URL 外部 URL

https://img.shields.io/badge/license-MIT-blue.svg

README.md:14

🔗

Medium External URL 外部 URL

https://www.python.org/downloads/

README.md:15

🔗

Medium External URL 外部 URL

https://img.shields.io/badge/python-3.10%2B-blue.svg

README.md:15

🔗

Medium External URL 外部 URL

https://img.shields.io/badge/backend-OpenClaw-orange.svg

README.md:16

🔗

Medium External URL 外部 URL

https://docs.kiwi-voice.com

README.md:19

🔗

Medium External URL 外部 URL

https://docs.openclaw.ai

README.md:21

🔗

Medium External URL 外部 URL

http://homeassistant.local:8123

README.md:429

🔗

Medium External URL 外部 URL

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html

docs/deployment/docker.md:70

🔗

Medium External URL 外部 URL

https://download.pytorch.org/whl/cu121

docs/deployment/gpu.md:10

🔗

Medium External URL 外部 URL

https://www.home-assistant.io/

docs/features/home-assistant.md:3

🔗

Medium External URL 外部 URL

http://192.168.1.100:7789

docs/features/home-assistant.md:20

🔗

Medium External URL 外部 URL

https://t.me/botfather

docs/features/voice-security.md:60

🔗

Medium External URL 外部 URL

https://api.telegram.org/bot

docs/features/voice-security.md:67

🔗

Medium External URL 外部 URL

https://developer.mozilla.org/en-US/docs/Web/API/AudioWorklet

docs/features/web-microphone.md:7

🔗

Medium External URL 外部 URL

https://ffmpeg.org/download.html

docs/getting-started/installation.md:52

🔗

Medium External URL 外部 URL

http://192.168.1.100:8123

kiwi/integrations/homeassistant.py:30

🔗

Medium External URL 外部 URL

https://api.elevenlabs.io/v1/text-to-speech

kiwi/tts/elevenlabs.py:93

🔗

Medium External URL 外部 URL

https://api.runpod.ai/v2/

kiwi/tts/runpod.py:44

🔗

Medium External URL 外部 URL

http://www.w3.org/2000/svg

kiwi/web/index.html:11

🔗

Medium External URL 外部 URL

https://console.runpod.io/serverless

runpod/README.md:27

🔗

Medium External URL 外部 URL

https://api.runpod.ai/v2/YOUR_ENDPOINT_ID

runpod/README.md:172

🔗

Medium External URL 外部 URL

http://www.apache.org/licenses/LICENSE-2.0

runpod/qwen_tts/__init__.py:9

🔗

Medium External URL 外部 URL

https://huggingface.co/papers/2103.17239

runpod/qwen_tts/core/tokenizer_12hz/modeling_qwen3_tts_tokenizer_v2.py:394

🔗

Medium External URL 外部 URL

https://huggingface.co/papers/2006.08195

runpod/qwen_tts/core/tokenizer_12hz/modeling_qwen3_tts_tokenizer_v2.py:588

🔗

Medium External URL 外部 URL

https://discuss.pytorch.org/t/how-to-generate-variable-length-mask/23397/3

runpod/qwen_tts/core/tokenizer_25hz/modeling_qwen3_tts_tokenizer_v1.py:233

🔗

Medium External URL 外部 URL

https://huggingface.co/papers/2005.07143

runpod/qwen_tts/core/tokenizer_25hz/modeling_qwen3_tts_tokenizer_v1.py:345

🔗

Medium External URL 外部 URL

https://arxiv.org/pdf/2107.03312.pdf

runpod/qwen_tts/core/tokenizer_25hz/vq/core_vq.py:336

🔗

Medium External URL 外部 URL

https://arxiv.org/abs/2305.02765

runpod/qwen_tts/core/tokenizer_25hz/vq/core_vq.py:479

🔗

Medium External URL 外部 URL

https://colab.research.google.com/drive/1q1oe2zOyZp7UsB3jJiQ1IFn8z5YfjwEb

scripts/train_wake_word.py:17

File Tree

146 files · 1.4 MB · 40646 lines

Python 79f · 24900L YAML 18f · 8512L Markdown 35f · 3575L CSS 2f · 1831L JavaScript 3f · 1372L HTML 1f · 282L JSON 3f · 69L Text 1f · 53L TOML 2f · 43L Config 1f · 9L

├─ ▾ 📁 custom_components

│ └─ ▾ 📁 kiwi_voice

│ ├─ ▾ 📁 translations

│ │ └─ 📋 en.json JSON 29L · 843 B

│ ├─ 🐍 __init__.py Python 185L · 6.0 KB

│ ├─ 🐍 button.py Python 108L · 3.6 KB

│ ├─ 🐍 config_flow.py Python 133L · 4.4 KB

│ ├─ 🐍 const.py Python 5L · 116 B

│ ├─ 🐍 coordinator.py Python 159L · 5.6 KB

│ ├─ 📋 manifest.json JSON 11L · 312 B

│ ├─ 🐍 sensor.py Python 165L · 5.6 KB

│ ├─ 📋 services.yaml YAML 36L · 994 B

│ ├─ 📋 strings.json JSON 29L · 843 B

│ ├─ 🐍 switch.py Python 75L · 2.7 KB

│ └─ 🐍 tts.py Python 89L · 2.8 KB

├─ ▾ 📁 docs

│ ├─ ▾ 📁 api

│ │ ├─ 📝 rest.md Markdown 268L · 3.9 KB

│ │ └─ 📝 websocket.md Markdown 118L · 3.0 KB

│ ├─ ▾ 📁 assets

│ │ └─ 📦 favicon.svg 115 B

│ ├─ ▾ 📁 deployment

│ │ ├─ 📝 docker.md Markdown 82L · 1.6 KB

│ │ ├─ 📝 gpu.md Markdown 76L · 1.6 KB

│ │ ├─ 📝 local.md Markdown 76L · 1.6 KB

│ │ ├─ 📝 reverse-proxy.md Markdown 47L · 1.2 KB

│ │ └─ 📝 systemd.md Markdown 58L · 1.1 KB

│ ├─ ▾ 📁 development

│ │ ├─ 📝 architecture.md Markdown 81L · 3.4 KB

│ │ ├─ 📝 code-patterns.md Markdown 129L · 2.4 KB

│ │ └─ 📝 contributing.md Markdown 86L · 2.8 KB

│ ├─ ▾ 📁 features

│ │ ├─ 📝 home-assistant.md Markdown 82L · 2.4 KB

│ │ ├─ 📝 multilanguage.md Markdown 113L · 3.3 KB

│ │ ├─ 📝 souls.md Markdown 95L · 2.5 KB

│ │ ├─ 📝 speaker-id.md Markdown 70L · 2.4 KB

│ │ ├─ 📝 streaming-tts.md Markdown 42L · 1.7 KB

│ │ ├─ 📝 stt-engines.md Markdown 94L · 2.7 KB

│ │ ├─ 📝 tts-providers.md Markdown 122L · 2.6 KB

│ │ ├─ 📝 voice-security.md Markdown 91L · 3.2 KB

│ │ ├─ 📝 wake-word.md Markdown 91L · 2.7 KB

│ │ ├─ 📝 web-dashboard.md Markdown 47L · 1.8 KB

│ │ └─ 📝 web-microphone.md Markdown 47L · 1.7 KB

│ ├─ ▾ 📁 getting-started

│ │ ├─ 📝 configuration.md Markdown 170L · 4.0 KB

│ │ ├─ 📝 first-run.md Markdown 81L · 2.2 KB

│ │ └─ 📝 installation.md Markdown 81L · 1.7 KB

│ ├─ ▾ 📁 stylesheets

│ │ └─ 📄 extra.css CSS 230L · 4.8 KB

│ └─ 📝 index.md Markdown 130L · 4.7 KB

├─ ▾ 📁 kiwi

│ ├─ ▾ 📁 api

│ │ ├─ 🐍 __init__.py Python 2L · 83 B

│ │ ├─ 🐍 audio_bridge.py Python 259L · 8.9 KB

│ │ └─ 🐍 server.py Python 952L · 39.5 KB

│ ├─ ▾ 📁 integrations

│ │ ├─ 🐍 __init__.py Python 1L · 54 B

│ │ └─ 🐍 homeassistant.py Python 230L · 8.7 KB

│ ├─ ▾ 📁 locales

│ │ ├─ 🐍 __init__.py Python 0 B

│ │ ├─ 📋 ar.yaml YAML 533L · 17.3 KB

│ │ ├─ 📋 de.yaml YAML 541L · 15.8 KB

│ │ ├─ 📋 en.yaml YAML 557L · 14.7 KB

│ │ ├─ 📋 es.yaml YAML 535L · 15.2 KB

│ │ ├─ 📋 fr.yaml YAML 541L · 15.7 KB

│ │ ├─ 📋 hi.yaml YAML 533L · 23.6 KB

│ │ ├─ 📋 id.yaml YAML 535L · 14.6 KB

│ │ ├─ 📋 it.yaml YAML 539L · 15.5 KB

│ │ ├─ 📋 ja.yaml YAML 525L · 17.5 KB

│ │ ├─ 📋 ko.yaml YAML 524L · 15.4 KB

│ │ ├─ 📋 pl.yaml YAML 535L · 15.1 KB

│ │ ├─ 📋 pt.yaml YAML 539L · 15.2 KB

│ │ ├─ 📋 ru.yaml YAML 591L · 21.9 KB

│ │ ├─ 📋 tr.yaml YAML 530L · 14.7 KB

│ │ └─ 📋 zh.yaml YAML 526L · 14.0 KB

│ ├─ ▾ 📁 mixins

│ │ ├─ 🐍 __init__.py Python 15L · 478 B

│ │ ├─ 🐍 audio_playback.py Python 331L · 13.4 KB

│ │ ├─ 🐍 dialogue_pipeline.py Python 699L · 32.2 KB

│ │ ├─ 🐍 llm_callbacks.py Python 247L · 10.4 KB

│ │ ├─ 🐍 stream_watchdog.py Python 292L · 12.5 KB

│ │ └─ 🐍 tts_speech.py Python 585L · 24.0 KB

│ ├─ ▾ 📁 souls

│ │ ├─ 📝 comedian.md Markdown 29L · 1.2 KB

│ │ ├─ 📝 hype-person.md Markdown 27L · 1.2 KB

│ │ ├─ 📝 mindful-companion.md Markdown 37L · 1.7 KB

│ │ ├─ 📝 siren.md Markdown 33L · 2.0 KB

│ │ └─ 📝 storyteller.md Markdown 29L · 1.3 KB

│ ├─ ▾ 📁 stt

│ │ ├─ 🐍 __init__.py Python 1L · 52 B

│ │ ├─ 🐍 elevenlabs.py Python 196L · 6.0 KB

│ │ └─ 🐍 mlx_whisper.py Python 139L · 4.0 KB

│ ├─ ▾ 📁 tts

│ │ ├─ 🐍 __init__.py Python 2L · 110 B

│ │ ├─ 🐍 base.py Python 138L · 4.4 KB

│ │ ├─ 🐍 elevenlabs_ws.py Python 710L · 27.0 KB

│ │ ├─ 🐍 elevenlabs.py Python 302L · 11.2 KB

│ │ ├─ 🐍 kokoro.py Python 193L · 5.9 KB

│ │ ├─ 🐍 piper.py Python 143L · 4.9 KB

│ │ ├─ 🐍 qwen_local.py Python 197L · 7.1 KB

│ │ ├─ 🐍 runpod.py Python 265L · 8.2 KB

│ │ └─ 🐍 streaming.py Python 356L · 14.1 KB

│ ├─ ▾ 📁 web

│ │ ├─ ▾ 📁 static

│ │ │ ├─ 📜 app.js JavaScript 1000L · 31.7 KB

│ │ │ ├─ 📜 audio-client.js JavaScript 288L · 8.0 KB

│ │ │ ├─ 📜 audio-worklet.js JavaScript 84L · 2.3 KB

│ │ │ └─ 📄 style.css CSS 1601L · 34.2 KB

│ │ └─ 📄 index.html HTML 282L · 15.0 KB

│ ├─ 🐍 __init__.py Python 7L · 247 B

│ ├─ 🐍 __main__.py Python 4L · 78 B

│ ├─ 🐍 config_loader.py Python 720L · 34.4 KB

│ ├─ 🐍 event_bus.py Python 466L · 13.6 KB

│ ├─ 🐍 hardware_aec.py Python 466L · 16.2 KB

│ ├─ 🐍 i18n.py Python 71L · 1.9 KB

│ ├─ 🐍 listener.py Python 2665L · 122.8 KB

│ ├─ 🐍 openclaw_cli.py Python 254L · 9.6 KB

│ ├─ 🐍 openclaw_ws.py Python 1648L · 64.8 KB

│ ├─ 🐍 service.py Python 748L · 29.8 KB

│ ├─ 🐍 soul_manager.py Python 258L · 9.6 KB

│ ├─ 🐍 speaker_id.py Python 451L · 15.5 KB

│ ├─ 🐍 speaker_manager.py Python 625L · 23.3 KB

│ ├─ 🐍 state_machine.py Python 11L · 437 B

│ ├─ 🐍 text_processing.py Python 242L · 9.3 KB

│ ├─ 🐍 unified_vad.py Python 398L · 12.9 KB

│ ├─ 🐍 utils.py Python 148L · 4.7 KB

│ ├─ 🐍 voice_security.py Python 624L · 22.6 KB

│ └─ 🐍 wake_word.py Python 104L · 3.1 KB

├─ ▾ 📁 runpod

│ ├─ ▾ 📁 qwen_tts

│ │ ├─ ▾ 📁 cli

│ │ │ └─ 🐍 demo.py Python 634L · 28.5 KB

│ │ ├─ ▾ 📁 core

│ │ │ ├─ ▾ 📁 tokenizer_12hz

│ │ │ │ ├─ 🔑 configuration_qwen3_tts_tokenizer_v2.py ⚠ Python 172L · 7.8 KB

│ │ │ │ └─ 🔑 modeling_qwen3_tts_tokenizer_v2.py ⚠ Python 1025L · 39.5 KB

│ │ │ ├─ ▾ 📁 tokenizer_25hz

│ │ │ │ ├─ ▾ 📁 vq

│ │ │ │ │ ├─ 🐍 core_vq.py Python 522L · 19.6 KB

│ │ │ │ │ ├─ 🐍 speech_vq.py Python 356L · 14.5 KB

│ │ │ │ │ └─ 🐍 whisper_encoder.py Python 406L · 14.0 KB

│ │ │ │ ├─ 🔑 configuration_qwen3_tts_tokenizer_v1.py ⚠ Python 332L · 14.2 KB

│ │ │ │ └─ 🔑 modeling_qwen3_tts_tokenizer_v1.py ⚠ Python 1528L · 55.1 KB

│ │ │ └─ 🐍 __init__.py Python 18L · 990 B

│ │ ├─ ▾ 📁 inference

│ │ │ ├─ 🐍 qwen3_tts_model.py Python 877L · 36.3 KB

│ │ │ └─ 🔑 qwen3_tts_tokenizer.py ⚠ Python 410L · 15.3 KB

│ │ ├─ 🐍 __init__.py Python 23L · 839 B

│ │ └─ 🐍 __main__.py Python 24L · 800 B

│ ├─ 🐍 download_model.py Python 22L · 734 B

│ ├─ 🐍 handler.py Python 415L · 13.0 KB

│ └─ 📝 README.md Markdown 238L · 6.1 KB

├─ ▾ 📁 scripts

│ ├─ 🐍 generate_cues.py Python 94L · 2.8 KB

│ ├─ 🐍 noise_monitor.py Python 85L · 2.6 KB

│ ├─ 🐍 send_voice.py Python 33L · 916 B

│ ├─ 🐍 telegram_voice.py Python 170L · 5.1 KB

│ └─ 🐍 train_wake_word.py Python 112L · 3.1 KB

├─ ▾ 📁 tests

│ ├─ 🐍 __init__.py Python 0 B

│ ├─ 🐍 test_api.py Python 27L · 704 B

│ ├─ 🐍 test_audio_bridge.py Python 106L · 3.0 KB

│ ├─ 🐍 test_config.py Python 58L · 1.9 KB

│ ├─ 🐍 test_i18n.py Python 78L · 2.2 KB

│ ├─ 🐍 test_smoke.py Python 80L · 2.6 KB

│ ├─ 🐍 test_text_processing.py Python 137L · 4.2 KB

│ └─ 🐍 test_voice_security.py Python 72L · 2.5 KB

├─ 📝 CLAUDE.md Markdown 228L · 8.8 KB

├─ 📋 config.yaml YAML 283L · 9.9 KB

├─ 📋 mkdocs.yml YAML 109L · 2.9 KB

├─ 📄 mypy.ini Config 9L · 217 B

├─ 📄 pyproject.toml TOML 26L · 830 B

├─ 📝 README.md Markdown 441L · 17.0 KB

├─ 📄 requirements.txt Text 53L · 947 B

├─ 📄 ruff.toml TOML 17L · 489 B

├─ 📝 SKILL.md Markdown 124L · 3.3 KB

└─ 📝 SOUL.md Markdown 12L · 851 B

Dependencies 7 items

Package	Version	Source	Known Vulns	Notes
`faster-whisper`	`>=1.0.0`	pip	No	STT engine, industry standard
`sounddevice`	`>=0.4.6`	pip	No	Audio input/output
`pyannote.audio`	`>=3.1.0`	pip	No	Speaker identification
`requests`	`>=2.28.0`	pip	No	HTTP client for TTS APIs
`websocket-client`	`>=1.6.0`	pip	No	WebSocket protocol
`aiohttp`	`>=3.9.0`	pip	No	REST API server
`cryptography`	`>=41.0.0`	pip	No	Ed25519 key handling

Security Positives

✓ Two-layer security architecture: pre-LLM regex filter + post-LLM shell execution approval

✓ Speaker identification with priority levels (OWNER > FRIEND > GUEST > BLOCKED)

✓ Telegram approval workflow with inline keyboard buttons for dangerous commands

✓ Voice security patterns defined per-language in locale YAML files

✓ API authentication with scope-based access control (read, control, tts, speakers, admin)

✓ Ed25519 device identity for WebSocket authentication

✓ Comprehensive crash protection with thread-safe logging

✓ Voice profiles stored as JSON with embeddings, not raw audio

✓ No credential harvesting or exfiltration patterns

✓ Well-documented i18n system with 15 supported languages

✓ OpenClaw session isolation for NSFW soul routing

Scan Report

File Tree

Dependencies 7 items

Security Positives