Trusted — Risk Score 5/100
Last scan:2 days ago Rescan
5 /100
kiwi-voice
Multi-language real-time voice assistant with OpenClaw AI backend integration, speaker identification, and voice security controls
Kiwi Voice is a legitimate multi-language voice assistant with robust security controls including Telegram approval for dangerous commands, speaker identification with priority levels, and two-layer command filtering. No malicious behavior detected.
Skill Namekiwi-voice
Duration75.3s
Enginepi
Safe to install
The skill is safe to use. No security concerns require action. Ensure .env file with API keys is kept secure and not committed to version control.
ResourceDeclaredInferredStatusEvidence
Filesystem READ READ ✓ Aligned Voice profiles stored in voice_profiles/ directory, logs in logs/ directory - bo…
Network READ READ ✓ Aligned WebSocket to 127.0.0.1:18789 (local), REST API on port 7789, Telegram/HA/TTS ext…
Shell NONE NONE No shell execution from user input; subprocess only for legitimate CLI tools (pa…
Environment READ READ ✓ Aligned Reads API keys from .env (KIWI_ELEVENLABS_API_KEY, KIWI_TELEGRAM_BOT_TOKEN, etc.…
Skill Invoke NONE NONE No skill chaining detected
Clipboard NONE NONE No clipboard access found
Browser NONE NONE No browser automation detected
Database NONE NONE No database access detected
1 Critical 32 findings
💀
Critical Dangerous Command 危险 Shell 命令
rm -rf /
docs/features/voice-security.md:17
🔗
Medium External URL 外部 URL
https://kiwi-voice.com
README.md:2
🔗
Medium External URL 外部 URL
https://kiwi-voice.com/assets/og-image.svg
README.md:3
🔗
Medium External URL 外部 URL
https://img.shields.io/badge/license-MIT-blue.svg
README.md:14
🔗
Medium External URL 外部 URL
https://www.python.org/downloads/
README.md:15
🔗
Medium External URL 外部 URL
https://img.shields.io/badge/python-3.10%2B-blue.svg
README.md:15
🔗
Medium External URL 外部 URL
https://img.shields.io/badge/backend-OpenClaw-orange.svg
README.md:16
🔗
Medium External URL 外部 URL
https://docs.kiwi-voice.com
README.md:19
🔗
Medium External URL 外部 URL
https://docs.openclaw.ai
README.md:21
🔗
Medium External URL 外部 URL
http://homeassistant.local:8123
README.md:429
🔗
Medium External URL 外部 URL
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html
docs/deployment/docker.md:70
🔗
Medium External URL 外部 URL
https://download.pytorch.org/whl/cu121
docs/deployment/gpu.md:10
🔗
Medium External URL 外部 URL
https://www.home-assistant.io/
docs/features/home-assistant.md:3
🔗
Medium External URL 外部 URL
http://192.168.1.100:7789
docs/features/home-assistant.md:20
🔗
Medium External URL 外部 URL
https://t.me/botfather
docs/features/voice-security.md:60
🔗
Medium External URL 外部 URL
https://api.telegram.org/bot
docs/features/voice-security.md:67
🔗
Medium External URL 外部 URL
https://developer.mozilla.org/en-US/docs/Web/API/AudioWorklet
docs/features/web-microphone.md:7
🔗
Medium External URL 外部 URL
https://ffmpeg.org/download.html
docs/getting-started/installation.md:52
🔗
Medium External URL 外部 URL
http://192.168.1.100:8123
kiwi/integrations/homeassistant.py:30
🔗
Medium External URL 外部 URL
https://api.elevenlabs.io/v1/text-to-speech
kiwi/tts/elevenlabs.py:93
🔗
Medium External URL 外部 URL
https://api.runpod.ai/v2/
kiwi/tts/runpod.py:44
🔗
Medium External URL 外部 URL
http://www.w3.org/2000/svg
kiwi/web/index.html:11
🔗
Medium External URL 外部 URL
https://console.runpod.io/serverless
runpod/README.md:27
🔗
Medium External URL 外部 URL
https://api.runpod.ai/v2/YOUR_ENDPOINT_ID
runpod/README.md:172
🔗
Medium External URL 外部 URL
http://www.apache.org/licenses/LICENSE-2.0
runpod/qwen_tts/__init__.py:9
🔗
Medium External URL 外部 URL
https://huggingface.co/papers/2103.17239
runpod/qwen_tts/core/tokenizer_12hz/modeling_qwen3_tts_tokenizer_v2.py:394
🔗
Medium External URL 外部 URL
https://huggingface.co/papers/2006.08195
runpod/qwen_tts/core/tokenizer_12hz/modeling_qwen3_tts_tokenizer_v2.py:588
🔗
Medium External URL 外部 URL
https://discuss.pytorch.org/t/how-to-generate-variable-length-mask/23397/3
runpod/qwen_tts/core/tokenizer_25hz/modeling_qwen3_tts_tokenizer_v1.py:233
🔗
Medium External URL 外部 URL
https://huggingface.co/papers/2005.07143
runpod/qwen_tts/core/tokenizer_25hz/modeling_qwen3_tts_tokenizer_v1.py:345
🔗
Medium External URL 外部 URL
https://arxiv.org/pdf/2107.03312.pdf
runpod/qwen_tts/core/tokenizer_25hz/vq/core_vq.py:336
🔗
Medium External URL 外部 URL
https://arxiv.org/abs/2305.02765
runpod/qwen_tts/core/tokenizer_25hz/vq/core_vq.py:479
🔗
Medium External URL 外部 URL
https://colab.research.google.com/drive/1q1oe2zOyZp7UsB3jJiQ1IFn8z5YfjwEb
scripts/train_wake_word.py:17

File Tree

146 files · 1.4 MB · 40646 lines
Python 79f · 24900L YAML 18f · 8512L Markdown 35f · 3575L CSS 2f · 1831L JavaScript 3f · 1372L HTML 1f · 282L JSON 3f · 69L Text 1f · 53L TOML 2f · 43L Config 1f · 9L
├─ 📁 custom_components
│ └─ 📁 kiwi_voice
│ ├─ 📁 translations
│ │ └─ 📋 en.json JSON 29L · 843 B
│ ├─ 🐍 __init__.py Python 185L · 6.0 KB
│ ├─ 🐍 button.py Python 108L · 3.6 KB
│ ├─ 🐍 config_flow.py Python 133L · 4.4 KB
│ ├─ 🐍 const.py Python 5L · 116 B
│ ├─ 🐍 coordinator.py Python 159L · 5.6 KB
│ ├─ 📋 manifest.json JSON 11L · 312 B
│ ├─ 🐍 sensor.py Python 165L · 5.6 KB
│ ├─ 📋 services.yaml YAML 36L · 994 B
│ ├─ 📋 strings.json JSON 29L · 843 B
│ ├─ 🐍 switch.py Python 75L · 2.7 KB
│ └─ 🐍 tts.py Python 89L · 2.8 KB
├─ 📁 docs
│ ├─ 📁 api
│ │ ├─ 📝 rest.md Markdown 268L · 3.9 KB
│ │ └─ 📝 websocket.md Markdown 118L · 3.0 KB
│ ├─ 📁 assets
│ │ └─ 📦 favicon.svg 115 B
│ ├─ 📁 deployment
│ │ ├─ 📝 docker.md Markdown 82L · 1.6 KB
│ │ ├─ 📝 gpu.md Markdown 76L · 1.6 KB
│ │ ├─ 📝 local.md Markdown 76L · 1.6 KB
│ │ ├─ 📝 reverse-proxy.md Markdown 47L · 1.2 KB
│ │ └─ 📝 systemd.md Markdown 58L · 1.1 KB
│ ├─ 📁 development
│ │ ├─ 📝 architecture.md Markdown 81L · 3.4 KB
│ │ ├─ 📝 code-patterns.md Markdown 129L · 2.4 KB
│ │ └─ 📝 contributing.md Markdown 86L · 2.8 KB
│ ├─ 📁 features
│ │ ├─ 📝 home-assistant.md Markdown 82L · 2.4 KB
│ │ ├─ 📝 multilanguage.md Markdown 113L · 3.3 KB
│ │ ├─ 📝 souls.md Markdown 95L · 2.5 KB
│ │ ├─ 📝 speaker-id.md Markdown 70L · 2.4 KB
│ │ ├─ 📝 streaming-tts.md Markdown 42L · 1.7 KB
│ │ ├─ 📝 stt-engines.md Markdown 94L · 2.7 KB
│ │ ├─ 📝 tts-providers.md Markdown 122L · 2.6 KB
│ │ ├─ 📝 voice-security.md Markdown 91L · 3.2 KB
│ │ ├─ 📝 wake-word.md Markdown 91L · 2.7 KB
│ │ ├─ 📝 web-dashboard.md Markdown 47L · 1.8 KB
│ │ └─ 📝 web-microphone.md Markdown 47L · 1.7 KB
│ ├─ 📁 getting-started
│ │ ├─ 📝 configuration.md Markdown 170L · 4.0 KB
│ │ ├─ 📝 first-run.md Markdown 81L · 2.2 KB
│ │ └─ 📝 installation.md Markdown 81L · 1.7 KB
│ ├─ 📁 stylesheets
│ │ └─ 📄 extra.css CSS 230L · 4.8 KB
│ └─ 📝 index.md Markdown 130L · 4.7 KB
├─ 📁 kiwi
│ ├─ 📁 api
│ │ ├─ 🐍 __init__.py Python 2L · 83 B
│ │ ├─ 🐍 audio_bridge.py Python 259L · 8.9 KB
│ │ └─ 🐍 server.py Python 952L · 39.5 KB
│ ├─ 📁 integrations
│ │ ├─ 🐍 __init__.py Python 1L · 54 B
│ │ └─ 🐍 homeassistant.py Python 230L · 8.7 KB
│ ├─ 📁 locales
│ │ ├─ 🐍 __init__.py Python 0 B
│ │ ├─ 📋 ar.yaml YAML 533L · 17.3 KB
│ │ ├─ 📋 de.yaml YAML 541L · 15.8 KB
│ │ ├─ 📋 en.yaml YAML 557L · 14.7 KB
│ │ ├─ 📋 es.yaml YAML 535L · 15.2 KB
│ │ ├─ 📋 fr.yaml YAML 541L · 15.7 KB
│ │ ├─ 📋 hi.yaml YAML 533L · 23.6 KB
│ │ ├─ 📋 id.yaml YAML 535L · 14.6 KB
│ │ ├─ 📋 it.yaml YAML 539L · 15.5 KB
│ │ ├─ 📋 ja.yaml YAML 525L · 17.5 KB
│ │ ├─ 📋 ko.yaml YAML 524L · 15.4 KB
│ │ ├─ 📋 pl.yaml YAML 535L · 15.1 KB
│ │ ├─ 📋 pt.yaml YAML 539L · 15.2 KB
│ │ ├─ 📋 ru.yaml YAML 591L · 21.9 KB
│ │ ├─ 📋 tr.yaml YAML 530L · 14.7 KB
│ │ └─ 📋 zh.yaml YAML 526L · 14.0 KB
│ ├─ 📁 mixins
│ │ ├─ 🐍 __init__.py Python 15L · 478 B
│ │ ├─ 🐍 audio_playback.py Python 331L · 13.4 KB
│ │ ├─ 🐍 dialogue_pipeline.py Python 699L · 32.2 KB
│ │ ├─ 🐍 llm_callbacks.py Python 247L · 10.4 KB
│ │ ├─ 🐍 stream_watchdog.py Python 292L · 12.5 KB
│ │ └─ 🐍 tts_speech.py Python 585L · 24.0 KB
│ ├─ 📁 souls
│ │ ├─ 📝 comedian.md Markdown 29L · 1.2 KB
│ │ ├─ 📝 hype-person.md Markdown 27L · 1.2 KB
│ │ ├─ 📝 mindful-companion.md Markdown 37L · 1.7 KB
│ │ ├─ 📝 siren.md Markdown 33L · 2.0 KB
│ │ └─ 📝 storyteller.md Markdown 29L · 1.3 KB
│ ├─ 📁 stt
│ │ ├─ 🐍 __init__.py Python 1L · 52 B
│ │ ├─ 🐍 elevenlabs.py Python 196L · 6.0 KB
│ │ └─ 🐍 mlx_whisper.py Python 139L · 4.0 KB
│ ├─ 📁 tts
│ │ ├─ 🐍 __init__.py Python 2L · 110 B
│ │ ├─ 🐍 base.py Python 138L · 4.4 KB
│ │ ├─ 🐍 elevenlabs_ws.py Python 710L · 27.0 KB
│ │ ├─ 🐍 elevenlabs.py Python 302L · 11.2 KB
│ │ ├─ 🐍 kokoro.py Python 193L · 5.9 KB
│ │ ├─ 🐍 piper.py Python 143L · 4.9 KB
│ │ ├─ 🐍 qwen_local.py Python 197L · 7.1 KB
│ │ ├─ 🐍 runpod.py Python 265L · 8.2 KB
│ │ └─ 🐍 streaming.py Python 356L · 14.1 KB
│ ├─ 📁 web
│ │ ├─ 📁 static
│ │ │ ├─ 📜 app.js JavaScript 1000L · 31.7 KB
│ │ │ ├─ 📜 audio-client.js JavaScript 288L · 8.0 KB
│ │ │ ├─ 📜 audio-worklet.js JavaScript 84L · 2.3 KB
│ │ │ └─ 📄 style.css CSS 1601L · 34.2 KB
│ │ └─ 📄 index.html HTML 282L · 15.0 KB
│ ├─ 🐍 __init__.py Python 7L · 247 B
│ ├─ 🐍 __main__.py Python 4L · 78 B
│ ├─ 🐍 config_loader.py Python 720L · 34.4 KB
│ ├─ 🐍 event_bus.py Python 466L · 13.6 KB
│ ├─ 🐍 hardware_aec.py Python 466L · 16.2 KB
│ ├─ 🐍 i18n.py Python 71L · 1.9 KB
│ ├─ 🐍 listener.py Python 2665L · 122.8 KB
│ ├─ 🐍 openclaw_cli.py Python 254L · 9.6 KB
│ ├─ 🐍 openclaw_ws.py Python 1648L · 64.8 KB
│ ├─ 🐍 service.py Python 748L · 29.8 KB
│ ├─ 🐍 soul_manager.py Python 258L · 9.6 KB
│ ├─ 🐍 speaker_id.py Python 451L · 15.5 KB
│ ├─ 🐍 speaker_manager.py Python 625L · 23.3 KB
│ ├─ 🐍 state_machine.py Python 11L · 437 B
│ ├─ 🐍 text_processing.py Python 242L · 9.3 KB
│ ├─ 🐍 unified_vad.py Python 398L · 12.9 KB
│ ├─ 🐍 utils.py Python 148L · 4.7 KB
│ ├─ 🐍 voice_security.py Python 624L · 22.6 KB
│ └─ 🐍 wake_word.py Python 104L · 3.1 KB
├─ 📁 runpod
│ ├─ 📁 qwen_tts
│ │ ├─ 📁 cli
│ │ │ └─ 🐍 demo.py Python 634L · 28.5 KB
│ │ ├─ 📁 core
│ │ │ ├─ 📁 tokenizer_12hz
│ │ │ │ ├─ 🔑 configuration_qwen3_tts_tokenizer_v2.py Python 172L · 7.8 KB
│ │ │ │ └─ 🔑 modeling_qwen3_tts_tokenizer_v2.py Python 1025L · 39.5 KB
│ │ │ ├─ 📁 tokenizer_25hz
│ │ │ │ ├─ 📁 vq
│ │ │ │ │ ├─ 🐍 core_vq.py Python 522L · 19.6 KB
│ │ │ │ │ ├─ 🐍 speech_vq.py Python 356L · 14.5 KB
│ │ │ │ │ └─ 🐍 whisper_encoder.py Python 406L · 14.0 KB
│ │ │ │ ├─ 🔑 configuration_qwen3_tts_tokenizer_v1.py Python 332L · 14.2 KB
│ │ │ │ └─ 🔑 modeling_qwen3_tts_tokenizer_v1.py Python 1528L · 55.1 KB
│ │ │ └─ 🐍 __init__.py Python 18L · 990 B
│ │ ├─ 📁 inference
│ │ │ ├─ 🐍 qwen3_tts_model.py Python 877L · 36.3 KB
│ │ │ └─ 🔑 qwen3_tts_tokenizer.py Python 410L · 15.3 KB
│ │ ├─ 🐍 __init__.py Python 23L · 839 B
│ │ └─ 🐍 __main__.py Python 24L · 800 B
│ ├─ 🐍 download_model.py Python 22L · 734 B
│ ├─ 🐍 handler.py Python 415L · 13.0 KB
│ └─ 📝 README.md Markdown 238L · 6.1 KB
├─ 📁 scripts
│ ├─ 🐍 generate_cues.py Python 94L · 2.8 KB
│ ├─ 🐍 noise_monitor.py Python 85L · 2.6 KB
│ ├─ 🐍 send_voice.py Python 33L · 916 B
│ ├─ 🐍 telegram_voice.py Python 170L · 5.1 KB
│ └─ 🐍 train_wake_word.py Python 112L · 3.1 KB
├─ 📁 tests
│ ├─ 🐍 __init__.py Python 0 B
│ ├─ 🐍 test_api.py Python 27L · 704 B
│ ├─ 🐍 test_audio_bridge.py Python 106L · 3.0 KB
│ ├─ 🐍 test_config.py Python 58L · 1.9 KB
│ ├─ 🐍 test_i18n.py Python 78L · 2.2 KB
│ ├─ 🐍 test_smoke.py Python 80L · 2.6 KB
│ ├─ 🐍 test_text_processing.py Python 137L · 4.2 KB
│ └─ 🐍 test_voice_security.py Python 72L · 2.5 KB
├─ 📝 CLAUDE.md Markdown 228L · 8.8 KB
├─ 📋 config.yaml YAML 283L · 9.9 KB
├─ 📋 mkdocs.yml YAML 109L · 2.9 KB
├─ 📄 mypy.ini Config 9L · 217 B
├─ 📄 pyproject.toml TOML 26L · 830 B
├─ 📝 README.md Markdown 441L · 17.0 KB
├─ 📄 requirements.txt Text 53L · 947 B
├─ 📄 ruff.toml TOML 17L · 489 B
├─ 📝 SKILL.md Markdown 124L · 3.3 KB
└─ 📝 SOUL.md Markdown 12L · 851 B

Dependencies 7 items

PackageVersionSourceKnown VulnsNotes
faster-whisper >=1.0.0 pip No STT engine, industry standard
sounddevice >=0.4.6 pip No Audio input/output
pyannote.audio >=3.1.0 pip No Speaker identification
requests >=2.28.0 pip No HTTP client for TTS APIs
websocket-client >=1.6.0 pip No WebSocket protocol
aiohttp >=3.9.0 pip No REST API server
cryptography >=41.0.0 pip No Ed25519 key handling

Security Positives

✓ Two-layer security architecture: pre-LLM regex filter + post-LLM shell execution approval
✓ Speaker identification with priority levels (OWNER > FRIEND > GUEST > BLOCKED)
✓ Telegram approval workflow with inline keyboard buttons for dangerous commands
✓ Voice security patterns defined per-language in locale YAML files
✓ API authentication with scope-based access control (read, control, tts, speakers, admin)
✓ Ed25519 device identity for WebSocket authentication
✓ Comprehensive crash protection with thread-safe logging
✓ Voice profiles stored as JSON with embeddings, not raw audio
✓ No credential harvesting or exfiltration patterns
✓ Well-documented i18n system with 15 supported languages
✓ OpenClaw session isolation for NSFW soul routing