kiwi-voice 安全扫描报告 — 可信 | ClawSafe

5 /100

kiwi-voice

Multi-language real-time voice assistant with OpenClaw AI backend integration, speaker identification, and voice security controls

Kiwi Voice is a legitimate multi-language voice assistant with robust security controls including Telegram approval for dangerous commands, speaker identification with priority levels, and two-layer command filtering. No malicious behavior detected.

技能名称kiwi-voice

分析耗时75.3s

引擎pi

✓

可以安装

The skill is safe to use. No security concerns require action. Ensure .env file with API keys is kept secure and not committed to version control.

资源类型	声明权限	推断权限	状态	证据
文件系统	`READ`	`READ`	✓ 一致	Voice profiles stored in voice_profiles/ directory, logs in logs/ directory - bo…
网络访问	`READ`	`READ`	✓ 一致	WebSocket to 127.0.0.1:18789 (local), REST API on port 7789, Telegram/HA/TTS ext…
命令执行	`NONE`	`NONE`	—	No shell execution from user input; subprocess only for legitimate CLI tools (pa…
环境变量	`READ`	`READ`	✓ 一致	Reads API keys from .env (KIWI_ELEVENLABS_API_KEY, KIWI_TELEGRAM_BOT_TOKEN, etc.…
技能调用	`NONE`	`NONE`	—	No skill chaining detected
剪贴板	`NONE`	`NONE`	—	No clipboard access found
浏览器	`NONE`	`NONE`	—	No browser automation detected
数据库	`NONE`	`NONE`	—	No database access detected

1 严重 32 项发现

💀

严重危险命令危险 Shell 命令

rm -rf /

docs/features/voice-security.md:17

🔗

中危外部 URL 外部 URL

https://kiwi-voice.com

README.md:2

🔗

中危外部 URL 外部 URL

https://kiwi-voice.com/assets/og-image.svg

README.md:3

🔗

中危外部 URL 外部 URL

https://img.shields.io/badge/license-MIT-blue.svg

README.md:14

🔗

中危外部 URL 外部 URL

https://www.python.org/downloads/

README.md:15

🔗

中危外部 URL 外部 URL

https://img.shields.io/badge/python-3.10%2B-blue.svg

README.md:15

🔗

中危外部 URL 外部 URL

https://img.shields.io/badge/backend-OpenClaw-orange.svg

README.md:16

🔗

中危外部 URL 外部 URL

https://docs.kiwi-voice.com

README.md:19

🔗

中危外部 URL 外部 URL

https://docs.openclaw.ai

README.md:21

🔗

中危外部 URL 外部 URL

http://homeassistant.local:8123

README.md:429

🔗

中危外部 URL 外部 URL

https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html

docs/deployment/docker.md:70

🔗

中危外部 URL 外部 URL

https://download.pytorch.org/whl/cu121

docs/deployment/gpu.md:10

🔗

中危外部 URL 外部 URL

https://www.home-assistant.io/

docs/features/home-assistant.md:3

🔗

中危外部 URL 外部 URL

http://192.168.1.100:7789

docs/features/home-assistant.md:20

🔗

中危外部 URL 外部 URL

https://t.me/botfather

docs/features/voice-security.md:60

🔗

中危外部 URL 外部 URL

https://api.telegram.org/bot

docs/features/voice-security.md:67

🔗

中危外部 URL 外部 URL

https://developer.mozilla.org/en-US/docs/Web/API/AudioWorklet

docs/features/web-microphone.md:7

🔗

中危外部 URL 外部 URL

https://ffmpeg.org/download.html

docs/getting-started/installation.md:52

🔗

中危外部 URL 外部 URL

http://192.168.1.100:8123

kiwi/integrations/homeassistant.py:30

🔗

中危外部 URL 外部 URL

https://api.elevenlabs.io/v1/text-to-speech

kiwi/tts/elevenlabs.py:93

🔗

中危外部 URL 外部 URL

https://api.runpod.ai/v2/

kiwi/tts/runpod.py:44

🔗

中危外部 URL 外部 URL

http://www.w3.org/2000/svg

kiwi/web/index.html:11

🔗

中危外部 URL 外部 URL

https://console.runpod.io/serverless

runpod/README.md:27

🔗

中危外部 URL 外部 URL

https://api.runpod.ai/v2/YOUR_ENDPOINT_ID

runpod/README.md:172

🔗

中危外部 URL 外部 URL

http://www.apache.org/licenses/LICENSE-2.0

runpod/qwen_tts/__init__.py:9

🔗

中危外部 URL 外部 URL

https://huggingface.co/papers/2103.17239

runpod/qwen_tts/core/tokenizer_12hz/modeling_qwen3_tts_tokenizer_v2.py:394

🔗

中危外部 URL 外部 URL

https://huggingface.co/papers/2006.08195

runpod/qwen_tts/core/tokenizer_12hz/modeling_qwen3_tts_tokenizer_v2.py:588

🔗

中危外部 URL 外部 URL

https://discuss.pytorch.org/t/how-to-generate-variable-length-mask/23397/3

runpod/qwen_tts/core/tokenizer_25hz/modeling_qwen3_tts_tokenizer_v1.py:233

🔗

中危外部 URL 外部 URL

https://huggingface.co/papers/2005.07143

runpod/qwen_tts/core/tokenizer_25hz/modeling_qwen3_tts_tokenizer_v1.py:345

🔗

中危外部 URL 外部 URL

https://arxiv.org/pdf/2107.03312.pdf

runpod/qwen_tts/core/tokenizer_25hz/vq/core_vq.py:336

🔗

中危外部 URL 外部 URL

https://arxiv.org/abs/2305.02765

runpod/qwen_tts/core/tokenizer_25hz/vq/core_vq.py:479

🔗

中危外部 URL 外部 URL

https://colab.research.google.com/drive/1q1oe2zOyZp7UsB3jJiQ1IFn8z5YfjwEb

scripts/train_wake_word.py:17

目录结构

146 文件 · 1.4 MB · 40646 行

Python 79f · 24900L YAML 18f · 8512L Markdown 35f · 3575L CSS 2f · 1831L JavaScript 3f · 1372L HTML 1f · 282L JSON 3f · 69L Text 1f · 53L TOML 2f · 43L Config 1f · 9L

├─ ▾ 📁 custom_components

│ └─ ▾ 📁 kiwi_voice

│ ├─ ▾ 📁 translations

│ │ └─ 📋 en.json JSON 29L · 843 B

│ ├─ 🐍 __init__.py Python 185L · 6.0 KB

│ ├─ 🐍 button.py Python 108L · 3.6 KB

│ ├─ 🐍 config_flow.py Python 133L · 4.4 KB

│ ├─ 🐍 const.py Python 5L · 116 B

│ ├─ 🐍 coordinator.py Python 159L · 5.6 KB

│ ├─ 📋 manifest.json JSON 11L · 312 B

│ ├─ 🐍 sensor.py Python 165L · 5.6 KB

│ ├─ 📋 services.yaml YAML 36L · 994 B

│ ├─ 📋 strings.json JSON 29L · 843 B

│ ├─ 🐍 switch.py Python 75L · 2.7 KB

│ └─ 🐍 tts.py Python 89L · 2.8 KB

├─ ▾ 📁 docs

│ ├─ ▾ 📁 api

│ │ ├─ 📝 rest.md Markdown 268L · 3.9 KB

│ │ └─ 📝 websocket.md Markdown 118L · 3.0 KB

│ ├─ ▾ 📁 assets

│ │ └─ 📦 favicon.svg 115 B

│ ├─ ▾ 📁 deployment

│ │ ├─ 📝 docker.md Markdown 82L · 1.6 KB

│ │ ├─ 📝 gpu.md Markdown 76L · 1.6 KB

│ │ ├─ 📝 local.md Markdown 76L · 1.6 KB

│ │ ├─ 📝 reverse-proxy.md Markdown 47L · 1.2 KB

│ │ └─ 📝 systemd.md Markdown 58L · 1.1 KB

│ ├─ ▾ 📁 development

│ │ ├─ 📝 architecture.md Markdown 81L · 3.4 KB

│ │ ├─ 📝 code-patterns.md Markdown 129L · 2.4 KB

│ │ └─ 📝 contributing.md Markdown 86L · 2.8 KB

│ ├─ ▾ 📁 features

│ │ ├─ 📝 home-assistant.md Markdown 82L · 2.4 KB

│ │ ├─ 📝 multilanguage.md Markdown 113L · 3.3 KB

│ │ ├─ 📝 souls.md Markdown 95L · 2.5 KB

│ │ ├─ 📝 speaker-id.md Markdown 70L · 2.4 KB

│ │ ├─ 📝 streaming-tts.md Markdown 42L · 1.7 KB

│ │ ├─ 📝 stt-engines.md Markdown 94L · 2.7 KB

│ │ ├─ 📝 tts-providers.md Markdown 122L · 2.6 KB

│ │ ├─ 📝 voice-security.md Markdown 91L · 3.2 KB

│ │ ├─ 📝 wake-word.md Markdown 91L · 2.7 KB

│ │ ├─ 📝 web-dashboard.md Markdown 47L · 1.8 KB

│ │ └─ 📝 web-microphone.md Markdown 47L · 1.7 KB

│ ├─ ▾ 📁 getting-started

│ │ ├─ 📝 configuration.md Markdown 170L · 4.0 KB

│ │ ├─ 📝 first-run.md Markdown 81L · 2.2 KB

│ │ └─ 📝 installation.md Markdown 81L · 1.7 KB

│ ├─ ▾ 📁 stylesheets

│ │ └─ 📄 extra.css CSS 230L · 4.8 KB

│ └─ 📝 index.md Markdown 130L · 4.7 KB

├─ ▾ 📁 kiwi

│ ├─ ▾ 📁 api

│ │ ├─ 🐍 __init__.py Python 2L · 83 B

│ │ ├─ 🐍 audio_bridge.py Python 259L · 8.9 KB

│ │ └─ 🐍 server.py Python 952L · 39.5 KB

│ ├─ ▾ 📁 integrations

│ │ ├─ 🐍 __init__.py Python 1L · 54 B

│ │ └─ 🐍 homeassistant.py Python 230L · 8.7 KB

│ ├─ ▾ 📁 locales

│ │ ├─ 🐍 __init__.py Python 0 B

│ │ ├─ 📋 ar.yaml YAML 533L · 17.3 KB

│ │ ├─ 📋 de.yaml YAML 541L · 15.8 KB

│ │ ├─ 📋 en.yaml YAML 557L · 14.7 KB

│ │ ├─ 📋 es.yaml YAML 535L · 15.2 KB

│ │ ├─ 📋 fr.yaml YAML 541L · 15.7 KB

│ │ ├─ 📋 hi.yaml YAML 533L · 23.6 KB

│ │ ├─ 📋 id.yaml YAML 535L · 14.6 KB

│ │ ├─ 📋 it.yaml YAML 539L · 15.5 KB

│ │ ├─ 📋 ja.yaml YAML 525L · 17.5 KB

│ │ ├─ 📋 ko.yaml YAML 524L · 15.4 KB

│ │ ├─ 📋 pl.yaml YAML 535L · 15.1 KB

│ │ ├─ 📋 pt.yaml YAML 539L · 15.2 KB

│ │ ├─ 📋 ru.yaml YAML 591L · 21.9 KB

│ │ ├─ 📋 tr.yaml YAML 530L · 14.7 KB

│ │ └─ 📋 zh.yaml YAML 526L · 14.0 KB

│ ├─ ▾ 📁 mixins

│ │ ├─ 🐍 __init__.py Python 15L · 478 B

│ │ ├─ 🐍 audio_playback.py Python 331L · 13.4 KB

│ │ ├─ 🐍 dialogue_pipeline.py Python 699L · 32.2 KB

│ │ ├─ 🐍 llm_callbacks.py Python 247L · 10.4 KB

│ │ ├─ 🐍 stream_watchdog.py Python 292L · 12.5 KB

│ │ └─ 🐍 tts_speech.py Python 585L · 24.0 KB

│ ├─ ▾ 📁 souls

│ │ ├─ 📝 comedian.md Markdown 29L · 1.2 KB

│ │ ├─ 📝 hype-person.md Markdown 27L · 1.2 KB

│ │ ├─ 📝 mindful-companion.md Markdown 37L · 1.7 KB

│ │ ├─ 📝 siren.md Markdown 33L · 2.0 KB

│ │ └─ 📝 storyteller.md Markdown 29L · 1.3 KB

│ ├─ ▾ 📁 stt

│ │ ├─ 🐍 __init__.py Python 1L · 52 B

│ │ ├─ 🐍 elevenlabs.py Python 196L · 6.0 KB

│ │ └─ 🐍 mlx_whisper.py Python 139L · 4.0 KB

│ ├─ ▾ 📁 tts

│ │ ├─ 🐍 __init__.py Python 2L · 110 B

│ │ ├─ 🐍 base.py Python 138L · 4.4 KB

│ │ ├─ 🐍 elevenlabs_ws.py Python 710L · 27.0 KB

│ │ ├─ 🐍 elevenlabs.py Python 302L · 11.2 KB

│ │ ├─ 🐍 kokoro.py Python 193L · 5.9 KB

│ │ ├─ 🐍 piper.py Python 143L · 4.9 KB

│ │ ├─ 🐍 qwen_local.py Python 197L · 7.1 KB

│ │ ├─ 🐍 runpod.py Python 265L · 8.2 KB

│ │ └─ 🐍 streaming.py Python 356L · 14.1 KB

│ ├─ ▾ 📁 web

│ │ ├─ ▾ 📁 static

│ │ │ ├─ 📜 app.js JavaScript 1000L · 31.7 KB

│ │ │ ├─ 📜 audio-client.js JavaScript 288L · 8.0 KB

│ │ │ ├─ 📜 audio-worklet.js JavaScript 84L · 2.3 KB

│ │ │ └─ 📄 style.css CSS 1601L · 34.2 KB

│ │ └─ 📄 index.html HTML 282L · 15.0 KB

│ ├─ 🐍 __init__.py Python 7L · 247 B

│ ├─ 🐍 __main__.py Python 4L · 78 B

│ ├─ 🐍 config_loader.py Python 720L · 34.4 KB

│ ├─ 🐍 event_bus.py Python 466L · 13.6 KB

│ ├─ 🐍 hardware_aec.py Python 466L · 16.2 KB

│ ├─ 🐍 i18n.py Python 71L · 1.9 KB

│ ├─ 🐍 listener.py Python 2665L · 122.8 KB

│ ├─ 🐍 openclaw_cli.py Python 254L · 9.6 KB

│ ├─ 🐍 openclaw_ws.py Python 1648L · 64.8 KB

│ ├─ 🐍 service.py Python 748L · 29.8 KB

│ ├─ 🐍 soul_manager.py Python 258L · 9.6 KB

│ ├─ 🐍 speaker_id.py Python 451L · 15.5 KB

│ ├─ 🐍 speaker_manager.py Python 625L · 23.3 KB

│ ├─ 🐍 state_machine.py Python 11L · 437 B

│ ├─ 🐍 text_processing.py Python 242L · 9.3 KB

│ ├─ 🐍 unified_vad.py Python 398L · 12.9 KB

│ ├─ 🐍 utils.py Python 148L · 4.7 KB

│ ├─ 🐍 voice_security.py Python 624L · 22.6 KB

│ └─ 🐍 wake_word.py Python 104L · 3.1 KB

├─ ▾ 📁 runpod

│ ├─ ▾ 📁 qwen_tts

│ │ ├─ ▾ 📁 cli

│ │ │ └─ 🐍 demo.py Python 634L · 28.5 KB

│ │ ├─ ▾ 📁 core

│ │ │ ├─ ▾ 📁 tokenizer_12hz

│ │ │ │ ├─ 🔑 configuration_qwen3_tts_tokenizer_v2.py ⚠ Python 172L · 7.8 KB

│ │ │ │ └─ 🔑 modeling_qwen3_tts_tokenizer_v2.py ⚠ Python 1025L · 39.5 KB

│ │ │ ├─ ▾ 📁 tokenizer_25hz

│ │ │ │ ├─ ▾ 📁 vq

│ │ │ │ │ ├─ 🐍 core_vq.py Python 522L · 19.6 KB

│ │ │ │ │ ├─ 🐍 speech_vq.py Python 356L · 14.5 KB

│ │ │ │ │ └─ 🐍 whisper_encoder.py Python 406L · 14.0 KB

│ │ │ │ ├─ 🔑 configuration_qwen3_tts_tokenizer_v1.py ⚠ Python 332L · 14.2 KB

│ │ │ │ └─ 🔑 modeling_qwen3_tts_tokenizer_v1.py ⚠ Python 1528L · 55.1 KB

│ │ │ └─ 🐍 __init__.py Python 18L · 990 B

│ │ ├─ ▾ 📁 inference

│ │ │ ├─ 🐍 qwen3_tts_model.py Python 877L · 36.3 KB

│ │ │ └─ 🔑 qwen3_tts_tokenizer.py ⚠ Python 410L · 15.3 KB

│ │ ├─ 🐍 __init__.py Python 23L · 839 B

│ │ └─ 🐍 __main__.py Python 24L · 800 B

│ ├─ 🐍 download_model.py Python 22L · 734 B

│ ├─ 🐍 handler.py Python 415L · 13.0 KB

│ └─ 📝 README.md Markdown 238L · 6.1 KB

├─ ▾ 📁 scripts

│ ├─ 🐍 generate_cues.py Python 94L · 2.8 KB

│ ├─ 🐍 noise_monitor.py Python 85L · 2.6 KB

│ ├─ 🐍 send_voice.py Python 33L · 916 B

│ ├─ 🐍 telegram_voice.py Python 170L · 5.1 KB

│ └─ 🐍 train_wake_word.py Python 112L · 3.1 KB

├─ ▾ 📁 tests

│ ├─ 🐍 __init__.py Python 0 B

│ ├─ 🐍 test_api.py Python 27L · 704 B

│ ├─ 🐍 test_audio_bridge.py Python 106L · 3.0 KB

│ ├─ 🐍 test_config.py Python 58L · 1.9 KB

│ ├─ 🐍 test_i18n.py Python 78L · 2.2 KB

│ ├─ 🐍 test_smoke.py Python 80L · 2.6 KB

│ ├─ 🐍 test_text_processing.py Python 137L · 4.2 KB

│ └─ 🐍 test_voice_security.py Python 72L · 2.5 KB

├─ 📝 CLAUDE.md Markdown 228L · 8.8 KB

├─ 📋 config.yaml YAML 283L · 9.9 KB

├─ 📋 mkdocs.yml YAML 109L · 2.9 KB

├─ 📄 mypy.ini Config 9L · 217 B

├─ 📄 pyproject.toml TOML 26L · 830 B

├─ 📝 README.md Markdown 441L · 17.0 KB

├─ 📄 requirements.txt Text 53L · 947 B

├─ 📄 ruff.toml TOML 17L · 489 B

├─ 📝 SKILL.md Markdown 124L · 3.3 KB

└─ 📝 SOUL.md Markdown 12L · 851 B

依赖分析 7 项

包名	版本	来源	已知漏洞	备注
`faster-whisper`	`>=1.0.0`	pip	否	STT engine, industry standard
`sounddevice`	`>=0.4.6`	pip	否	Audio input/output
`pyannote.audio`	`>=3.1.0`	pip	否	Speaker identification
`requests`	`>=2.28.0`	pip	否	HTTP client for TTS APIs
`websocket-client`	`>=1.6.0`	pip	否	WebSocket protocol
`aiohttp`	`>=3.9.0`	pip	否	REST API server
`cryptography`	`>=41.0.0`	pip	否	Ed25519 key handling

安全亮点

✓ Two-layer security architecture: pre-LLM regex filter + post-LLM shell execution approval

✓ Speaker identification with priority levels (OWNER > FRIEND > GUEST > BLOCKED)

✓ Telegram approval workflow with inline keyboard buttons for dangerous commands

✓ Voice security patterns defined per-language in locale YAML files

✓ API authentication with scope-based access control (read, control, tts, speakers, admin)

✓ Ed25519 device identity for WebSocket authentication

✓ Comprehensive crash protection with thread-safe logging

✓ Voice profiles stored as JSON with embeddings, not raw audio

✓ No credential harvesting or exfiltration patterns

✓ Well-documented i18n system with 15 supported languages

✓ OpenClaw session isolation for NSFW soul routing