Scan Report
5 /100
kiwi-voice
Multi-language real-time voice assistant with OpenClaw AI backend integration, speaker identification, and voice security controls
Kiwi Voice is a legitimate multi-language voice assistant with robust security controls including Telegram approval for dangerous commands, speaker identification with priority levels, and two-layer command filtering. No malicious behavior detected.
Safe to install
The skill is safe to use. No security concerns require action. Ensure .env file with API keys is kept secure and not committed to version control.
| Resource | Declared | Inferred | Status | Evidence |
|---|---|---|---|---|
| Filesystem | READ | READ | ✓ Aligned | Voice profiles stored in voice_profiles/ directory, logs in logs/ directory - bo… |
| Network | READ | READ | ✓ Aligned | WebSocket to 127.0.0.1:18789 (local), REST API on port 7789, Telegram/HA/TTS ext… |
| Shell | NONE | NONE | — | No shell execution from user input; subprocess only for legitimate CLI tools (pa… |
| Environment | READ | READ | ✓ Aligned | Reads API keys from .env (KIWI_ELEVENLABS_API_KEY, KIWI_TELEGRAM_BOT_TOKEN, etc.… |
| Skill Invoke | NONE | NONE | — | No skill chaining detected |
| Clipboard | NONE | NONE | — | No clipboard access found |
| Browser | NONE | NONE | — | No browser automation detected |
| Database | NONE | NONE | — | No database access detected |
1 Critical 32 findings
Critical Dangerous Command 危险 Shell 命令
rm -rf / docs/features/voice-security.md:17 Medium External URL 外部 URL
https://kiwi-voice.com README.md:2 Medium External URL 外部 URL
https://kiwi-voice.com/assets/og-image.svg README.md:3 Medium External URL 外部 URL
https://img.shields.io/badge/license-MIT-blue.svg README.md:14 Medium External URL 外部 URL
https://www.python.org/downloads/ README.md:15 Medium External URL 外部 URL
https://img.shields.io/badge/python-3.10%2B-blue.svg README.md:15 Medium External URL 外部 URL
https://img.shields.io/badge/backend-OpenClaw-orange.svg README.md:16 Medium External URL 外部 URL
https://docs.kiwi-voice.com README.md:19 Medium External URL 外部 URL
https://docs.openclaw.ai README.md:21 Medium External URL 外部 URL
http://homeassistant.local:8123 README.md:429 Medium External URL 外部 URL
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html docs/deployment/docker.md:70 Medium External URL 外部 URL
https://download.pytorch.org/whl/cu121 docs/deployment/gpu.md:10 Medium External URL 外部 URL
https://www.home-assistant.io/ docs/features/home-assistant.md:3 Medium External URL 外部 URL
http://192.168.1.100:7789 docs/features/home-assistant.md:20 Medium External URL 外部 URL
https://t.me/botfather docs/features/voice-security.md:60 Medium External URL 外部 URL
https://api.telegram.org/bot docs/features/voice-security.md:67 Medium External URL 外部 URL
https://developer.mozilla.org/en-US/docs/Web/API/AudioWorklet docs/features/web-microphone.md:7 Medium External URL 外部 URL
https://ffmpeg.org/download.html docs/getting-started/installation.md:52 Medium External URL 外部 URL
http://192.168.1.100:8123 kiwi/integrations/homeassistant.py:30 Medium External URL 外部 URL
https://api.elevenlabs.io/v1/text-to-speech kiwi/tts/elevenlabs.py:93 Medium External URL 外部 URL
https://api.runpod.ai/v2/ kiwi/tts/runpod.py:44 Medium External URL 外部 URL
http://www.w3.org/2000/svg kiwi/web/index.html:11 Medium External URL 外部 URL
https://console.runpod.io/serverless runpod/README.md:27 Medium External URL 外部 URL
https://api.runpod.ai/v2/YOUR_ENDPOINT_ID runpod/README.md:172 Medium External URL 外部 URL
http://www.apache.org/licenses/LICENSE-2.0 runpod/qwen_tts/__init__.py:9 Medium External URL 外部 URL
https://huggingface.co/papers/2103.17239 runpod/qwen_tts/core/tokenizer_12hz/modeling_qwen3_tts_tokenizer_v2.py:394 Medium External URL 外部 URL
https://huggingface.co/papers/2006.08195 runpod/qwen_tts/core/tokenizer_12hz/modeling_qwen3_tts_tokenizer_v2.py:588 Medium External URL 外部 URL
https://discuss.pytorch.org/t/how-to-generate-variable-length-mask/23397/3 runpod/qwen_tts/core/tokenizer_25hz/modeling_qwen3_tts_tokenizer_v1.py:233 Medium External URL 外部 URL
https://huggingface.co/papers/2005.07143 runpod/qwen_tts/core/tokenizer_25hz/modeling_qwen3_tts_tokenizer_v1.py:345 Medium External URL 外部 URL
https://arxiv.org/pdf/2107.03312.pdf runpod/qwen_tts/core/tokenizer_25hz/vq/core_vq.py:336 Medium External URL 外部 URL
https://arxiv.org/abs/2305.02765 runpod/qwen_tts/core/tokenizer_25hz/vq/core_vq.py:479 Medium External URL 外部 URL
https://colab.research.google.com/drive/1q1oe2zOyZp7UsB3jJiQ1IFn8z5YfjwEb scripts/train_wake_word.py:17 File Tree
146 files · 1.4 MB · 40646 lines Python 79f · 24900L
YAML 18f · 8512L
Markdown 35f · 3575L
CSS 2f · 1831L
JavaScript 3f · 1372L
HTML 1f · 282L
JSON 3f · 69L
Text 1f · 53L
TOML 2f · 43L
Config 1f · 9L
├─
▾
custom_components
│ └─
▾
kiwi_voice
│ ├─
▾
translations
│ │ └─
en.json
JSON
│ ├─
__init__.py
Python
│ ├─
button.py
Python
│ ├─
config_flow.py
Python
│ ├─
const.py
Python
│ ├─
coordinator.py
Python
│ ├─
manifest.json
JSON
│ ├─
sensor.py
Python
│ ├─
services.yaml
YAML
│ ├─
strings.json
JSON
│ ├─
switch.py
Python
│ └─
tts.py
Python
├─
▾
docs
│ ├─
▾
api
│ │ ├─
rest.md
Markdown
│ │ └─
websocket.md
Markdown
│ ├─
▾
assets
│ │ └─
favicon.svg
│ ├─
▾
deployment
│ │ ├─
docker.md
Markdown
│ │ ├─
gpu.md
Markdown
│ │ ├─
local.md
Markdown
│ │ ├─
reverse-proxy.md
Markdown
│ │ └─
systemd.md
Markdown
│ ├─
▾
development
│ │ ├─
architecture.md
Markdown
│ │ ├─
code-patterns.md
Markdown
│ │ └─
contributing.md
Markdown
│ ├─
▾
features
│ │ ├─
home-assistant.md
Markdown
│ │ ├─
multilanguage.md
Markdown
│ │ ├─
souls.md
Markdown
│ │ ├─
speaker-id.md
Markdown
│ │ ├─
streaming-tts.md
Markdown
│ │ ├─
stt-engines.md
Markdown
│ │ ├─
tts-providers.md
Markdown
│ │ ├─
voice-security.md
Markdown
│ │ ├─
wake-word.md
Markdown
│ │ ├─
web-dashboard.md
Markdown
│ │ └─
web-microphone.md
Markdown
│ ├─
▾
getting-started
│ │ ├─
configuration.md
Markdown
│ │ ├─
first-run.md
Markdown
│ │ └─
installation.md
Markdown
│ ├─
▾
stylesheets
│ │ └─
extra.css
CSS
│ └─
index.md
Markdown
├─
▾
kiwi
│ ├─
▾
api
│ │ ├─
__init__.py
Python
│ │ ├─
audio_bridge.py
Python
│ │ └─
server.py
Python
│ ├─
▾
integrations
│ │ ├─
__init__.py
Python
│ │ └─
homeassistant.py
Python
│ ├─
▾
locales
│ │ ├─
__init__.py
Python
│ │ ├─
ar.yaml
YAML
│ │ ├─
de.yaml
YAML
│ │ ├─
en.yaml
YAML
│ │ ├─
es.yaml
YAML
│ │ ├─
fr.yaml
YAML
│ │ ├─
hi.yaml
YAML
│ │ ├─
id.yaml
YAML
│ │ ├─
it.yaml
YAML
│ │ ├─
ja.yaml
YAML
│ │ ├─
ko.yaml
YAML
│ │ ├─
pl.yaml
YAML
│ │ ├─
pt.yaml
YAML
│ │ ├─
ru.yaml
YAML
│ │ ├─
tr.yaml
YAML
│ │ └─
zh.yaml
YAML
│ ├─
▾
mixins
│ │ ├─
__init__.py
Python
│ │ ├─
audio_playback.py
Python
│ │ ├─
dialogue_pipeline.py
Python
│ │ ├─
llm_callbacks.py
Python
│ │ ├─
stream_watchdog.py
Python
│ │ └─
tts_speech.py
Python
│ ├─
▾
souls
│ │ ├─
comedian.md
Markdown
│ │ ├─
hype-person.md
Markdown
│ │ ├─
mindful-companion.md
Markdown
│ │ ├─
siren.md
Markdown
│ │ └─
storyteller.md
Markdown
│ ├─
▾
stt
│ │ ├─
__init__.py
Python
│ │ ├─
elevenlabs.py
Python
│ │ └─
mlx_whisper.py
Python
│ ├─
▾
tts
│ │ ├─
__init__.py
Python
│ │ ├─
base.py
Python
│ │ ├─
elevenlabs_ws.py
Python
│ │ ├─
elevenlabs.py
Python
│ │ ├─
kokoro.py
Python
│ │ ├─
piper.py
Python
│ │ ├─
qwen_local.py
Python
│ │ ├─
runpod.py
Python
│ │ └─
streaming.py
Python
│ ├─
▾
web
│ │ ├─
▾
static
│ │ │ ├─
app.js
JavaScript
│ │ │ ├─
audio-client.js
JavaScript
│ │ │ ├─
audio-worklet.js
JavaScript
│ │ │ └─
style.css
CSS
│ │ └─
index.html
HTML
│ ├─
__init__.py
Python
│ ├─
__main__.py
Python
│ ├─
config_loader.py
Python
│ ├─
event_bus.py
Python
│ ├─
hardware_aec.py
Python
│ ├─
i18n.py
Python
│ ├─
listener.py
Python
│ ├─
openclaw_cli.py
Python
│ ├─
openclaw_ws.py
Python
│ ├─
service.py
Python
│ ├─
soul_manager.py
Python
│ ├─
speaker_id.py
Python
│ ├─
speaker_manager.py
Python
│ ├─
state_machine.py
Python
│ ├─
text_processing.py
Python
│ ├─
unified_vad.py
Python
│ ├─
utils.py
Python
│ ├─
voice_security.py
Python
│ └─
wake_word.py
Python
├─
▾
runpod
│ ├─
▾
qwen_tts
│ │ ├─
▾
cli
│ │ │ └─
demo.py
Python
│ │ ├─
▾
core
│ │ │ ├─
▾
tokenizer_12hz
│ │ │ │ ├─
configuration_qwen3_tts_tokenizer_v2.py
⚠
Python
│ │ │ │ └─
modeling_qwen3_tts_tokenizer_v2.py
⚠
Python
│ │ │ ├─
▾
tokenizer_25hz
│ │ │ │ ├─
▾
vq
│ │ │ │ │ ├─
core_vq.py
Python
│ │ │ │ │ ├─
speech_vq.py
Python
│ │ │ │ │ └─
whisper_encoder.py
Python
│ │ │ │ ├─
configuration_qwen3_tts_tokenizer_v1.py
⚠
Python
│ │ │ │ └─
modeling_qwen3_tts_tokenizer_v1.py
⚠
Python
│ │ │ └─
__init__.py
Python
│ │ ├─
▾
inference
│ │ │ ├─
qwen3_tts_model.py
Python
│ │ │ └─
qwen3_tts_tokenizer.py
⚠
Python
│ │ ├─
__init__.py
Python
│ │ └─
__main__.py
Python
│ ├─
download_model.py
Python
│ ├─
handler.py
Python
│ └─
README.md
Markdown
├─
▾
scripts
│ ├─
generate_cues.py
Python
│ ├─
noise_monitor.py
Python
│ ├─
send_voice.py
Python
│ ├─
telegram_voice.py
Python
│ └─
train_wake_word.py
Python
├─
▾
tests
│ ├─
__init__.py
Python
│ ├─
test_api.py
Python
│ ├─
test_audio_bridge.py
Python
│ ├─
test_config.py
Python
│ ├─
test_i18n.py
Python
│ ├─
test_smoke.py
Python
│ ├─
test_text_processing.py
Python
│ └─
test_voice_security.py
Python
├─
CLAUDE.md
Markdown
├─
config.yaml
YAML
├─
mkdocs.yml
YAML
├─
mypy.ini
Config
├─
pyproject.toml
TOML
├─
README.md
Markdown
├─
requirements.txt
Text
├─
ruff.toml
TOML
├─
SKILL.md
Markdown
└─
SOUL.md
Markdown
Dependencies 7 items
| Package | Version | Source | Known Vulns | Notes |
|---|---|---|---|---|
faster-whisper | >=1.0.0 | pip | No | STT engine, industry standard |
sounddevice | >=0.4.6 | pip | No | Audio input/output |
pyannote.audio | >=3.1.0 | pip | No | Speaker identification |
requests | >=2.28.0 | pip | No | HTTP client for TTS APIs |
websocket-client | >=1.6.0 | pip | No | WebSocket protocol |
aiohttp | >=3.9.0 | pip | No | REST API server |
cryptography | >=41.0.0 | pip | No | Ed25519 key handling |
Security Positives
✓ Two-layer security architecture: pre-LLM regex filter + post-LLM shell execution approval
✓ Speaker identification with priority levels (OWNER > FRIEND > GUEST > BLOCKED)
✓ Telegram approval workflow with inline keyboard buttons for dangerous commands
✓ Voice security patterns defined per-language in locale YAML files
✓ API authentication with scope-based access control (read, control, tts, speakers, admin)
✓ Ed25519 device identity for WebSocket authentication
✓ Comprehensive crash protection with thread-safe logging
✓ Voice profiles stored as JSON with embeddings, not raw audio
✓ No credential harvesting or exfiltration patterns
✓ Well-documented i18n system with 15 supported languages
✓ OpenClaw session isolation for NSFW soul routing