Scan Report
5 /100
xiaomi-mimo-tts
使用小米 MiMo TTS (mimo-v2-tts) 生成语音。支持多种音色、风格控制、情感标签和方言。
This is a legitimate Xiaomi MiMo TTS (text-to-speech) skill that calls an external API to synthesize audio from text. Base64 decoding is used solely for decoding returned audio data — a standard and expected pattern. No credential exfiltration, no hidden functionality, no shell obfuscation beyond normal API usage.
Safe to install
Approve for use. No security concerns identified. Consider pinning the requests/urllib dependency versions if used in production.
Findings 2 items
| Severity | Finding | Location |
|---|---|---|
| Low | Undeclared DRY variable reference in JavaScript | scripts/base/mimo_tts.js:53 |
| Info | API key fallback chain not documented | scripts/_env.sh:10 |
| Resource | Declared | Inferred | Status | Evidence |
|---|---|---|---|---|
| Filesystem | WRITE | WRITE | ✓ Aligned | All implementations write output files (ogg/wav). Declared in SKILL.md (output f… |
| Network | READ | READ | ✓ Aligned | HTTPS POST to api.xiaomimimo.com for TTS synthesis. Declared in SKILL.md. |
| Shell | NONE | WRITE | ✓ Aligned | spawnSync('ffmpeg') and subprocess.run(['ffmpeg']) for audio conversion. Functio… |
| Environment | NONE | READ | ✓ Aligned | Reads XIAOMI_API_KEY / MIMO_API_KEY — expected credential access for API auth. |
| Skill Invoke | NONE | NONE | — | No skill_invoke calls detected. |
| Clipboard | NONE | NONE | — | No clipboard access. |
| Browser | NONE | NONE | — | No browser access. |
| Database | NONE | NONE | — | No database access. |
2 Critical 5 findings
Critical Encoded Execution Base64 编码执行(代码混淆)
base64 -d scripts/base/mimo-tts.sh:58 Critical Encoded Execution Base64 编码执行(代码混淆)
Buffer.from(audio_b64, 'base64' scripts/base/mimo_tts.js:81 Medium External URL 外部 URL
https://platform.xiaomimimo.com/ README.md:76 Medium External URL 外部 URL
https://api.xiaomimimo.com/v1/chat/completions scripts/base/mimo-tts.sh:54 Medium External URL 外部 URL
https://api.xiaomimimo.com/v1/models scripts/utils/test.sh:31 File Tree
20 files · 51.4 KB · 1557 lines Shell 11f · 816L
Markdown 3f · 487L
Python 2f · 128L
JavaScript 2f · 115L
JSON 2f · 11L
├─
▾
scripts
│ ├─
▾
base
│ │ ├─
mimo_tts.js
JavaScript
│ │ ├─
mimo_tts.py
Python
│ │ └─
mimo-tts.sh
Shell
│ ├─
▾
examples
│ │ ├─
demo.sh
Shell
│ │ ├─
dialect-tester.sh
Shell
│ │ └─
tease-generator.sh
Shell
│ ├─
▾
smart
│ │ ├─
mimo_tts_smart.js
JavaScript
│ │ ├─
mimo_tts_smart.py
Python
│ │ ├─
mimo_tts_smart.sh
Shell
│ │ └─
mimo-tts-smart.sh
Shell
│ ├─
▾
utils
│ │ └─
test.sh
Shell
│ ├─
_env.sh
Shell
│ ├─
mimo-tts-smart.sh
Shell
│ ├─
mimo-tts.sh
Shell
│ └─
test_local.sh
Shell
├─
_meta.json
JSON
├─
CHANGELOG.md
Markdown
├─
package.json
JSON
├─
README.md
Markdown
└─
SKILL.md
Markdown
Dependencies 3 items
| Package | Version | Source | Known Vulns | Notes |
|---|---|---|---|---|
ffmpeg | any | system | No | Audio conversion tool. Required for OGG encoding. Declared in usage docs. |
Node.js (fs, child_process, https) | any | system | No | Built-in Node.js modules only. |
Python stdlib | any | stdlib | No | Uses only urllib.request, json, base64, subprocess, os — no pip dependencies. |
Security Positives
✓ All network requests go to a single, clearly identified API endpoint (api.xiaomimimo.com) — no suspicious IPs or domains.
✓ Base64 decoding is used exclusively for legitimate TTS audio decoding (API response → audio file), not for obfuscating malicious payloads.
✓ API key access is scoped to authentication for the declared service only — no credential harvesting or exfiltration.
✓ File writes are limited to output audio files in user-specified or temp directories — no writes to sensitive paths like ~/.ssh, ~/.aws, or .env.
✓ No eval(), no os.system/popen with user-controlled strings, no curl|bash patterns.
✓ The skill's functionality is fully declared in SKILL.md: TTS synthesis via Xiaomi MiMo API with style/dialect control.
✓ Python implementation uses standard library only (urllib.request, json, base64, subprocess) — no third-party dependency risk.