Trusted — Risk Score 5/100
Last scan:2 days ago Rescan
5 /100
xiaomi-mimo-tts
使用小米 MiMo TTS (mimo-v2-tts) 生成语音。支持多种音色、风格控制、情感标签和方言。
This is a legitimate Xiaomi MiMo TTS (text-to-speech) skill that calls an external API to synthesize audio from text. Base64 decoding is used solely for decoding returned audio data — a standard and expected pattern. No credential exfiltration, no hidden functionality, no shell obfuscation beyond normal API usage.
Skill Namexiaomi-mimo-tts
Duration51.3s
Enginepi
Safe to install
Approve for use. No security concerns identified. Consider pinning the requests/urllib dependency versions if used in production.

Findings 2 items

Severity Finding Location
Low
Undeclared DRY variable reference in JavaScript
Variable 'DRY' is referenced at line 53 of mimo_tts.js but never defined. Since it evaluates as falsy, the real code path always runs — the block is dead code. This indicates incomplete cleanup from development.
if (DRY) {
→ Remove the unreachable DRY block or define 'const DRY = false;' at the top of the file for clarity.
scripts/base/mimo_tts.js:53
Info
API key fallback chain not documented
The scripts read XIAOMI_API_KEY with fallback to MIMO_API_KEY. This is not documented in SKILL.md, which only mentions XIAOMI_API_KEY.
if [ -z "${XIAOMI_API_KEY}" ] && [ -n "${MIMO_API_KEY}" ]; then
→ Document the MIMO_API_KEY fallback in SKILL.md for completeness.
scripts/_env.sh:10
ResourceDeclaredInferredStatusEvidence
Filesystem WRITE WRITE ✓ Aligned All implementations write output files (ogg/wav). Declared in SKILL.md (output f…
Network READ READ ✓ Aligned HTTPS POST to api.xiaomimimo.com for TTS synthesis. Declared in SKILL.md.
Shell NONE WRITE ✓ Aligned spawnSync('ffmpeg') and subprocess.run(['ffmpeg']) for audio conversion. Functio…
Environment NONE READ ✓ Aligned Reads XIAOMI_API_KEY / MIMO_API_KEY — expected credential access for API auth.
Skill Invoke NONE NONE No skill_invoke calls detected.
Clipboard NONE NONE No clipboard access.
Browser NONE NONE No browser access.
Database NONE NONE No database access.
2 Critical 5 findings
🔒
Critical Encoded Execution Base64 编码执行(代码混淆)
base64 -d
scripts/base/mimo-tts.sh:58
🔒
Critical Encoded Execution Base64 编码执行(代码混淆)
Buffer.from(audio_b64, 'base64'
scripts/base/mimo_tts.js:81
🔗
Medium External URL 外部 URL
https://platform.xiaomimimo.com/
README.md:76
🔗
Medium External URL 外部 URL
https://api.xiaomimimo.com/v1/chat/completions
scripts/base/mimo-tts.sh:54
🔗
Medium External URL 外部 URL
https://api.xiaomimimo.com/v1/models
scripts/utils/test.sh:31

File Tree

20 files · 51.4 KB · 1557 lines
Shell 11f · 816L Markdown 3f · 487L Python 2f · 128L JavaScript 2f · 115L JSON 2f · 11L
├─ 📁 scripts
│ ├─ 📁 base
│ │ ├─ 📜 mimo_tts.js JavaScript 98L · 3.2 KB
│ │ ├─ 🐍 mimo_tts.py Python 107L · 3.5 KB
│ │ └─ 🔧 mimo-tts.sh Shell 67L · 2.0 KB
│ ├─ 📁 examples
│ │ ├─ 🔧 demo.sh Shell 65L · 2.2 KB
│ │ ├─ 🔧 dialect-tester.sh Shell 106L · 3.6 KB
│ │ └─ 🔧 tease-generator.sh Shell 86L · 3.8 KB
│ ├─ 📁 smart
│ │ ├─ 📜 mimo_tts_smart.js JavaScript 17L · 822 B
│ │ ├─ 🐍 mimo_tts_smart.py Python 21L · 889 B
│ │ ├─ 🔧 mimo_tts_smart.sh Shell 109L · 3.0 KB
│ │ └─ 🔧 mimo-tts-smart.sh Shell 88L · 3.0 KB
│ ├─ 📁 utils
│ │ └─ 🔧 test.sh Shell 65L · 1.5 KB
│ ├─ 🔧 _env.sh Shell 27L · 867 B
│ ├─ 🔧 mimo-tts-smart.sh Shell 96L · 3.3 KB
│ ├─ 🔧 mimo-tts.sh Shell 70L · 2.1 KB
│ └─ 🔧 test_local.sh Shell 37L · 964 B
├─ 📋 _meta.json JSON 5L · 134 B
├─ 📝 CHANGELOG.md Markdown 47L · 2.2 KB
├─ 📋 package.json JSON 6L · 179 B
├─ 📝 README.md Markdown 216L · 7.4 KB
└─ 📝 SKILL.md Markdown 224L · 6.7 KB

Dependencies 3 items

PackageVersionSourceKnown VulnsNotes
ffmpeg any system No Audio conversion tool. Required for OGG encoding. Declared in usage docs.
Node.js (fs, child_process, https) any system No Built-in Node.js modules only.
Python stdlib any stdlib No Uses only urllib.request, json, base64, subprocess, os — no pip dependencies.

Security Positives

✓ All network requests go to a single, clearly identified API endpoint (api.xiaomimimo.com) — no suspicious IPs or domains.
✓ Base64 decoding is used exclusively for legitimate TTS audio decoding (API response → audio file), not for obfuscating malicious payloads.
✓ API key access is scoped to authentication for the declared service only — no credential harvesting or exfiltration.
✓ File writes are limited to output audio files in user-specified or temp directories — no writes to sensitive paths like ~/.ssh, ~/.aws, or .env.
✓ No eval(), no os.system/popen with user-controlled strings, no curl|bash patterns.
✓ The skill's functionality is fully declared in SKILL.md: TTS synthesis via Xiaomi MiMo API with style/dialect control.
✓ Python implementation uses standard library only (urllib.request, json, base64, subprocess) — no third-party dependency risk.