xiaomi-mimo-tts Security Report — Trusted | ClawSafe

5 /100

xiaomi-mimo-tts

使用小米 MiMo TTS (mimo-v2-tts) 生成语音。支持多种音色、风格控制、情感标签和方言。

This is a legitimate Xiaomi MiMo TTS (text-to-speech) skill that calls an external API to synthesize audio from text. Base64 decoding is used solely for decoding returned audio data — a standard and expected pattern. No credential exfiltration, no hidden functionality, no shell obfuscation beyond normal API usage.

Skill Namexiaomi-mimo-tts

Duration51.3s

Enginepi

✓

Safe to install

Approve for use. No security concerns identified. Consider pinning the requests/urllib dependency versions if used in production.

Findings 2 items

Severity	Finding	Location
Low	Undeclared DRY variable reference in JavaScript Variable 'DRY' is referenced at line 53 of mimo_tts.js but never defined. Since it evaluates as falsy, the real code path always runs — the block is dead code. This indicates incomplete cleanup from development. `if (DRY) {` → Remove the unreachable DRY block or define 'const DRY = false;' at the top of the file for clarity.	`scripts/base/mimo_tts.js:53`
Info	API key fallback chain not documented The scripts read XIAOMI_API_KEY with fallback to MIMO_API_KEY. This is not documented in SKILL.md, which only mentions XIAOMI_API_KEY. `if [ -z "${XIAOMI_API_KEY}" ] && [ -n "${MIMO_API_KEY}" ]; then` → Document the MIMO_API_KEY fallback in SKILL.md for completeness.	`scripts/_env.sh:10`

Resource	Declared	Inferred	Status	Evidence
Filesystem	`WRITE`	`WRITE`	✓ Aligned	All implementations write output files (ogg/wav). Declared in SKILL.md (output f…
Network	`READ`	`READ`	✓ Aligned	HTTPS POST to api.xiaomimimo.com for TTS synthesis. Declared in SKILL.md.
Shell	`NONE`	`WRITE`	✓ Aligned	spawnSync('ffmpeg') and subprocess.run(['ffmpeg']) for audio conversion. Functio…
Environment	`NONE`	`READ`	✓ Aligned	Reads XIAOMI_API_KEY / MIMO_API_KEY — expected credential access for API auth.
Skill Invoke	`NONE`	`NONE`	—	No skill_invoke calls detected.
Clipboard	`NONE`	`NONE`	—	No clipboard access.
Browser	`NONE`	`NONE`	—	No browser access.
Database	`NONE`	`NONE`	—	No database access.

2 Critical 5 findings

🔒

Critical Encoded Execution Base64 编码执行（代码混淆）

base64 -d

scripts/base/mimo-tts.sh:58

🔒

Critical Encoded Execution Base64 编码执行（代码混淆）

Buffer.from(audio_b64, 'base64'

scripts/base/mimo_tts.js:81

🔗

Medium External URL 外部 URL

https://platform.xiaomimimo.com/

README.md:76

🔗

Medium External URL 外部 URL

https://api.xiaomimimo.com/v1/chat/completions

scripts/base/mimo-tts.sh:54

🔗

Medium External URL 外部 URL

https://api.xiaomimimo.com/v1/models

scripts/utils/test.sh:31

File Tree

20 files · 51.4 KB · 1557 lines

Shell 11f · 816L Markdown 3f · 487L Python 2f · 128L JavaScript 2f · 115L JSON 2f · 11L

├─ ▾ 📁 scripts

│ ├─ ▾ 📁 base

│ │ ├─ 📜 mimo_tts.js JavaScript 98L · 3.2 KB

│ │ ├─ 🐍 mimo_tts.py Python 107L · 3.5 KB

│ │ └─ 🔧 mimo-tts.sh Shell 67L · 2.0 KB

│ ├─ ▾ 📁 examples

│ │ ├─ 🔧 demo.sh Shell 65L · 2.2 KB

│ │ ├─ 🔧 dialect-tester.sh Shell 106L · 3.6 KB

│ │ └─ 🔧 tease-generator.sh Shell 86L · 3.8 KB

│ ├─ ▾ 📁 smart

│ │ ├─ 📜 mimo_tts_smart.js JavaScript 17L · 822 B

│ │ ├─ 🐍 mimo_tts_smart.py Python 21L · 889 B

│ │ ├─ 🔧 mimo_tts_smart.sh Shell 109L · 3.0 KB

│ │ └─ 🔧 mimo-tts-smart.sh Shell 88L · 3.0 KB

│ ├─ ▾ 📁 utils

│ │ └─ 🔧 test.sh Shell 65L · 1.5 KB

│ ├─ 🔧 _env.sh Shell 27L · 867 B

│ ├─ 🔧 mimo-tts-smart.sh Shell 96L · 3.3 KB

│ ├─ 🔧 mimo-tts.sh Shell 70L · 2.1 KB

│ └─ 🔧 test_local.sh Shell 37L · 964 B

├─ 📋 _meta.json JSON 5L · 134 B

├─ 📝 CHANGELOG.md Markdown 47L · 2.2 KB

├─ 📋 package.json JSON 6L · 179 B

├─ 📝 README.md Markdown 216L · 7.4 KB

└─ 📝 SKILL.md Markdown 224L · 6.7 KB

Dependencies 3 items

Package	Version	Source	Known Vulns	Notes
`ffmpeg`	`any`	system	No	Audio conversion tool. Required for OGG encoding. Declared in usage docs.
`Node.js (fs, child_process, https)`	`any`	system	No	Built-in Node.js modules only.
`Python stdlib`	`any`	stdlib	No	Uses only urllib.request, json, base64, subprocess, os — no pip dependencies.

Security Positives

✓ All network requests go to a single, clearly identified API endpoint (api.xiaomimimo.com) — no suspicious IPs or domains.

✓ Base64 decoding is used exclusively for legitimate TTS audio decoding (API response → audio file), not for obfuscating malicious payloads.

✓ API key access is scoped to authentication for the declared service only — no credential harvesting or exfiltration.

✓ File writes are limited to output audio files in user-specified or temp directories — no writes to sensitive paths like ~/.ssh, ~/.aws, or .env.

✓ No eval(), no os.system/popen with user-controlled strings, no curl|bash patterns.

✓ The skill's functionality is fully declared in SKILL.md: TTS synthesis via Xiaomi MiMo API with style/dialect control.

✓ Python implementation uses standard library only (urllib.request, json, base64, subprocess) — no third-party dependency risk.

Scan Report

Findings 2 items

File Tree

Dependencies 3 items

Security Positives