media-cluster Security Report — Low Risk | ClawSafe

12 /100

media-cluster

Crawls Chinese social media platforms (Xiaohongshu, Douyin, Weibo, Bilibili, etc.) by keyword, generates Markdown reports, and produces voice summaries via TTS API

A legitimate social media crawling skill that scrapes Chinese platforms, generates reports, and synthesizes voice summaries via a documented third-party TTS API, with no hidden malicious behavior detected.

Skill Namemedia-cluster

Duration41.9s

Enginepi

✓

Safe to install

Approve for use. Monitor `senseaudio.cn` external API calls for unexpected data exfiltration; pin pip dependencies in setup_env.sh before production deployment.

Findings 3 items

Severity	Finding	Location
Low	Unpinned pip dependencies Supply Chain setup_env.sh installs requirements.txt without version pins or hash verification, which could allow a compromised or malicious package to be pulled at install time. `pip install -r requirements.txt` → Pin dependencies with specific versions (e.g., pip install -r requirements.txt --require-hashes) or use a lock file. Consider using a private PyPI mirror for production.	`scripts/setup_env.sh:22`
Low	Unverified remote Git clone Supply Chain ensure_mediacrawler.sh clones MediaCrawler from github.com at runtime without commit hash pinning. While the target is a known legitimate repo, a compromised branch or tag could be introduced. `git clone "$REPO_URL" "$MEDIACRAWLER_DIR"` → Pin to a specific commit hash: git clone ... && cd MediaCrawler && git checkout <hash>. Alternatively, bundle MediaCrawler with the skill package.	`scripts/ensure_mediacrawler.sh:16`
Low	External API call to senseaudio.cn without data flow audit Data Exfil The skill sends voice script text to senseaudio.cn/api. No evidence of credential or data exfiltration, but the external call is not fully auditable. `r = requests.post(SENSEAUDIO_TTS_URL, json=payload, headers=headers, timeout=60)` → Verify that only voice script text (not scraped content) is sent to the API. Consider self-hosting TTS or using a fully local TTS solution to eliminate external dependency.	`scripts/summarize_and_voice.py:138`

Resource	Declared	Inferred	Status	Evidence
Filesystem	`READ`	`READ`	✓ Aligned	SKILL.md declares reading data files from MediaCrawler/data/; scripts/summarize_…
Filesystem	`WRITE`	`WRITE`	✓ Aligned	SKILL.md declares writing reports and MP3 files; scripts/summarize_and_voice.py:…
Shell	`WRITE`	`WRITE`	✓ Aligned	SKILL.md declares shell script execution for conda/pip/playwright; setup_env.sh …
Network	`READ`	`READ`	✓ Aligned	SKILL.md declares senseaudio.cn TTS API calls; summarize_and_voice.py:25 sets AP…
Environment	`READ`	`READ`	✓ Aligned	SKILL.md declares reading SENSEAUDIO_API_KEY env var; summarize_and_voice.py:228…

5 findings

🔗

Medium External URL 外部 URL

https://senseaudio.cn/platform/api-key

SKILL.md:35

🔗

Medium External URL 外部 URL

https://senseaudio.cn/docs/

SKILL.md:119

🔗

Medium External URL 外部 URL

https://www.xiaohongshu.com/explore/...

SKILL.md:157

🔗

Medium External URL 外部 URL

https://senseaudio.cn/docs/text_to_speech_api

SKILL.md:197

🔗

Medium External URL 外部 URL

https://api.senseaudio.cn

scripts/summarize_and_voice.py:25

File Tree

4 files · 22.1 KB · 554 lines

Python 1f · 303L Markdown 1f · 197L Shell 2f · 54L

├─ ▾ 📁 scripts

│ ├─ 🔧 ensure_mediacrawler.sh Shell 19L · 630 B

│ ├─ 🔧 setup_env.sh Shell 35L · 1.1 KB

│ └─ 🐍 summarize_and_voice.py Python 303L · 11.1 KB

└─ 📝 SKILL.md Markdown 197L · 9.4 KB

Dependencies 2 items

Package	Version	Source	Known Vulns	Notes
`requests`	`*`	pip	No	Imported in summarize_and_voice.py but installed with unpinned version in setup_env.sh
`MediaCrawler requirements.txt`	`unpinned`	pip	No	Cloned from GitHub and installed via pip install -r requirements.txt without version pins

Security Positives

✓ No credential harvesting — API key is only read from environment and used only for TTS authentication

✓ No sensitive path access — no access to ~/.ssh, ~/.aws, .env, or similar credential paths

✓ No obfuscation — all code is plain text, no base64, eval, or encoded payloads

✓ No reverse shell or C2 communication — network calls only go to documented senseaudio.cn API

✓ Subprocess usage is documented and limited to audio playback and legitimate CLI tools (afplay, paplay, aplay, ffplay)

✓ Report generation reads only from the declared data directory and produces output locally

✓ No hidden functionality — all behavior matches the SKILL.md documentation

Scan Report

Findings 3 items

File Tree

Dependencies 2 items

Security Positives