alphapai-scraper 安全扫描报告 — 低风险 | ClawSafe

20 /100

alphapai-scraper

AlphaPai financial comment scraper and archival search tool

A legitimate, functional web scraping and archival skill for AlphaPai financial platform with minor documentation gaps but no malicious behavior detected.

技能名称alphapai-scraper

分析耗时77.6s

引擎pi

✓

可以安装

The skill is safe to use. Consider pinning dependency versions in setup.sh and documenting subprocess usage for osascript and openclaw agent calls in SKILL.md for full transparency.

安全发现 3 项

严重性	安全发现	位置
低危	Undeclared subprocess execution 文档欺骗 SKILL.md does not mention that run.py executes 'osascript' for macOS notifications, analyze.py invokes 'openclaw agent' subprocess for AI analysis, or publish_skill.py calls 'clawhub' CLI. These are legitimate tools but should be declared. `subprocess.run(['osascript', '-e', script], check=False, ...)` → Add shell:WRITE to SKILL.md allowed-tools section, documenting osascript and openclaw agent usage.	`scripts/run.py:43`
低危	Unpinned playwright dependency 供应链 setup.sh installs playwright without version pinning: 'pip3 install playwright --quiet'. This allows a malicious version to be pulled at install time. `pip3 install playwright --quiet` → Pin to a known-safe version: pip3 install playwright==1.44.0 --quiet	`setup.sh:15`
低危	Chrome profile directory access undeclared 敏感访问 The scraper accesses '~/Library/Application Support/Google/Chrome' for browser profile authentication. This is a sensitive path that should be declared in SKILL.md. `profile_user_data_dir: ~/Library/Application Support/Google/Chrome` → Document browser:WRITE and the Chrome profile access in SKILL.md, as it implies access to all Chrome-stored credentials and sessions.	`config/settings.example.json:19`

资源类型	声明权限	推断权限	状态	证据
文件系统	`WRITE`	`WRITE`	✓ 一致	SKILL.md declares data storage; archive_store.py writes SQLite, raw, normalized …
网络访问	`READ`	`WRITE`	✓ 一致	SKILL.md declares AlphaPai scraping; send_feishu.py POSTs to Feishu webhook (dec…
命令执行	`NONE`	`WRITE`	✓ 一致	run.py:43 osascript subprocess; analyze.py:112 openclaw agent subprocess; publis…
环境变量	`READ`	`READ`	✓ 一致	common.py:95-106 reads ALPHAPAI_* env vars; load_auth_bundle reads USER_AUTH_TOK…
技能调用	`NONE`	`WRITE`	✓ 一致	analyze.py:112 invokes 'openclaw agent --message' as subprocess
剪贴板	`NONE`	`NONE`	—
浏览器	`READ`	`WRITE`	✓ 一致	SKILL.md declares browser-based scraping; scraper.py uses Playwright with WRITE …
数据库	`WRITE`	`WRITE`	✓ 一致	archive_store.py creates SQLite (FTS5) and Chroma vector index

3 项发现

🔗

中危外部 URL 外部 URL

https://alphapai-web.rabyte.cn

config/settings.example.json:3

🔗

中危外部 URL 外部 URL

https://alphapai-web.rabyte.cn/login

config/settings.example.json:4

🔗

中危外部 URL 外部 URL

https://alphapai-web.rabyte.cn/reading/home/comment

config/settings.example.json:5

目录结构

19 文件 · 109.3 KB · 3476 行

Python 11f · 2929L Markdown 2f · 385L Shell 1f · 91L JSON 4f · 64L YAML 1f · 7L

├─ ▾ 📁 agents

│ └─ 📋 openai.yaml YAML 7L · 344 B

├─ ▾ 📁 config

│ ├─ 📋 cookies.example.json JSON 12L · 211 B

│ ├─ 🔑 credentials.example.json ⚠ JSON 4L · 70 B

│ ├─ 📋 settings.example.json JSON 43L · 1.2 KB

│ └─ 🔑 token.example.json ⚠ JSON 5L · 120 B

├─ ▾ 📁 scripts

│ ├─ 🐍 analyze.py Python 287L · 9.3 KB

│ ├─ 🐍 archive_store.py Python 748L · 24.5 KB

│ ├─ 🐍 bootstrap_session.py Python 84L · 3.0 KB

│ ├─ 🐍 common.py Python 320L · 9.8 KB

│ ├─ 🐍 init_config.py Python 94L · 2.6 KB

│ ├─ 🐍 package_skill.py Python 57L · 1.3 KB

│ ├─ 🐍 publish_skill.py Python 105L · 2.6 KB

│ ├─ 🐍 query_comments.py Python 228L · 7.9 KB

│ ├─ 🐍 run.py Python 177L · 5.8 KB

│ ├─ 🐍 scraper.py Python 755L · 24.6 KB

│ └─ 🐍 send_feishu.py Python 74L · 2.1 KB

├─ 📝 README.md Markdown 236L · 5.7 KB

├─ 🔧 setup.sh Shell 91L · 3.2 KB

└─ 📝 SKILL.md Markdown 149L · 4.8 KB

依赖分析 4 项

包名	版本	来源	已知漏洞	备注
`playwright`	`*`	pip	否	Version not pinned in setup.sh
`chromadb`	`*`	pip	否	Version not pinned; used for local vector search
`sentence-transformers`	`*`	pip	否	Version not pinned; loads bge-small-zh-v1.5 model locally
`openclaw`	`CLI`	system	否	Called via subprocess for AI summarization

安全亮点

✓ No evidence of credential exfiltration — tokens and cookies stay local to configured storage paths

✓ No obfuscation, base64 payloads, or anti-analysis techniques detected

✓ No network requests to IP addresses or suspicious domains beyond the declared AlphaPai platform and Feishu webhook

✓ package_skill.py correctly excludes all credential files before publishing (token, cookies, credentials, settings local files)

✓ Credential handling follows a clear, documented priority chain with no surprise collection

✓ No reverse shell, C2 communication, or data theft patterns found

✓ Code is clean, readable, and well-structured with no hidden functionality

✓ Normalize_cookies() in common.py filters cookie keys to a safe allowlist (name, value, domain, etc.)