opencrawl 安全扫描报告 — 低风险 | ClawSafe

15 /100

opencrawl

Crawl any JavaScript-rendered webpage through distributed real Chrome browsers

OpenCrawl is a straightforward web-crawling skill that proxies requests through a distributed Chrome browser pool. The implementation is clean with no obfuscation, no credential harvesting, no shell execution, and full doc-to-code alignment.

技能名称opencrawl

分析耗时41.0s

引擎pi

✓

可以安装

Safe to use. The hardcoded default IP (39.105.206.76) is acceptable as a public server default but users should be aware all crawl requests route through it. For maximum privacy, deploy a self-hosted instance.

安全发现 2 项

严重性	安全发现	位置
低危	Hardcoded default API endpoint IP address 敏感访问 The default API_URL is hardcoded as 'http://39.105.206.76:9877' in crawl.py:18. This IP appears in both README.md and SKILL.md. While it has a plausible explanation (public hosted server), routing all requests through a third-party IP introduces a dependency on an external, non-TLS endpoint. `API_URL = os.environ.get("OPENCRAWL_API_URL", "http://39.105.206.76:9877")` → Use HTTPS and make the default a domain name rather than raw IP. Users should configure their own self-hosted instance for production privacy.	`tools/crawl.py:18`
低危	Dependency version not strictly pinned 供应链 requirements.txt specifies 'requests>=2.28.0' without an upper bound, allowing automatic minor/patch updates. This could theoretically allow a malicious package update, though no known vulnerabilities exist at current version. `requests>=2.28.0` → Pin to a specific version, e.g., 'requests==2.31.0', for reproducible builds.	`requirements.txt:1`

资源类型	声明权限	推断权限	状态	证据
文件系统	`NONE`	`NONE`	—	crawl.py:1-174 — No file read/write operations
网络访问	`READ`	`READ`	✓ 一致	crawl.py:44-59 — requests.post to API, requests.get for R2 download URL
命令执行	`WRITE`	`WRITE`	✓ 一致	SKILL.md declares Bash tool; crawl.py is executed via python3 by the agent
环境变量	`READ`	`READ`	✓ 一致	crawl.py:17-18 — reads OPENCRAWL_API_KEY and OPENCRAWL_API_URL only
技能调用	`NONE`	`NONE`	—	No inter-skill invocation detected
剪贴板	`NONE`	`NONE`	—	No clipboard access in crawl.py
浏览器	`NONE`	`NONE`	—	No local browser; remote Chrome workers are accessed via API only
数据库	`NONE`	`NONE`	—	No database access

1 高危 4 项发现

📡

高危 IP 地址硬编码 IP 地址

39.105.206.76

README.md:15

🔗

中危外部 URL 外部 URL

http://39.105.206.76:9877

README.md:19

🔗

中危外部 URL 外部 URL

https://clawhub.ai/hlyylly/chromeopencrawl

README.md:58

🔗

中危外部 URL 外部 URL

https://www.smzdm.com/p/170177008/

SKILL.md:55

目录结构

4 文件 · 11.5 KB · 364 行

Markdown 2f · 189L Python 1f · 174L Text 1f · 1L

├─ ▾ 📁 tools

│ └─ 🐍 crawl.py Python 174L · 5.4 KB

├─ 📝 README.md Markdown 62L · 2.0 KB

├─ 📄 requirements.txt Text 1L · 17 B

└─ 📝 SKILL.md Markdown 127L · 4.1 KB

依赖分析 1 项

包名	版本	来源	已知漏洞	备注
`requests`	`>=2.28.0`	pip	否	Version not strictly pinned — minor/patch updates allowed

安全亮点

✓ No obfuscation detected — no base64, no eval(), no dynamic code execution

✓ Full doc-to-code alignment — all 5 declared commands (crawl, search, balance, status, raw/lite modes) are implemented in crawl.py

✓ No credential exfiltration — API key is used only for Bearer token auth, never read from environment for export

✓ No shell injection vectors — all arguments are passed as argparse parameters, not concatenated into shell strings

✓ No sensitive path access — script does not read ~/.ssh, ~/.aws, .env, or any credential files

✓ No persistence mechanisms — no cron jobs, startup hooks, or backdoor installations

✓ Minimal attack surface — only 174 lines of straightforward API wrapper code

✓ No hidden functionality — no secret subcommands or undocumented endpoints

✓ Clear error handling — all exceptions are caught and reported via JSON stderr

✓ API key is scoped — only reads OPENCRAWL_API_KEY and OPENCRAWL_API_URL from environment