TinyScraper 安全扫描报告 — 可信 | ClawSafe

5 /100

TinyScraper

简单静态网站镜像爬虫 - mirrors websites to local storage for offline viewing

TinyScraper is a legitimate static website mirroring tool that performs exactly as documented using only Python3 standard library with no malicious behavior detected.

技能名称TinyScraper

分析耗时39.4s

引擎pi

✓

可以安装

No action needed. This skill is safe to use.

安全发现 2 项

严重性	安全发现	位置
低危	robots.txt is ignored The crawler explicitly ignores robots.txt when crawling websites, which is a common practice for mirroring tools but may have legal implications `robots.txt 被忽略（镜像工具通用行为）` → Consider adding a --respect-robots option for compliance with target site preferences	`references/SPEC.md:92`
低危	No allowed-tools declaration in SKILL.md SKILL.md does not explicitly declare the allowed-tools mapping, though implementation is consistent with declared functionality `No allowed-tools section found` → Add allowed-tools declaration for transparency	`SKILL.md:1`

资源类型	声明权限	推断权限	状态	证据
文件系统	`WRITE`	`WRITE`	✓ 一致	lib/crawler.py:line 63 - ensure_dir(self.base_dir); lib/crawler.py:line 185 - se…
网络访问	`READ`	`READ`	✓ 一致	lib/crawler.py:line 162 - urllib.request.urlopen(req, timeout=timeout)
命令执行	`NONE`	`NONE`	—	No subprocess or shell execution found
环境变量	`NONE`	`READ`	✓ 一致	lib/crawler.py:line 37 - os.environ.get('OPENCLAW_WORKSPACE') for workspace conf…
技能调用	`NONE`	`NONE`	—	No skill invocation detected
剪贴板	`NONE`	`NONE`	—	No clipboard access detected
浏览器	`NONE`	`NONE`	—	No browser automation detected
数据库	`NONE`	`NONE`	—	No database access detected

2 项发现

🔗

中危外部 URL 外部 URL

https://other.com/about

scripts/test_crawler.py:70

🔗

中危外部 URL 外部 URL

https://external.com

scripts/test_crawler.py:92

目录结构

4 文件 · 37.0 KB · 1150 行

Python 2f · 885L Markdown 2f · 265L

├─ ▾ 📁 lib

│ └─ 🐍 crawler.py Python 630L · 22.4 KB

├─ ▾ 📁 references

│ └─ 📝 SPEC.md Markdown 132L · 2.9 KB

├─ ▾ 📁 scripts

│ └─ 🐍 test_crawler.py Python 255L · 8.4 KB

└─ 📝 SKILL.md Markdown 133L · 3.2 KB

依赖分析 1 项

包名	版本	来源	已知漏洞	备注
`Python3 Standard Library`	`Built-in`	stdlib	否	urllib.request, html.parser, re, os, tempfile - all standard library

安全亮点

✓ Uses only Python3 standard library (no external dependencies)

✓ No subprocess, shell execution, or system command calls

✓ No credential harvesting or environment variable exfiltration

✓ No base64/encoded payloads or eval() calls

✓ No hidden functionality - code matches documentation

✓ Same-domain restriction prevents unintended external requests

✓ Configurable request delay (DELAY) prevents abuse

✓ Clean BFS crawling algorithm with proper URL normalization

✓ No data exfiltration or external IP communications beyond target URL

✓ Well-documented with SKILL.md and SPEC.md