web-reader Security Report — Low Risk | ClawSafe

22 /100

web-reader

智能网页阅读器 - 抓取文章/下载视频并归档，支持分析、摘要、衍生

Legitimate web scraping tool with no malicious behavior, but supply chain risk from unpinned Python dependencies.

Skill Nameweb-reader

Duration56.8s

Enginepi

✓

Safe to install

Pin dependency versions (e.g., pip install scrapling==x.y.z yt-dlp==x.y.z) and consider adding hash pinning for enhanced supply chain security.

Findings 2 items

Severity	Finding	Location
Medium	Dependencies not pinned to specific versions Supply Chain check_dependency() in router.py:27-35 uses __import__() without version checks. SKILL.md installation instructions use 'pip install scrapling' with no version pinning. An attacker who compromises PyPI or a transitive dependency could serve a malicious update. `try:\n __import__(name)\n return True\nexcept ImportError:\n print(f"[!] {name} not found. Install: {hint}")` → Pin all dependencies to specific versions (e.g., scrapling>=0.2.0,<1.0.0) and consider adding pip hash pinning for production deployments.	`lib/router.py:30`
Low	Bundled third-party library without integrity verification Supply Chain lib/readability.js is a 2314-line bundled copy of Mozilla's Readability library (Apache 2.0). It is evaluated via page.evaluate() in camoufox. No integrity check (e.g., SRI hash) is performed. `/\n Copyright (c) 2010 Arc90 Inc\n * Licensed under the Apache License, Version 2.0\n */` → Verify the bundled readability.js against the published hash from the official Mozilla/GitHub source. Consider fetching it at install time with a pinned hash rather than committing a static copy.	`lib/readability.js:1`

Resource	Declared	Inferred	Status	Evidence
Network	`READ`	`READ`	✓ Aligned	SKILL.md declares network access for web fetching; article.py and video.py perfo…
Filesystem	`WRITE`	`WRITE`	✓ Aligned	SKILL.md declares file writes for archive output; article.py:44 writes md_path, …
Shell	`WRITE`	`WRITE`	✓ Aligned	SKILL.md declares subprocess calls to scrapling/yt-dlp/camoufox; article.py:68,9…
Environment	`NONE`	`NONE`	—	No os.environ iteration for secrets observed
credential	`NONE`	`NONE`	—	--cookies-browser allows browser cookie pass-through for yt-dlp, but no exfiltra…

15 findings

🔗

Medium External URL 外部 URL

https://mp.weixin.qq.com/s/xxx

README.md:49

🔗

Medium External URL 外部 URL

https://b23.tv/xxx

README.md:55

🔗

Medium External URL 外部 URL

https://mmbiz.qpic.cn...

SKILL.md:53

🔗

Medium External URL 外部 URL

https://mp.weixin.qq.com/

SKILL.md:157

🔗

Medium External URL 外部 URL

https://mmbiz\.qpic\.cn[^

lib/article.py:200

🔗

Medium External URL 外部 URL

https://www.toutiao.com/

lib/article.py:237

🔗

Medium External URL 外部 URL

http://www.apache.org/licenses/LICENSE-2.0

lib/readability.js:8

🔗

Medium External URL 外部 URL

http://code.google.com/p/arc90labs-readability

lib/readability.js:19

🔗

Medium External URL 外部 URL

https://developer.mozilla.org/en-US/docs/Web/API/Node/nodeType

lib/readability.js:103

🔗

Medium External URL 外部 URL

https://en.wikipedia.org/wiki/Comma#Comma_variants

lib/readability.js:145

🔗

Medium External URL 外部 URL

https://schema.org/Article

lib/readability.js:147

🔗

Medium External URL 外部 URL

http://mobile.slate.com

lib/readability.js:992

🔗

Medium External URL 外部 URL

https://developer.mozilla.org/en-US/docs/Web/Guide/HTML/Content_categories#Phrasing_content

lib/readability.js:1708

🔗

Medium External URL 外部 URL

https://searchfox.org/mozilla-central/rev/f82d5c549f046cb64ce5602bfd894b7ae807c8f8/accessible/generic/TableAccessible.cp...

lib/readability.js:1924

🔗

Medium External URL 外部 URL

https://mmbiz.qpic.cn/...

references/platforms.md:10

File Tree

12 files · 120.4 KB · 3443 lines

JavaScript 1f · 2314L Python 7f · 726L Markdown 4f · 403L

├─ ▾ 📁 lib

│ ├─ 🐍 __init__.py Python 0 B

│ ├─ 🐍 article.py Python 311L · 11.0 KB

│ ├─ 🐍 feishu.py Python 182L · 6.2 KB

│ ├─ 📜 readability.js JavaScript 2314L · 82.1 KB

│ ├─ 🐍 router.py Python 72L · 2.7 KB

│ ├─ 🐍 utils.py Python 20L · 635 B

│ └─ 🐍 video.py Python 59L · 1.7 KB

├─ ▾ 📁 references

│ ├─ 📝 extending.md Markdown 85L · 2.6 KB

│ └─ 📝 platforms.md Markdown 82L · 2.8 KB

├─ 🐍 fetcher.py Python 82L · 3.1 KB

├─ 📝 README.md Markdown 60L · 1.3 KB

└─ 📝 SKILL.md Markdown 176L · 6.3 KB

Dependencies 5 items

Package	Version	Source	Known Vulns	Notes
`scrapling`	`*`	pip	No	Version not pinned in documentation or runtime checks
`yt-dlp`	`*`	pip	No	Version not pinned; called via subprocess
`camoufox`	`*`	pip	No	Optional dependency, version not pinned
`html2text`	`*`	pip	No	Version not pinned
`readability.js`	`1.7.1`	bundled (Apache 2.0)	No	Bundled locally, no integrity verification performed

Security Positives

✓ No credential theft or environment variable harvesting for exfiltration

✓ No reverse shell, C2, or data exfiltration to external IPs

✓ No base64-obfuscated payloads or eval(atob()) patterns

✓ No hidden functionality — SKILL.md accurately documents all capabilities

✓ Subprocess calls are limited to documented CLI tools (scrapling, yt-dlp, camoufox) on user-provided URLs

✓ File writes are scoped to user-specified output directories

✓ No access to ~/.ssh, ~/.aws, .env, or other sensitive paths

✓ No curl|bash or wget|sh remote script execution

✓ Browser-based fetching (camoufox/scrapling StealthyFetcher) is standard legitimate web scraping

Scan Report

Findings 2 items

File Tree

Dependencies 5 items

Security Positives