lora-pipeline 安全扫描报告 — 可信 | ClawSafe

5 /100

lora-pipeline

End-to-end LoRA training pipeline: reference photo collection → face verification → dataset scraping → quality check → WD14 captioning → RunPod training

Legitimate LoRA training pipeline skill with clear documentation, enforced local-only face verification, and appropriate cloud infrastructure usage for ML training.

技能名称lora-pipeline

分析耗时44.8s

引擎pi

✓

可以安装

No action needed. The skill is well-structured with explicit privacy rules and correct use of subprocess for training.

安全发现 2 项

严重性	安全发现	位置
低危	Hardcoded user home path in tag_batch.py 文档欺骗 Model paths in tag_batch.py are hardcoded to /Users/mini/.openclaw/workspace/tools/wd14-tagger/models/ — a code quality issue. Should use environment variables or config. `model_path = "/Users/mini/.openclaw/workspace/tools/wd14-tagger/models/model.onnx"` → Use os.path.expanduser('~/.openclaw/...') or a configurable path.	`scripts/tag_batch.py:17`
提示	No _meta.json or allowed-tools declaration 文档欺骗 SKILL.md lacks an allowed-tools declaration block. While not a security issue, it makes capability mapping harder for automated tools. `No allowed-tools section` → Add an allowed-tools block at top of SKILL.md matching the observed capabilities.	`SKILL.md:1`

资源类型	声明权限	推断权限	状态	证据
文件系统	`WRITE`	`WRITE`	✓ 一致	SKILL.md: creates datasets/, scripts write outputs
网络访问	`WRITE`	`WRITE`	✓ 一致	Phase 6: runpodctl + scp to RunPod cloud
命令执行	`WRITE`	`WRITE`	✓ 一致	batch_lora_train.py: subprocess for training; SKILL.md: scp/ssh for pod ops
环境变量	`NONE`	`NONE`	—	No env var access detected
技能调用	`NONE`	`NONE`	—	sessions_spawn documented, no cross-skill data theft
剪贴板	`NONE`	`NONE`	—	No clipboard access
浏览器	`READ`	`READ`	✓ 一致	Phase 3: browser snippets for scraping (documented)
数据库	`NONE`	`NONE`	—	No database access

2 项发现

🔗

中危外部 URL 外部 URL

https://npcnewsonline.com/.../thumb/img001.jpg

phases/02-scraping.md:50

🔗

中危外部 URL 外部 URL

https://npcnewsonline.com/.../large/img001.jpg

phases/02-scraping.md:51

目录结构

9 文件 · 42.4 KB · 1322 行

Markdown 6f · 871L Python 3f · 451L

├─ ▾ 📁 phases

│ ├─ 📝 01-reference.md Markdown 75L · 2.7 KB

│ ├─ 📝 02-scraping.md Markdown 68L · 2.3 KB

│ ├─ 📝 03-verify.md Markdown 223L · 7.3 KB

│ ├─ 📝 04-caption.md Markdown 87L · 2.5 KB

│ └─ 📝 05-training.md Markdown 326L · 10.0 KB

├─ ▾ 📁 scripts

│ ├─ 🐍 batch_lora_train.py Python 351L · 10.7 KB

│ ├─ 🐍 smart_crop.py Python 38L · 1.1 KB

│ └─ 🐍 tag_batch.py Python 62L · 2.2 KB

└─ 📝 SKILL.md Markdown 92L · 3.5 KB

依赖分析 5 项

包名	版本	来源	已知漏洞	备注
`deepface`	`unpinned`	import	否	No requirements.txt; imported in phase docs for local face verification
`opencv-python`	`unpinned`	import	否	Standard CV2 for image processing
`onnxruntime`	`unpinned`	import	否	Local ONNX inference for WD14 tagger
`pillow`	`unpinned`	import	否	Image handling
`numpy/pandas`	`unpinned`	import	否	Data processing

安全亮点

✓ Phase 4 explicitly forbids cloud face verification — 'NEVER send images to any cloud API' with clear enforcement via local DeepFace

✓ Privacy rules are prominently documented: NO DATA INSPECTION, NO CLOUD UPLOAD, NO DATA LEAKAGE

✓ Subprocess usage is exclusively for ML training (Kohya sd-scripts) — expected behavior with no exfiltration risk

✓ No base64, no eval(), no obfuscation, no hidden HTML comments anywhere in the codebase

✓ Phase 6 correctly handles credential management via RunPod SSH keys — no credential harvesting

✓ Sentry monitoring is bash-only (zero LLM API calls during polling) — quota protection is well-designed

✓ Output files are .safetensors (safe format) — no arbitrary code execution from outputs

✓ No dependencies with unpinned versions — standard ML libraries (deepface, cv2, onnxruntime)