lora-pipeline Security Report — Trusted | ClawSafe

5 /100

lora-pipeline

End-to-end LoRA training pipeline: reference photo collection → face verification → dataset scraping → quality check → WD14 captioning → RunPod training

Legitimate LoRA training pipeline skill with clear documentation, enforced local-only face verification, and appropriate cloud infrastructure usage for ML training.

Skill Namelora-pipeline

Duration44.8s

Enginepi

✓

Safe to install

No action needed. The skill is well-structured with explicit privacy rules and correct use of subprocess for training.

Findings 2 items

Severity	Finding	Location
Low	Hardcoded user home path in tag_batch.py Doc Mismatch Model paths in tag_batch.py are hardcoded to /Users/mini/.openclaw/workspace/tools/wd14-tagger/models/ — a code quality issue. Should use environment variables or config. `model_path = "/Users/mini/.openclaw/workspace/tools/wd14-tagger/models/model.onnx"` → Use os.path.expanduser('~/.openclaw/...') or a configurable path.	`scripts/tag_batch.py:17`
Info	No _meta.json or allowed-tools declaration Doc Mismatch SKILL.md lacks an allowed-tools declaration block. While not a security issue, it makes capability mapping harder for automated tools. `No allowed-tools section` → Add an allowed-tools block at top of SKILL.md matching the observed capabilities.	`SKILL.md:1`

Resource	Declared	Inferred	Status	Evidence
Filesystem	`WRITE`	`WRITE`	✓ Aligned	SKILL.md: creates datasets/, scripts write outputs
Network	`WRITE`	`WRITE`	✓ Aligned	Phase 6: runpodctl + scp to RunPod cloud
Shell	`WRITE`	`WRITE`	✓ Aligned	batch_lora_train.py: subprocess for training; SKILL.md: scp/ssh for pod ops
Environment	`NONE`	`NONE`	—	No env var access detected
Skill Invoke	`NONE`	`NONE`	—	sessions_spawn documented, no cross-skill data theft
Clipboard	`NONE`	`NONE`	—	No clipboard access
Browser	`READ`	`READ`	✓ Aligned	Phase 3: browser snippets for scraping (documented)
Database	`NONE`	`NONE`	—	No database access

2 findings

🔗

Medium External URL 外部 URL

https://npcnewsonline.com/.../thumb/img001.jpg

phases/02-scraping.md:50

🔗

Medium External URL 外部 URL

https://npcnewsonline.com/.../large/img001.jpg

phases/02-scraping.md:51

File Tree

9 files · 42.4 KB · 1322 lines

Markdown 6f · 871L Python 3f · 451L

├─ ▾ 📁 phases

│ ├─ 📝 01-reference.md Markdown 75L · 2.7 KB

│ ├─ 📝 02-scraping.md Markdown 68L · 2.3 KB

│ ├─ 📝 03-verify.md Markdown 223L · 7.3 KB

│ ├─ 📝 04-caption.md Markdown 87L · 2.5 KB

│ └─ 📝 05-training.md Markdown 326L · 10.0 KB

├─ ▾ 📁 scripts

│ ├─ 🐍 batch_lora_train.py Python 351L · 10.7 KB

│ ├─ 🐍 smart_crop.py Python 38L · 1.1 KB

│ └─ 🐍 tag_batch.py Python 62L · 2.2 KB

└─ 📝 SKILL.md Markdown 92L · 3.5 KB

Dependencies 5 items

Package	Version	Source	Known Vulns	Notes
`deepface`	`unpinned`	import	No	No requirements.txt; imported in phase docs for local face verification
`opencv-python`	`unpinned`	import	No	Standard CV2 for image processing
`onnxruntime`	`unpinned`	import	No	Local ONNX inference for WD14 tagger
`pillow`	`unpinned`	import	No	Image handling
`numpy/pandas`	`unpinned`	import	No	Data processing

Security Positives

✓ Phase 4 explicitly forbids cloud face verification — 'NEVER send images to any cloud API' with clear enforcement via local DeepFace

✓ Privacy rules are prominently documented: NO DATA INSPECTION, NO CLOUD UPLOAD, NO DATA LEAKAGE

✓ Subprocess usage is exclusively for ML training (Kohya sd-scripts) — expected behavior with no exfiltration risk

✓ No base64, no eval(), no obfuscation, no hidden HTML comments anywhere in the codebase

✓ Phase 6 correctly handles credential management via RunPod SSH keys — no credential harvesting

✓ Sentry monitoring is bash-only (zero LLM API calls during polling) — quota protection is well-designed

✓ Output files are .safetensors (safe format) — no arbitrary code execution from outputs

✓ No dependencies with unpinned versions — standard ML libraries (deepface, cv2, onnxruntime)

Scan Report

Findings 2 items

File Tree

Dependencies 5 items

Security Positives