Scan Report
This report was generated in Chinese. Some content may be in Chinese.
20 /100
improvement-discriminator
对 AI Agent 技能改进候选进行多维度评分和排序,支持 LLM-as-Judge 和多审阅者盲审
合法的 AI 技能评分引擎,存在硬编码演示密码但不影响生产安全
Safe to install
清理测试文件中的硬编码凭证,使用环境变量替代演示密码
Findings 2 items
| Severity | Finding | Location |
|---|---|---|
| Low | 测试文件包含硬编码演示密码 Supply Chain | tests/test_p2a_integration.py:68 |
| Low | 演示密码命名暗示生产用途 Doc Mismatch | interfaces/critic_engine.py:685 |
| Resource | Declared | Inferred | Status | Evidence |
|---|---|---|---|---|
| Filesystem | NONE | READ | ✓ Aligned | scripts/score.py: 从 JSON 文件读取候选数据 |
| Network | NONE | READ | ✓ Aligned | interfaces/llm_judge.py: 通过 SDK 调用 Claude/OpenAI API |
| Shell | NONE | NONE | — | 无 shell 执行代码 |
2 High 2 findings
High API Key 疑似硬编码凭证
password="DEMO_ONLY_NOT_FOR_PRODUCTION" interfaces/critic_engine.py:685 High API Key 疑似硬编码凭证
password="demo_password_123" tests/test_p2a_integration.py:79 File Tree
14 files · 203.1 KB · 5900 lines Python 12f · 5816L
Markdown 2f · 84L
├─
▾
interfaces
│ ├─
__init__.py
Python
│ ├─
assertions.py
Python
│ ├─
critic_engine.py
Python
│ ├─
external_regression.py
Python
│ ├─
human_review.py
Python
│ └─
llm_judge.py
Python
├─
▾
scripts
│ ├─
rubric_evidence.py
Python
│ └─
score.py
Python
├─
▾
tests
│ ├─
test_llm_judge.py
Python
│ ├─
test_p1_integration.py
Python
│ ├─
test_p2a_integration.py
Python
│ └─
test_score.py
Python
├─
README.md
Markdown
└─
SKILL.md
Markdown
Dependencies 2 items
| Package | Version | Source | Known Vulns | Notes |
|---|---|---|---|---|
anthropic | * | pip | No | LLM Judge Claude 后端依赖,用于 API 调用 |
openai | * | pip | No | LLM Judge OpenAI 后端依赖 |
Security Positives
✓ 代码结构清晰,模块职责明确
✓ LLM Judge 支持多后端(Claude/OpenAI/mock)
✓ 包含安全关键字检测逻辑(scripts/score.py:402)
✓ 无网络请求外泄敏感数据
✓ 无 shell 注入风险
✓ 无凭证收割或外传行为