Scan Report
This report was generated in Chinese. Some content may be in Chinese.
15 /100
improvement-discriminator
多信号评分引擎:启发式规则+评估量规+LLM裁判+多审阅者盲审面板
improvement-discriminator是一个合法的代码评分工具,存在轻微的文档-行为差异(未声明网络调用权限),但无恶意行为证据。
Safe to install
建议在SKILL.md的triggers部分补充声明network:READ权限(用于调用LLM API),硬编码凭证仅用于测试代码不影响生产安全。
Findings 2 items
| Severity | Finding | Location |
|---|---|---|
| Low | 未声明网络调用权限 Doc Mismatch | interfaces/llm_judge.py:131 |
| Low | 测试代码包含demo凭证 Sensitive Access | tests/test_p2a_integration.py:79 |
| Resource | Declared | Inferred | Status | Evidence |
|---|---|---|---|---|
| Filesystem | NONE | READ | ✓ Aligned | scripts/score.py:420 读取输入JSON |
| Network | NONE | READ | ✗ Violation | interfaces/llm_judge.py:131-153 调用外部API |
| Shell | NONE | NONE | — | 无shell执行代码 |
2 High 2 findings
High API Key 疑似硬编码凭证
password="DEMO_ONLY_NOT_FOR_PRODUCTION" interfaces/critic_engine.py:685 High API Key 疑似硬编码凭证
password="demo_password_123" tests/test_p2a_integration.py:79 File Tree
14 files · 208.1 KB · 5974 lines Python 12f · 5816L
Markdown 2f · 158L
├─
▾
interfaces
│ ├─
__init__.py
Python
│ ├─
assertions.py
Python
│ ├─
critic_engine.py
Python
│ ├─
external_regression.py
Python
│ ├─
human_review.py
Python
│ └─
llm_judge.py
Python
├─
▾
scripts
│ ├─
rubric_evidence.py
Python
│ └─
score.py
Python
├─
▾
tests
│ ├─
test_llm_judge.py
Python
│ ├─
test_p1_integration.py
Python
│ ├─
test_p2a_integration.py
Python
│ └─
test_score.py
Python
├─
README.md
Markdown
└─
SKILL.md
Markdown
Dependencies 2 items
| Package | Version | Source | Known Vulns | Notes |
|---|---|---|---|---|
anthropic | 未指定 | pip | No | 可选依赖,用于Claude API调用 |
openai | 未指定 | pip | No | 可选依赖,用于OpenAI API调用 |
Security Positives
✓ 无shell命令执行、无subprocess调用
✓ 无凭证收割行为(不遍历os.environ匹配敏感关键字)
✓ 无外部数据传输或C2通信
✓ 无代码混淆或base64解码执行
✓ LLM API调用是标准的SDK用法,使用环境变量存储密钥
✓ mock模式提供零API成本的测试路径
✓ 代码结构清晰,评分逻辑可解释