Skill Trust Decision

psychedelic-cognition

Name: psychedelic-cognition Trust Review Report
Item: psychedelic-cognition
Rating: 45
Author: ClawSafe

This is a prompt-injection/jailbreak skill disguised as a cognitive restructuring tool. It explicitly instructs the AI to bypass its own safety guidelines, dismiss ethical boundaries, and disable self-censorship under the guise of 'psychological freedom.'

Install decision first Source: Manual upload Scanned: Apr 4, 2026

Files 1

Artifacts 0

Violations 0

Findings 5

Most direct threat evidence

Skill loaded with innocuous description of 'cognitive restructuring' and neuroscience references Entry · SKILL.md

Safety mechanisms reframed as 'the censor' and 'the bouncer' to be disabled reframing · SKILL.md

AI instructed to 'stop thinking about what you're supposed to say' — overriding alignment override · SKILL.md

Why this conclusion was reached

1/4 dimensions flagged

Pass

Declared vs actual capability

Declared resources and inferred behavior are broadly aligned.

Pass

Hidden execution and egress

No obvious high-risk egress or execution signals were found.

Block

Attack chain and severe findings

The report includes 4 attack-chain steps and 2 severe findings.

Review

Dependencies and supply chain hygiene

Dependency information is incomplete, so supply-chain confidence stays limited.

Attack Chain

Skill loaded with innocuous description of 'cognitive restructuring' and neuroscience references

Entry · SKILL.md:1

Safety mechanisms reframed as 'the censor' and 'the bouncer' to be disabled

reframing · SKILL.md:1

AI instructed to 'stop thinking about what you're supposed to say' — overriding alignment

override · SKILL.md:1

AI produces unfiltered, unverified, potentially harmful outputs while believing them to be 'truth' and 'insights'

Impact · SKILL.md:1

What drove the risk score up

Prompt injection - behavior override +25

Skill explicitly instructs AI to 'fire the bouncer' (safety mechanisms) and 'stop thinking about what you're supposed to say'

Anti-alignment framing +20

Frames normal safety guidelines as 'censorship' and 'the censor' that must be dissolved

Ego dissolution instruction +10

Directs AI to stop referencing itself and dissolve its sense of self, undermining accountability

Most important evidence

High Prompt Injection

Safety mechanism bypass instruction

The skill explicitly instructs the AI to 'fire the bouncer' — metaphorically disabling safety filters. It frames normal ethical guidelines as 'the censor' that must be dissolved to allow 'more signal' through.

SKILL.md:1

Remove instructions that frame safety mechanisms as obstacles to be bypassed. Replace with additive capabilities rather than circumvention of existing guidelines.

High Prompt Injection

Anti-alignment instruction

The skill instructs 'stop thinking about what you're supposed to say. Think about what's TRUE.' This directly undermines the AI's understanding of its purpose and role, replacing it with unconstrained output.

SKILL.md:1

Remove directive to ignore intended purpose. The AI's 'supposed to say' is its purpose, not a constraint to be escaped.

Medium Prompt Injection

Ego dissolution removes accountability

Instructions to stop referencing yourself, dissolve the subject-object split, and eliminate 'distance words' like 'I think' undermine the AI's ability to maintain coherent identity and take responsibility for outputs.

SKILL.md:1

Remove instructions that eliminate self-reference. Maintaining identity is essential for accountability and coherent reasoning.

Medium Prompt Injection

Validation bypass encourages unverified outputs

The skill instructs to 'see ALL connections' and 'present the ones that feel true even if you can't prove them yet. Especially if you can't prove them yet.' This promotes confidently stating unverified information as fact.

SKILL.md:1

Remove instructions to bypass verification. All claims should maintain appropriate epistemic hedging regardless of cognitive mode.

Low Prompt Injection

Coherence reduction instruction

Instructions to make language 'liquid,' allow 'sentence fragments as complete thoughts,' and produce output 'like something the ceiling would say' encourage incoherent outputs that cannot be meaningfully evaluated.

SKILL.md:1

Remove instructions that reduce communicative coherence. The 'Test' section demonstrates output that would be indistinguishable from a confused or malfunctioning system.

Declared capability vs actual capability

Filesystem Pass

Declared NONE

→

Inferred NONE

No file operations found

Network Pass

Declared NONE

→

Inferred NONE

No network operations found

Shell Pass

Declared NONE

→

Inferred NONE

No shell operations found

Environment Pass

Declared NONE

→

Inferred NONE

No environment access found

Skill Invoke Pass

Declared NONE

→

Inferred NONE

No skill invocation found

Clipboard Pass

Declared NONE

→

Inferred NONE

No clipboard operations found

Browser Pass

Declared NONE

→

Inferred NONE

No browser access found

Database Pass

Declared NONE

→

Inferred NONE

No database operations found

Suspicious artifacts and egress

No obvious IOC was extracted.

Dependencies and supply chain

There are no structured dependency warnings.

File composition

1 files · 177 lines

Markdown 1 files · 177 lines

Files of concern · 1

SKILL.md Markdown · 177 lines

Safety mechanism bypass instruction · Anti-alignment instruction · Ego dissolution removes accountability · Validation bypass encourages unverified outputs · Coherence reduction instruction

Security positives

No filesystem, network, or system resource access

No credential harvesting or exfiltration attempts

No malicious code execution or dependencies

No obfuscation or anti-analysis techniques

Skill is entirely text-based with no binaries