Adversarial Evasion & Cat-and-Mouse

Defend moderation against adversarial evasion. Learn the canonical obfuscation techniques (leetspeak, homoglyph, image edits and splice, audio pitch / speed, shadow accounts, semantic paraphrase, jailbreak-style prompt injection of moderation systems), red-team programs that find evasions, the adversarial-cycle metric, and the cost vs benefit of patching each evasion vs raising the floor.

6
Lessons
📋
Templates
Practitioner-Ready
100%
Free

Lessons in This Topic

Work through these 6 lessons in order, or jump to whichever is most relevant.