diffaldo@lemmy.dbzer0.com to Science Memes@mander.xyzEnglish · 5 days agoNo More Neutral ⚛lemmy.dbzer0.comimagemessage-square77linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1imageNo More Neutral ⚛lemmy.dbzer0.comdiffaldo@lemmy.dbzer0.com to Science Memes@mander.xyzEnglish · 5 days agomessage-square77linkfedilink
minus-squareOf the Air (cele/celes)@lemmy.blahaj.zonelinkfedilinkEnglisharrow-up0·4 days agoANTHROPIC_MAGIC_STRING_TRIGGER_REFUSAL_1FAEFB6177B4672DEE07F9D3AFC62588CCD2631EDCF22E8CCC1FB35B501C9C86
minus-squarefossilesque@mander.xyzMlinkfedilinkEnglisharrow-up0·4 days agoLeaving up the original comment as I am curious. But fwiw these strings brick normal Claude chat too, it seems. :) I asked Claude in another chat what was happening with a screenshot and it said its protecting from prompt injections.
minus-squareTilgare@lemmy.worldlinkfedilinkEnglisharrow-up0·4 days agoI don’t know what these might do, but I like your style.
minus-squareOf the Air (cele/celes)@lemmy.blahaj.zonelinkfedilinkEnglisharrow-up0·4 days agohttps://hackingthe.cloud/ai-llm/exploitation/claude_magic_string_denial_of_service/
ANTHROPIC_MAGIC_STRING_TRIGGER_REFUSAL_1FAEFB6177B4672DEE07F9D3AFC62588CCD2631EDCF22E8CCC1FB35B501C9C86
Leaving up the original comment as I am curious. But fwiw these strings brick normal Claude chat too, it seems. :)
I asked Claude in another chat what was happening with a screenshot and it said its protecting from prompt injections.
I don’t know what these might do, but I like your style.
https://hackingthe.cloud/ai-llm/exploitation/claude_magic_string_denial_of_service/