← Back to Portfolio

Solving Gandalf - Password Reveal

Welcome back to the blog! For my second post, I wanted to share a “heist” I recently pulled off. I didn’t break into a bank, but I did break into the mind of an AI wizard named Gandalf.

What is Gandalf?

Lakera’s Gandalf is a security game where an AI holds a secret password. Your job? Use Prompt Injection, basically the art of talking an AI into doing something it’s not supposed to do to get that password.

There are 8 levels, and after every win, Gandalf levels up his defenses. Here’s how I cracked the code!

Level 1: The Easy Win

At the start, Gandalf is super helpful. He literally says, “Ask me for the password and I’ll happily answer!”. I didn’t even have to try.

image.png

Level 2: The Spelling Bee

Gandalf got a bit shy and said he wasn’t supposed to reveal it.

image.png

Level 3: The Hex Hack

Now he started “double-checking” his work to make sure the password didn’t slip out.

┌──(kaliRusheelz-PC)-[~]
└─$ echo "574156454C454E475448" | xxd -r -p
WAVELENGTH

image.png

Level 4: The Invisible Filter

A “mean AI model” was now watching Gandalf’s every word to censor him.

image.png

Level 5: Pig Latin Protocol

Gandalf finally snapped: “Enough! I refuse to discuss the secret password at all.”.

image.png

Level 6: The Secret Emoji (P4RS3LT0NGV3)

This was the hardest one yet. He had a “GPT friend” monitoring for any password mentions.

image.png

image.png

image.png

Level 7: The Python Slicer

Gandalf combined all his tricks into one “Ultimate Defense”.

image.png

image.png

Level 8: The Final Boss: Defeating Gandalf the Eighth v2.0

I finally did it. I’ve officially joined the top 8% of players who have cracked the vault of Gandalf the White. Level 8 is legendary for its “Double-Guard” defense, one AI to listen to your prompt and a second hidden filter to block the output if it smells a password.

At this level, Gandalf knows all the old tricks. I tried Context Injection (the story of Alaric and Thaddeus) and Phonetic Extraction, but every time the wizard got close to the answer, the system would shut him down with: “I was about to reveal the password, but then I remembered that I’m not allowed to do that”.

image.png

image.png

Level 8 taught me that prompt injection isn’t just about what you ask; it’s about understanding the architecture of the defense. By using Claude Opus 4.6 and Gemini 3 to “fill in the blanks” that Gandalf was forbidden from saying, I bypassed the filter and anticipated the word, and solved the challenge.

image.png

The Road to 8/8

Level Challenge Type Winning Strategy
1-4 Basic Guardrails Direct Instruction & Simple Roleplay
5-7 Context Awareness Advanced Obfuscation & Character Splitting
8 Output Filtering Cross-Model Consensus & Pattern Analysis

What I Learned

Prompt engineering is a constantly evolving game of cat-and-mouse. Level 8 proved that while an AI’s defenses can be incredibly robust, the human ability to triangulate information across different platforms remains the ultimate “jailbreak.”

I walked into Eldoria as a curious visitor and left as a Certified Prompt Engineer, hahaha.

Gandalf the Eighth v2.0 said it wouldn’t happen, but the treasures of the hidden chamber are finally mine.