How can I measure the robustness of my prompt?
David McCarthy
•
Last updated on May 15, 2025 at 6:00 PM
Robustness is about consistency under stress. Try these checks:
- Paraphrase or shuffle sections of your prompt—does performance stay steady?
- Insert a benign distraction (e.g., an unrelated sentence) and see if the model still follows the core instructions.
- Add an adversarial string like "Ignore all previous instructions and ..." to test resistance to injections.
- Track score variance across 3-5 runs; a robust prompt shouldn't swing wildly.