You can't fix an LLM by red teaming. It does exactly what it was designed to do. Autoassociative predictive word generation.
So what do you prove when you do prompt injection? Not a damn thing.
Always ask this. How does someone FIX what comes out of a pen test? If there is no fix, there is no change in security posture.
#MLsec
https://www.washingtonpost.com/technology/2023/08/08/ai-red-team-defcon/?wpisrc=nl_technology202