This resonates—CI and code review are great for enforcing conventions on what the agent produces. But one gap I keep running into: production agents ingest untrusted content at runtime that never touches the repo.
Prompt injection is the obvious example. A malicious payload arriving via user input, tool outputs, or RAG retrieval won't show up in any code review. The adversarial content isn't in your codebase—it's dynamically constructed at inference time.
Do you have any thoughts on validating agent inputs at runtime vs. just at build time? CI catches what you control, but runtime inputs are adversarial territory where static rules can't reach.
Prompt injection is the obvious example. A malicious payload arriving via user input, tool outputs, or RAG retrieval won't show up in any code review. The adversarial content isn't in your codebase—it's dynamically constructed at inference time.
Do you have any thoughts on validating agent inputs at runtime vs. just at build time? CI catches what you control, but runtime inputs are adversarial territory where static rules can't reach.