Obiguard enables the screening of LLM interactions to address various threats. Validator checks, flagging logic, and strictness levels are all centrally managed through an Obiguard guardrail policy.
guard
response will indicate flagged equals failed
.
If no validators flag the request, the response will indicate flagged equals passed
.
You can configure your application to handle flagged responses in various ways.
Options include blocking the flagged inputs or outputs, prompting the user for confirmation before proceeding,
logging the flagged response for analysis and monitoring, or taking no action. The choice is entirely yours.
Optionally, the response can include a detailed breakdown of the flagging decision.
This will show the detectors that were executed, as specified in the policy, and indicate whether each detector identified an issue.
passed
or failed
.