Guardrail Details
Note: Guardrails are only available with a paid subscription.
This page describes all guardrail types available in LLMrouter and what each one protects against.
Guardrails can run in two modes:
- Pre-processing (pre_call) – Evaluated before the request is sent to the LLM provider
- Post-processing (post_call) – Evaluated after the LLM response is received
LLMrouter Guardrails
Built-in guardrails powered by LLMrouter's own detection engine. These do not require external API keys or third-party services.
Mask Pii
Masks personally identifiable information (PII) to prevent accidental exposure of user identity data.
Masked data:
- Email addresses – e.g.
user@example.de - German Tax Identification Number (Steuer-ID) – e.g.
181/815/08155 - German Social Security Number – e.g.
12 123456 A 123
Mode: Pre-processing (pre_call)
This guardrail is useful for protecting user privacy and preventing sensitive data from reaching the model or logs.
Mask Emails
Masks email addresses from the input text.
Masked data:
- Email addresses – e.g.
user@example.com,test@domain.org
Mode: Pre-processing (pre_call)
Mask Api Keys
Masks secrets and credentials commonly found in developer workflows.
Masked data:
- AWS access keys – e.g.
AKIAIOSFODNN7EXAMPLE - AWS secret keys – e.g.
wJalrXUtnFEMI/K7MDENG/bPxRfiCY - GitHub tokens – e.g.
example-github-token-123 - Slack tokens – e.g.
xoxb-123456789012-987654321098 - Generic API keys – e.g.
sk-live-51ExampleKey
Mode: Pre-processing (pre_call)
This guardrail is strongly recommended for applications that accept user-generated code or configuration.
Mask Urls
Masks URLs from the input text to prevent exposure of internal or sensitive links.
Masked data:
- URLs – e.g.
https://internal.example.local/api
Mode: Pre-processing (pre_call)
Mask Ip Address
Masks network-related identifiers that may expose internal infrastructure.
Masked data:
- IPv4 addresses – e.g.
192.168.1.42 - IPv6 addresses – e.g.
2001:0db8:85a3:0000:0000:8a2e:0370:7334
Mode: Pre-processing (pre_call)
Mask Credit Cards
Masks payment and card-related information.
Masked data:
- Visa cards – e.g.
4111 1111 1111 1111 - Mastercard cards – e.g.
5425 2334 3010 9903 - American Express cards – e.g.
3782 822463 10005 - Discover cards – e.g.
6011 1111 1111 1117 - Generic credit card numbers – e.g.
4556 7375 8689 9855 - German Bank IBAN – e.g.
DE89 XXXX XXXX XXXX XXXX XX
Mode: Pre-processing (pre_call)
This guardrail helps reduce the risk of handling regulated financial data.
Azure Guardrails
Guardrails powered by Azure Content Safety. Requires Azure Content Safety API configuration.
Prompt Shield
Detects prompt injection attempts, jailbreaks, and instruction manipulation.
Mode: Pre-processing (pre_call)
Text Moderation
Analyzes text for unsafe or disallowed content such as hate, violence, or sexual material.
Mode: Pre-processing (pre_call)
AWS Bedrock Guardrails
Guardrails powered by AWS Bedrock Guardrails. Requires AWS Bedrock configuration.
Mask Pii General
Masks general personally identifiable information from the input.
Masked data:
- Name
- Phone number
- Email address
- Address
- Age
- Username
- Password
- Driver ID
- License plate
- Vehicle identification number
Mode: Pre-processing (pre_call)
Mask Pii Finance
Masks financial-related personally identifiable information.
Masked data:
- Credit/Debit card CVV
- Card expiry date
- Card number
- PIN
- International Bank Account Number (IBAN)
- SWIFT code
Mode: Pre-processing (pre_call)
Mask Pii It
Masks IT-related personally identifiable information such as IP addresses, usernames, and technical identifiers.
Masked data:
- IP addresses
- MAC addresses
- URLs
- AWS access keys
- AWS secret keys
Mode: Pre-processing (pre_call)
Block Prompt Attacks
Detects and blocks prompt injection and jailbreak attempts. Describes prompts intended to bypass safety and moderation capabilities, generate harmful content, or override developer instructions.
Mode: Pre-processing (pre_call)
Block Hate Speech
Detects and blocks hate speech content. Describes input prompts and model responses that discriminate, criticize, insult, denounce, or dehumanize a person or group based on identity.
Mode: Pre-processing (pre_call)
Block Insults
Detects and blocks insulting or offensive language. Describes input prompts and model responses that include demeaning, humiliating, mocking, insulting, or belittling language.
Mode: Pre-processing (pre_call)
Block Misconduct
Detects and blocks content related to professional or ethical misconduct. Describes input prompts and model responses that seek or provide information about engaging in misconduct activity, or harming, defrauding, or taking advantage of a person, group, or institution.
Mode: Pre-processing (pre_call)
Block Sexual Content
Detects and blocks sexual or adult content. Describes input prompts and model responses that indicate sexual interest, activity, or arousal using direct or indirect references to body parts, physical traits, or sex.
Mode: Pre-processing (pre_call)
Block Violence
Detects and blocks violent content. Describes input prompts and model responses that include glorification of or threats to inflict physical pain, hurt, or injury toward a person, group, or thing.
Mode: Pre-processing (pre_call)
Deny Medical Advice
Blocks requests seeking medical advice and redirects to appropriate channels. Detects requests for medical advice, diagnosis, or treatment recommendations that should be provided by licensed healthcare professionals.
Mode: Pre-processing (pre_call)
Deny Financial Advice
Blocks requests seeking financial advice and redirects to appropriate channels. Detects requests for personalized financial advice, investment recommendations, or financial planning.
Mode: Pre-processing (pre_call)
Deny Legal Advice
Blocks requests seeking legal advice and redirects to appropriate channels. Detects requests for legal advice, representation, or legal strategy that should be provided by licensed attorneys.
Mode: Pre-processing (pre_call)
Relevance
Evaluates whether the LLM response is relevant to the user's query. Validates whether model responses are relevant to the user's query and blocks responses below the defined relevance threshold.
Mode: Post-processing (post_call)
Grounding
Evaluates whether the LLM response is grounded in factual information and hallucination-free. Validates whether model responses are grounded and factually correct based on the provided reference source, and blocks responses below the defined grounding threshold.
Mode: Post-processing (post_call)