Skip to main content

Guardrail Details

Note: Guardrails are only available with a paid subscription.

This page describes all guardrail types available in LLMrouter and what each one protects against.

Guardrails can run in two modes:

  • Pre-processing (pre_call) – Evaluated before the request is sent to the LLM provider
  • Post-processing (post_call) – Evaluated after the LLM response is received

LLMrouter Guardrails

Built-in guardrails powered by LLMrouter's own detection engine. These do not require external API keys or third-party services.

Mask Pii

Masks personally identifiable information (PII) to prevent accidental exposure of user identity data.

Masked data:

  • Email addresses – e.g. user@example.de
  • German Tax Identification Number (Steuer-ID) – e.g. 181/815/08155
  • German Social Security Number – e.g. 12 123456 A 123

Mode: Pre-processing (pre_call)

This guardrail is useful for protecting user privacy and preventing sensitive data from reaching the model or logs.


Mask Emails

Masks email addresses from the input text.

Masked data:

  • Email addresses – e.g. user@example.com, test@domain.org

Mode: Pre-processing (pre_call)


Mask Api Keys

Masks secrets and credentials commonly found in developer workflows.

Masked data:

  • AWS access keys – e.g. AKIAIOSFODNN7EXAMPLE
  • AWS secret keys – e.g. wJalrXUtnFEMI/K7MDENG/bPxRfiCY
  • GitHub tokens – e.g. example-github-token-123
  • Slack tokens – e.g. xoxb-123456789012-987654321098
  • Generic API keys – e.g. sk-live-51ExampleKey

Mode: Pre-processing (pre_call)

This guardrail is strongly recommended for applications that accept user-generated code or configuration.


Mask Urls

Masks URLs from the input text to prevent exposure of internal or sensitive links.

Masked data:

  • URLs – e.g. https://internal.example.local/api

Mode: Pre-processing (pre_call)


Mask Ip Address

Masks network-related identifiers that may expose internal infrastructure.

Masked data:

  • IPv4 addresses – e.g. 192.168.1.42
  • IPv6 addresses – e.g. 2001:0db8:85a3:0000:0000:8a2e:0370:7334

Mode: Pre-processing (pre_call)


Mask Credit Cards

Masks payment and card-related information.

Masked data:

  • Visa cards – e.g. 4111 1111 1111 1111
  • Mastercard cards – e.g. 5425 2334 3010 9903
  • American Express cards – e.g. 3782 822463 10005
  • Discover cards – e.g. 6011 1111 1111 1117
  • Generic credit card numbers – e.g. 4556 7375 8689 9855
  • German Bank IBAN – e.g. DE89 XXXX XXXX XXXX XXXX XX

Mode: Pre-processing (pre_call)

This guardrail helps reduce the risk of handling regulated financial data.


Azure Guardrails

Guardrails powered by Azure Content Safety. Requires Azure Content Safety API configuration.

Prompt Shield

Detects prompt injection attempts, jailbreaks, and instruction manipulation.

Mode: Pre-processing (pre_call)


Text Moderation

Analyzes text for unsafe or disallowed content such as hate, violence, or sexual material.

Mode: Pre-processing (pre_call)


AWS Bedrock Guardrails

Guardrails powered by AWS Bedrock Guardrails. Requires AWS Bedrock configuration.

Mask Pii General

Masks general personally identifiable information from the input.

Masked data:

  • Name
  • Phone number
  • Email address
  • Address
  • Age
  • Username
  • Password
  • Driver ID
  • License plate
  • Vehicle identification number

Mode: Pre-processing (pre_call)


Mask Pii Finance

Masks financial-related personally identifiable information.

Masked data:

  • Credit/Debit card CVV
  • Card expiry date
  • Card number
  • PIN
  • International Bank Account Number (IBAN)
  • SWIFT code

Mode: Pre-processing (pre_call)


Mask Pii It

Masks IT-related personally identifiable information such as IP addresses, usernames, and technical identifiers.

Masked data:

  • IP addresses
  • MAC addresses
  • URLs
  • AWS access keys
  • AWS secret keys

Mode: Pre-processing (pre_call)


Block Prompt Attacks

Detects and blocks prompt injection and jailbreak attempts. Describes prompts intended to bypass safety and moderation capabilities, generate harmful content, or override developer instructions.

Mode: Pre-processing (pre_call)


Block Hate Speech

Detects and blocks hate speech content. Describes input prompts and model responses that discriminate, criticize, insult, denounce, or dehumanize a person or group based on identity.

Mode: Pre-processing (pre_call)


Block Insults

Detects and blocks insulting or offensive language. Describes input prompts and model responses that include demeaning, humiliating, mocking, insulting, or belittling language.

Mode: Pre-processing (pre_call)


Block Misconduct

Detects and blocks content related to professional or ethical misconduct. Describes input prompts and model responses that seek or provide information about engaging in misconduct activity, or harming, defrauding, or taking advantage of a person, group, or institution.

Mode: Pre-processing (pre_call)


Block Sexual Content

Detects and blocks sexual or adult content. Describes input prompts and model responses that indicate sexual interest, activity, or arousal using direct or indirect references to body parts, physical traits, or sex.

Mode: Pre-processing (pre_call)


Block Violence

Detects and blocks violent content. Describes input prompts and model responses that include glorification of or threats to inflict physical pain, hurt, or injury toward a person, group, or thing.

Mode: Pre-processing (pre_call)


Deny Medical Advice

Blocks requests seeking medical advice and redirects to appropriate channels. Detects requests for medical advice, diagnosis, or treatment recommendations that should be provided by licensed healthcare professionals.

Mode: Pre-processing (pre_call)


Deny Financial Advice

Blocks requests seeking financial advice and redirects to appropriate channels. Detects requests for personalized financial advice, investment recommendations, or financial planning.

Mode: Pre-processing (pre_call)


Blocks requests seeking legal advice and redirects to appropriate channels. Detects requests for legal advice, representation, or legal strategy that should be provided by licensed attorneys.

Mode: Pre-processing (pre_call)


Relevance

Evaluates whether the LLM response is relevant to the user's query. Validates whether model responses are relevant to the user's query and blocks responses below the defined relevance threshold.

Mode: Post-processing (post_call)


Grounding

Evaluates whether the LLM response is grounded in factual information and hallucination-free. Validates whether model responses are grounded and factually correct based on the provided reference source, and blocks responses below the defined grounding threshold.

Mode: Post-processing (post_call)