Network Intelligence

Every prompt SafePrompt validates — across every customer — contributes to a shared threat intelligence layer. The more the network sees, the better it protects everyone on it.

Collective Defense

A prompt injection attack against one SafePrompt customer generates a threat signal. That signal — fully anonymized and stripped of any application-specific context — is fed into the detection layer that protects every other customer.

This means a novel attack pattern discovered against a fintech app at 2 AM is blocking the same attack against a healthcare chatbot by 3 AM — without either customer doing anything.

// Network effect diagram
Customer A detects novel attack
↓ anonymize + extract pattern
Threat intelligence layer (shared)
↓ distribute to all nodes
Customer B, C, D ... automatically protected

What Gets Collected

SafePrompt collects the minimum possible data needed to generate useful threat signals. No prompt content is ever stored beyond what is required for the current validation.

Collected (anonymized)

  • Attack category (e.g. prompt_injection, jailbreak)
  • Structural pattern fingerprint (not the prompt itself)
  • Hashed source IP (not the raw IP)
  • Timestamp (rounded to the hour)
  • Confidence score of the verdict

Never collected

  • The prompt text itself
  • Your application's system prompt
  • User identities or session data
  • Raw IP addresses
  • Any application-specific context

All threat data is anonymized within 24 hours. SafePrompt is designed to be GDPR and CCPA compliant.

Intelligence Layers

IP Reputation<10ms overhead

Source IPs are hashed and checked against a reputation database built from network-wide attack signals. A hash with a history of injection attempts triggers heightened scrutiny — faster escalation to AI stages — before the first AI call is even made.

Hash rotates every 24 hours so the same IP cannot be tracked across time windows.
Pattern DiscoveryRuns nightly

A nightly ML job analyzes clusters of recently flagged prompts to identify new structural patterns. Novel patterns that reach statistical significance are promoted to the Stage 1 pattern library — turning AI-detected attacks into zero-cost pattern blocks for the next day's traffic.

Campaign DetectionTemporal clustering

A coordinated jailbreak campaign — where many sources send structurally similar attacks in a short window — produces a detectable temporal cluster. Campaign detection surfaces these spikes in near real-time and adds campaign-level signals to the validation context, increasing sensitivity during active attacks.

Privacy Architecture

Shared intelligence only works if customers trust it. The architecture is designed so that contributing to the network gives you back more protection than you put in, without exposing anything about your users or your application.

PropertyGuarantee
Prompt contentNever stored, never leaves the validation pipeline
IP addressesOne-way hashed, rotated every 24 hours
Threat signalsAnonymized within 24 hours, no customer attribution
Pattern contributionsStructural fingerprint only — not reconstructable
Data residencyIntelligence layer operated independently of validation pipeline

Opting Out

Business plan customers can opt out of contributing threat signals to the network intelligence layer. You continue to receive the benefit of the shared intelligence compiled before your opt-out, but your traffic no longer contributes new signals.

To opt out, contact [email protected] from your registered email, or use the privacy controls in your dashboard settings.

Note: Opting out reduces your protection on novel attack patterns that the network has not yet seen. Most customers choose to remain opted in.