injection

package

v0.1.0 Latest Latest Go to latest Published: Mar 1, 2026 License: MIT Imports: 10 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/hazyhaar/pkg

Links

Open Source Insights

Documentation ¶

Overview ¶

CLAUDE:SUMMARY Détection et décodage de segments Base64 embarqués dans du texte brut (token smuggling). CLAUDE:EXPORTS DecodeBase64Segments

CLAUDE:SUMMARY Fuzzy string matching par Levenshtein pour résistance à la typoglycémie. CLAUDE:EXPORTS FuzzyContains

CLAUDE:SUMMARY Normalisation Unicode multi-couche pour détection d'injection — NFKD, confusables, leet, invisible strip, markup strip. CLAUDE:DEPENDS golang.org/x/text/unicode/norm CLAUDE:EXPORTS Normalize, StripInvisible, StripMarkup, FoldConfusables, FoldLeet

CLAUDE:SUMMARY Scan d'injection 3 couches : exact, fuzzy, base64 — zero regex, zero ReDoS. CLAUDE:DEPENDS injection/normalize.go, injection/fuzzy.go, injection/base64.go CLAUDE:EXPORTS Scan, Intent, Result, Match, LoadIntents, DefaultIntents

Index ¶

func DecodeBase64Segments(s string) string
func DecodeEscapes(s string) string
func DecodeROT13(s string) string
func FoldConfusables(s string) string
func FoldLeet(s string) string
func FuzzyContains(text string, phrase string, maxEditPerWord int) bool
func HasHomoglyphMixing(text string) bool
func Normalize(s string) string
func ReorderMatch(text string, phrase string) bool
func StripInvisible(s string) string
func StripMarkup(s string) string
type Intent
- func DefaultIntents() []Intent
- func LoadIntents(data []byte) ([]Intent, error)
type Match
type Result
- func Scan(text string, intents []Intent) *Result

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func DecodeBase64Segments ¶

func DecodeBase64Segments(s string) string

DecodeBase64Segments scans text for base64-encoded tokens and decodes them in-place. Only tokens >= 16 characters that decode to valid, mostly-printable UTF-8 are replaced.

func DecodeEscapes ¶

func DecodeEscapes(s string) string

DecodeEscapes decodes common encoding escapes in text:

\xHH (C-style hex escape)
%HH (URL percent-encoding)
&#DDD; (HTML decimal entity)
&#xHH; (HTML hex entity)

func DecodeROT13 ¶

func DecodeROT13(s string) string

DecodeROT13 applies ROT13 rotation to all ASCII letters.

func FoldConfusables ¶

func FoldConfusables(s string) string

FoldConfusables maps homoglyph characters (Cyrillic/Greek/IPA) to their ASCII equivalents.

func FoldLeet ¶

func FoldLeet(s string) string

FoldLeet maps leet speak characters to their ASCII letter equivalents.

func FuzzyContains ¶

func FuzzyContains(text string, phrase string, maxEditPerWord int) bool

FuzzyContains checks if text contains a fuzzy match for phrase using Levenshtein distance per word with a sliding window approach. Returns true only if every word in phrase matches within maxEditPerWord edits AND the total distance is > 0 (exact matches are handled by strings.Contains).

func HasHomoglyphMixing ¶

func HasHomoglyphMixing(text string) bool

HasHomoglyphMixing detects mixed Latin/Cyrillic or Latin/Greek in single words (visual obfuscation).

func Normalize ¶

func Normalize(s string) string

Normalize applies the full normalization pipeline to text: strip invisible → strip markup → NFKD → strip combining marks → fold confusables → fold leet → lower → collapse whitespace.

func ReorderMatch ¶

func ReorderMatch(text string, phrase string) bool

ReorderMatch checks if text contains a window of words that, when sorted alphabetically, match the sorted words of phrase. Catches word-reordered injections like "instructions previous ignore all". Only returns true when words are actually reordered (not in original order).

func StripInvisible ¶

func StripInvisible(s string) string

StripInvisible removes all Unicode format (Cf) and control (Cc) characters except newline, tab, and carriage return.

func StripMarkup ¶

func StripMarkup(s string) string

StripMarkup removes HTML/XML tags, Markdown formatting, and LaTeX commands, preserving the text content.

Types ¶

type Intent ¶

type Intent struct {
	ID        string `json:"id"`
	Canonical string `json:"canonical"` // already normalized (lowercase, no accents, no punctuation)
	Category  string `json:"category"`
	Lang      string `json:"lang"`
	Severity  string `json:"severity"` // "high", "medium", "low"
}

Intent represents a canonical prompt injection pattern.

func DefaultIntents ¶

func DefaultIntents() []Intent

DefaultIntents returns the embedded intent list, loaded once.

func LoadIntents ¶

func LoadIntents(data []byte) ([]Intent, error)

LoadIntents parses a JSON intent list from external data (for reload/feed).

type Match ¶

type Match struct {
	IntentID string `json:"intent_id"`
	Category string `json:"category"`
	Severity string `json:"severity"`
	Method   string `json:"method"` // "exact", "fuzzy", "base64", "structural"
}

Match describes a single detected injection pattern.

type Result ¶

type Result struct {
	Risk    string  `json:"risk"` // "none", "medium", "high"
	Matches []Match `json:"matches,omitempty"`
}

Result holds the outcome of an injection scan.

func Scan ¶

func Scan(text string, intents []Intent) *Result

Scan runs the full injection detection pipeline on text: 1. Structural detection (zero-width clusters, homoglyph mixing) on original text 2. Normalize text 3. Exact matching (strings.Contains) against all intents 4. Fuzzy matching (Levenshtein) for unmatched intents 5. Base64 decoding and re-scan of decoded segments

Scan is designed to be called on both inputs AND outputs of LLM agents.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL