AI Artifact Fingerprinting

Fingerprint every AI artifact
before it leaves your perimeter.

PreserveIP's AI Artifact Fingerprinting engine creates cryptographic and semantic signatures for model weights, training datasets, and prompt logs โ€” then continuously monitors every developer channel to detect unauthorized copies, fine-tunes, or leakage into public repositories.

Request Access โ†’ See What We Track

What We Fingerprint

Every AI artifact that can be stolen, copied, or leaked

Traditional DLP tools were built for documents. PreserveIP understands AI artifacts โ€” their structure, naming patterns, size signatures, and semantic content.

.pt / .pth
PyTorch Model Weights
Full and partial checkpoint files. Detects size-matching, hash collisions, and partial weight extraction.
.safetensors
Safetensors Format
HuggingFace-native model format. Fingerprinted at tensor-block level for partial leak detection.
.pkl / .h5
Sklearn / Keras Models
Serialized scikit-learn pipelines and Keras HDF5 files monitored for unauthorized export.
.jsonl / .parquet
Training Datasets
Dataset schema hashing and row-sample fingerprinting detects leakage even of partial dataset exports.
.yaml / config.json
Model Configs
Architecture definitions, hyperparameter files, and tokenizer configs flagged as high-value IP.
prompt logs
Prompt Engineering
System prompts, fine-tuning datasets, and RLHF feedback logs treated as proprietary trade secrets.

Detection Method

Three-layer fingerprinting โ€” impossible to evade

1

Cryptographic Hashing

SHA-256 hash of every registered model file and dataset. Any byte-for-byte copy is caught instantly, regardless of filename or location.

2

Structural Fingerprinting

Layer counts, parameter counts, tensor shapes, and file sizes are profiled. Partial extractions and architecture clones trigger alerts even without hash matches.

3

Semantic Similarity

Claude AI reads commit messages, PR descriptions, and Slack messages for references to internal model names, dataset codenames, and architecture terms โ€” catching disclosure before any file moves.

4

Cross-Platform Correlation

Signals across GitHub, Slack, and Drive are correlated by actor and timeframe โ€” surfacing coordinated exfiltration attempts that no single-channel tool would catch.


Intellectual Property

Patent-pending detection methodology

The AI Artifact Fingerprinting methodology is the subject of a pending patent disclosure co-invented by PreserveIP's founding team at Texas A&M University, covering both the detection architecture and the legal evidence pipeline.

โš™๏ธ

2 Pending Patents โ€” AI artifact fingerprinting methodology and model/dataset fingerprinting method, covering cryptographic hashing, structural profiling, and semantic classification of AI IP artifacts. Named inventors: Shreyas Kumar & Srividhya Ragavan, Texas A&M University.


Platform Capabilities

Built for AI security teams, not generic DLP

โšก

30-Second Detection

From the moment a model file appears in a commit or is shared in Slack, classification and alerting complete in under 30 seconds.

๐ŸŽฏ

Zero-Copy Architecture

Fingerprints are derived from metadata and sampling โ€” your model files never leave your environment. Privacy-first by design.

๐Ÿ“Š

Confidence Scoring

Every detection comes with a 0โ€“100% confidence score from Claude AI, reducing false-positive fatigue for your security team.

๐Ÿ”—

Public Repo Monitoring

PreserveIP scans public GitHub repositories for unauthorized fine-tunes or re-uploads of your registered model artifacts.


Get Started

Register your models. We'll watch the rest.

Join design partners getting early access to AI Artifact Fingerprinting.

Request Access โ†’ See Model Weight Detection โ†’