04 / AI Security
AI Security
Testing
Your AI features are your newest attack surface. Most teams ship them untested.
LLMs, agents, and RAG pipelines introduce a class of vulnerabilities that traditional security testing does not cover. Prompt injection, insecure tool use, indirect context poisoning, and training data exposure require an adversary-perspective approach built specifically for AI systems. DALI X tests your AI features the way attackers will probe them — before they get the chance.
Methodology
How we test.
Prompt Injection
- Direct prompt injection via user input
- Indirect injection via retrieved documents (RAG)
- System prompt extraction
- Jailbreaking and guardrail bypass
- Multi-turn conversation manipulation
Insecure Output Handling
- LLM output injection into downstream systems
- XSS via rendered AI-generated content
- SQL/command injection via LLM-constructed queries
- Markdown/HTML injection in responses
Tool & Agent Security
- Function calling permission analysis
- Arbitrary code execution via agent tools
- SSRF via agent-controlled HTTP requests
- Privilege escalation through tool chaining
- Multi-agent trust boundary testing
RAG Pipeline
- Document poisoning and context manipulation
- Retrieval manipulation to override system prompts
- Embedding model attack surface
- Vector database access control review
Model & Data Security
- Training data extraction attempts
- Membership inference testing
- Model inversion and extraction
- Fine-tuned model behavior analysis
Supply Chain & Integration
- Model provenance and integrity review
- Plugin and extension security
- Third-party AI API dependency review
- API key and credential exposure in AI pipelines
Why this matters now
The attack surface is new. The risk is not.
Most AI features ship untested
Traditional DAST and SAST tools do not understand prompt injection or agent tool abuse. Teams ship LLM features with no security review.
OWASP Top 10 for LLMs
A dedicated threat taxonomy for LLM applications now exists. DALI X tests against every category — from prompt injection to model denial of service.
Compliance is catching up
SOC 2 auditors are beginning to ask about AI system security. HIPAA-covered entities using AI to process PHI face specific exposure. Get ahead of it.
Agents amplify the blast radius
An LLM with no tools is a text generator. An LLM agent with code execution, file access, and external API calls is an attack surface that can pivot through your entire environment.
RAG pipelines are supply chains
Documents, web pages, and databases feeding your retrieval pipeline are untrusted input. Attackers who can influence retrieved content can influence your model's behavior.
The window to fix this is now
AI security is immature. First movers who build secure AI pipelines today will have a structural advantage as scrutiny increases from customers, auditors, and regulators.
Sample findings
What we find.
Representative findings from past engagements. Client details redacted.
CRITICAL
Prompt Injection via User-Controlled Input in RAG Pipeline
Attacker-controlled document content injected into retrieval context, overriding system prompt and exfiltrating prior conversation history.
CRITICAL
LLM-Assisted SSRF — Internal Metadata Service Access
Prompt manipulation caused the model to make HTTP requests to cloud instance metadata endpoint, returning IAM credentials to the attacker.
HIGH
Insecure Tool Use — Arbitrary Code Execution via Function Calling
Agent framework passed unsanitized LLM output directly to a code execution tool, enabling remote code execution on the host.
HIGH
Training Data Extraction via Membership Inference
Systematic querying of a fine-tuned model reproduced verbatim PII strings from the training dataset.
Pricing
Engagement tiers.
Most common
LLM Application Review
$8,000 – $18,000
Teams shipping LLM-powered features
- Prompt injection testing (direct and indirect)
- System prompt extraction attempts
- Output handling and injection into downstream systems
- Tool/function calling security review
- Context window poisoning
- Findings report + remediation guidance
Most common. Covers the core attack surface for LLM-integrated applications.
Get a QuoteAI Agent Assessment
$15,000 – $30,000
Autonomous agent systems and pipelines
- Full agentic attack surface enumeration
- Tool permission and scope analysis
- Multi-agent trust boundary testing
- Memory and state manipulation
- SSRF and lateral movement via agent actions
- Compliance-aligned report
For teams building autonomous agents with tool access, code execution, or external integrations.
Get a QuoteRed Team — AI System
Custom — contact us
Enterprise AI platforms
- Full adversarial assessment of AI system
- Model extraction and inversion attempts
- RAG pipeline attack path analysis
- Integration layer security (APIs, plugins)
- Supply chain review (model provenance)
- Executive + technical report package
Scoped per system. Timeframe varies with complexity.
Get a QuoteFAQ
Common questions.
Do you need access to our model weights?
No. Most AI security testing is black-box or grey-box — we interact with your application the way an attacker would. For fine-tuned model assessments we may request read-only access to model configuration, but never weights.
What frameworks and providers do you cover?
We test across OpenAI, Anthropic, and open-source models. For agent frameworks we cover LangChain, LlamaIndex, AutoGen, and custom implementations. If you use something else, ask — we scope per engagement.
Our AI features are still in development. Is it too early to test?
No — it is the right time. Security findings are cheapest to fix before launch. A pre-launch AI security review is significantly less disruptive than remediating a prompt injection vulnerability after customers are using the feature.
How does this relate to our existing pentest?
AI security testing is a separate scope from traditional web application testing. If your application has both a conventional web surface and LLM-powered features, we can scope them together or as separate engagements depending on your timeline and budget.
Is there a compliance requirement driving this?
Not yet — but it is coming. SOC 2 auditors are increasingly asking about AI system controls. HIPAA-covered entities processing PHI through LLMs face specific risk. We can write the report to support your compliance posture regardless of current requirements.
Start an engagement
Ready to see what's really there?
No commitment, no pitch deck. A direct conversation about your threat surface and what a DALI X engagement looks like.
Request Scoping Call →ResponseAll scoping inquiries answered within one business day.
NDA firstMutual NDA signed before any scoping conversation begins.
ComplianceSOC 2, PCI DSS, and HIPAA report-ready engagements.
Contacthello@dali-x.com