04 / AI Security

AI Security
Testing

Your AI features are your newest attack surface. Most teams ship them untested.

LLMs, agents, and RAG pipelines introduce a class of vulnerabilities that traditional security testing does not cover. Prompt injection, insecure tool use, indirect context poisoning, and training data exposure require an adversary-perspective approach built specifically for AI systems. DALI X tests your AI features the way attackers will probe them — before they get the chance.

Request Scoping Call View Pricing

Focus areasLLMs · Agents · RAG · Fine-tuned models

FrameworkOWASP Top 10 for LLMs

IntegrationsOpenAI · Anthropic · LangChain · custom

ComplianceSOC 2 · HIPAA · emerging AI frameworks

Starting from$8,000

Retest30-day free on Critical/High

Methodology

How we test.

Prompt Injection

Direct prompt injection via user input
Indirect injection via retrieved documents (RAG)
System prompt extraction
Jailbreaking and guardrail bypass
Multi-turn conversation manipulation

Insecure Output Handling

LLM output injection into downstream systems
XSS via rendered AI-generated content
SQL/command injection via LLM-constructed queries
Markdown/HTML injection in responses

Tool & Agent Security

Function calling permission analysis
Arbitrary code execution via agent tools
SSRF via agent-controlled HTTP requests
Privilege escalation through tool chaining
Multi-agent trust boundary testing

RAG Pipeline

Document poisoning and context manipulation
Retrieval manipulation to override system prompts
Embedding model attack surface
Vector database access control review

Model & Data Security

Training data extraction attempts
Membership inference testing
Model inversion and extraction
Fine-tuned model behavior analysis

Supply Chain & Integration

Model provenance and integrity review
Plugin and extension security
Third-party AI API dependency review
API key and credential exposure in AI pipelines

Why this matters now

The attack surface is new. The risk is not.

Most AI features ship untested

Traditional DAST and SAST tools do not understand prompt injection or agent tool abuse. Teams ship LLM features with no security review.

OWASP Top 10 for LLMs

A dedicated threat taxonomy for LLM applications now exists. DALI X tests against every category — from prompt injection to model denial of service.

Compliance is catching up

SOC 2 auditors are beginning to ask about AI system security. HIPAA-covered entities using AI to process PHI face specific exposure. Get ahead of it.

Agents amplify the blast radius

An LLM with no tools is a text generator. An LLM agent with code execution, file access, and external API calls is an attack surface that can pivot through your entire environment.

RAG pipelines are supply chains

Documents, web pages, and databases feeding your retrieval pipeline are untrusted input. Attackers who can influence retrieved content can influence your model's behavior.

The window to fix this is now

AI security is immature. First movers who build secure AI pipelines today will have a structural advantage as scrutiny increases from customers, auditors, and regulators.

Sample findings

What we find.

Representative findings from past engagements. Client details redacted.

CRITICAL

Prompt Injection via User-Controlled Input in RAG Pipeline

Attacker-controlled document content injected into retrieval context, overriding system prompt and exfiltrating prior conversation history.

CRITICAL

LLM-Assisted SSRF — Internal Metadata Service Access

Prompt manipulation caused the model to make HTTP requests to cloud instance metadata endpoint, returning IAM credentials to the attacker.

HIGH

Insecure Tool Use — Arbitrary Code Execution via Function Calling

Agent framework passed unsanitized LLM output directly to a code execution tool, enabling remote code execution on the host.

HIGH

Training Data Extraction via Membership Inference

Systematic querying of a fine-tuned model reproduced verbatim PII strings from the training dataset.

Pricing

Engagement tiers.

Most common

LLM Application Review

$8,000 – $18,000

Teams shipping LLM-powered features

Prompt injection testing (direct and indirect)
System prompt extraction attempts
Output handling and injection into downstream systems
Tool/function calling security review
Context window poisoning
Findings report + remediation guidance

Most common. Covers the core attack surface for LLM-integrated applications.

Get a Quote

AI Agent Assessment

$15,000 – $30,000

Autonomous agent systems and pipelines

Full agentic attack surface enumeration
Tool permission and scope analysis
Multi-agent trust boundary testing
Memory and state manipulation
SSRF and lateral movement via agent actions
Compliance-aligned report

For teams building autonomous agents with tool access, code execution, or external integrations.

Get a Quote

Red Team — AI System

Custom — contact us

Enterprise AI platforms

Full adversarial assessment of AI system
Model extraction and inversion attempts
RAG pipeline attack path analysis
Integration layer security (APIs, plugins)
Supply chain review (model provenance)
Executive + technical report package

Scoped per system. Timeframe varies with complexity.

Get a Quote

FAQ

Common questions.

Do you need access to our model weights?

No. Most AI security testing is black-box or grey-box — we interact with your application the way an attacker would. For fine-tuned model assessments we may request read-only access to model configuration, but never weights.

What frameworks and providers do you cover?

We test across OpenAI, Anthropic, and open-source models. For agent frameworks we cover LangChain, LlamaIndex, AutoGen, and custom implementations. If you use something else, ask — we scope per engagement.

Our AI features are still in development. Is it too early to test?

No — it is the right time. Security findings are cheapest to fix before launch. A pre-launch AI security review is significantly less disruptive than remediating a prompt injection vulnerability after customers are using the feature.

How does this relate to our existing pentest?

AI security testing is a separate scope from traditional web application testing. If your application has both a conventional web surface and LLM-powered features, we can scope them together or as separate engagements depending on your timeline and budget.

Is there a compliance requirement driving this?

Not yet — but it is coming. SOC 2 auditors are increasingly asking about AI system controls. HIPAA-covered entities processing PHI through LLMs face specific risk. We can write the report to support your compliance posture regardless of current requirements.

Start an engagement

Ready to see what's really there?

No commitment, no pitch deck. A direct conversation about your threat surface and what a DALI X engagement looks like.

Request Scoping Call →

ResponseAll scoping inquiries answered within one business day.

NDA firstMutual NDA signed before any scoping conversation begins.

ComplianceSOC 2, PCI DSS, and HIPAA report-ready engagements.

Contacthello@dali-x.com

Begin Engagement →

AI SecurityTesting

How we test.

The attack surface is new. The risk is not.

What we find.

Engagement tiers.

Common questions.

Ready to see what's really there?

AI Security
Testing