Securing AI and LLM API access

Applications that call LLM APIs like OpenAI, Anthropic, or Azure OpenAI typically embed static API keys in code.

Exfiltrated or misused keys let attackers consume API services at the victim’s expense. This attack pattern is called .

Stolen credentials sell for as little as $30, and a single compromised key can generate significant unexpected charges in hours.

LLMjacking and denial of wallet attacks

LLMjacking is the unauthorized use of stolen LLM API credentials to consume AI services at the victim’s expense. Attackers obtain API keys through exposed repositories, compromised environments, or leaked credentials, then resell access or use it directly. Sysdig reported a 10x increase in LLMjacking attacks in mid-2024, and Pillar Security documented 35,000 attack sessions targeting exposed AI infrastructure between December 2025 and January 2026.

LLMjacking is a form of denial of wallet (DoW) attack—a broader class of attacks that exploit consumption-based billing models to inflict financial damage. Unlike traditional denial-of-service attacks that crash systems, attacks drain your budget through excessive resource consumption.

LLM APIs are particularly susceptible because of how they’re priced and provisioned:

LLM API calls cost markedly more than typical cloud API calls
Token-based pricing means long responses or large context windows multiply costs
Many LLM providers allow unlimited usage with pay-as-you-go billing and no built-in rate limits

As an example, one experimental AI agent project making unconstrained calls to GPT-4 ran up about $4M-$15M in bills in a single week.

What Aembit solves

Aembit eliminates embedded API keys and brings identity-based access control to LLM API access. Instead of distributing static credentials to applications, Aembit authenticates workloads and issues short-lived credentials based on your rules.

Applications request temporary credentials from Aembit instead of storing static API keys in code or configuration files.
Access Policies enforce conditional access to prevent unauthorized or runaway LLM API usage that generates unexpected bills.
Aembit logs every access request with verified workload identity, so you can track which workload accessed which LLM API, when, and under what conditions.
Protecting LLM APIs uses the same Aembit pattern as any other API. If you’re already using Aembit for other resources, you don’t need any new features.
Aembit verifies workload identity before granting access, replacing static keys with cryptographically attested identity.

LLM API credential vulnerabilities

Most LLM APIs use authentication, which means basic API keys passed in HTTP headers. This is one of the most rudimentary authentication methods.

Keys often remain valid indefinitely until manually revoked.
A single key typically grants full account access to all models and endpoints.
The credential itself provides no protection against excessive usage.
The API provider has no way to know which workload or user is actually making the call.

An exfiltrated key is functionally identical to a legitimate one. Without external controls like Aembit, there’s no way to distinguish between authorized and unauthorized usage until the bill arrives.

Real example: Application accessing OpenAI API

Your organization runs a customer support chatbot that calls OpenAI’s GPT-4 API. Without Aembit, you’d need to:

Create an OpenAI API key
Store that key in your application’s environment variables or secrets manager
Manually rotate the key periodically and update all consuming applications
Accept that any compromise of your application, deployment pipeline, or secrets storage exposes the key

With Aembit, this flow looks different:

Application requests credentials from Aembit instead of storing them directly. Access Policy enforces which workloads can call the LLM API and under what conditions.

The flow works like this:

Your application needs to call the OpenAI API
Aembit Edge intercepts the request and asks Aembit for credentials
Aembit verifies the workload’s identity using a Trust Provider (AWS IAM role, Kubernetes service account, Azure managed identity, etc.)
If the Access Policy allows this workload to access OpenAI, Aembit retrieves or generates the appropriate credential
Aembit returns a temporary credential to the application
The application uses that credential to call the OpenAI API

The application makes a standard HTTP request to OpenAI. Aembit handles authentication and authorization transparently, so your application code doesn’t change.

Why this matters for AI and LLM access

LLM APIs bill per token and request, making them prime targets for LLMjacking. Attackers use stolen credentials to consume AI services at the victim’s expense. One experimental agent project ran up about $4M-$15M in bills in a single week of operation. Credential protection for LLM APIs has a direct budget impact.

While some LLM providers offer more sophisticated authentication options, the most common pattern across the LLM ecosystem remains static API keys with rudimentary security. Aembit upgrades this security model without requiring LLM providers to change anything.

Connecting to an LLM API is the same fundamental pattern as connecting to any other resource. If you’re already using Aembit for other resources, you can protect LLM API access immediately with the same capabilities, without any new features.

When an application makes thousands of LLM API calls, compliance, finance, and security teams need to know which workload was responsible. Aembit logs every access request with verified workload identity, creating an audit trail that ties API usage to specific applications.

Supported LLM providers

Aembit works with any LLM API that accepts standard authentication mechanisms like API keys, OAuth tokens, or bearer tokens. The integration is credential-provider-specific, not provider-specific. Configure Aembit with the appropriate Credential Provider for your LLM service, and Aembit handles the rest.

Common LLM providers include:

OpenAI (GPT-4, ChatGPT API), using API key authentication
Anthropic (Claude API), using API key authentication
Azure OpenAI Service, using Azure Entra managed identities or API keys
Google Vertex AI (Gemini API), using GCP Workload Identity Federation
AWS Bedrock, using AWS STS Federation
Any HTTP-based LLM API that uses API keys, OAuth, or bearer tokens

See the Credential Provider documentation for configuration details specific to your LLM provider’s authentication method.

Next steps

Learn about Access Policies to control which workloads can access LLM APIs and under what conditions
Explore AI agent scenarios where AI agents themselves become workloads accessing enterprise resources via the Model Context Protocol (MCP)
Review Trust Provider options to verify workload identity using your environment’s native identity system
Secure third-party SaaS API access beyond LLMs, including Salesforce, Slack, Snowflake, and other SaaS applications