LangGrinch Alert: Critical LangChain Vulnerability CVE-2025-68664 - Detection and Response Guide

2026-01-31 9 min read

vulnerabilitylangchainai-securitydefender-xdrkqlsupply-chaincve

A critical vulnerability just dropped in one of the most widely-deployed AI frameworks on the planet. CVE-2025-68664, dubbed "LangGrinch," affects LangChain Core's serialization mechanism and carries a CVSS score of 9.3. If you're running LangChain in production? With 98 million downloads last month alone, many of you are. You need to act now.

This isn't theoretical. The vulnerability allows secret extraction from environment variables, arbitrary object instantiation, and potentially remote code execution. Microsoft published detection guidance yesterday. I'm going to walk you through exactly what SOC teams need to know.

TL;DR: What You Need to Do Right Now

Patch immediately: Update to langchain-core 0.3.81+ or 1.2.5+
Hunt for vulnerable versions: Use the KQL queries below
Assess exposure: Check if your AI workloads serialize/deserialize LLM outputs
Monitor: Deploy detection rules for exploitation attempts

Understanding the Vulnerability

LangChain uses a custom serialization format with a reserved lc marker to distinguish between regular data and serialized LangChain objects. The vulnerability exists because the dumps() and dumpd() functions didn't properly escape user-controlled dictionaries containing the lc key.

The Attack Chain

Injection: Attacker crafts malicious input containing an lc key
Serialization: The input gets serialized through normal LangChain operations (logging, streaming, caching)
Deserialization: When the data is later deserialized, LangChain treats the attacker's dict as a trusted object
Exploitation: This leads to:
- Environment variable extraction (including API keys, secrets)
- Arbitrary object instantiation within approved namespaces
- Potential code execution via Jinja2 template injection

Why This Is Worse Than It Sounds

The most dangerous attack vector isn't someone sending you a malicious blob directly. It's prompt injection.

LLM responses can influence fields like additional_kwargs and response_metadata. These fields get serialized during streaming, logging, and message history operations. A single malicious prompt can cascade through your AI pipeline and trigger exploitation.

User prompt → LLM response (influenced) → Serialization → Deserialization → Secret theft

This is exactly the "AI meets classic security" intersection that catches organizations off guard.

Detection: Hunt for Vulnerable LangChain Versions

Defender XDR: Find Vulnerable Software

Microsoft published this KQL query to identify devices running vulnerable versions:

// Find devices with vulnerable LangChain versions
DeviceTvmSoftwareInventory
| where SoftwareName has "langchain"
    and (
        // 0.x versions below 0.3.81
        SoftwareVersion startswith "0."
        and (
            toint(split(SoftwareVersion, ".")[1]) < 3
            or (
                SoftwareVersion hasprefix "0.3." 
                and toint(split(SoftwareVersion, ".")[2]) < 81
            )
        )
        // 1.x versions below 1.2.5
        or (
            SoftwareVersion hasprefix "1."
            and (
                toint(split(SoftwareVersion, ".")[1]) < 2
                or (
                    toint(split(SoftwareVersion, ".")[1]) == 2
                    and toint(split(SoftwareVersion, ".")[2]) < 5
                )
            )
        )
    )
| project DeviceName, OSPlatform, SoftwareName, SoftwareVersion
| summarize 
    VulnerableDevices = dcount(DeviceName),
    Devices = make_set(DeviceName)
    by SoftwareName, SoftwareVersion
| order by VulnerableDevices desc

Defender for Containers: Cloud Process Monitoring

If you're running Defender for Containers, you can hunt for LangChain-related process activity in your containerized workloads using the CloudProcessEvents table:

// Find LangChain processes in containers (Defender for Containers)
CloudProcessEvents
| where Timestamp > ago(7d)
| where ProcessCommandLine has_any ("langchain", "langgraph", "lc_")
    or FileName has "python"
| where ProcessCommandLine has_any ("pip install", "import langchain")
| project 
    Timestamp,
    AccountName,
    FileName,
    ProcessCommandLine,
    ContainerName,
    PodName
| order by Timestamp desc

Note: CloudProcessEvents is in Preview. For software inventory across endpoints, use the DeviceTvmSoftwareInventory query in the previous section.

Hunt for Suspicious Serialization Activity

This query looks for Python processes associated with LangChain making unexpected network connections. That's a potential indicator of secret exfiltration:

// Detect potential LangGrinch exploitation - outbound connections from Python/LangChain
DeviceNetworkEvents
| where Timestamp > ago(24h)
| where InitiatingProcessFileName in ("python", "python3", "python.exe")
| where InitiatingProcessCommandLine has_any ("langchain", "lc_", "langgraph")
| where RemoteIPType == "Public"
| where ActionType == "ConnectionSuccess"
| summarize 
    ConnectionCount = count(),
    RemoteIPs = make_set(RemoteIP),
    RemoteUrls = make_set(RemoteUrl),
    FirstSeen = min(Timestamp),
    LastSeen = max(Timestamp)
    by DeviceName, InitiatingProcessCommandLine
| where ConnectionCount > 10 or array_length(RemoteIPs) > 3
| order by ConnectionCount desc

Hunt for Environment Variable Access

The primary exploitation path involves extracting environment variables. Look for suspicious env var access patterns:

// Monitor for suspicious environment variable access in Python processes
DeviceProcessEvents
| where Timestamp > ago(7d)
| where FileName in ("python", "python3", "python.exe")
| where ProcessCommandLine has_any ("os.environ", "getenv", "AWS_", "OPENAI_", "AZURE_", "API_KEY")
| where ProcessCommandLine has "langchain"
| project 
    Timestamp,
    DeviceName,
    ProcessCommandLine,
    InitiatingProcessCommandLine,
    AccountName
| order by Timestamp desc

Vulnerable Flows: Know Your Attack Surface

The advisory identifies 12 distinct vulnerable flows. Here are the most common in production environments:

Flow	Risk Level	Description
`astream_events(version="v1")`	High	v1 uses vulnerable serialization; v2 is safe
`Runnable.astream_log()`	High	Streaming logs trigger serialize/deserialize
`dumps()` → `loads()` on untrusted data	Critical	Direct path to exploitation
`RunnableWithMessageHistory`	High	Message history serialization
`InMemoryVectorStore.load()`	Medium	Vector store deserialization
`hub.pull()` (LangChain Hub)	Medium	Manifest pulling from external source

Check Your Code

Search your codebase for these vulnerable patterns:

# Find potentially vulnerable code patterns
grep -rn "astream_events.*version.*v1" .
grep -rn "astream_log" .
grep -rn "dumps\|dumpd" . | grep -i langchain
grep -rn "RunnableWithMessageHistory" .
grep -rn "loads\|load" . | grep -i langchain

Remediation Playbook

Phase 1: Immediate Patching (0-24 hours)

Priority 1: Upgrade LangChain Core

# For 0.3.x users
pip install "langchain-core>=0.3.81"

# For 1.x users  
pip install "langchain-core>=1.2.5"

# Verify the version
python -c "import langchain_core; print(langchain_core.__version__)"

Priority 2: Pin versions in requirements

# requirements.txt
langchain-core>=0.3.81,<0.4.0  # or >=1.2.5 for 1.x

Phase 2: Assess and Harden (24-72 hours)

Audit secrets_from_env usage

Before the patch, secrets_from_env=True was the default. Review your deserialization calls:

# Unsafe (pre-patch default)
loads(serialized_data, secrets_from_env=True)

# Safe
loads(serialized_data, secrets_from_env=False)

Review LLM output handling

Treat these fields as untrusted:
- additional_kwargs
- response_metadata
- Tool outputs
- Retrieved documents

Check for astream_events v1 usage

Migrate to v2:

# Vulnerable
async for event in runnable.astream_events(input, version="v1"):
    ...

# Safe
async for event in runnable.astream_events(input, version="v2"):
    ...

Phase 3: Detection Deployment (72+ hours)

Deploy the KQL queries above as scheduled analytics rules in Microsoft Sentinel or as custom detection rules in Defender XDR.

Example Sentinel rule configuration:

name: LangGrinch - Vulnerable LangChain Detected
severity: High
tactics:
  - InitialAccess
  - Execution
query: |
  DeviceTvmSoftwareInventory
  | where SoftwareName has "langchain"
  | where SoftwareVersion !startswith "0.3.81"
      and SoftwareVersion !startswith "1.2.5"
  // ... (full version check logic)
triggerThreshold: 0
frequency: 1d

The Bigger Picture: AI Supply Chain Security

This vulnerability highlights a critical blind spot in most organizations: AI framework security.

Yesterday, I wrote about securing AI data with Microsoft Purview. LangGrinch shows the other side of the coin. It's not just about protecting data in AI, but protecting your infrastructure from AI frameworks themselves.

Key questions for your AI security posture:

Inventory: Where are you running AI/LLM frameworks?
Versioning: Can you quickly identify which versions are deployed?
Trust boundaries: Do you treat LLM outputs as untrusted input?
Serialization: What data crosses serialize/deserialize boundaries?

If you can't answer these questions quickly and confidently, you're flying blind when advisories like this drop.

Technical Deep Dive: The `lc` Marker Confusion

For those who want to understand the mechanics:

LangChain's serialization format uses special keys to encode object metadata:

# Normal serialized LangChain object
{
    "lc": 1,
    "type": "constructor",
    "id": ["langchain_core", "messages", "HumanMessage"],
    "kwargs": {"content": "Hello"}
}

The vulnerability: if user-controlled data contains an lc key, the deserializer treats it as a trusted object:

# Attacker-controlled data
malicious_dict = {
    "lc": 1,
    "type": "secret",
    "id": ["OPENAI_API_KEY"]  # Extracts env var!
}

# This gets serialized and later deserialized...
# The framework thinks it's a real LangChain object

The patch wraps user dictionaries containing lc to prevent confusion:

# Post-patch: user dict with "lc" is escaped
{
    "lc": 1,
    "type": "not_implemented",
    "repr": "<wrapped user dict>"
}

JavaScript/TypeScript Alert

There's a parallel vulnerability in LangChainJS: CVE-2025-68665 (GHSA-r399-636x-v7f6). Same mechanics, same risks. If you run both Python and JavaScript LangChain stacks, patch both.

Key Takeaways

CVE-2025-68664 is critical (CVSS 9.3). Patch to 0.3.81+ or 1.2.5+ immediately.
The attack vector is subtle. Prompt injection can trigger exploitation through normal framework operations.
Detection is possible. Use the KQL queries to find vulnerable versions and suspicious activity.
LLM outputs are untrusted input. Treat additional_kwargs, response_metadata, and tool outputs accordingly.
AI supply chain security matters. Know where your AI frameworks are deployed and what versions you're running.

The frameworks we use to build intelligent applications are becoming attack surfaces themselves. LangGrinch won't be the last vulnerability of this kind.

References:

Questions? Connect on LinkedIn or see my CV.