LLM JSON Repair Guide: Fix Broken AI Output Online (2026)
LLMs break JSON with markdown fences, Python booleans, truncated output, and smart quotes. This guide shows how to repair any LLM JSON error online or in code.
Have broken JSON right now? Fix it free in under 1 second — no signup.
Fix My JSON →If you've been building with any LLM API in 2026 — OpenAI, Anthropic Claude, Google Gemini, or a local model — you've seen this:
{"success": True, "data": [{"id": 1, "name": "Alice"
JSON.parse() throws. Your pipeline breaks.
This guide explains why LLMs produce broken JSON, covers every failure pattern, and gives you the fastest repair path — online and in code.
Quick fix: Paste your broken LLM JSON into the AI JSONMedic repair tool for an instant one-click fix.Why LLMs Break JSON
Modern LLMs generate JSON probabilistically. Even with json_object mode enabled, models can produce output that fails strict JSON.parse() for several reasons:
- They were trained on Python code and leak
True,False,Noneinstead oftrue,false,null - They wrap JSON in markdown code fences (
`json) - They hit token limits mid-output and truncate the response
- They use smart quotes (
"") instead of standard ASCII double quotes - They add trailing commas after the last element
- They include JavaScript-style comments that JSON doesn't support
- They drop required fields when schemas are complex
These aren't bugs you can fully prevent. Even with native structured output, edge cases exist in every major provider.
The 7 LLM JSON Failure Patterns
1. Python Boolean Literals
Seen in: GPT-4, Claude, Gemini — especially when the prompt has Python examples{"success": True, "active": False, "value": None}
LLMs trained on Python code conflate Python's True/False/None with JSON's true/false/null. This is one of the most common LLM JSON bugs.
import re
def fix_python_booleans(s: str) -> str:
s = re.sub(r'\bTrue\b', 'true', s)
s = re.sub(r'\bFalse\b', 'false', s)
s = re.sub(r'\bNone\b', 'null', s)
return s
See also: Why LLMs Output True Instead of true — and How to Fix It
2. Markdown Code Fences
Seen in: All LLMs when using chat-completion without structured output mode`
json
{"name": "Alice", "age": 30}
`
Chat models format responses helpfully. When asked for JSON, they often wrap it in a code block — which breaks any parser expecting raw JSON.
Fix in JavaScript:function stripFences(text) {
const match = text.match(/
(?:json)?\s\n?([\s\S]?)\n?```/)
return match ? match[1].trim() : text.trim()
}
Fix in Python:python
import re
def strip_fences(text: str) -> str:
match = re.search(r'``(?:json)?\s\n?([\s\S]?)\n?``', text)
return match.group(1).strip() if match else text.strip()
3. Truncated JSON (Token Limit Cutoff)
Seen in: Any model when output is large and max_tokens is hitjson
{"users": [{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}, {"id": 3,
Output stops mid-structure. The JSON has unclosed brackets, unclosed strings, or missing closing delimiters.
Fix: AI JSONMedic's repair engine adds missing brackets and closes open strings. Paste truncated output here.
Better fix — prevent it:python
Always set max_tokens high enough for your expected output
response = client.chat.completions.create(
model="gpt-4o",
max_tokens=4096, # never let this be the bottleneck
messages=[...]
)
See also: Fix 'Unexpected End of JSON Input' — Truncated JSON Guide
4. Trailing Commas
Seen in: GPT-4, Claudejson
{
"name": "Alice",
"age": 30,
}
JSON doesn't allow trailing commas. JavaScript object literals do. LLMs trained on both confuse them.
Fix in JavaScript:javascript
function removeTrailingCommas(s) {
return s.replace(/,\s*([}\]])/g, '$1')
}
Caveat: this regex breaks if there's a comma inside a string value followed by } or ]. Use AI JSONMedic for safe production repair.
5. Smart Quotes / Curly Quotes
Seen in: Models generating "conversational" JSON
{"message": "This is a \u201cquoted\u201d word"}
Some models use typographic quotes (" ") instead of JSON-required straight double quotes (").
Fix in Python:python
def fix_smart_quotes(s: str) -> str:
return (s.replace('\u201c', '"').replace('\u201d', '"')
.replace('\u2018', "'").replace('\u2019', "'"))
6. Unquoted Keys
Seen in: Local LLMs (Ollama, LM Studio), some fine-tuned models
{name: "Alice", age: 30}
JSON requires all keys to be double-quoted strings. JavaScript allows unquoted keys in object literals. Local models with less training data get this wrong most often.
7. Leading Prose Around JSON
Seen in: Chat models without strict system prompts
Here is the JSON you requested:
{"name": "Alice", "age": 30}
Let me know if you need any changes.
The model adds text before and after the JSON. Your parser chokes on the leading text.
Fix:javascript
function extractJSON(text) {
const start = Math.min(
text.indexOf('{') === -1 ? Infinity : text.indexOf('{'),
text.indexOf('[') === -1 ? Infinity : text.indexOf('[')
)
if (start === Infinity) throw new Error('No JSON found in LLM output')
const open = text[start]
const close = open === '{' ? '}' : ']'
const end = text.lastIndexOf(close)
return text.slice(start, end + 1)
}
Fastest Fix: Repair LLM JSON Online
The quickest path from broken to repaired: paste into AI JSONMedic.
It handles all 7 patterns above in one step — no regex, no code changes, no dependencies. Works on truncated JSON, mixed failures, and deeply nested structures. Server response time under 200ms means you get results instantly.
For a comparison of all available repair tools (libraries + online tools), see the Best JSON Repair Tools 2026 guide.
Repair in Code
Python: json-repair Library
The json-repair library (PyPI) is the most capable Python solution for LLM output:
bash
pip install json-repair
python
from json_repair import repair_json
import json
raw = '{"success": True, "users": [{"id": 1, "name": "Alice",'
repaired_str = repair_json(raw)
data = json.loads(repaired_str)
print(data)
{'success': True, 'users': [{'id': 1, 'name': 'Alice'}]}
It handles truncated JSON, Python booleans, single quotes, unquoted keys, and trailing commas. For production pipelines:
python
from json_repair import repair_json
import json
import logging
def safe_parse_llm_json(raw: str) -> dict | None:
"""Parse JSON from LLM output. Returns None if unfixable."""
if not raw or not raw.strip():
return None
try:
return json.loads(raw)
except json.JSONDecodeError:
repaired = repair_json(raw)
try:
return json.loads(repaired)
except json.JSONDecodeError:
logging.warning(f"Unrepairably broken JSON: {raw[:200]}")
return None
TypeScript / Node.js: ai-json-safe-parse
The ai-json-safe-parse npm package is a zero-dependency TypeScript utility built specifically for LLM output:
bash
npm install ai-json-safe-parse
typescript
import { safeParseJSON } from 'ai-json-safe-parse'
const raw = \\\json
{"name": "Alice", "score": True, "items": [1, 2, 3,]}
\\\``
const result = safeParseJSON(raw)
// { name: 'Alice', score: true, items: [1, 2, 3] }
For JavaScript-specific parse error patterns, see Fix JavaScript JSON.parse() Errors.
Repair by LLM Provider
OpenAI / GPT-4o
OpenAI's Structured Outputs (json_schema) and json_object mode significantly reduce broken output. Always use a repair fallback:
python
from openai import OpenAI
from json_repair import repair_json
import json
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
response_format={"type": "json_object"},
messages=[
{"role": "system", "content": "Return valid JSON only. No markdown."},
{"role": "user", "content": "...your prompt..."}
]
)
raw = response.choices[0].message.content
try:
data = json.loads(raw)
except json.JSONDecodeError:
data = json.loads(repair_json(raw))
Structured Outputs (json_schema mode) is more reliable than json_object but requires a defined schema. Use it when you know the exact output shape.
For a complete walkthrough of ChatGPT JSON error patterns and pipeline-level fixes, see the ChatGPT JSON errors guide.
Anthropic Claude
Claude outputs JSON most reliably through tool_use. The tool calling API guarantees schema-valid responses:
python
import anthropic
import json
client = anthropic.Anthropic()
tools = [{
"name": "extract_data",
"description": "Extract structured data from the input",
"input_schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"confidence": {"type": "number"}
},
"required": ["name", "confidence"]
}
}]
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=tools,
tool_choice={"type": "tool", "name": "extract_data"},
messages=[{"role": "user", "content": "Extract from: ..."}]
)
tool_use input is always schema-valid JSON
data = response.content[0].input
Google Gemini
python
import google.generativeai as genai
import json
genai.configure(api_key="YOUR_API_KEY")
model = genai.GenerativeModel(
model_name="gemini-2.0-flash",
generation_config={"response_mime_type": "application/json"}
)
response = model.generate_content("Extract structured data: ...")
data = json.loads(response.text)
Local LLMs (Ollama)
Local models produce the most broken JSON. Always repair:
python
import ollama
from json_repair import repair_json
import json
response = ollama.chat(
model='llama3.2',
messages=[{
'role': 'user',
'content': 'Return JSON: {"name": ..., "score": ...}'
}],
format='json' # Ollama's JSON mode — helps but isn't perfect
)
raw = response['message']['content']
data = json.loads(repair_json(raw))
Provider Reliability Comparison
Provider Mode Reliability OpenAI GPT-4o json_schema Structured Outputs~99.9% OpenAI GPT-4o json_object mode~98% Anthropic Claude tool_use with schema~99.5% Google Gemini response_mime_type: application/json~99% Ollama (local) format: 'json'~90% (model-dependent)
These numbers reflect failure rates from production telemetry reported in 2026. A repair fallback is still recommended even at 99.9% — at scale, 0.1% failure is thousands of requests per day.
Prevention: Stop Broken JSON Before It Starts
Validate with Pydantic (Python)
python
from pydantic import BaseModel
from json_repair import repair_json
import json
class UserProfile(BaseModel):
name: str
age: int
email: str
def parse_user(raw: str) -> UserProfile:
try:
return UserProfile.model_validate_json(raw)
except Exception:
repaired = repair_json(raw)
return UserProfile.model_validate_json(repaired)
Validate with Zod (TypeScript)
typescript
import { z } from 'zod'
import { safeParseJSON } from 'ai-json-safe-parse'
const UserSchema = z.object({
name: z.string(),
age: z.number().int(),
email: z.string().email()
})
function parseUserFromLLM(raw: string) {
const data = safeParseJSON(raw)
return UserSchema.parse(data)
}
Use the JSON Validator to check your repaired JSON against a schema interactively before writing validation code.
When Repair Fails: Retry Strategy
When JSON is beyond automatic repair, include the broken output in a retry prompt:
python
def get_json_with_retry(client, messages, max_retries=2):
for attempt in range(max_retries + 1):
response = client.chat.completions.create(
model="gpt-4o",
response_format={"type": "json_object"},
messages=messages
)
raw = response.choices[0].message.content
try:
return json.loads(repair_json(raw))
except Exception as e:
if attempt < max_retries:
messages = messages + [
{"role": "assistant", "content": raw},
{
"role": "user",
"content": f"The JSON you returned was invalid: {e}. Return only valid JSON with no prose."
}
]
raise ValueError(f"Failed to get valid JSON after {max_retries} retries")
```
FAQ
Q: Which LLM produces the most valid JSON without repair?A: OpenAI GPT-4o with Structured Outputs (json_schema mode) has the lowest failure rate (~0.1%). Anthropic Claude with tool_use is comparable. Both require schema definition. For schema-free JSON, json_object mode with a repair fallback is the pragmatic choice.
A: No. json_object mode prevents non-JSON responses but doesn't guarantee schema compliance or complete output. Token limit truncation can still produce broken JSON even with JSON mode enabled.
A: Yes — truncated JSON is one of the most common LLM JSON failure cases. AI JSONMedic and the json-repair Python library both handle incomplete JSON by inferring missing closing brackets and strings. See Fix 'Unexpected End of JSON Input' for a full guide.
A: Validation checks if JSON is valid and reports what's wrong. Repair takes broken JSON and returns a working version. For maximum reliability, do both: repair first, then validate the repaired output against your schema using the JSON Validator.
Q: Should I use a regex to fix LLM JSON or a library?A: Use a library. Regex breaks on edge cases — commas inside string values, nested structures, escaped characters. The json-repair (Python) and ai-json-safe-parse (npm) libraries handle these correctly. Regex is only acceptable as a first pass for the simplest patterns like stripping markdown fences.
Still dealing with broken JSON?
Paste it in and get it fixed in under 1 second — free, no signup, no install. Works with ChatGPT, Claude, n8n, and any AI output.
Fix My JSON Free →Related Articles