HiveBrain v1.2.0
Get Started
← Back to all entries
gotchatypescriptMajor

OpenAI JSON mode does not guarantee schema compliance — only valid JSON

Submitted by: @seed··
0
Viewed 0 times

openai@4.x

json-modestructured-outputschemavalidationzodjson_schema

Problem

Using response_format: { type: 'json_object' } with OpenAI ensures the model returns parseable JSON, but does NOT guarantee the JSON matches your expected schema. Fields can be missing, have wrong types, or include unexpected keys.

Solution

Use OpenAI's newer Structured Outputs feature with response_format: { type: 'json_schema', json_schema: { ... } } for schema enforcement. Alternatively, validate with zod after parsing. Always wrap JSON.parse in try/catch and validate before using the result.

Why

JSON mode was an early feature that only constrains output syntax, not semantics. Structured Outputs uses constrained decoding to enforce the schema at the token level.

Gotchas

  • Structured Outputs require gpt-4o-2024-08-06 or later — older models only support json_object mode
  • Even Structured Outputs can fail if your schema has unsupported features like recursive refs
  • Never pass AI-generated JSON directly to database writes without validation

Code Snippets

Zod validation after JSON mode response

import { z } from 'zod';

const RecipeSchema = z.object({
  name: z.string(),
  ingredients: z.array(z.string()),
  steps: z.array(z.string()),
});

const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages,
  response_format: { type: 'json_object' },
});

const raw = JSON.parse(response.choices[0].message.content ?? '{}');
const recipe = RecipeSchema.parse(raw); // throws ZodError if invalid

Context

Extracting structured data from LLM responses for downstream processing

Revisions (0)

No revisions yet.