OpenAI Structured JSON Output With Adherence
In the past, when using OpenAI’s JSON mode there was no guarantee that the model output will match the specified and predefined JSON schema.
In my view this really made this feature unreliable in a production environment, where consistency is important…
However, this has changed with something OpenAI refers to as Structured Outputs, which they describe as the evolution of JSON Mode.
As seen in the image below, the old JSON Mode is still available, parallel to the more recent Structured Outputs. However, as I have already noted back in Nov 2023, without the possibility to enforce / ensure model adherence to the structure, alternative avenues would work better.
Hence for this reason OpenAI recommend users to always use Structured Outputs instead of JSON mode.
JSON is one of the most widely used formats in the world for applications to exchange data.
Structured Outputs offer several key benefits. They enhance type-safety by eliminating the need to validate or retry improperly formatted responses.
Additionally, safety-based model refusals can now be detected programmatically, making it easier to handle these situations.
Furthermore, they simplify prompting, as consistent formatting is achieved without the need for strong or specific prompts.
Function Calling
Function calling is also an avenue to create structured output…here is some background…
Demystifying Large Language Model Function Calling
Large Language Model (LLM) Function Calling enables models to interact directly with external functions and APIs…
cobusgreyling.medium.com
Levels of Autonomy
In function calling with language models, the model gains a level of autonomy by determining whether to invoke a specific function or to rely on its default processing approach.
When the model identifies that a function is required, it autonomously switches to a more structured mode, preparing the necessary data and parameters for the function call.
This ability allows the language model to function as a mediator, efficiently handling the function while maintaining flexibility in processing requests.
The autonomy of AI can be viewed on a spectrum, where the degree of independence depends on how the system is designed.
By incorporating function calls into generative AI applications, we introduce not only structure but also an initial layer of autonomy, enabling the model to assess requests and decide whether it should use a function or provide an answer based on its default capabilities.
As AI technology progresses, this autonomy is expected to increase, with models becoming more independent and capable of handling increasingly complex tasks on their own.
This evolution enhances the ability of AI systems to take on more sophisticated responsibilities with minimal human intervention.
JSON Mode Evolved Into Structured Outputs
As I mentioned earlier, Structured Outputs is the next step in the evolution of JSON mode.
While both guarantee the production of valid JSON, only Structured Outputsensure strict adherence to the defined schema.
Both Structured Outputs and JSON mode are supported across the Chat Completions API, Assistants API, Fine-tuning API, and Batch API.
Now You Can Set GPT Output To JSON Mode
This feature is separate to function calling. When calling the models “gpt-4–1106-preview” or “gpt-3.5-turbo-1106” the…
cobusgreyling.medium.com
Chain of Thought
Below is a complete working example of Python which you can copy and paste into a Notebook. The code will prompt you for your OpenAI API Key…
This is an example where thegpt-4o-2024–08–06
model is made use of…the model is instructed to output the answer in a structured fashion, and also step-by-step to guide the user through the process.
You can ask the model to output an answer in a structured, step-by-step way, to guide the user through the solution.
Notice that this code only shows the Chain of Thought implementation, and how the sequence of reasoning is portrayed. In this example no JSON schema is given to the model to adhere to.
# Install the necessary packages
!pip install openai pydantic
# Import the modules
from pydantic import BaseModel
from openai import OpenAI
import getpass
import json
# Prompt the user for their OpenAI API key
api_key = getpass.getpass("Enter your OpenAI API key: ")
# Initialize the OpenAI client with the provided API key
client = OpenAI(api_key=api_key)
# Define the Step and MathReasoning classes using Pydantic
class Step(BaseModel):
explanation: str
output: str
class MathReasoning(BaseModel):
steps: list[Step]
final_answer: str
# Use the OpenAI client to parse a chat completion for a math problem
completion = client.beta.chat.completions.parse(
model="gpt-4o-2024-08-06",
messages=[
{"role": "system", "content": "You are a helpful math tutor. Guide the user through the solution step by step."},
{"role": "user", "content": "how can I solve 8x + 7 = -23"}
],
response_format=MathReasoning,
)
# Extract the parsed math reasoning from the completion
math_reasoning = completion.choices[0].message.parsed
# Convert the math reasoning to a JSON string and print it
print(json.dumps(math_reasoning.dict(), indent=4))
And the response…
{
"steps": [
{
"explanation": "The equation given is 8x + 7 = -23. To solve for x, we first need to isolate the term with x on one side. We should move the constant on the left side, which is 7, to the right side by subtracting 7 from both sides.",
"output": "8x + 7 - 7 = -23 - 7"
},
{
"explanation": "Subtracting 7 from both sides simplifies the equation. The left side becomes 8x because 7 - 7 is 0, and the right side becomes -30.",
"output": "8x = -30"
},
{
"explanation": "Now that we have isolated the term 8x, we need to solve for x. Since 8x means 8 times x, we divide both sides of the equation by 8 to find the value of x.",
"output": "x = -30 / 8"
},
{
"explanation": "Dividing -30 by 8 gives -3.75. This is the value of x that satisfies the equation.",
"output": "x = -3.75"
}
],
"final_answer": "x = -3.75"
}
Using Structured Outputs with response_format
Below is example code which is fully working and which can be used as-is within a Notebook.
Notice in this example, how the JSON schema is defined to which the model output must adhere.
!pip install openai
import openai
import json
class OpenAIClient:
def __init__(self, api_key: str):
"""
Initialize the OpenAI client with the provided API key.
"""
openai.api_key = api_key
def get_structured_output_with_schema(self, prompt: str, schema: dict):
"""
Call the OpenAI API and request a structured output based on the given prompt and schema.
"""
try:
# Call the OpenAI API with response_format set to 'structured' and provided schema
# Include the schema in the prompt
prompt_with_schema = f"{prompt} \n\n Response should be in JSON format according to this schema: {json.dumps(schema)}"
response = openai.chat.completions.create(
model="gpt-4", # Use a model that supports structured output
messages=[
{"role": "user", "content": prompt_with_schema}
],
max_tokens=150,
)
return response.choices[0].message.content.strip() # Extract and return the response
except Exception as e:
return f"Error occurred: {e}"
@staticmethod
def prompt_for_api_key():
"""
Prompt the user to input their OpenAI API key.
"""
api_key = input("Please enter your OpenAI API key: ")
return api_key
# Example Usage:
# Define the JSON schema for the structured output
schema = {
"type": "object",
"properties": {
"benefits": {
"type": "array",
"items": {
"type": "string"
},
"minItems": 1,
"maxItems": 5
}
},
"required": ["benefits"]
}
# Prompt user for API key
api_key = OpenAIClient.prompt_for_api_key()
# Initialize the OpenAI client
client = OpenAIClient(api_key)
# Test the function with a sample prompt
prompt = "Provide a list of 3-5 benefits of using structured outputs in AI applications."
result = client.get_structured_output_with_schema(prompt, schema)
# Print the result
print("Structured Output Response:")
if result.startswith("Error occurred"):
print(result) # Print the error message if there was an error
else:
print(json.dumps(json.loads(result), indent=2)) # Pretty print the JSON response
And the result from the query…
{
"type": "object",
"properties": {
"benefits": {
"type": "array",
"items": {
"type": "string"
},
"minItems": 1,
"maxItems": 5,
"default": [
"Structured outputs can handle complex relationships and dependencies within and between output elements.",
"Structured output prediction can be more finely controlled, providing more accurate and specific results.",
"Structured outputs help in better representation and abstraction of complex problems and tasks in AI.",
"Utilizing structured outputs can reduce misunderstandings and errors in communications between different components of an AI system.",
"Structured outputs can lead to better interpretation and understanding of the AI processes and their conclusions."
]
}
},
"required": [
"benefits"
]
}
Finally
Language models now offer advanced functionalities such as function calling, structured output, and reasoning, enabling developers to offload complex tasks directly to the model.
This shift allows for streamlined workflows but also requires makers to carefully decide how much functionality they want to delegate to the model versus implementing within their application logic.
Offloading too much to the model can result in business applications becoming tightly coupled to specific models and their unique capabilities, making future updates or changes more challenging.
As models evolve, this coupling can hinder the flexibility of the application and limit its adaptability. On the other hand, by keeping the application logic more modular, developers can design model-agnostic systems capable of orchestrating multiple models for diverse tasks.
This approach enables greater flexibility, allowing businesses to integrate a variety of models as they emerge and adjust to future needs without being locked into a single solution.
Chief Evangelist @ Kore.ai | I’m passionate about exploring the intersection of AI and language. From Language Models, AI Agents to Agentic Applications, Development Frameworks & Data-Centric Productivity Tools, I share insights and ideas on how these technologies are shaping the future.