Features
- Projects
- Guardrails
- Virtual Keys
- Observability
- AI Gateway
Thinking Mode
Enable Thinking Mode to enhance your AI’s reasoning capabilities.
Thinking/Reasoning models are a new class of LLMs designed to make their internal reasoning visible. Unlike standard LLMs that provide only final answers, thinking models such as Claude 3.7 Sonnet, OpenAI o1/o3, and Deepseek R1 “think out loud” by generating a detailed chain of thought before presenting their conclusions.
These models are optimized for tasks that demand complex analysis, multi-step reasoning, and structured logic. Obiguard offers access to these advanced models through a unified API that works seamlessly across different providers.
Supported Thinking Models
Obiguard currently supports these thinking-enabled models:
- Anthropic: claude-3-7-sonnet-latest
- Google Vertex AI: anthropic.claude-3-7-sonnet@20250219
- Amazon Bedrock: claude-3-7-sonnet
Additional thinking models will be supported as they become available.
Using Thinking Mode
- Set
strict_open_ai_compliance=False
in your headers or client configuration. - Thinking responses are returned in a format different from standard completions.
- For streaming, the thinking content appears in
response_chunk.choices[0].delta.content_blocks
.
Basic Example
from obiguard import Obiguard
# Initialize the Obiguard client
client = Obiguard(
virtual_key="vk-obg***", # Add your provider's virtual key
strict_open_ai_compliance=False # Required for thinking mode
)
# Create the request
response = client.chat.completions.create(
model="claude-3-7-sonnet-latest",
max_tokens=3000,
thinking={
"type": "enabled",
"budget_tokens": 2030 # Maximum tokens to use for thinking
},
stream=False,
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "when does the flight from new york to bengaluru land tomorrow, what time, what is its flight number, and what is its baggage belt?"
}
]
}
]
)
print(response)
# For streaming responses, handle content_blocks differently
# response = client.chat.completions.create(
# ...same config as above but with stream=True
# )
# for chunk in response:
# if chunk.choices[0].delta:
# content_blocks = chunk.choices[0].delta.get("content_blocks")
# if content_blocks is not None:
# for content_block in content_blocks:
# print(content_block)
from obiguard import Obiguard
# Initialize the Obiguard client
client = Obiguard(
virtual_key="vk-obg***", # Add your provider's virtual key
strict_open_ai_compliance=False # Required for thinking mode
)
# Create the request
response = client.chat.completions.create(
model="claude-3-7-sonnet-latest",
max_tokens=3000,
thinking={
"type": "enabled",
"budget_tokens": 2030 # Maximum tokens to use for thinking
},
stream=False,
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "when does the flight from new york to bengaluru land tomorrow, what time, what is its flight number, and what is its baggage belt?"
}
]
}
]
)
print(response)
# For streaming responses, handle content_blocks differently
# response = client.chat.completions.create(
# ...same config as above but with stream=True
# )
# for chunk in response:
# if chunk.choices[0].delta:
# content_blocks = chunk.choices[0].delta.get("content_blocks")
# if content_blocks is not None:
# for content_block in content_blocks:
# print(content_block)
from openai import OpenAI
from obiguard import createHeaders
# Initialize the OpenAI client with Obiguard headers
client = OpenAI(
base_url='https://gateway.obiguard.ai/v1',
api_key='sk-***', # Your OpenAI API key here
default_headers=createHeaders(
provider="openai",
api_key="vk-obg***", # Your Obiguard virtual key here
strict_open_ai_compliance=False # Required for thinking mode
)
)
# Create the request
response = client.chat.completions.create(
model="claude-3-7-sonnet-latest",
max_tokens=3000,
thinking={
"type": "enabled",
"budget_tokens": 2030 # Maximum tokens to use for thinking
},
stream=False,
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "when does the flight from new york to bengaluru land tomorrow, what time, what is its flight number, and what is its baggage belt?"
}
]
}
]
)
print(response)
# For streaming responses, handle content_blocks differently
# response = client.chat.completions.create(
# ...same config as above but with stream=True
# )
# for chunk in response:
# if chunk.choices[0].delta:
# content_blocks = chunk.choices[0].delta.get("content_blocks")
# if content_blocks is not None:
# for content_block in content_blocks:
# print(content_block)
curl "https://gateway.obiguard.ai/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "x-obiguard-api-key: $OBIGUARD_API_KEY" \
-H "x-obiguard-provider: anthropic" \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "x-obiguard-strict-open-ai-compliance: false" \
-d '{
"model": "claude-3-7-sonnet-latest",
"max_tokens": 3000,
"thinking": {
"type": "enabled",
"budget_tokens": 2030
},
"stream": false,
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "when does the flight from new york to bengaluru land tomorrow, what time, what is its flight number, and what is its baggage belt?"
}
]
}
]
}'
Multi-Turn Conversations
For multi-turn conversations, include the previous thinking content in the conversation history:
from obiguard import Obiguard
# Initialize the Obiguard client
client = Obiguard(
virtual_key="vk-obg***", # Your Obiguard virtual key here
strict_open_ai_compliance=False
)
# Create the request
response = client.chat.completions.create(
model="claude-3-7-sonnet-latest",
max_tokens=3000,
thinking={
"type": "enabled",
"budget_tokens": 2030
},
stream=False,
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "when does the flight from baroda to bangalore land tomorrow, what time, what is its flight number, and what is its baggage belt?"
}
]
},
{
"role": "assistant",
"content": [
{
"type": "thinking",
"thinking": "The user is asking several questions about a flight from Baroda (also known as Vadodara) to Bangalore:\n1. When does the flight land tomorrow\n2. What time does it land\n3. What is the flight number\n4. What is the baggage belt number at the arrival airport\n\nTo properly answer these questions, I would need access to airline flight schedules and airport information systems. However, I don't have:\n- Real-time or scheduled flight information\n- Access to airport baggage claim allocation systems\n- Information about specific flights between these cities\n- The ability to look up tomorrow's specific flight schedules\n\nThis question requires current, specific flight information that I don't have access to. Instead of guessing or providing potentially incorrect information, I should explain this limitation and suggest ways the user could find this information.",
"signature": "EqoBCkgIARABGAIiQBVA7FBNLRtWarDSy9TAjwtOpcTSYHJ+2GYEoaorq3V+d3eapde04bvEfykD/66xZXjJ5yyqogJ8DEkNMotspRsSDKzuUJ9FKhSNt/3PdxoMaFZuH+1z1aLF8OeQIjCrA1+T2lsErrbgrve6eDWeMvP+1sqVqv/JcIn1jOmuzrPi2tNz5M0oqkOO9txJf7QqEPPw6RG3JLO2h7nV1BMN6wE="
}
]
},
{
"role": "user",
"content": "thanks that's good to know, how about to chennai?"
}
]
)
print(response)
from obiguard import Obiguard
# Initialize the Obiguard client
client = Obiguard(
virtual_key="vk-obg***", # Your Obiguard virtual key here
strict_open_ai_compliance=False
)
# Create the request
response = client.chat.completions.create(
model="claude-3-7-sonnet-latest",
max_tokens=3000,
thinking={
"type": "enabled",
"budget_tokens": 2030
},
stream=False,
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "when does the flight from baroda to bangalore land tomorrow, what time, what is its flight number, and what is its baggage belt?"
}
]
},
{
"role": "assistant",
"content": [
{
"type": "thinking",
"thinking": "The user is asking several questions about a flight from Baroda (also known as Vadodara) to Bangalore:\n1. When does the flight land tomorrow\n2. What time does it land\n3. What is the flight number\n4. What is the baggage belt number at the arrival airport\n\nTo properly answer these questions, I would need access to airline flight schedules and airport information systems. However, I don't have:\n- Real-time or scheduled flight information\n- Access to airport baggage claim allocation systems\n- Information about specific flights between these cities\n- The ability to look up tomorrow's specific flight schedules\n\nThis question requires current, specific flight information that I don't have access to. Instead of guessing or providing potentially incorrect information, I should explain this limitation and suggest ways the user could find this information.",
"signature": "EqoBCkgIARABGAIiQBVA7FBNLRtWarDSy9TAjwtOpcTSYHJ+2GYEoaorq3V+d3eapde04bvEfykD/66xZXjJ5yyqogJ8DEkNMotspRsSDKzuUJ9FKhSNt/3PdxoMaFZuH+1z1aLF8OeQIjCrA1+T2lsErrbgrve6eDWeMvP+1sqVqv/JcIn1jOmuzrPi2tNz5M0oqkOO9txJf7QqEPPw6RG3JLO2h7nV1BMN6wE="
}
]
},
{
"role": "user",
"content": "thanks that's good to know, how about to chennai?"
}
]
)
print(response)
from openai import OpenAI
from obiguard import OBIGUARD_GATEWAY_URL, createHeaders
openai = OpenAI(
api_key='Anthropic_API_KEY',
base_url=OBIGUARD_GATEWAY_URL,
default_headers=createHeaders(
provider="anthropic",
obiguard_api_key="vk-obg***", # Your Obiguard virtual key here
strict_open_ai_compliance=False
)
)
response = openai.chat.completions.create(
model="claude-3-7-sonnet-latest",
max_tokens=3000,
thinking={
"type": "enabled",
"budget_tokens": 2030
},
stream=False,
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "when does the flight from baroda to bangalore land tomorrow, what time, what is its flight number, and what is its baggage belt?"
}
]
},
{
"role": "assistant",
"content": [
{
"type": "thinking",
"thinking": "The user is asking several questions about a flight from Baroda (also known as Vadodara) to Bangalore:\n1. When does the flight land tomorrow\n2. What time does it land\n3. What is the flight number\n4. What is the baggage belt number at the arrival airport\n\nTo properly answer these questions, I would need access to airline flight schedules and airport information systems. However, I don't have:\n- Real-time or scheduled flight information\n- Access to airport baggage claim allocation systems\n- Information about specific flights between these cities\n- The ability to look up tomorrow's specific flight schedules\n\nThis question requires current, specific flight information that I don't have access to. Instead of guessing or providing potentially incorrect information, I should explain this limitation and suggest ways the user could find this information.",
signature: "EqoBCkgIARABGAIiQBVA7FBNLRtWarDSy9TAjwtOpcTSYHJ+2GYEoaorq3V+d3eapde04bvEfykD/66xZXjJ5yyqogJ8DEkNMotspRsSDKzuUJ9FKhSNt/3PdxoMaFZuH+1z1aLF8OeQIjCrA1+T2lsErrbgrve6eDWeMvP+1sqVqv/JcIn1jOmuzrPi2tNz5M0oqkOO9txJf7QqEPPw6RG3JLO2h7nV1BMN6wE="
}
]
},
{
"role": "user",
"content": "thanks that's good to know, how about to chennai?"
}
]
)
print(response)
curl "https://gateway.obiguard.ai/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "x-obiguard-api-key: $OBIGUARD_API_KEY" \
-H "x-obiguard-provider: anthropic" \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "x-obiguard-strict-open-ai-compliance: false" \
-d '{
"model": "claude-3-7-sonnet-latest",
"max_tokens": 3000,
"thinking": {
"type": "enabled",
"budget_tokens": 2030
},
"stream": false,
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "when does the flight from baroda to bangalore land tomorrow, what time, what is its flight number, and what is its baggage belt?"
}
]
},
{
"role": "assistant",
"content": [
{
"type": "thinking",
"thinking": "The user is asking several questions about a flight from Baroda (also known as Vadodara) to Bangalore:\n1. When does the flight land tomorrow\n2. What time does it land\n3. What is the flight number\n4. What is the baggage belt number at the arrival airport\n\nTo properly answer these questions, I would need access to airline flight schedules and airport information systems. However, I don't have:\n- Real-time or scheduled flight information\n- Access to airport baggage claim allocation systems\n- Information about specific flights between these cities\n- The ability to look up tomorrow's specific flight schedules\n\nThis question requires current, specific flight information that I don't have access to. Instead of guessing or providing potentially incorrect information, I should explain this limitation and suggest ways the user could find this information.",
"signature": "EqoBCkgIARABGAIiQBVA7FBNLRtWarDSy9TAjwtOpcTSYHJ+2GYEoaorq3V+d3eapde04bvEfykD/66xZXjJ5yyqogJ8DEkNMotspRsSDKzuUJ9FKhSNt/3PdxoMaFZuH+1z1aLF8OeQIjCrA1+T2lsErrbgrve6eDWeMvP+1sqVqv/JcIn1jOmuzrPi2tNz5M0oqkOO9txJf7QqEPPw6RG3JLO2h7nV1BMN6wE="
}
]
},
{
"role": "user",
"content": "thanks that's good to know, how about to chennai?"
}
]
}
Understanding the Response Structure
When working with thinking-enabled models, note that their responses use a special format:
The assistant’s thinking output is found in the response_chunk.choices[0].delta.content_blocks
array, not in the response.choices[0].message.content
string.
This distinction is crucial for streaming responses, where you must extract the thinking content from the content blocks.
FAQs
No, thinking mode is only available on specific reasoning-optimized models. Currently, this includes Claude 3.7 Sonnet and will expand to other models as they become available.
Yes, enabling thinking mode will increase your token usage since the model is generating additional content for its reasoning process.
The budget_tokens
parameter lets you control the maximum tokens allocated to thinking.
Yes, particularly for streaming responses.
The thinking content is returned in the content_blocks
array rather than the standard content field,
so you’ll need to adapt your response parsing logic.
The thinking mode response format extends beyond the standard OpenAI completion schema.
Setting strict_open_ai_compliance
to false allows Obiguard to return this extended format with the thinking content.