> ## Documentation Index
> Fetch the complete documentation index at: https://docs.obiguard.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Thinking Mode

> Enable Thinking Mode to enhance your AI's reasoning capabilities.

Thinking/Reasoning models are a new class of LLMs designed to make their internal reasoning visible. Unlike standard LLMs that provide only final answers, thinking models such as Claude 3.7 Sonnet, OpenAI o1/o3, and Deepseek R1 “think out loud” by generating a detailed chain of thought before presenting their conclusions.

These models are optimized for tasks that demand complex analysis, multi-step reasoning, and structured logic. Obiguard offers access to these advanced models through a unified API that works seamlessly across different providers.

## Supported Thinking Models

Obiguard currently supports these thinking-enabled models:

* **Anthropic**: claude-3-7-sonnet-latest
* **Google Vertex AI**: anthropic.claude-3-7-sonnet\@20250219
* **Amazon Bedrock**: claude-3-7-sonnet

<Note>
  Additional thinking models will be supported as they become available.
</Note>

## Using Thinking Mode

1. Set `strict_open_ai_compliance=False` in your headers or client configuration.
2. Thinking responses are returned in a format different from standard completions.
3. For streaming, the thinking content appears in `response_chunk.choices[0].delta.content_blocks`.

### Basic Example

<Tabs>
  <Tab title="Python SDK">
    ```python theme={null}
    from obiguard import Obiguard

    # Initialize the Obiguard client
    client = Obiguard(
      virtual_key="vk-obg***",   # Add your provider's virtual key
      strict_open_ai_compliance=False  # Required for thinking mode
    )

    # Create the request
    response = client.chat.completions.create(
      model="claude-3-7-sonnet-latest",
      max_tokens=3000,
      thinking={
        "type": "enabled",
        "budget_tokens": 2030  # Maximum tokens to use for thinking
      },
      stream=False,
      messages=[
        {
          "role": "user",
          "content": [
            {
              "type": "text",
              "text": "when does the flight from new york to bengaluru land tomorrow, what time, what is its flight number, and what is its baggage belt?"
            }
          ]
        }
      ]
    )
    print(response)

    # For streaming responses, handle content_blocks differently
    # response = client.chat.completions.create(
    #   ...same config as above but with stream=True
    # )
    # for chunk in response:
    #     if chunk.choices[0].delta:
    #         content_blocks = chunk.choices[0].delta.get("content_blocks")
    #         if content_blocks is not None:
    #             for content_block in content_blocks:
    #                 print(content_block)
    ```
  </Tab>

  <Tab title="OpenAI Python">
    ```python theme={null}
    from openai import OpenAI
    from obiguard import createHeaders

    # Initialize the OpenAI client with Obiguard headers
    client = OpenAI(
      base_url='https://gateway.obiguard.ai/v1',
      api_key='sk-***',  # Your OpenAI API key here
      default_headers=createHeaders(
        provider="openai",
        api_key="vk-obg***",  # Your Obiguard virtual key here
        strict_open_ai_compliance=False  # Required for thinking mode
      )
    )

    # Create the request
    response = client.chat.completions.create(
      model="claude-3-7-sonnet-latest",
      max_tokens=3000,
      thinking={
        "type": "enabled",
        "budget_tokens": 2030  # Maximum tokens to use for thinking
      },
      stream=False,
      messages=[
        {
          "role": "user",
          "content": [
            {
              "type": "text",
              "text": "when does the flight from new york to bengaluru land tomorrow, what time, what is its flight number, and what is its baggage belt?"
            }
          ]
        }
      ]
    )
    print(response)

    # For streaming responses, handle content_blocks differently
    # response = client.chat.completions.create(
    #   ...same config as above but with stream=True
    # )
    # for chunk in response:
    #     if chunk.choices[0].delta:
    #         content_blocks = chunk.choices[0].delta.get("content_blocks")
    #         if content_blocks is not None:
    #             for content_block in content_blocks:
    #                 print(content_block)
    ```
  </Tab>

  <Tab title="cURL">
    ```curl theme={null}
    curl "https://gateway.obiguard.ai/v1/chat/completions" \
      -H "Content-Type: application/json" \
      -H "x-obiguard-api-key: $OBIGUARD_API_KEY" \
      -H "x-obiguard-provider: anthropic" \
      -H "x-api-key: $ANTHROPIC_API_KEY" \
      -H "x-obiguard-strict-open-ai-compliance: false" \
      -d '{
        "model": "claude-3-7-sonnet-latest",
        "max_tokens": 3000,
        "thinking": {
          "type": "enabled",
          "budget_tokens": 2030
        },
        "stream": false,
        "messages": [
          {
            "role": "user",
            "content": [
              {
                "type": "text",
                "text": "when does the flight from new york to bengaluru land tomorrow, what time, what is its flight number, and what is its baggage belt?"
              }
            ]
          }
        ]
      }'
    ```
  </Tab>
</Tabs>

## Multi-Turn Conversations

For multi-turn conversations, include the previous thinking content in the conversation history:

<Tabs>
  <Tab title="Python SDK">
    ```python theme={null}
    from obiguard import Obiguard

    # Initialize the Obiguard client
    client = Obiguard(
      virtual_key="vk-obg***",   # Your Obiguard virtual key here
      strict_open_ai_compliance=False
    )

    # Create the request
    response = client.chat.completions.create(
      model="claude-3-7-sonnet-latest",
      max_tokens=3000,
      thinking={
        "type": "enabled",
        "budget_tokens": 2030
      },
      stream=False,
      messages=[
        {
          "role": "user",
          "content": [
        {
          "type": "text",
          "text": "when does the flight from baroda to bangalore land tomorrow, what time, what is its flight number, and what is its baggage belt?"
        }
          ]
        },
        {
          "role": "assistant",
          "content": [
        {
          "type": "thinking",
          "thinking": "The user is asking several questions about a flight from Baroda (also known as Vadodara) to Bangalore:\n1. When does the flight land tomorrow\n2. What time does it land\n3. What is the flight number\n4. What is the baggage belt number at the arrival airport\n\nTo properly answer these questions, I would need access to airline flight schedules and airport information systems. However, I don't have:\n- Real-time or scheduled flight information\n- Access to airport baggage claim allocation systems\n- Information about specific flights between these cities\n- The ability to look up tomorrow's specific flight schedules\n\nThis question requires current, specific flight information that I don't have access to. Instead of guessing or providing potentially incorrect information, I should explain this limitation and suggest ways the user could find this information.",
          "signature": "EqoBCkgIARABGAIiQBVA7FBNLRtWarDSy9TAjwtOpcTSYHJ+2GYEoaorq3V+d3eapde04bvEfykD/66xZXjJ5yyqogJ8DEkNMotspRsSDKzuUJ9FKhSNt/3PdxoMaFZuH+1z1aLF8OeQIjCrA1+T2lsErrbgrve6eDWeMvP+1sqVqv/JcIn1jOmuzrPi2tNz5M0oqkOO9txJf7QqEPPw6RG3JLO2h7nV1BMN6wE="
        }
          ]
        },
        {
          "role": "user",
          "content": "thanks that's good to know, how about to chennai?"
        }
      ]
    )
    print(response)
    ```
  </Tab>

  <Tab title="OpenAI Python">
    ```python theme={null}
    from openai import OpenAI
    from obiguard import OBIGUARD_GATEWAY_URL, createHeaders

    openai = OpenAI(
      api_key='Anthropic_API_KEY',
      base_url=OBIGUARD_GATEWAY_URL,
      default_headers=createHeaders(
        provider="anthropic",
        obiguard_api_key="vk-obg***",  # Your Obiguard virtual key here
        strict_open_ai_compliance=False
      )
    )

    response = openai.chat.completions.create(
      model="claude-3-7-sonnet-latest",
      max_tokens=3000,
      thinking={
        "type": "enabled",
        "budget_tokens": 2030
      },
      stream=False,
      messages=[
        {
          "role": "user",
          "content": [
            {
              "type": "text",
              "text": "when does the flight from baroda to bangalore land tomorrow, what time, what is its flight number, and what is its baggage belt?"
            }
          ]
        },
        {
          "role": "assistant",
          "content": [
            {
              "type": "thinking",
              "thinking": "The user is asking several questions about a flight from Baroda (also known as Vadodara) to Bangalore:\n1. When does the flight land tomorrow\n2. What time does it land\n3. What is the flight number\n4. What is the baggage belt number at the arrival airport\n\nTo properly answer these questions, I would need access to airline flight schedules and airport information systems. However, I don't have:\n- Real-time or scheduled flight information\n- Access to airport baggage claim allocation systems\n- Information about specific flights between these cities\n- The ability to look up tomorrow's specific flight schedules\n\nThis question requires current, specific flight information that I don't have access to. Instead of guessing or providing potentially incorrect information, I should explain this limitation and suggest ways the user could find this information.",
              signature: "EqoBCkgIARABGAIiQBVA7FBNLRtWarDSy9TAjwtOpcTSYHJ+2GYEoaorq3V+d3eapde04bvEfykD/66xZXjJ5yyqogJ8DEkNMotspRsSDKzuUJ9FKhSNt/3PdxoMaFZuH+1z1aLF8OeQIjCrA1+T2lsErrbgrve6eDWeMvP+1sqVqv/JcIn1jOmuzrPi2tNz5M0oqkOO9txJf7QqEPPw6RG3JLO2h7nV1BMN6wE="
            }
          ]
        },
        {
          "role": "user",
          "content": "thanks that's good to know, how about to chennai?"
        }
      ]
    )
    print(response)
    ```
  </Tab>

  <Tab title="cURL">
    ```curl theme={null}
    curl "https://gateway.obiguard.ai/v1/chat/completions" \
    -H "Content-Type: application/json" \
    -H "x-obiguard-api-key: $OBIGUARD_API_KEY" \
    -H "x-obiguard-provider: anthropic" \
    -H "x-api-key: $ANTHROPIC_API_KEY" \
    -H "x-obiguard-strict-open-ai-compliance: false" \
    -d '{
      "model": "claude-3-7-sonnet-latest",
      "max_tokens": 3000,
      "thinking": {
        "type": "enabled",
        "budget_tokens": 2030
      },
        "stream": false,
        "messages": [
      {
        "role": "user",
        "content": [
      {
        "type": "text",
        "text": "when does the flight from baroda to bangalore land tomorrow, what time, what is its flight number, and what is its baggage belt?"
      }
        ]
      },
      {
        "role": "assistant",
        "content": [
      {
        "type": "thinking",
        "thinking": "The user is asking several questions about a flight from Baroda (also known as Vadodara) to Bangalore:\n1. When does the flight land tomorrow\n2. What time does it land\n3. What is the flight number\n4. What is the baggage belt number at the arrival airport\n\nTo properly answer these questions, I would need access to airline flight schedules and airport information systems. However, I don't have:\n- Real-time or scheduled flight information\n- Access to airport baggage claim allocation systems\n- Information about specific flights between these cities\n- The ability to look up tomorrow's specific flight schedules\n\nThis question requires current, specific flight information that I don't have access to. Instead of guessing or providing potentially incorrect information, I should explain this limitation and suggest ways the user could find this information.",
        "signature": "EqoBCkgIARABGAIiQBVA7FBNLRtWarDSy9TAjwtOpcTSYHJ+2GYEoaorq3V+d3eapde04bvEfykD/66xZXjJ5yyqogJ8DEkNMotspRsSDKzuUJ9FKhSNt/3PdxoMaFZuH+1z1aLF8OeQIjCrA1+T2lsErrbgrve6eDWeMvP+1sqVqv/JcIn1jOmuzrPi2tNz5M0oqkOO9txJf7QqEPPw6RG3JLO2h7nV1BMN6wE="
      }
        ]
      },
      {
        "role": "user",
        "content": "thanks that's good to know, how about to chennai?"
      }
        ]
    }
    ```
  </Tab>
</Tabs>

## Understanding the Response Structure

When working with thinking-enabled models, note that their responses use a special format:

<Note>
  The assistant’s thinking output is found in the `response_chunk.choices[0].delta.content_blocks`
  array, not in the `response.choices[0].message.content` string.
</Note>

This distinction is crucial for streaming responses, where you must extract the thinking content from the content blocks.

## FAQs

<AccordionGroup>
  <Accordion title="Can I use thinking mode with any model?">
    No, thinking mode is only available on specific reasoning-optimized models. Currently, this includes Claude 3.7 Sonnet and will expand to other models as they become available.
  </Accordion>

  <Accordion title="Does thinking mode increase token usage?">
    Yes, enabling thinking mode will increase your token usage since the model is generating additional content for its reasoning process.
    The `budget_tokens` parameter lets you control the maximum tokens allocated to thinking.
  </Accordion>

  <Accordion title="Do I need to handle the response differently for thinking mode?">
    Yes, particularly for streaming responses.
    The thinking content is returned in the `content_blocks` array rather than the standard content field,
    so you’ll need to adapt your response parsing logic.
  </Accordion>

  <Accordion title="Why do I need to set strict_open_ai_compliance to false?">
    The thinking mode response format extends beyond the standard OpenAI completion schema.
    Setting `strict_open_ai_compliance` to false allows Obiguard to return this extended format with the thinking content.
  </Accordion>
</AccordionGroup>
