What are vision models?Vision models are AI systems that integrate visual and language understanding, enabling them to interpret images
alongside natural language text. They are trained on extensive datasets containing both images and text, with training
methods varying based on their specific objectives.
Using Vision Chat Completion
Obiguard implements the OpenAI message format, allowing you to include images in API requests. You can provide images to the model either by supplying a URL or by embedding the image as a base64-encoded string. Below is an example with OpenAI’sgpt-4o
model:
Supported Providers and Models
Obiguard integrates with a wide range of vision models from leading providers. The table below lists some of the supported models.Provider | Models | Functions |
---|---|---|
OpenAI | gpt-4-vision-preview, gpt-4o, gpt-4o-mini | Create Chat Completion |
Azure OpenAI | gpt-4-vision-preview, gpt-4o, gpt-4o-mini | Create Chat Completion |
Gemini | gemini-1.0-pro-vision, gemini-1.5-flash, gemini-1.5-flash-8b, gemini-1.5-pro | Create Chat Completion |
Anthropic | claude-3-sonnet, claude-3-haiku, claude-3-opus, claude-3.5-sonnet, claude-3.5-haiku | Create Chat Completion |
AWS Bedrock | anthropic.claude-3-5-sonnet, anthropic.claude-3-5-haiku, anthropic.claude-3-5-sonnet-20240620-v1:0 | Create Chat Completion |