Models Metadata System

Last updated: 2024-06-15

The models metadata system provides standardized information about supported LLM models across all providers, enabling consistent configuration and cost tracking.

Overview

The metadata system serves as:

Single source of truth for model capabilities
Configuration validator for model-specific limits
Cost calculation reference
Feature compatibility checker

Supported Models

Model ID	Provider	Context Window	Max Tokens	Input Price ($/1M)	Output Price ($/1M)	Cached Price ($/1M)
anthropic-claude-3-7-sonnet-latest	Anthropic	200000	64000	3.00	15.00	0.30
anthropic-claude-3-5-sonnet-latest	Anthropic	200000	8192	3.00	15.00	0.30
anthropic-claude-3-5-haiku-latest	Anthropic	200000	8192	0.80	4.00	1.00
openai-gpt-4.1	OpenAI	1047576	32768	2.00	8.00	0.50
openai-gpt-4o	OpenAI	128000	64000	2.50	10.00	1.25
openai-gpt-4.1-mini	OpenAI	1047576	32768	0.40	1.60	0.10
openai-gpt-4.1-nano	OpenAI	1047576	32768	0.10	0.40	0.025
openai-gpt-4o-mini	OpenAI	1047576	16384	0.15	0.60	0.075
openai-gpt-4.5	OpenAI	128000	64000	75.00	150.00	37.50
openai-o1	OpenAI	200000	100000	15.00	60.00	7.50
openai-o3-mini	OpenAI	200000	100000	1.10	4.40	0.55
openai-o3	OpenAI	200000	100000	10.00	40.00	2.50
openai-o4-mini	OpenAI	200000	100000	1.10	4.40	0.275
mistral-mistral-large-latest	Mistral	131072	8192	2.00	6.00	-
mistral-mistral-small-latest	Mistral	131072	8192	0.10	0.30	-
mistral-pixtral-large-latest	Mistral	131072	8192	2.00	6.00	-
mistral-codestral-latest	Mistral	200000	8192	0.30	0.90	-
mistral-ministral-8b-latest	Mistral	131072	8192	0.10	0.10	-
mistral-ministral-3b-latest	Mistral	131072	8192	0.04	0.04	-
mistral-mistral-embed	Mistral	8000	8192	0.10	0.00	-
mistral-pixtral-12b	Mistral	131072	8192	0.15	0.15	-
mistral-mistral-nemo	Mistral	131072	8192	0.15	0.15	-
gemini-gemini-2.5-pro-exp-03-25	Gemini	1048576	64000	0.00	0.00	-
gemini-gemini-2.0-flash-exp	Gemini	1048576	8192	0.00	0.00	-
gemini-gemini-2.0-flash-thinking-exp-01-21	Gemini	1048576	65536	0.00	0.00	-
gemini-gemini-2.0-flash	Gemini	1048576	8192	0.10	0.40	-
gemini-gemini-2.0-flash-lite	Gemini	1048576	8192	0.075	0.30	-
gemini-gemini-2.5-pro-preview-03-25	Gemini	1048576	64000	1.25	10.00	-
groq-meta-llama/llama-4-scout-17b-16e-instruct	Groq	131072	8192	0.11	0.34	-
groq-meta-llama/llama-4-maverick-17b-128e-instruct	Groq	131072	8192	0.50	0.77	-
groq-deepseek-r1-distill-llama-70b	Groq	128000	8192	0.75	0.99	-
groq-deepseek-r1-distill-qwen-32b	Groq	128000	16384	0.69	0.69	-
groq-qwen-2.5-32b	Groq	128000	16384	0.79	0.79	-
groq-qwen-2.5-coder-32b	Groq	128000	16384	0.79	0.79	-
groq-qwen-qwq-32b	Groq	128000	16384	0.29	0.39	-
groq-mistral-saba-24b	Groq	200000	16384	0.79	0.79	-
deepseek-deepseek-reasoner	Deepseek	65536	8192	0.55	2.10	0.14
deepseek-deepseek-chat	Deepseek	65536	8192	0.27	1.10	0.07
grok-3-beta	Groq	131072	8192	3.00	15.00	-
grok-3-mini-beta	Groq	131072	8192	0.20	0.50	-

Note: Prices are per million tokens. "-" indicates no caching available. Feature abbreviations: streaming (real-time responses), multimodal (image/audio support), function_calling (tool use), thinking (reasoning steps), code (code generation).

Accessing Metadata

python

from aicore.models_metadata import METADATA

# Get metadata for a specific model
gpt4_metadata = METADATA["openai-gpt-4o"]

# List all available models
all_models = list(METADATA.keys())

Metadata Structure

Each model entry contains:

python

class ModelMetaData(BaseModel):
    context_window: int          # Maximum context size in tokens
    max_tokens: int              # Maximum generation tokens
    pricing: Optional[PricingConfig] = None

Pricing Configuration

The pricing system supports:

python

class PricingConfig(BaseModel):
    input: float                 # $ per 1M input tokens
    output: float                # $ per 1M output tokens
    cached: float = 0            # Discount for cached prompts
    cache_write: float = 0       # Cost for caching prompts
    happy_hour: Optional[Dict[str, Tuple[float, float]]] = None  # Discount periods i.e. Deepseek
    dynamic: Optional[DynamicPricing] = None # Dynamic pricing based on tokens consumed i.e. Gemini

Example Usage

python

# Calculate request cost
model_data = METADATA["anthropic-claude-3-7-sonnet"]
cost = model_data.pricing.calculate_cost(
    prompt_tokens=1500,
    response_tokens=800
)

Best Practices

Always check model limits:

python

if len(prompt) > METADATA[model].context_window:
    raise ValueError("Prompt too long")

Use metadata for cost estimation:

python

budget = 10.00
estimated_cost = METADATA[model].estimate_cost(prompt)
if estimated_cost > budget:
    # Choose cheaper model

Models Metadata System ​

Overview ​

Supported Models ​

Accessing Metadata ​

Metadata Structure ​

Pricing Configuration ​

Example Usage ​

Best Practices ​

Models Metadata System

Overview

Supported Models

Accessing Metadata

Metadata Structure

Pricing Configuration

Example Usage

Best Practices