Documentation | Free LLM Router

Get Started

Building a demo or prototyping an MVP but don’t want to pay API costs just to validate an idea?

OpenRouter's free tier is generous for early development, but free models come with maintenance trade-offs. They can get rate limited, hit capacity, or disappear without notice—leaving you to juggle fallbacks and slow down shipping.

We maintain a live-updated list of available free models so you don't have to track availability yourself. Set your preferences using use case and sorting, fetch the list from our API, and pass the model IDs to OpenRouter. It will automatically try each model in the order you specified until one responds. No need to manage fallbacks or check which models are currently working.

Set Up OpenRouter

OpenRouter provides a unified API for accessing many LLM providers. Sign up for free and create a dedicated API key specifically for free model usage.

Protect yourself from accidental charges. Create a separate API key just for free models and set a credit limit (e.g. $1) on your account. If a non-free model is accidentally used, you could be charged. We are not responsible for any charges incurred on OpenRouter.

Get Your API Key

Sign in with GitHub to create your API key. All keys share a per-user limit of 200 requests per 24 hours (with SDK caching, this is plenty).

Copy `free-llm-router.ts`

This helper fetches free model IDs from our API, reports both successes and issues back, and handles caching automatically.

/** Free LLM Router helper. Set FREE_LLM_ROUTER_API_KEY in your environment. */

const API = 'https://freellmrouter.com/api/v1';
const API_KEY = typeof process !== 'undefined' ? process.env.FREE_LLM_ROUTER_API_KEY : undefined;

type UseCase = 'chat' | 'vision' | 'tools' | 'longContext' | 'reasoning';
type Sort = 'contextLength' | 'maxOutput' | 'capable' | 'leastIssues' | 'newest';
type CacheMode = 'default' | 'no-store';
type TimeRange = '15m' | '30m' | '1h' | '6h' | '24h' | '7d' | '30d' | 'all';

const CACHE_TTL = 15 * 60 * 1000; // 15 minutes in milliseconds

interface GetModelIdsResult {
  ids: string[];
  requestId: string;
}

const cache = new Map<string, { data: GetModelIdsResult; timestamp: number }>();

/** Returns `{ ids, requestId }`. Calling with no params uses saved API-key preferences if configured. */
export async function getModelIds(
  useCase?: UseCase[],
  sort?: Sort,
  topN?: number,
  options?: {
    cache?: CacheMode;
    maxErrorRate?: number;
    timeRange?: TimeRange;
    myReports?: boolean;
  }
): Promise<GetModelIdsResult> {
  // Keep cache keys stable regardless of useCase order.
  const normalizedUseCase = useCase ? [...useCase].sort() : undefined;

  const cacheKey = JSON.stringify({
    useCase: normalizedUseCase,
    sort,
    topN,
    maxErrorRate: options?.maxErrorRate,
    timeRange: options?.timeRange,
    myReports: options?.myReports,
  });

  const cached = cache.get(cacheKey);
  const cacheMode = options?.cache ?? 'default';

  if (cacheMode === 'default' && cached && Date.now() - cached.timestamp < CACHE_TTL) {
    return cached.data;
  }

  try {
    const params = new URLSearchParams();
    if (normalizedUseCase) params.set('useCase', normalizedUseCase.join(','));
    if (sort) params.set('sort', sort);
    if (topN) params.set('topN', String(topN));
    if (options?.maxErrorRate !== undefined) {
      params.set('maxErrorRate', String(options.maxErrorRate));
    }
    if (options?.timeRange) {
      params.set('timeRange', options.timeRange);
    }
    if (options?.myReports) {
      params.set('myReports', 'true');
    }

    const { ids, requestId } = await fetch(`${API}/models/ids?${params}`, {
      headers: { Authorization: `Bearer ${API_KEY}` },
    }).then((r) => r.json());

    const result: GetModelIdsResult = { ids, requestId };

    cache.set(cacheKey, { data: result, timestamp: Date.now() });

    return result;
  } catch (error) {
    // Prefer stale cache over hard failure when available.
    if (cached) {
      if (process.env.NODE_ENV !== 'production') {
        console.warn('API request failed, using stale cached data', error);
      }
      return cached.data;
    }
    throw error;
  }
}

/** Fire-and-forget feedback call (does not throw). */
export function reportIssue(
  modelId: string,
  issue: 'error' | 'rate_limited' | 'unavailable',
  requestId?: string,
  details?: string
) {
  fetch(`${API}/models/feedback`, {
    method: 'POST',
    headers: {
      Authorization: `Bearer ${API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({ modelId, issue, requestId, details }),
  }).catch(() => {});
}

/** Fire-and-forget feedback call (does not throw). */
export function reportSuccess(modelId: string, requestId?: string, details?: string) {
  fetch(`${API}/models/feedback`, {
    method: 'POST',
    headers: {
      Authorization: `Bearer ${API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({ modelId, success: true, requestId, details }),
  }).catch(() => {});
}

export function issueFromStatus(status: number): 'rate_limited' | 'unavailable' | 'error' {
  if (status === 429) return 'rate_limited';
  if (status === 503) return 'unavailable';
  return 'error';
}

Use It

Use saved key defaults, or override for a single request.

import { getModelIds, reportSuccess, reportIssue, issueFromStatus } from './free-llm-router';

try {
// getModelIds() with no params applies your saved configured params automatically
  const { ids: freeModels, requestId } = await getModelIds()

  for (const id of freeModels) {
    try {
      const res = await client.chat.completions.create({ model: id, messages });
      // submit success feedback
      reportSuccess(id, requestId);
      return res;
    } catch (e) {
      const status = e.status || e.response?.status;
      // submit issue feedback
      reportIssue(id, issueFromStatus(status), requestId, e.message);
    }
  }
} catch {
  // E.g. return await client.chat.completions.create({ model: 'anthropic/claude-3.5-sonnet', messages });
}
throw new Error('All models failed');

More patterns are available in Code Examples .

Further Configure Parameters

Need to tune use case, sorting, limits, reliability filters, and exclusions? Open Parameter Configuration.

Parameter Configuration

Overview

Configure model selection once in the app, then call getModelIds() without request parameters. Saved defaults are applied per API key automatically.

Use Preview only to test filters locally without saving to a key.

Configure Parameters

Use Case

Sort

Top N

Health Filter

Hide Models

Reset

Live

0 free models

Tune filters and preview the output list. For full parameter details, see Query Parameters.

Key Defaults

Parameters are saved per API key. A call like getModelIds() uses the selected key's saved defaults.

Model exclusions are also saved per key in app settings. They are applied automatically for that key and are not part of request-level override arguments.

const { ids, requestId } = await getModelIds()

Request Overrides

Passing arguments to getModelIds(...) overrides those defaults for that request only.

const { ids, requestId } = await getModelIds([], 'contextLength', 5)

Code Examples

Ready-to-use patterns for common use cases.

Basic Usage

Call getModelIds() with no params to use your saved key configuration automatically, then try each model until one succeeds.

import { getModelIds, reportSuccess, reportIssue, issueFromStatus } from './free-llm-router';

try {
// getModelIds() with no params applies your saved configured params automatically
  const { ids: freeModels, requestId } = await getModelIds()

  for (const id of freeModels) {
    try {
      const res = await client.chat.completions.create({ model: id, messages });
      // submit success feedback
      reportSuccess(id, requestId);
      return res;
    } catch (e) {
      const status = e.status || e.response?.status;
      // submit issue feedback
      reportIssue(id, issueFromStatus(status), requestId, e.message);
    }
  }
} catch {
  // E.g. return await client.chat.completions.create({ model: 'anthropic/claude-3.5-sonnet', messages });
}
throw new Error('All models failed');

One-off API Call

Simple single prompt completion - perfect for scripts, CLI tools, or serverless functions.

import { getModelIds, reportSuccess, reportIssue, issueFromStatus } from './free-llm-router';

const prompt = 'Summarize this article in 3 bullet points: ...';

try {
  // fetch ranked model IDs + requestId
  // Get top 3 models with both chat and vision capabilities
  // SDK has built-in 15-min cache, so this won't hit the API on every call
  const { ids: models, requestId } = await getModelIds(['chat', 'vision'], 'capable', 3);

  // Try each model until one succeeds
  for (const id of models) {
    try {
      const res = await fetch('https://openrouter.ai/api/v1/chat/completions', {
        method: 'POST',
        headers: {
          'Authorization': `Bearer ${process.env.OPENROUTER_API_KEY}`,
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: id,
          messages: [{ role: 'user', content: prompt }],
        }),
      });
      if (!res.ok) {
        // Report the right issue type - free, doesn't use quota
        reportIssue(id, issueFromStatus(res.status), requestId, `HTTP ${res.status}`);
        continue;
      }
      const data = await res.json();
      console.log(data.choices[0].message.content);
      // submit success feedback
      // Report success - helps other users know this model works!
      reportSuccess(id, requestId);
      break; // Success - exit loop
    } catch (e) {
      const status = e.status || e.response?.status;
      // submit issue feedback
      reportIssue(id, issueFromStatus(status), requestId, e.message); // Free - doesn't use quota
    }
  }
} catch {
  // API unavailable - handle gracefully
  console.error('Failed to fetch models');
}

Chatbot

Multi-turn conversation with message history - ideal for chat interfaces.

import { getModelIds, reportSuccess, reportIssue, issueFromStatus } from './free-llm-router';
import OpenAI from 'openai';

// OpenAI SDK works with OpenRouter's API
const client = new OpenAI({
  baseURL: 'https://openrouter.ai/api/v1',
  apiKey: process.env.OPENROUTER_API_KEY,
});

// Store conversation history for multi-turn chat
const messages: OpenAI.ChatCompletionMessageParam[] = [];

async function chat(userMessage: string) {
  messages.push({ role: 'user', content: userMessage });

  try {
    // fetch ranked model IDs + requestId
    // SDK has built-in 15-min cache, so this won't hit the API on every call
    const { ids: models, requestId } = await getModelIds(['chat'], 'capable', 5);

    for (const id of models) {
      try {
        const res = await client.chat.completions.create({
          model: id,
          messages, // Include full history
        });
        const reply = res.choices[0].message.content;
        messages.push({ role: 'assistant', content: reply });
        // submit success feedback
        // Report success - helps other users know this model works!
        reportSuccess(id, requestId);
        return reply;
      } catch (e) {
        // submit issue feedback
        // Report with correct issue type - free, doesn't use quota
        reportIssue(id, issueFromStatus(e.status), requestId, e.message);
      }
    }
  } catch {
    // API unavailable
  }
  throw new Error('All models failed');
}

Tool Calling

Let the model call functions - for agents, data fetching, or structured outputs.

import { getModelIds, reportSuccess, reportIssue, issueFromStatus } from './free-llm-router';
import { createOpenAI } from '@ai-sdk/openai';
import { generateText, tool } from 'ai';
import { z } from 'zod';

// Vercel AI SDK with OpenRouter
const openrouter = createOpenAI({
  baseURL: 'https://openrouter.ai/api/v1',
  apiKey: process.env.OPENROUTER_API_KEY,
});

// Define tools with Zod schemas
const tools = {
  getWeather: tool({
    description: 'Get current weather for a location',
    parameters: z.object({ location: z.string() }),
    execute: async ({ location }) => `72°F and sunny in ${location}`,
  }),
};

async function askWithTools(prompt: string) {
  try {
    // Filter for models that support tool calling
    // fetch ranked model IDs + requestId
    // SDK has built-in 15-min cache, so this won't hit the API on every call
    const { ids: models, requestId } = await getModelIds(['tools'], 'capable', 3);

    for (const id of models) {
      try {
        const { text, toolCalls } = await generateText({
          model: openrouter(id),
          prompt,
          tools,
        });
        // submit success feedback
        reportSuccess(id, requestId); // Helps improve health metrics
        return { text, toolCalls };
      } catch (e) {
        // submit issue feedback
        // Report with correct issue type - free, doesn't use quota
        reportIssue(id, issueFromStatus(e.status), requestId, e.message);
      }
    }
  } catch {
    // API unavailable
  }
  throw new Error('All models failed');
}

API Reference

Complete reference for all available endpoints. See Query Parameters for parameter details.

GET

/api/v1/models/ids

Lightweight endpoint returning only model IDs. Fast and small payload - use this in production.

Query Parameters

Parameter	Type	Description
useCase	string	Comma-separated: `chat`, `vision`, `tools`, `longContext`, `reasoning`
sort	string	One of: `contextLength`, `maxOutput`, `capable`, `leastIssues`, `newest`
topN	number	Return top N models based on sort order (1-100)
maxErrorRate	number	Exclude models with error rate above this percentage (0-100)
timeRange	string	Time window for error rates: `15m`, `30m`, `1h`, `6h`, `24h`, `7d`, `30d`. Default: `7d`.
myReports	boolean	If true, calculate error rates from only your own reports (requires API key). Default: false.

Response

Field	Type	Description
ids	string[]	Array of model IDs
count	number	Number of IDs returned

Errors

500 - Server error

Cache-Control: private, max-age=60 - Responses are cached for 60 seconds at the HTTP layer and 15 minutes in the SDK.

Request

API Key

Required to send requests.

Use Case

Sort

Top N

Health Filter

Reset

curl https://freellmrouter.com/api/v1/models/ids?sort=contextLength&topN=5 \
  -H "Authorization: Bearer YOUR_API_KEY" \

Response

{
  "ids": [
    "google/gemini-2.0-flash-exp:free",
    "meta-llama/llama-3.3-70b-instruct:free",
    "deepseek/deepseek-chat:free"
  ],
  "count": 15
}

GET

/api/v1/models/full

Full model objects with metadata, feedback counts, and timestamps. Use for browsing or debugging.

Query Parameters

Same parameters as /models/ids: useCase, sort, topN, maxErrorRate, timeRange, and myReports.

See /models/ids documentation above for parameter details.

Response

Field	Type	Description
models	Model[]	Full model objects with all metadata
feedbackCounts	object	Per-model feedback: issue counts, success count, and error rate (percentage). Error rate shows % of failed requests.
lastUpdated	string	ISO 8601 timestamp of last sync
useCases	string[]	Applied use case values
sort	string	Applied sort value
count	number	Total number of models returned

Cache-Control: private, max-age=60 - Responses are cached for 60 seconds at the HTTP layer and 15 minutes in the SDK.

Request

API Key

Required to send requests.

Use Case

Sort

Top N

Health Filter

Reset

curl https://freellmrouter.com/api/v1/models/full?sort=contextLength&topN=5 \
  -H "Authorization: Bearer YOUR_API_KEY" \

Response

{
  "models": [
    {
      "id": "google/gemini-2.0-flash-exp:free",
      "name": "Gemini 2.0 Flash",
      "contextLength": 1000000,
      "maxCompletionTokens": 8192,
      "description": "...",
      "inputModalities": ["text", "image"],
      "outputModalities": ["text"],
      "supportedParameters": ["tools", "reasoning"]
    }
  ],
  "feedbackCounts": { ... },
  "lastUpdated": "2024-12-29T10:00:00Z",
  "filters": ["vision"],
  "sort": "contextLength",
  "count": 15
}

POST

/api/v1/models/feedback

Report model feedback: successes or issues (rate limiting, errors, unavailability). Does not count towards your rate limit.

Request Body

Parameter	Type	Required	Description
modelId	string	Yes	The model ID to report
success	boolean	No	Set to true to report successful request. If omitted, reports an issue (requires issue field).
issue	string	Yes	Required if success is false/omitted. One of: `rate_limited`, `unavailable`, `error`
details	string	No	Optional description of the issue
dryRun	boolean	No	If true, validates request but doesn't save (for testing)

Response

Field	Type	Description
received	boolean	Whether feedback was recorded

Errors

400 - Missing modelId or invalid issue type
500 - Server error

Request

API Key

Required to send requests.

curl -X POST https://freellmrouter.com/api/v1/models/feedback \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "modelId": "google/gemini-2.0-flash-exp:free",
    "success": true,
    "dryRun": true
  }'

Response

{ "received": true }

Query Parameters

Customize your requests by combining these parameters. All parameters are optional and can be mixed and matched.

useCase

Select models by use case. Pass one or more as a comma-separated list: ?useCase=vision,tools

Value	Description
chat	Text-to-text models optimized for conversation
vision	Models that accept image inputs
tools	Models that support function/tool calling
longContext	Models with 100k+ token context windows
reasoning	Models with advanced reasoning capabilities (e.g., o1, QwQ, DeepSeek R1)

sort

Control the order models are returned. This determines fallback priority when iterating through the list. Example: ?sort=contextLength

Value	Label	Description
contextLength	Context Length	Largest context window first - best for long documents
maxOutput	Max Output	Highest output token limit first - best for long-form generation
capable	Most Capable	Most supported features first - good default
leastIssues	Least Reported Issues	Fewest user-reported issues first - best for stability
newest	Newest First	Most recently added models first - best for trying new models

topN

Return only the top N models based on sort order. Range: 1-100. Default: unlimited. Example: ?topN=10

maxErrorRate

Exclude models with error rate above this percentage (0-100). Error rate = errors / (errors + successes). Example: ?maxErrorRate=20 excludes models with more than 20% error rate.

timeRange

Time window for calculating error rates. Options: 15m, 30m, 1h, 6h, 24h, 7d, 30d, all. Default: 7d.

myReports

When set to true, calculate error rates from only your own reported issues instead of all community reports. Requires API key authentication. Default: false. Example: ?myReports=true

Get Started

Set Up OpenRouter

Get Your API Key

Copy free-llm-router.ts

Use It

Further Configure Parameters

Parameter Configuration

Overview

Configure Parameters

Key Defaults

Request Overrides

Code Examples

Basic Usage

One-off API Call

Chatbot

Tool Calling

API Reference

/api/v1/models/ids

Query Parameters

Response

Errors

Request

Response

/api/v1/models/full

Query Parameters

Response

Request

Response

/api/v1/models/feedback

Request Body

Response

Errors

Request

Response

Query Parameters

useCase

sort

topN

maxErrorRate

timeRange

myReports

Copy `free-llm-router.ts`