Get 1000 free inferences/day w/ @OpenRouterAI 🆓

Sunday, September 21st 2025

🚀 1000 Free AI Inferences a Day with OpenRouter + LiteLLM

Most people think running AI models at scale means $$$. But there’s a neat trick: with OpenRouter you can get 1000 free inference requests per day – if you top-up your account once with $10.

This unlocks a playground for experiments, custom APIs, or simply a drop-in replacement for OpenAI endpoints. Let’s set it up in a few minutes.

🛠 Step 1: Spin up your own inference endpoint

Create an account at openrouter.ai and top-up $10. This enables the free tier quota (1000 calls/day).

Now run a LiteLLM container locally.

docker-compose.yml:

litellm:
  image: ghcr.io/berriai/litellm:main-latest
  restart: unless-stopped
  command: ["--config=/litellm_config.yaml"]
  env_file:
    - .env
  volumes:
    - ./litellm_config.yaml:/litellm_config.yaml
  ports:
    - "4000:4000"

.env:

OPENROUTER_API_KEY="sk-or-v1-..."   # from https://openrouter.ai/settings/keys
LITELLM_MASTER_KEY="sk-1234"

Fetch my pre-made config with all the latest FREE models.

wget -O litellm_config.yaml https://nexus.echolotintel.eu/api/public/template/openrouter-free
docker compose up -d

Your own OpenAI-compatible endpoint is now live at:

http://localhost:4000/v1/chat/completions

⚡ Example Request

curl http://localhost:4000/v1/chat/completions \
  -H "Authorization: Bearer sk-1234" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openrouter/moonshotai/kimi-k2:free",
    "messages": [{"role":"user","content":"Hello, world!"}]
  }'

Boom 🎉 — 1000 requests/day, free.

🔄 Keep Your Config Fresh with a Cronjob

The free models list can change over time. To always stay up-to-date, let’s auto-update the litellm_config.yaml every night with a cronjob:

crontab -e

Add this line (runs every night at 2 AM):

0 2 * * * wget -q -O /path/to/litellm_config.yaml https://nexus.echolotintel.eu/api/public/template/openrouter-free && docker compose -f /path/to/docker-compose.yml restart litellm

🧙 Bonus Tip: Add AI Magic to Old Code

With your local endpoint running, you can drop AI analysis into legacy functions without refactoring the whole codebase. Just decorate existing Python functions with an AI wrapper.

snippets/ai.py

import json
import requests
from functools import wraps

def ai_enhance(prompt_template: str):
    """Add AI analysis to any function output"""
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            result = func(*args, **kwargs)
            prompt = prompt_template.format(result=result)
            ai_response = call_ai(prompt)
            return {"original": result, "ai_analysis": ai_response}
        return wrapper
    return decorator

def call_ai(prompt: str) -> str:
    response = requests.post(
        "http://127.0.0.1:4000/v1/chat/completions",
        headers={
            "Content-Type": "application/json",
            "Authorization": "Bearer sk-1234"
        },
        json={
            "model": "openrouter/moonshotai/kimi-k2:free",
            "max_tokens": 2000,
            "messages": [{"role": "user", "content": prompt}]
        }
    )
    return response.json()["choices"][0]["message"]["content"]

Usage Examples

@ai_enhance("Analyze this vulnerability scan for critical risks:\n{result}")
def vulnerability_scan(target: str) -> str:
    return f"Found 3 critical SQLi, 5 XSS on {target}"

@ai_enhance("Extract key security findings as JSON:\n{result}")
def log_analysis(logfile: str) -> str:
    return "Failed login from 192.168.1.100, admin account locked"

result = vulnerability_scan("app.example.com")
print(result["original"])     # Old function output
print(result["ai_analysis"])  # AI-enhanced insight

🎯 Why This Matters

Drop-in OpenAI compatibility → no code changes for most apps.

Free daily quota → perfect for side-projects or experimentation.

💡 In short: $10 unlocks a daily stream of free inference requests. Point your tools at your LiteLLM proxy, decorate old code, and suddenly your dusty scripts become AI-assisted.