its a fork

TypeScript 95.7%
Shell 2%
JavaScript 1%
HTML 0.8%
CSS 0.4%
Other 0.1%

Find a file

Gabriel e826a7bdf6 Some checks failed CI / Install, test, build (push) Failing after 1m43s Details feat(proxy): fail over when a stream stalls before the first token Adds a pre-first-token stall watchdog to the streaming path: while no bytes have reached the client yet (so failover is still possible), a route that connects but produces no content/tool/reasoning delta for STREAM_FIRST_TOKEN_STALL_MS (default 30s, env-tunable) is abandoned and the retry loop moves to the next provider/key. Any real delta — including reasoning traces — resets the watchdog, so a slow-but-thinking model is never cut. After headers are flushed we can't fail over, so the existing 90s inactivity timeout still governs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>		2026-06-15 13:27:37 +02:00
.gitea/workflows	docs+ci: README rewrite for v1.3, add Forgejo Actions, fix proxy dispatcher import	2026-06-14 13:40:56 +02:00
.github/workflows	ci(docker): native per-arch builds, no QEMU — fixes 20-min/failed arm64 legs	2026-06-05 13:34:33 +05:00
client	fix(models): render expanded stats directly under the selected row	2026-06-14 23:29:21 +02:00
desktop	chore(desktop): v0.3.0 for the Premium release	2026-06-10 11:55:52 +01:00
docker	feat(docker): Docker + GHCR support (adopts #44 , +multi-arch & localhost-bind) (#129 )	2026-05-30 21:19:24 +05:00
docs	feat: Premium live catalog — signed sync, license keys, self-serve billing	2026-06-10 11:55:13 +01:00
k8s	feat(ops): Phase 8 — PM2 ecosystem, k8s YAML, SETUP.md, .env.example updates	2026-06-12 21:22:17 +02:00
repo-assets	docs: desktop app screenshot + Windows testers wanted note	2026-06-05 13:22:04 +05:00
scripts	feat(scripts): remove country filter, output all working proxies	2026-06-13 23:38:07 +02:00
server	feat(proxy): fail over when a stream stalls before the first token	2026-06-15 13:27:37 +02:00
shared	feat: per-key outbound TPM + inline RPM/TPM editing on provider keys	2026-06-13 12:38:20 +02:00
.dockerignore	feat(docker): Docker + GHCR support (adopts #44 , +multi-arch & localhost-bind) (#129 )	2026-05-30 21:19:24 +05:00
.env.example	feat(ops): Phase 8 — PM2 ecosystem, k8s YAML, SETUP.md, .env.example updates	2026-06-12 21:22:17 +02:00
.gitignore	chore: gitignore monetization/ (private planning)	2026-06-05 12:57:42 +05:00
docker-compose.yml	feat(docker): Docker + GHCR support (adopts #44 , +multi-arch & localhost-bind) (#129 )	2026-05-30 21:19:24 +05:00
Dockerfile	fix(docker): install build toolchain for better-sqlite3 native compile (#143 )	2026-05-31 14:33:15 +05:00
ecosystem.config.cjs	feat(ops): Phase 8 — PM2 ecosystem, k8s YAML, SETUP.md, .env.example updates	2026-06-12 21:22:17 +02:00
install.sh	fix(auth): accept username or email on login; add migration prompt to install.sh	2026-06-13 18:28:26 +02:00
LICENSE	Initial release of FreeLLMAPI	2026-04-21 20:48:54 +01:00
migrate-from-old.sh	fix(migrate): replace sqlite3 CLI calls with Node.js better-sqlite3	2026-06-13 18:30:58 +02:00
package-lock.json	feat(ui): add all missing frontend pages	2026-06-12 21:51:39 +02:00
package.json	feat(onboarding): one-line install script + LAN dev docs (#250 , #247 )	2026-06-07 18:57:42 +01:00
README.md	docs+ci: README rewrite for v1.3, add Forgejo Actions, fix proxy dispatcher import	2026-06-14 13:40:56 +02:00
SETUP.md	feat(ops): Phase 8 — PM2 ecosystem, k8s YAML, SETUP.md, .env.example updates	2026-06-12 21:22:17 +02:00
working-proxies.txt	fix(keys): properly add discovered models to fallback_config	2026-06-14 00:20:27 +02:00

README.md

FreeLLMAPI

One OpenAI-compatible endpoint. Every free LLM provider. Real-time dashboard.

Aggregate the free tiers from Google, Groq, Cerebras, NVIDIA, Mistral, OpenRouter, GitHub Models, Cohere, Cloudflare, HuggingFace, Z.ai, Ollama, Kilo, Pollinations, LLM7, OVH AI Endpoints, plus any custom OpenAI-compatible endpoint — behind a single /v1/chat/completions drop-in. A smart router picks the best available key for each request, fails over transparently when a provider rate-limits you, queues requests instead of dropping them, routes outbound traffic through an HTTP proxy pool of your choice, and tracks everything in a live WebSocket dashboard.

What's new in v1.3

🛰️ HTTP proxy pool — pool many proxies, choose a strategy (round-robin, least-latency, random, or single), per-key assignment that rotates hourly, dashboard with stats, geolocation, health-checker, bulk select / enable / disable / test / delete, and a France-only find-proxies scraper. A single proxy URL set in Settings is also tracked in the dashboard.
📊 Live dashboard v2 — in-flight requests now show the proxy in use, input tokens and a growing output token counter that ticks up as the stream comes back. New session totals (Tokens IN / Tokens OUT).
🤖 AI Profiles v2 — profile context window is auto-set to the smallest model's window so a profile mixing 128K + 8K models never hallucinates. Round-robin API, per-profile stats panel, profile strategies (failover, round-robin, least-latency).
🔑 Keys page v2 — model discovery now syncs the catalog: new models from the provider are added, stale ones are removed. Bulk auto-discover for custom endpoints, search bar, inline label/RPM/TPM editing.
🌐 17 providers — Groq, Cerebras, Mistral, OpenRouter, NVIDIA NIM, GitHub Models, Cohere, Cloudflare, Google, Z.ai, HuggingFace, Ollama Cloud, Kilo Gateway, Pollinations, LLM7, OVH AI Endpoints, plus any custom OpenAI-compatible URL.
🛠 Operational fixes — fixed fd-leak crash on test-all, dispatcher cache, SOCKS test endpoint, profile creation bug, model-stats scoping per user, pool-strategy persistence across restarts.
🧪 505 server tests passing in 10s. Forgejo Actions workflow on every push and every v* tag.

What's new in v1.3
Features
Supported providers
Quick start
Install script
Docker
Node.js + PM2
Migrating from an older install
Using the API
HTTP proxy pool
Dashboard
Configuration
How routing works
Project structure
Continuous integration
Contributing
Disclaimer

Features

Gateway

Single OpenAI-compatible endpoint — /v1/chat/completions, /v1/models, /v1/embeddings. Any OpenAI client library works unchanged.
Smart failover router — tries provider keys in scored order (success rate, latency, rate-limit headroom). Falls over to the next key/provider silently.
Request queueing — when all keys for a model are rate-limited, requests are held and retried instead of immediately returning 429. Configurable timeout.
Per-user isolation — each user has their own provider keys, gateway keys, profiles, proxies, and request history. Zero cross-tenant data leakage.
Gateway API keys — mint scoped keys (freellmapi-…) for your apps with per-key RPM/TPD limits. Enable/disable without deleting.
Outbound rate limits — set per-provider-key RPM/TPM caps enforced between the router and the upstream API.
Context handoff — injects a compact system message when a session switches model mid-stream so the new model knows where things left off.
HTTP proxy pool — distribute outbound calls across many HTTP proxies (round-robin, least-latency, random), or pin to a single proxy. Per-key hourly assignment.
AES-256-GCM encryption — all provider API keys are encrypted at rest. The plaintext never leaves your server.

AI Profiles

Virtual models — create a named profile that fans out across a list of real models. Appears in /v1/models like any other model.
Routing strategies — failover, round-robin, or least-latency per profile.
Auto context window — profile's context_window is automatically set to the smallest model's window so a profile mixing big and small models never hallucinates.
AUTO model — pseudo-model that picks the best available option across all your configured keys by score.
Embeddings parity — profiles work for embedding requests too.
Per-profile stats — request volume, error rate, latency, and token counts scoped to the profile.

Dashboard

Live WebSocket dashboard — in-flight request counter, per-request routing events (which provider/model/proxy is being tried), per-key RPM gauges, overview counters — all pushed in real time without polling.
In-flight proxy + tokens — active rows show the proxy name, input tokens, and a growing output token counter that ticks up live as the model streams.
Session token totals — Tokens IN and Tokens OUT aggregated over the dashboard session.
Per-user live counts — counts and events are scoped to the logged-in user.
Keys page — masked key display, inline label/RPM/TPM editing, health status, one-click model discovery, custom endpoint with bulk auto-discover.
Per-key deep stats — click any key to expand: latency percentiles (p50/p95/p99), 24-hour hourly bar chart, per-model breakdown, error breakdown, recent 25 requests, active cooldowns.
Live status dots — each key's indicator updates in real time when the server detects a 401/403/429. No manual refresh needed.
Playground — interactive chat with model selector, system prompt editor, and a live routing indicator showing which provider/model/proxy is handling the current request.
Profiles page — create and manage AI Profiles with a searchable model picker; add and remove models from the provider with one click.
Proxy pool page — bulk select, enable / disable / delete / test, per-proxy stats panel, geolocation, health checker.
Analytics — request volume, error rates, cost estimates, per-provider breakdown, exportable history.

Operations

Structured JSON logger — every request logs timestamp, level, req_id, user, key prefix, proxy, provider, and model. Secrets masked. Daily rotation with configurable retention.
Universal installer — install.sh auto-detects Docker vs Node.js+PM2, generates keys, writes .env, builds, and starts the service. Prompts to migrate from an older install.
Migration script — migrate-from-old.sh finds an older FreeLLMAPI install, verifies the encryption key, and imports provider keys and request history non-destructively.
find-proxies script — scans a list of candidate free-proxy URLs, geo-IP-verifies the country, tests connectivity, and writes the working set to working-proxies.txt for one-click pool import.
Docker — multi-stage Dockerfile, docker-compose.yml, named volume for the database.
PM2 — ecosystem.config.cjs with autorestart, memory cap, and structured log rotation.
Kubernetes / Podman — k8s/ manifests for podman play kube or kubectl apply.
Forgejo Actions — .gitea/workflows/ci.yml and release.yml run install + build + test on every push and every v* tag.

Supported providers

Provider	Models	Auth
Google Gemini	Gemini 2.5 Flash, 2.0 Flash, 2.5 Pro preview	API key
Groq	Llama 3.3/4, Qwen3, Gemma, compound-beta	API key
Cerebras	Qwen3 235B, Llama 3.3 70B	API key
Mistral	Large 3, Medium 3.5, Codestral, Devstral	API key
OpenRouter	20+ free-tier models via `:free` routes	API key
GitHub Models	GPT-4.1, GPT-4o, Phi-4, Llama 3.3	GitHub PAT
Cloudflare Workers AI	Kimi K2, GLM-4.7, Llama, Granite	Account ID + token
Cohere	Command R+, Command-A (trial key)	API key
NVIDIA NIM	40 RPM free tier (eval-only ToS)	API key
HuggingFace	Inference router → DeepSeek, Kimi, Qwen3	API key
Z.ai (Zhipu)	GLM-4.5, GLM-4.7 Flash	API key
Ollama Cloud	GLM-4.7, Kimi K2, Qwen3	API key
Kilo Gateway	`:free` routes, no key required	Keyless
Pollinations	GPT-OSS 20B, no key required	Keyless
LLM7	GPT-OSS, Llama 3.1, GLM, no key required	Keyless
OVH AI Endpoints	Qwen3.5 397B, Llama 3.3, no key required	Keyless
Custom endpoint	Any OpenAI-compatible URL — llama.cpp, LM Studio, vLLM, Ollama, Novita, etc.	API key (or keyless)

Keyless providers don't require an API key. The Keys page stores a sentinel row for them so routing treats the platform as configured.

Quick start

git clone https://git.pandem.fr/outage.sh/FreeLLMapi freellmapi
cd freellmapi

# Recommended — universal installer
bash install.sh

# Or manually
cp .env.example .env
# Edit .env — set ENCRYPTION_KEY, ADMIN_USERNAME, ADMIN_PASSWORD
npm install && npm run build
node server/dist/index.js

Open http://localhost:3001 and log in with your admin credentials.

Install script

bash install.sh             # auto-detects Docker vs Node.js+PM2
bash install.sh --docker    # force Docker Compose
bash install.sh --node      # force Node.js + PM2
bash install.sh --port 8080 # custom port
bash install.sh --lan       # bind to 0.0.0.0 (LAN access)
bash install.sh --yes       # non-interactive / CI

The script detects Docker or Node.js, generates ENCRYPTION_KEY, prompts for admin credentials, writes .env, builds, starts the service, waits for the health check, and prints the access URL and management commands. After .env is written it asks if you want to migrate keys from an older install.

Docker

cp .env.example .env
# Set ENCRYPTION_KEY, ADMIN_USERNAME, ADMIN_PASSWORD in .env

docker compose up -d

# Verify
curl http://localhost:3001/api/ping

# Logs
docker compose logs -f

# Update
git pull && docker compose build && docker compose up -d

The database is stored in a named Docker volume (freellmapi-data) and survives container restarts and image rebuilds.

LAN access: set HOST_BIND=0.0.0.0 in .env to expose the port to your local network.

Node.js + PM2

Requires Node.js 20+.

cp .env.example .env   # edit before proceeding

npm install
npm run build

pm2 start ecosystem.config.cjs
pm2 save
pm2 startup            # enable autostart on reboot

# Logs
pm2 logs api-gateway

# Update
git pull && npm run build && pm2 restart api-gateway

The database lives at server/data/freeapi.db. Back it up before updates.

Migrating from an older install

If you ran a previous version of FreeLLMAPI and want to carry your provider keys forward:

# Auto-search the machine for an old install
bash migrate-from-old.sh

# Or point directly at the old repo directory
bash migrate-from-old.sh /path/to/old/freellmapi

The script:

Locates the old freeapi.db and reads its ENCRYPTION_KEY
Verifies decryption works before touching anything
Handles ENCRYPTION_KEY mismatches — adopt the old key (recommended if the new DB is empty) or re-encrypt everything on the fly with the new key
Imports all api_keys rows, skipping duplicates. Optionally imports request history.
Never deletes from either database — fully non-destructive

After running, restart the server and go to Keys → Check all to confirm the imported keys are still valid with each provider.

Using the API

FreeLLMAPI speaks standard OpenAI API. Point any client at your server:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:3001/v1",
    api_key="freellmapi-your-gateway-key-here",
)

response = client.chat.completions.create(
    model="auto",   # router picks best available, or name a specific model
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

curl http://localhost:3001/v1/chat/completions \
  -H "Authorization: Bearer freellmapi-your-gateway-key-here" \
  -H "Content-Type: application/json" \
  -d '{"model":"auto","messages":[{"role":"user","content":"Hello!"}]}'

Listing models

curl http://localhost:3001/v1/models \
  -H "Authorization: Bearer freellmapi-your-gateway-key-here"

Returns all models from your configured providers plus any AI Profiles. Use auto to let the router choose, or request a specific model by name.

Gateway API keys

Create keys in the dashboard under Keys → Gateway keys. The raw key is shown once at creation — only the hash is stored.

HTTP proxy pool

v1.3 ships a full HTTP proxy pool. Configure it from the Proxies page or via env / .env:

Env var	Default	Description
`PROXY_URL`	—	Single proxy URL. Wins over DB. e.g. `http://user:pass@host:port`
`PROXY_POOL_STRATEGY`	`none`	`none` (off), `random`, `round-robin`, `least-latency`
`PROXY_BYPASS`	—	Comma-separated platform names that skip the proxy (e.g. `groq,google`)

Single proxy — set PROXY_URL once. The dashboard tracks request count, success rate, latency, and bytes.

Pool — add rows in the Proxies page (label, URL, optional country), pick a strategy, and the router distributes each request across the pool. Per-key assignment rotates every hour to spread load and avoid burning any single egress.

Bypass list — platforms you don't want proxied (e.g. low-latency providers like Groq that you want to reach directly).

SOCKS proxies — supported via socks5:// and socks4:// URLs.

find-proxies script — scans a candidate URL list, geo-IP-verifies the country (France by default), probes connectivity, and writes the working set to working-proxies.txt for bulk import:

# 1. Find a batch of working proxies
node scripts/find-proxies.mjs --out working-proxies.txt

# 2. Import in the dashboard: Proxies → Bulk import → paste file

Dashboard

Live dashboard (v1.3 highlights)

In-flight proxy name — every active row shows which proxy the request is using
Growing token counts — output_tokens ticks up live as the model streams
Session totals — Tokens IN and Tokens OUT aggregated over the dashboard session
Per-user counts — events scoped to the logged-in user
Real-time routing events — which provider/model was tried, latency, outcome, all pushed over WebSocket

Keys page

Add provider credentials for any supported service. Features per key:

Live status dot — updates in real time: green (healthy), amber (rate-limited), red (error/disabled/401)
Inline editing — pencil icon to edit label, RPM cap, TPM cap without re-entering the key
Stats panel — click the dot or ⌄ chevron to expand deep stats:
- Total requests, success rate, error count, rate-limit hits
- Latency: min / p50 / avg / p95 / p99 / max
- 24-hour bar chart, colour-coded by error rate
- Per-model breakdown with success rate and token counts
- Error breakdown with occurrence counts
- Last 25 requests with model, latency, tokens, and timestamp
- Active cooldowns with reset times
Model discovery — one click calls the provider's /v1/models endpoint, adds new models to the catalog, and removes stale ones. Custom endpoint mode imports all advertised models in one shot.

Custom endpoints: use the "Custom" provider to connect any OpenAI-compatible URL. Toggle "Auto-discover" to bulk-register all models the endpoint advertises.

AI Profiles

Profiles act as virtual models. Build a ranked list of real models, choose a routing strategy, and the profile appears in /v1/models like any other model.

Context window — auto-set to the smallest model's window. Add a 4K model to a profile that has a 128K model and the cap drops to 4K; remove it and the cap goes back up. Prevents the larger model from drifting on a too-large context.
Routing strategies — failover (try in order), round-robin, least-latency
Search bar — filter by model name, provider, or model ID when building a profile
Per-profile stats — request volume, error rate, latency, and token counts

Proxies

The proxy pool page lets you:

Add HTTP / HTTPS / SOCKS proxies with labels and (optional) country
Pick a pool strategy: none, random, round-robin, least-latency
Bulk select with checkboxes and apply enable / disable / delete / test across many rows
Click a row to expand its stats panel (request count, success rate, latency, bytes)
Health-checker pings every enabled proxy on a schedule and marks failing ones

Playground

Interactive chat with:

Full model selector (providers, profiles, auto)
System prompt editor
Live routing indicator — shows the provider, model, and proxy the router is currently trying while a request is in flight

Analytics

Request history with per-provider breakdown, token usage, latency trends, and cost estimates.

Configuration

Variable	Default	Description
`ENCRYPTION_KEY`	(required)	64-char hex string for AES-256-GCM. Generate: `node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"`
`PORT`	`3001`	HTTP listen port
`HOST_BIND`	`127.0.0.1`	Bind interface. Set `0.0.0.0` for LAN access.
`ADMIN_USERNAME`	—	Bootstrap admin username (first run only)
`ADMIN_PASSWORD`	—	Bootstrap admin password, min 8 chars (first run only)
`PROXY_RATE_LIMIT_RPM`	`120`	Max `/v1` requests per minute per client IP. `0` = disabled.
`REQUEST_ANALYTICS_RETENTION_DAYS`	`90`	Days to keep request history
`REQUEST_ANALYTICS_MAX_ROWS`	`100000`	Maximum request history rows
`REQUEST_QUEUE_TIMEOUT_SECONDS`	`60`	Seconds to hold a queued request before returning 429
`MODEL_REFRESH_INTERVAL_MINUTES`	—	Periodic live model list refresh. Unset = boot-time only.
`FREELLMAPI_CONTEXT_HANDOFF`	—	Set `on_model_switch` to inject context messages on model switch
`PROXY_URL`	—	Single proxy URL. e.g. `http://user:pass@host:port`
`PROXY_POOL_STRATEGY`	`none`	`none` / `random` / `round-robin` / `least-latency`
`PROXY_BYPASS`	—	Comma-separated platform names to skip the proxy
`DASHBOARD_ORIGINS`	—	Extra CORS origins for the dashboard (comma-separated)
`LOG_LEVEL`	`INFO`	`DEBUG` \| `INFO` \| `WARN` \| `ERROR` \| `FATAL`
`GATEWAY_INSTANCES`	`1`	PM2: number of gateway worker processes

How routing works

When a request arrives at /v1/chat/completions:

Resolve the model — a Profile name expands to its candidate list; auto scores all available models; anything else looks up provider keys for that model ID.
Score and sort — each candidate key is scored by recent success rate, average latency, and rate-limit headroom. Lower-scored candidates move to the back of the queue.
Pick a proxy — for the resolved key, look up its current pool assignment (or fall back to the single PROXY_URL); the assignment rotates every hour so each key gets a fair share.
Try in order — the router picks the first key that isn't on cooldown, hasn't hit its outbound RPM/TPM cap, and has a healthy proxy.
On failure — a rate-limit response (429) puts the key on a cooldown; an auth failure (401/403) marks it invalid and pushes a real-time status update to the dashboard.
Queue if exhausted — if all candidates are on cooldown the request parks in an in-memory queue and retries when the earliest cooldown expires, up to REQUEST_QUEUE_TIMEOUT_SECONDS.
Context handoff — if the winning model differs from the previous turn in the same session, a compact system message is prepended so the new model has context.

Every routing decision is logged (provider, model, key prefix, proxy, latency) and streamed to the live dashboard via WebSocket.

Project structure

freellmapi/
├── server/src/
│   ├── routes/        REST endpoints — /api/*, /v1/*
│   │                    (proxy, keys, profiles, gateway-keys, admin, fallback, embeddings, responses, settings…)
│   ├── services/      Router, queue, health checker, scoring, WebSocket push, auth, context handoff, proxy-health
│   ├── providers/     Per-provider OpenAI-compatible adapters (google, openai-compat, cohere, cloudflare)
│   ├── db/            SQLite (better-sqlite3), migration runner, model catalog migrations
│   └── lib/           Crypto, proxy pool, logger, error handling
├── client/src/
│   ├── pages/         Keys, Profiles, Proxies, Gateway keys, Playground, Analytics, Live dashboard, Fallback, Premium, Admin…
│   └── components/    Shared UI components (shadcn/ui)
├── shared/            TypeScript types shared between server and client
├── desktop/           Electron desktop wrapper
├── scripts/           find-proxies.mjs and other ops helpers
├── k8s/               Kubernetes / Podman play kube manifests
├── .gitea/workflows/  Forgejo Actions (ci.yml, release.yml)
├── .github/workflows/ GitHub Actions (ci.yml, docker.yml)
├── ecosystem.config.cjs  PM2 process definition
├── Dockerfile         Multi-stage Docker build
├── docker-compose.yml
├── install.sh         Universal installer
├── migrate-from-old.sh  Import keys from an older FreeLLMAPI install
└── find-proxies.mjs   Free-proxy scraper / verifier

Continuous integration

Two Forgejo Actions workflows ship with the repo and are picked up automatically by your Forgejo runner:

Workflow	Trigger	What it does
`.gitea/workflows/ci.yml`	push to `main`, pull request	`npm ci` → build server → build client → run 505 server tests
`.gitea/workflows/release.yml`	push of any `v*` tag (e.g. `v1.3`)	same as CI, then posts a green-build summary to the run page

A push of the v1.3 tag will start a release build on the runner and report success or failure on the release page. Run the same suite locally with:

npm install
npm run build
npm test -w server

Contributing

git clone https://git.pandem.fr/outage.sh/FreeLLMapi freellmapi
cd freellmapi
npm install
npm run dev        # server on :3001, Vite dev server on :5173

npm run build — full production build (server + client)
npm test -w server — server test suite (505 tests)
npm run build:server — server only
npm run desktop:dev — Electron desktop wrapper in dev mode
npm run desktop:dist — build a distributable Electron binary

PRs welcome. Keep changes focused; include tests for new server behaviour.

Disclaimer

FreeLLMAPI routes requests to third-party AI provider APIs under their respective free-tier terms. It does not circumvent rate limits — it distributes load across multiple API keys you legitimately own. Review each provider's Terms of Service before use. NVIDIA's free NIM tier is for evaluation only. The authors are not responsible for ToS violations or API key misuse.

MIT License · Based on FreeLLMAPI

README.md

FreeLLMAPI

What's new in v1.3

Contents

Features

Gateway

AI Profiles

Dashboard

Operations

Supported providers

Quick start

Install script

Docker

Node.js + PM2

Migrating from an older install

Using the API

Listing models

Gateway API keys

HTTP proxy pool

Dashboard

Live dashboard (v1.3 highlights)

Keys page

AI Profiles

Proxies

Playground

Analytics

Configuration

How routing works

Project structure

Continuous integration

Contributing

Disclaimer