Skip to main content

Menu

Sign In Register

Choose A Provider And Model

Choose between Apprentice CLI providers, subscription providers, direct API providers, and local model runtimes based on quality, cost, locality, and reliability.

Choose A Provider And Model

Apprentice is not limited to local models. It supports several provider types, and each agent can use the provider and model that fit its job.

The right choice depends on quality, cost, latency, privacy expectations, account setup, and whether you want cloud APIs, CLI/subscription providers, or local runtimes.

Provider Types

Apprentice supports:

  • CLI providers that run inside the agent runtime container.
  • Subscription-backed providers.
  • Direct API providers.
  • Local OpenAI-compatible runtimes.

Current provider families include Claude Code, Codex, Gemini CLI, ChatGPT Subscription, Anthropic / Claude API, Google Gemini API, OpenAI, DeepSeek, Mistral, Kimi, GLM, Qwen, LM Studio, Docker Model Runner, and Ollama.

Local-First Does Not Mean Local-Model-Only

Apprentice is local-first because the desktop app, agent configuration, runtime control, database, permissions, memory, tasks, schedules, and audit trail live on your machine.

Model traffic follows the provider you choose:

  • Local runtimes keep model calls on the configured local model service.
  • API providers send prompts and context to the provider API.
  • CLI and subscription providers use that provider's account and service behavior.

Choose a provider based on where you want model inference to happen.

Start With A Known-Good Provider

For your first serious agent, use a provider you have already tested in Settings > AI Integration.

Avoid changing too many things at once. If the agent fails, you want to know whether the issue is the prompt, folder access, permissions, Docker, or the provider.

Choose By Job Type

Use stronger reasoning models for:

  • Large refactors.
  • Planning across many files.
  • Complex debugging.
  • Long multi-step work.

Use faster or cheaper models for:

  • Summaries.
  • Classification.
  • Short replies.
  • Routine status checks.
  • Scheduled reports.

Use local runtimes for:

  • Lower-cost experimentation.
  • Offline or LAN-local workflows.
  • Tasks where local inference quality is good enough.

Context Window Matters

Some providers and models support larger context windows than others.

Use larger-context models when the agent needs more conversation history, more files, or longer instructions. Use smaller models when the task is narrow and you want speed or lower cost.

Cost Matters

For paid providers:

  • Start with a small agent budget.
  • Review Run Detail after the first few runs.
  • Watch input tokens, output tokens, duration, and estimated cost.
  • Use cheaper models for frequent schedules.

If cost is not meaningful for a local runtime, mark the agent as Free Agent and rely on duration and permission controls instead.

Reliability Matters

If a provider is rate-limited, unavailable, or too slow, switch future agents or migrate the agent through the supported provider-change flow when available.

Do not assume every provider supports every runtime feature. Some settings, tool-call limits, or session-continuity behavior depend on provider type.

Practical Starting Points

For a first project assistant:

  • Pick one provider you can authenticate and test.
  • Pick a model with enough context for the folder size.
  • Use Ask for Approval.
  • Set a small budget if the provider is paid.

For a scheduled report agent:

  • Prefer a stable, cost-effective model.
  • Keep the prompt narrow.
  • Set a max duration and budget.

For local experimentation:

  • Start with Ollama, LM Studio, or Docker Model Runner if you already have models running there.
  • Test model quality manually before connecting schedules or integrations.

Next Step

After choosing the provider and model, configure provider accounts and credentials.