Documentation

Topics Overview Overview Linux macOS Windows VS Code for the Web Raspberry Pi Network Additional Components Uninstall VS Code Tutorial Copilot Quickstart User Interface Personalize VS Code Install Extensions Tips and Tricks Intro Videos Overview Setup Quickstart Overview Language Models Context Tools Agents Customization Trust & Safety Overview Agents Tutorial Agents Window Planning Memory Tools Subagents Local Agents Copilot CLI Cloud Agents Third-Party Agents Overview Chat Sessions Add Context Inline Chat Review Edits Checkpoints Artifacts Panel Debug Chat Interactions Prompt Examples Overview Instructions Prompt Files Custom Agents Agent Skills Language Models MCP Hooks Plugins Context Engineering Customize AI Test-Driven Development Edit Notebooks with AI Test with AI Test Web Apps with Browser Tools Debug with AI MCP Dev Guide OpenTelemetry Monitoring Inline Suggestions Smart Actions Best Practices Security Troubleshooting FAQ Cheat Sheet Settings Reference MCP Configuration Workspace Context Display Language Layout Keyboard Shortcuts Settings Settings Sync Extension Marketplace Extension Runtime Security Themes Profiles Overview Voice Interactions Command Line Interface Telemetry Basic Editing IntelliSense Code Navigation Refactoring Snippets Overview Multi-Root Workspaces Workspace Trust Tasks Debugging Debug Configuration Testing Port Forwarding Integrated Browser Overview Quickstart Staging & Committing Branches & Worktrees Repositories & Remotes Merge Conflicts Collaborate on GitHub Troubleshooting FAQ Getting Started Tutorial Terminal Basics Terminal Profiles Shell Integration Appearance Advanced Overview Enterprise Policies AI Settings Extensions Telemetry Updates Overview JavaScript JSON HTML Emmet CSS, SCSS and Less TypeScript Markdown PowerShell C++ Java PHP Python Julia R Ruby Rust Go T-SQL C# .NET Swift Working with JavaScript Node.js Tutorial Node.js Debugging Deploy Node.js Apps Browser Debugging Angular Tutorial React Tutorial Vue Tutorial Debugging Recipes Performance Profiling Extensions Tutorial Transpiling Editing Refactoring Debugging Quick Start Tutorial Run Python Code Editing Linting Formatting Debugging Environments Testing Python Interactive Django Tutorial FastAPI Tutorial Flask Tutorial Create Containers Deploy Python Apps Python in the Web Settings Reference Getting Started Navigate and Edit Refactoring Formatting and Linting Project Management Build Tools Run and Debug Testing Spring Boot Modernizing Java Apps Application Servers Deploy Java Apps GUI Applications Extensions FAQ Intro Videos GCC on Linux GCC on Windows GCC on Windows Subsystem for Linux Clang on macOS Microsoft C++ on Windows Build with CMake CMake Tools on Linux CMake Quick Start C++ Dev Tools for Copilot Editing and Navigating Debugging Configure Debugging Refactoring Settings Reference Configure IntelliSense Configure IntelliSense for Cross-Compiling FAQ Intro Videos Get Started Navigate and Edit IntelliCode Refactoring Formatting and Linting Project Management Build Tools Package Management Run and Debug Testing FAQ Overview Node.js Python ASP.NET Core Debug Docker Compose Registries Deploy to Azure Choose a Dev Environment Customize Develop with Kubernetes Tips and Tricks Overview Jupyter Notebooks Data Science Tutorial Python Interactive Data Wrangler Quick Start Data Wrangler PyTorch Support Azure Machine Learning Manage Jupyter Kernels Jupyter Notebooks on the Web Data Science in Microsoft Fabric Foundry Toolkit Overview Foundry Toolkit Copilot Tools Create Agents Models Playground Agent Builder Agent Inspector Evaluation Tool Catalog Fine-Tuning (Automated Setup) Fine-Tuning (Project Template) Model Conversion Tracing Profiling (Windows ML) FAQ File Structure Manual Model Conversion Manual Model Conversion on GPU Setup Environment Without Foundry Toolkit Template Project Migrating from Visualizer to Agent Inspector Overview Getting Started Resources View Deployment VS Code for the Web - Azure Containers Azure Kubernetes Service Kubernetes MongoDB Remote Debugging for Node.js Overview SSH Dev Containers Windows Subsystem for Linux GitHub Codespaces VS Code Server Tunnels SSH Tutorial WSL Tutorial Tips and Tricks FAQ Overview Tutorial Attach to Container Create Dev Container Advanced Containers devcontainer.json Dev Container CLI Tips and Tricks FAQ Default Keyboard Shortcuts Default Settings Substitution Variables Tasks Schema

On this page there are 11 sections

AI language models in VS Code

Visual Studio Code offers different built-in language models that are optimized for different tasks. You can also bring your own language model API key to use models from other providers.

For background on how language models work and their key characteristics, see Language models concepts.

This article describes how to change the language model for chat or inline suggestions and how to use your own API key.

Choose the right model for your task

By default, chat uses a base model to provide fast, capable responses for a wide range of tasks, such as coding, summarization, knowledge-based questions, reasoning, and more.

However, you are not limited to using only this model. You can choose from a selection of language models, each with its own particular strengths. As a general guideline, use a fast model for quick edits and simple questions, and a reasoning model for complex refactoring, architectural decisions, or multi-step tasks. For a detailed comparison, see Choosing the right AI model for your task in the GitHub Copilot documentation.

Depending on the agent you are using, the list of available models might be different. For example, in agent mode, the list of models is limited to those that have good support for tool calling.

Note

If you are a Copilot Business or Enterprise user, your administrator needs to enable certain models for your organization by opting in to Editor Preview Features in the Copilot policy settings on GitHub.com.

Change the model for chat conversations

Use the language model picker in the chat input field to change the model that is used for chat conversations and code editing.

Tip

Install the AI Toolkit extension to add more language models to enhance GitHub Copilot capabilities.

For more information, see Change the chat model.

You can further extend the list of available models by using your own language model API key.

If you have a paid Copilot plan, the model picker shows the premium request multiplier for premium models. Learn more about premium requests in the GitHub Copilot documentation.

Configure thinking effort

Some models support configurable thinking effort. Thinking effort controls how much reasoning the model applies to each request. Use a higher effort level for complex tasks like architectural decisions or multi-step debugging, and a lower level for straightforward code generation or simple questions. For background on how thinking and reasoning work, see Thinking and reasoning.

VS Code sets recommended default effort levels based on evaluations and online performance data, and has adaptive reasoning enabled. Adaptive reasoning lets the model dynamically determine when and how much to think based on the complexity of each request. For most use cases, the defaults work well and you don't need to change them.

You can configure the thinking effort directly from the model picker:

Open the model picker in the chat input field and select a reasoning model.
Select the > arrow that appears next to the model name to open the Thinking Effort submenu.

Note
Non-reasoning models, such as GPT-4.1 and GPT-4o, do not show the thinking effort submenu.
Select an effort level.

The model picker label updates to show the selected effort level, for example "Claude Sonnet 4.6 · High". The effort level persists across conversations for the same model.

Note

The github.copilot.chat.anthropic.thinking.effort Open in VS Code Open in VS Code Insiders and github.copilot.chat.responsesApiReasoningEffort Open in VS Code Open in VS Code Insiders settings are deprecated. You should configure thinking effort directly via the language model picker.

Auto model selection

Note

Auto model selection is available as of VS Code release 1.104.

With auto model selection, VS Code automatically selects a model to ensure that you get the optimal performance and reduce rate limits due to excessive usage of particular language models. It detects degraded model performance and uses the best model at that point in time. We continue to improve this feature to pick the most suitable model for your needs.

To use auto model selection, select Auto from the model picker in chat.

Currently, auto chooses between Claude Sonnet 4, GPT-5, GPT-5 mini and other models. If your organization has opted out of certain models, auto will not select those models. If none of these models are available or you run out of premium requests, auto will fall back to a model at 0x multiplier.

Important

Starting April 20, 2026, new sign-ups for Copilot Pro, Copilot Pro+, and student plans are temporarily paused. Additionally, we are tightening weekly usage limits. If you hit a weekly limit and you have premium requests remaining, you can continue using Copilot with auto model selection. See GitHub Copilot usage limits.

Multiplier discounts

When using auto model selection, VS Code uses a variable model multiplier, based on the selected model. If you are a paid user, auto will apply a request discount.

At any time, you can see which model and model multiplier are used by hovering over the chat response.

Manage language models

You can use the language models editor to view all available models, choose which models are shown in the model picker, and add more models by adding from built-in providers or from extension-provided model providers.

To open the Language Models editor, open the model picker in the Chat view and select Manage Models or run the Chat: Manage Language Models command from the Command Palette. The Language Models editor opens by default in a modal overlay on top of the editor area.

The editor lists all models available to you, showing key information such as the model capabilities, context size, billing details, and visibility status. By default, models are grouped by provider, but you can also group them by visibility.

You can search and filter models by using the following options:

Text search with the search box
Provider: @provider:"OpenAI"
Capability: @capability:tools, @capability:vision, @capability:agent
Visibility: @visible:true/false

Customize the model picker

You can customize which models are shown in the model picker by changing the visibility status of models in the Language Models editor. You can show or hide models from any provider.

Hover over a model in the list and select the eye icon to show or hide the model in the model picker.

Bring your own language model key

Note

If you are a Copilot Business or Enterprise user, your administrator can disable the Bring Your Own Language Model Key policy in the Copilot policy settings on GitHub.com.

GitHub Copilot in VS Code comes with a variety of built-in language models that are optimized for different tasks. If you want to use a model that is not available as a built-in model, you can bring your own language model API key (BYOK) to use models from other providers.

Using your own language model API key in VS Code has several benefits:

Model choice: access hundreds of models from different providers, beyond the built-in models.
Experimentation: experiment with new models or features that are not yet available in the built-in models.
Local compute: use your own compute for one of the models already supported in GitHub Copilot or to run models not yet available.
Greater control: by using your own key, you can bypass the standard rate limits and restrictions imposed on the built-in models.

VS Code provides different options to add more models:

Use one of the built-in model providers
Install a language model provider extension from the Visual Studio Marketplace, for example, AI Toolkit for VS Code with Foundry Local

Considerations when using bring your own model key

Only applies to the chat experience and doesn't affect inline suggestions or other AI-powered features in VS Code.
Capabilities are model-dependent and might differ from the built-in models, for example, support for tool calling, vision, or thinking.
The Copilot service API is still used for some tasks, such as sending embeddings, repository indexing, query refinement, intent detection, and side queries.
There is no guarantee that responsible AI filtering is applied to the model's output when using BYOK.

Add a model from a built in provider

VS Code supports several built-in model providers that you can use to add more models to the model picker in chat.

To configure a language model from a built-in provider:

Select Manage Models from the language model picker in the Chat view or run the Chat: Manage Language Models command from the Command Palette.
In the Language Models editor, select Add Models, and then select a model provider from the list.
Enter the provider-specific details, such as the API key or endpoint URL.
Depending on the provider, enter the model details or select a model from the list.

The following screenshot shows the model picker for Ollama running locally, with the Phi-4 model deployed.
You can now select the model from the model picker in chat.

For a model to be available when using agents, it must support tool calling. If the model doesn't support tool calling, it won't be shown in the model picker.

Note

Configuring a custom OpenAI-compatible model is currently only available in VS Code Insiders as of release 1.104. You can also manually add your OpenAI-compatible model configuration in the github.copilot.chat.customOAIModels Open in VS Code Open in VS Code Insiders setting.

Update model provider details

To update the details of a model provider you have configured previously:

Select Manage Models from the language model picker in the Chat view or run the Chat: Manage Language Models command from the Command Palette.
In the Language Models editor, select the gear icon for the model provider you want to update.
Update the provider details, such as the API key or endpoint URL.

Change the model for inline chat

You can configure a default language model for editor inline chat. This enables you to use a different model for inline chat than for chat conversations.

To configure the default model for inline chat, use the inlineChat.defaultModel Open in VS Code Open in VS Code Insiders setting. The setting lists all available models from the model picker.

If you change the model during an inline chat session, the selection persists for the remainder of the session. After you reload VS Code, the model resets to the value specified in the inlineChat.defaultModel Open in VS Code Open in VS Code Insiders setting.

Change the model for inline suggestions

To change the language model that is used for generating inline suggestions in the editor:

Select Configure Inline Suggestions... from the Chat menu in the VS Code title bar.
Select Change Completions Model..., and then select one of the models from the list.

Note

The models that are available for inline suggestions might evolve over time as we add support for more models.

Frequently asked questions

How do I enable bring your own model key for Copilot Business or Copilot Enterprise?

If you are a Copilot Business or Enterprise user, your organization administrator must enable the Bring Your Own Language Model Key in VS Code policy in the Copilot policy settings on GitHub.com. After the policy is enabled, you can use your own API keys to add models, just like individual plan users.

Can I use locally hosted models with Copilot in VS Code?

You can use locally hosted models in chat by using bring your own model key (BYOK) and using a model provider that supports connecting to a local model. You have different options to connect to a local model:

Use a built-in model provider that supports local models
Install an extension from the Visual Studio Marketplace, for example, AI Toolkit for VS Code with Foundry Local

Currently, you cannot connect to a local model for inline suggestions. VS Code provides an extension API InlineCompletionItemProvider that enables extensions to contribute a custom completion provider. You can get started with our Inline Completions sample.

Note

Currently, using a locally hosted models still requires the Copilot service for some tasks. Therefore, your GitHub account needs to have access to a Copilot plan (for example, Copilot Free) and you need to be online. This requirement might change in a future release.

Can I use a local model without an internet connection?

Currently, using a local model requires access to the Copilot service and therefore requires you to be online. This requirement might change in a future release.

Can I use a local model without a Copilot plan?

No, currently you need to have access to a Copilot plan (for example, Copilot Free) to use a local model. This requirement might change in a future release.

Related resources

5/13/2026

Documentation

On this page there are 11 sectionsOn this page

AI language models in VS Code

Choose the right model for your task

Change the model for chat conversations

Configure thinking effort

Auto model selection

Multiplier discounts

Manage language models

Customize the model picker

Bring your own language model key

Considerations when using bring your own model key

Add a model from a built in provider

Update model provider details

Change the model for inline chat

Change the model for inline suggestions

Frequently asked questions

How do I enable bring your own model key for Copilot Business or Copilot Enterprise?

Can I use locally hosted models with Copilot in VS Code?

Can I use a local model without an internet connection?

Can I use a local model without a Copilot plan?

Related resources

On this page there are 11 sections