How to Use Llama 3.1 with Continue

If you haven’t already installed Continue, you can do that here for VS Code or here for JetBrains. For more general information on customizing Continue, read our customization docs. Below we share some of the easiest ways to get up and running, depending on your use-case.

How to Use Llama 3.1 with Ollama

Ollama is the fastest way to get up and running with local language models. We recommend trying Llama 3.1 8b, which is impressive for its size and will perform well on most hardware.

Download Ollama here (it should walk you through the rest of these steps)
Open a terminal and run ollama run llama3.1:8b
Change your Continue config file like this:

YAML
JSON

config.yaml

models:
  - name: Llama 3.1 8b
    provider: ollama
    model: llama3.1-8b

config.json

{
  "models": [
    { "title": "Llama 3.1 8b", "provider": "ollama", "model": "llama3.1-8b" }
  ]
}

How to Use Llama 3.1 with Groq

Check if your chosen model is still supported by referring to the model documentation. If a model has been deprecated, you may encounter a 404 error when attempting to use it.

Groq provides the fastest available inference for open-source language models, including the entire Llama 3.1 family.

Obtain an API key here
Update your Continue config file like this:

YAML
JSON

config.yaml

models:
  - name: Llama 3.3 70b Versatile
    provider: groq
    model: llama-3.3-70b-versatile
    apiKey: <YOUR_GROQ_API_KEY>

config.json

{
  "models": [
    {
      "title": "Llama 3.3 70b Versatile",
      "provider": "groq",
      "model": "llama-3.3-70b-versatile",
      "apiKey": "<YOUR_GROQ_API_KEY>"
    }
  ]
}

How to Use Llama 3.1 with Together AI

Together AI provides fast and reliable inference of open-source models. You’ll be able to run the 405b model with good speed.

Create an account here
Copy your API key that appears on the welcome screen
Update your Continue config file like this:

YAML
JSON

config.yaml

models:
  - name: Llama 3.1 405b
    provider: together
    model: llama3.1-405b
    apiKey: <YOUR_TOGETHER_API_KEY>

config.json

{
  "models": [
    {
      "title": "Llama 3.1 405b",
      "provider": "together",
      "model": "llama3.1-405b",
      "apiKey": "<YOUR_TOGETHER_API_KEY>"
    }
  ]
}

How to Use Llama 3.1 with Replicate

Replicate makes it easy to host and run open-source AI with an API.

Get your Replicate API key here
Change your Continue config file like this:

YAML
JSON

config.yaml

models:
  - name: Llama 3.1 405b
    provider: replicate
    model: llama3.1-405b
    apiKey: <YOUR_REPLICATE_API_KEY>

config.json

{
  "models": [
    {
      "title": "Llama 3.1 405b",
      "provider": "replicate",
      "model": "llama3.1-405b",
      "apiKey": "<YOUR_REPLICATE_API_KEY>"
    }
  ]
}

How to Use Llama 3.1 with SambaNova

SambaNova Cloud provides world record Llama3.1 70B/405B serving.

Create an account here
Copy your API key
Update your Continue config file like this:

YAML
JSON

config.yaml

models:
  - name: SambaNova Llama 3.1 405B
    provider: sambanova
    model: llama3.1-405b
    apiKey: <YOUR_SAMBA_API_KEY>

config.json

{
  "models": [
    {
      "title": "SambaNova Llama 3.1 405B",
      "provider": "sambanova",
      "model": "llama3.1-405b",
      "apiKey": "<YOUR_SAMBA_API_KEY>"
    }
  ]
}

How to Use Llama 3.1 and 3.3 with Cerebras Inference

Check if your chosen model is still supported by referring to the model status. If a model has been deprecated, you may encounter a 404 error when attempting to use it.

Cerebras Inference uses specialized silicon to provides fast inference for the Llama3.1 8B and Llama3.3 70B.

Create an account in the portal here.
Create and copy the API key for use in Continue.
Update your Continue config file:

YAML
JSON

config.yaml

models:
  - name: Cerebras Llama 3.3 70B
    provider: cerebras
    model: llama3.3-70b
    apiKey: <YOUR_CEREBRAS_API_KEY>

config.json

{
  "models": [
    {
      "title": "Cerebras Llama 3.3 70B",
      "provider": "cerebras",
      "model": "llama3.3-70b",
      "apiKey": "<YOUR_CEREBRAS_API_KEY>"
    }
  ]
}

Guides

​How to Use Llama 3.1 with Ollama

​How to Use Llama 3.1 with Groq

​How to Use Llama 3.1 with Together AI

​How to Use Llama 3.1 with Replicate

​How to Use Llama 3.1 with SambaNova

​How to Use Llama 3.1 and 3.3 with Cerebras Inference

How to Use Llama 3.1 with Ollama

How to Use Llama 3.1 with Groq

How to Use Llama 3.1 with Together AI

How to Use Llama 3.1 with Replicate

How to Use Llama 3.1 with SambaNova

How to Use Llama 3.1 and 3.3 with Cerebras Inference