Build your own VS Code extension

This blogpost is everything you need to get started with building your very own VS Code extension.

Build your own VS Code extension
VS Code extension Tutorial

Visual Studio Code is a powerful code editor known for its extensibility. Users can install various extensions to fit their needs, and if something isn't available, they can build their own extensions to enhance productivity. To do this, an extension author needs to be familiar with the VS Code API and the development workflow for creating extensions.

In this blog post, we will show you the framework we used to build an extension using a webview to display anything you like. We will guide you through the development of an extension, where you can modify the project to do specific tasks, and finally, we will show you how to officially publish an extension on the VS Code Marketplace.

The published VS Code extension
The published VS Code extension
Extension listet in VS Code
Extension listed in VS Code

Preparation

The VS Code extension will be a TypeScript project, and for the webview, we will use Svelte as the frontend framework. We have prepared a VS Code extension development template so that you can clone this repository and get started right away:

GitHub repository: https://github.com/codesphere-cloud/vscode-extension-template

git clone https://github.com/codesphere-cloud/vscode-extension-template
cd vscode-extension-template
npm i

This will download the VS Code template and install all the dependencies you need for developing your extension.

That's it. You are ready to test and build your extension. First, we want to make sure that the tutorial extension works. When developing an extension, we want to test every change right away, and testing should be a smooth workflow.

We can open a debug VS Code window called Extension-Development-Host to test our extension. First, we run a command to compile our extension while working on it.

npm run watch

Tip: open a new Terminal with this Keyboard combination: Ctrl + Shift + `

This compiles our extension every time we are changing our project. now we open either src/SidebarProvider.ts or src/extension.ts file and push F5 to open a new extension-developing-host. an this is the result:

Activated tutorial extenion in an extension-development-host
Activated tutorial extension in an extension-development-host

The template extension we provide adds a view to the activity bar on the side with a webview a user can interact with. We kept this tutorial template as simple as possible, so you can follow along with this tutorial and get to know the project better while adding new features to it.

This tutorial focuses on creating a webview in the side panel of VS Code, but you can do much more than that. In the last section of this blog post, we will list a bunch of links to useful resources for further reference.

Now, we will build a GitHub Copilot-style chat interface together, but with a locally installed Llama.cpp.

Project idea

The idea is to replicate the GitHub Co-Pilot chat interface on VS Code but with a locally self-hosted Llama.cpp instance. The use case for this would be if you are working on a project which data should not be exposed to other services. Such an approach of using AI would be data compliant.

We need to set up our local Llama.cpp instance and use the server example of this repository to host our own OpenAI compatible chat completion, so we have a nice to use API. Now, we guide you through the process of setting up your locally installed Llama.cpp instance.

Firstly, you need to clone the GitHub repository of llama.cpp and build the project with make . Make sure that you are in the root directory of the extension:

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp/
make

Depending on your local resources this command might take while. After completion we can continue to set up the OpenAI compatible server:

cd examples/server
make server

Like before this might take a while. After building the server we need to download our model we want to use. The important thing is to use models in the .gguf format. You can browse avalable models on HuggingFace: https://huggingface.co/models?library=gguf&sort=trending

We chose the qwen2-7b-instruct-q8_0.gguf model: https://huggingface.co/Qwen/Qwen2-7B-Instruct-GGUF/tree/main

It is 8 GB and to download it takes a while. You can download it to your models/ directory in your root:

cd models/
wget https://huggingface.co/Qwen/Qwen2-7B-Instruct-GGUF/resolve/main/qwen2-7b-instruct-q8_0.gguf?download=true

Simply replace this link with the link to the model you want to use. Now, you need to wait until this model is downloaded.

When downloading is finished you can use this model right away. Execute this command in the root directory of llama.cpp:

chmod a+x ./server
./server -m models/qwen2-7b-instruct-q8_0.gguf -c 2048

And again, if you chose an other model replace the path to the model to the right file.

You can now use the Llama.cpp interface on localhost:8080

Llama.cpp server on localhost
Llama.cpp server on localhost

You can for sure optimize the performance of the model running now on your localhost, but lets stick with this for this tutorial because we want to build a VS Code extension.

Since we are using the API we wont need that UI. Here is a documentation about the API: https://github.com/ggerganov/llama.cpp/tree/master/examples/server#API-Endpoints

Features to add

Inside our sidepanel extension we want to add these features:

  • start your AI model out of VS Code
  • place an input field for your prompts
  • use the completion endpoint of our model to generate a response
  • display the response to our webview

We want to keep this extension simple and if you want to you can take the challenge and improve it as you like! 🤗

Creating a Button to Start and Stop Llama.cpp

Let's create buttons so that we don't have to type the command in the terminal every time we want to use our Llama instance. Additionally, it would be convenient to have a button to stop the model from running, as we don't want to manually find out the PID in order to kill the process.

The idea is to add icons to the top-right corner. Perhaps you're familiar with some extensions that have these small icons in the top-right corner (e.g., GitHub Co-Pilot).

Icons in the right top corner
Icons in the right top corner

To add these buttons, we need to modify our package.json file first. The package.json file in our project is an important file filled with various metadata about your extension. This file serves as one of the main entry points to your extension. Inside the package.json, there is a field called menus. We will add two additional icons to our existing icon by appending these objects to that array.

"menus": {
      "view/title": [
        {
          "command": "tutorial.tutorial",
          "when": "view == tutorial-sidebar",
          "group": "navigation"
        },
        {
          "command": "tutorial.start",
          "when": "view == tutorial-sidebar && !tutorial.modelIsRunning",
          "group": "navigation"
        },
        {
          "command": "tutorial.stop",
          "when": "view == tutorial-sidebar && tutorial.modelIsRunning",
          "group": "navigation"
        }
      ]
    },

In the when attribute, you can specify conditions for when the icons should be displayed. We can control variables through the VS Code API. We want the start icon to be displayed when the model is not running, and when it is running, we want to display the stop button.

We aim to use small icons for these commands. There is a variety of internal icons available for use: Icon Listing

In our package.json file, there is a commands field where we can add such icons. You can simply copy this snippet and replace it with the existing one:

"commands": [
      {
        "command": "tutorial.tutorial",
        "title": "tutorial",
        "icon": "$(log-out)",
        "category": "Tutorial command"
      },
      {
        "command": "tutorial.start",
        "title": "Start",
        "icon": "$(play)",
        "category": "Tutorial command"
      },
      {
        "command": "tutorial.stop",
        "title": "Stop",
        "icon": "$(stop)",
        "category": "Tutorial command"
      }
    ]

Now we save with Ctrl + S and navigate to our Extension Development Host, then press Ctrl + R to refresh the window. You should now see the icons displayed in the top-right corner.

Icons in the top corner
Icons in the top corner

Perhaps you noticed that the stop button is not displayed. This is because of the condition we provided in the package.json.

Now let's code the commands that are executed when clicking on these icons. We need to open the src/extension.ts file to register our commands with the extension. We register the commands in the activation function of this file. In our package.json, there is a field named activationEvents where we can specify when the activation function is executed.

"activationEvents": [
    "onStartupFinished"
  ],

The activation event is set to onStartupFinished. This means the activation function is executed when VS Code is fully loaded. You can copy this code snippet and paste it into extension.ts:

context.subscriptions.push(
  vscode.commands.registerCommand('tutorial.start', async () => {
    vscode.commands.executeCommand('setContext', 'tutorial.modelIsRunning', true);
  })
);

context.subscriptions.push(
  vscode.commands.registerCommand('tutorial.stop', async () => {
    vscode.commands.executeCommand('setContext', 'tutorial.modelIsRunning', false);
  })
);

Now, when you reload the Extension Development Host with Ctrl + R, you can test these icons, and you will notice that after clicking, the icon changes. This is the result of using the condition. We can change the context for the modelIsRunning variable during the runtime of your extension.

Now, we actually want to add functionality to our buttons. We can do that by simply extending the registerCommand() code block with the logic we want to have.

We want to start our LLama.cpp instance when clicking start and stop it when clicking stop. We need to execute these bash commands. You can again simply copy and paste this code snippet into the code you already have:

a) start Llama.cpp on localhost:8080:

context.subscriptions.push(
  vscode.commands.registerCommand('tutorial.start', async () => {
  
    const extension = vscode.extensions.getExtension('Tutorial.tutorial')?.extensionPath;

    // Your Llama.cpp folder has to be inside the extension folder
    const serverPath = path.join(extension, 'llama.cpp', 'server');
    const modelPath = path.join(extension, 'llama.cpp', 'models', 'qwen2-7b-instruct-q8_0.gguf');
    const bashcommand = `${serverPath} -m ${modelPath} -c 2048`;

    const test = 'echo hi';

    exec (bashcommand, (error, stdout, stderr) => {	
        if (error) {
            console.error(`exec error: ${error}`);
            return;
        }

        if (stderr) {
            console.error(`stderr: ${stderr}`);
            return;
        }

        console.log(`stdout: ${stdout}`);
    });
    vscode.commands.executeCommand('setContext', 'tutorial.modelIsRunning', true);
  })
);

Note: We will save the PID (process ID) of the process to our extension's globalStorage to ensure that we can kill this process with our extension's stop button. The globalStorage can be used for state management in our extension.

b) stop Llama.cpp server:

context.subscriptions.push(
    vscode.commands.registerCommand('tutorial.stop', async () => {
        const pid = context.globalState.get("codesphere.runningPID");
        console.log(pid)

        const bashcommand = `kill ${pid as number + 1}`;

        const childProcess = exec(bashcommand, (error, stdout, stderr) => {
            if (error) {
                console.error(`exec error: ${error}`);
                return;
            }

            if (stderr) {
                console.error(`stderr: ${stderr}`);
                return;
            }

            console.log(`stdout: ${stdout}`);
        });

        context.globalState.update("codesphere.runningPID", "");
        vscode.commands.executeCommand('setContext', 'tutorial.modelIsRunning', false);
    })
);

We loaded the PID from our globalStorage and used it to terminate the correct process. And there we have it! Now we can start and stop our local Llama.cpp model as needed.

Tip: We actually have a webview developer console inside the extension development host to debug our extension. You can open it as follows:

  • Ctrl + Shift + P
  • select Developer: Open Webview Developer Tools
Developer Console inside VS Code
Developer Console inside VS Code

Place an input field to our webview

The user needs to place the prompt somewhere. Let's create the UI for that within our Svelte file. For that, we simply need HTML, CSS, and JavaScript for the functionality.

In our Svelte file, we have a <script> tag for JavaScript code and a <style> tag for CSS. In the rest of the file, you can place the HTML code. Let's keep it simple, and if you want, you can practice your UI/UX design by enhancing this code as you like.

Add the following two code snippets to src/webviews/components:

a) CSS

.prompt-container {
        display: flex;
        align-items: center;
        padding: 10px;
        width: 100%;
        max-width: 600px;
        margin: 0 auto;
        position: fixed;
        bottom: 0;
        left: 0;
    }

    .prompt-input {
        flex: 1;
        padding: 10px;
        border: none;
        outline: none;
        font-size: 16px;
    }

    .prompt-button {
        display: flex;
        justify-content: center;
        align-items: center;
        padding: 10px 20px;
        font-size: 16px;
        border: none;
        cursor: pointer;
        border-radius: 4px;
        margin-left: 10px;
        width: 50px;
        height: 50px;
        align-self: end;  
    }

b) HTML

<div class="prompt-container">
        <textarea type="text" placeholder="Ask your model something about your code" wrap="hard" class="prompt-input"></textarea>
        <button class="prompt-button">
            <svg xmlns="http://www.w3.org/2000/svg" width="100px" height="100px" viewBox="0 0 16 16">
                <path fill="currentColor" fill-rule="evenodd" d="m4.25 3l1.166-.624l8 5.333v1.248l-8 5.334l-1.166-.624zm1.5 1.401v7.864l5.898-3.932z" clip-rule="evenodd"/>
            </svg>
        </button>
    </div>

And this is the result:

Added input field
Added input field

As you can see, we didn't add a color to our button, but it's blue. That's because in our project, we have some standard VS Code styles imported. You can find the styles in the media directory of our project.

Split the webview to two different sections

It's time to split up our webview since the HelloWorld template has nothing to do with our chat interface. We can have multiple webviews in our sidebar simultaneously, like an accordion, where you can open and close the different webviews as you like.

"views": {
      "tutorial-sidebar-view": [
        {
          "type": "webview",
          "id": "tutorial-sidebar",
          "name": "Tutoial",
          "icon": "media/tutorial.svg",
          "contextualTitle": "Tutorial"
        },
        {
          "type": "webview",
          "id": "tutorial-chat",
          "name": "Chat",
          "icon": "media/tutorial.svg",
          "contextualTitle": "Chat"
        }
      ]
    },

With that simple trick we created a second webview window inside our sidebar:

Second window in the sidebar
Second window in the sidebar

The content in our new chat window is loading infinitely because we haven't registered content for that window in our extension. Let's change that by adding code in src/extension.ts:

const chatProvider = new ChatProvider(context.extensionUri, context);
context.subscriptions.push(
  vscode.window.registerWebviewViewProvider(
  "tutorial-chat",
  chatProvider
  )
);

Now, when you refresh the extension development host, you will see that the content is loaded into the second webview. The reason this works without adding extra files is that we provided the necessary files in the template. Here is a list of the files you need to add to register this new webview:

  • ChatProvider.ts in /src
  • Chat.svelte in webviews/components
  • chat.ts in webview/pages

If you are about to create another webview and you are using our SidebarProvider.ts as a template for that, you need to change the compiled JavaScript and CSS files to the correct ones.

const styleResetUri = webview.asWebviewUri(
      vscode.Uri.joinPath(this._extensionUri, "media", "reset.css")
    );
    const scriptUri = webview.asWebviewUri(
      // change here
      vscode.Uri.joinPath(this._extensionUri, "out", "compiled/chat.js")
    );
    const styleMainUri = webview.asWebviewUri(
      // change here
      vscode.Uri.joinPath(this._extensionUri, "out", "compiled/chat.css")
    );
    const styleVSCodeUri = webview.asWebviewUri(
      vscode.Uri.joinPath(this._extensionUri, "media", "vscode.css")
    );

Add functionality to our chat-interface

Now we have our UI, but we need the logic for handling our prompts and displaying the response. In order to do that, we need to send our prompt to our model via the API and send the response to our webview so that we can display it. It follows the same pattern when implementing logic for our extension:

  1. Send data from the webview to our extension via the postMessage method.
  2. Handle the message within our extension's code (e.g., ChatProvider.ts) inside the switch-case block.
  3. Send back data to our webview with the postMessage method.

First, we define a function to forward our prompt to the extension and add this prompt to our chat interface. You can replace the someMessage() function with the following code:

function someMessage() {
    // Clear the textarea
    document.getElementById('promptInput').value = '';

    // add textblock to chat interface
    const chat = document.createElement('div');
    let randomId = generateRandomId();
    chat.innerHTML = `<div style="  background-color: #00BCFF; 
                                    color: black; 
                                    padding: 10px; 
                                    margin: 10px; 
                                    border-radius: 10px; 
                                    display: inline-block;">${prompt}</div>`;
    document.getElementById('chatContainer').appendChild(chat);

    // Create response div
    const responseDiv = document.createElement('div');
    // styling
    responseDiv.style = "background-color: #6F40D3; color: black; padding: 10px; margin: 10px; border-radius: 10px; display: inline-block;";
    responseDiv.id = randomId;
    document.getElementById('chatContainer').appendChild(responseDiv);

    vscode.postMessage({
        type: 'prompt',
        value: {
            prompt: prompt,
            id: randomId
        }
    });
}

The postMessage() method will send a message to our extension, and the type property of the message body will serve as the identifier for which case in our switch-case code block will be executed within the ChatProvider.ts.

webviewView.webview.onDidReceiveMessage(async (data) => {
      switch (data.type) {
        case "prompt": {
          if (!data.value) {
            return;
          }

          let prompt = data.value.prompt;
          let id = data.value.id;

          const Test = async (prompt: string, id: any) => {
            try {
              let response = await fetch("http://127.0.0.1:8080/completion", {
                method: 'POST',
                headers: {
                  'Content-Type': 'application/json',
                },
                body: JSON.stringify({
                  prompt,
                  n_predict: 30,
                  stream: true,
                }),
              });

              if (!response.ok) {
                throw new Error(`HTTP error! status: ${response.status}`);
              }

              if (!response.body) {
                throw new Error('Response body is null');
              }

              const reader = response.body.getReader();
              const decoder = new TextDecoder();
              let result = '';

              while (true) {
                const { done, value } = await reader.read();
                if (done) {
                  break;
                }
                result += decoder.decode(value, { stream: true });

                const lines = result.split('\n');
                for (const line of lines) {
                  if (line.startsWith('data:')) {
                    try {
                      const json = JSON.parse(line.substring(5).trim());
                      console.log(json.content);
                      let token = json.content;
                      this._view?.webview.postMessage({
                        type: "response",
                        value: token,
                        id: id
                      });
                    } catch (e) {
                      console.error('Error parsing JSON:', e);
                    }
                  }
                }
                result = lines[lines.length - 1];
              }

              this._view?.webview.postMessage({
                type: "response",
                value: "done",
                id: id
              });
            } catch (error) {
              console.error('Error:', error);
            }
          };

          await Test(prompt, id);

          break;
        }
      }
    });

This will handle our request to our running LLM. We call it in stream mode, which means that every token is sent separately back. Every time we get a token, we send this token back to our extension via the postMessage method so that we can add it to our chat interface within the webview's code.

onMount(() => {
    window.addEventListener('message', event => {
        const message = event.data; // The JSON data our extension sent
        console.log(`message received: ${JSON.stringify(message)}`);
        switch (message.type) {
            case 'response':
                if (message.value === 'done') {
                    // Mark the end of the response
                    break;
                }
                let responseDiv = document.getElementById(message.id)
                responseDiv.innerHTML += message.value;
                
                break;
        }
    });
});

Now we can have a chat with our locally hosted LLM.

VS Code extension LLM chat
VS Code extension LLM chat

Publish an extension

When we decide that our extension is ready to be published, we can do so using the vsce CLI tool for managing VS Code extensions.

Fist we need to install the vsce CLI:

npm install -g @vscode/vsce

Packaging your project to a .vsix file is the easiest way to share your extension. You need to run this command in the root directory of your project:

vsce pack

However, since it's not always trustworthy to share files directly, we might want to publish our extension in the official VS Code Extension Marketplace. Since Microsoft's official documentation for publishing extensions is clear, we will just reference their documentation here: Publishing Extension.

Once you've completed everything and your extension is accepted, it will be listed and available for other users to install.

This is a list of links which were useful to us when building our own VS Code extension (will be updated when finding good reference in the ):

Conclusion

This tutorial serves as a comprehensive guide for creating a Visual Studio Code extension from scratch. We utilized the template provided to build an extension that interacts with a running Llama.cpp instance on localhost:8080. While this tutorial covers a lot, there are many more possibilities to explore when creating VS Code extensions, but covering all of them would be too much for one blog post.

If you're interested in building your own VS Code extension, you can join our Discord Server to connect with fellow software developers, Codesphere users, and the Codesphere team: Discord Server.

Furthermore, if you want to improve this tutorial extension, there are plenty of things you can do! It's a great project to practice your coding skills and your VS Code extension creation skills. Here are some ideas for improvement:

  • Persistent Chats: Implement a feature to save chat histories or conversations between sessions.
  • Creating a HuggingFace LLM Browser as a Separate Webview to Download Models: Develop a separate webview interface where users can browse and download models from HuggingFace.
  • Fine-Tune Prompt Settings or Create a UI for User Settings: Allow users to fine-tune prompt settings or provide a user-friendly UI for configuring settings related to the extension.
  • Improve UI/UX: Enhance the user interface and experience of the extension to make it more intuitive and visually appealing.
  • User-Friendly Installation of Llama.cpp: Simplify the installation process of Llama.cpp for users, possibly by automating or providing clear instructions within the extension.
  • Improve Model Performance Depending on User's Hardware: Implement optimizations to enhance the performance of the model, taking into account the hardware specifications of the user's machine.
  • Implement Error Handling: Develop mechanisms to handle errors gracefully, providing meaningful error messages and guiding users on how to resolve issues when they occur.
  • Cross-Platform Compatibility: Ensure that the extension functions smoothly on the Mac, Linux, and Windows operating systems.