Set Up Hugging Face API for Open Source LLM in Copilot Studio

Introduction

Open-source Large Language Models (LLMs) have become popular for building AI-driven solutions due to their flexibility and cost-effectiveness. Integrating these LLMs in Microsoft Copilot Studio can be a game-changer, helping developers enhance their apps with powerful AI capabilities. In this blog, we'll explore how you can assess and use open-source LLMs in Copilot Studio, making it easy to build smarter, more efficient apps.

In this article, we are going to connect Hugging Face Api in Copilot, where we can get a number of open-sources LLMs related to a variety of work.

Setup Hugging Face API

To log into Hugging Face, go to https://huggingface.co/ and log in using Google or your new username and password.

Once you log in, you will land on the page where you can see the model top menu.

Model

Click on the model tab, and here you can see all available models and their features if you go to the model details page. You can use any of those models where they are available. That’s the beauty of the Hugging Face platform.

The Next step is to set up Hugging Face API. For that, On the left side, you can see the User Profile. Click on user profile and it take you on profile page, click on setting. On the settings page, you can see all those options related to setting and generating tokens for API.

Hugging Face API

In the left side menu, you can see the assess token. Click on it to create a new token. There, you can see options to create and manage tokens. While creating, select Read Options in that list. And give the token name and create.

Read Options

You will find tokens created on that page.

Getting Model API Url

On top menu you will see Models click on model and you will get list of models, select the model you want to use. In my case, I selected “meta-llama/Llama-3.2-1b”.

Model API Url

If you see the "Inference API (Serverless)" option on the model's details page under deployment, you can use this model with HTTP calls. If this option isn’t available, you’ll need to set up the model on a dedicated server, which can be expensive and may require a commercial license. For testing, the serverless option is simpler and more cost-effective.

Inference API

To get the API URL and code, select “Inference API (serverless)” on the deployment page. You’ll find the code and the API URL there. In our case, we only need the API URL and header since we’ll be calling this model using Copilot Studio.

API URL

Now, we are ready with API and tokens to integrate with Copilot Studio.

Setup Copilot Studio

Go to URL https://copilotstudio.microsoft.com/ and create a new copilot. If you don’t know how to set it up, please read my previous blogs.

Then, create a topic according to your requirement and add a step Send HTTP request under the advance option.

Send HTTP request

Now, we will try to connect Hugging Face to open-source LLM using Hugging Face LLM API.

In the HTTP request step, select method “Post” and response type “String” and then click on the edit option for “Headers and body”.

Select body content type Json and add this Json data.

{"inputs": "Can you please let us know more details about your "}

And save it. Here input is question you want to ask from LLM, it could be your prompt as well.

Then, under the header section, add the key “Authorization” and value the access token you created in Hugging Face in the first step.

Authorization

Create a variable that can hold the response, set the type as “Any” and you can use it further in your flow as required. Now, you are all set to call LLM using Copilot Studio.

Conclusion

This article explains

  1. Setting up a Hugging Face Profile: It guides you through creating a Hugging Face account and generating an authentication token, which is needed for API access.
  2. Accessing LLM Model API Endpoints: It covers how to find and use API endpoints for different LLM models available on Hugging Face, allowing you to interact with these models via HTTP calls.
  3. Calling LLM Using Copilot Studio: It demonstrates how to integrate and call the LLM within Copilot Studio, using the generated token and API URL to enable seamless communication with the model.