Harnessing the Power of Function Calling with Local LLMs using Ollama

Introduction

The usage of large language models (LLMs) has revolutionized the way we approach complex tasks in AI, providing powerful tools for natural language understanding, generation, and more. Among the collection of LLMs available, Meta's Latest Llama 3.1 stands out as a robust model for various natural language processing (NLP) applications. Coupled with Microsoft's Semantic Kernel, developers now have a flexible framework to integrate these models into a wide range of applications, from chatbots to data analysis tools etc..

In this article, we'll explore the exciting capabilities of function calling using a local instance of Llama 3.1 hosted in ollama with Semantic Kernel. By leveraging this combination, developers can harness the full potential of Llama 3.1 while maintaining control over their data and computational resources with out using any cloud services. We’ll walk through the process of setting up a local LLM environment, integrating it with Semantic Kernel, and implementing function calling to perform specific tasks.

Why Use a Local LLM with C# Semantic Kernel?

Running LLMs locally brings several key advantages:

  1. Data Security: Your data remains within your local environment, ensuring privacy and compliance with data protection regulations.
  2. Customization: You can select various open source models that better suit specific use cases. SLM model like PHI3 for simple use case like summarization.
  3. Cost Savings: Avoiding the need for external APIs reduces ongoing costs.
  4. Low Latency: Local execution minimizes response times, improving performance for real-time applications.

The C# Semantic Kernel library provides a structured framework for integrating LLMs into your applications. It simplifies the process of calling functions, managing context, and orchestrating complex workflows, making it easier to deploy Llama 3.1 in real-world scenarios.

Setting Up Llama 3.1 Locally with Ollama

To begin function calling with a local instance of Llama 3.1 using C# Semantic Kernel, you'll need to follow a few setup steps. 

First, you’ll need to set up the Llama 3.1 model from Ollama on your local machine. This involves:

  • Hardware Preparation: Ensure your machine has the necessary computational power, preferably a GPU.
  • Environment Configuration: Install the required software and dependencies. Follow the instructions provided by Ollama to download and run Llama 3.1 locally.

Step-by-Step Guide: Implementing Function Calling with Llama 3.1 Using C# Semantic Kernel


Step 1. Setup Your Project

  1. Create a New C# Console Application:

    • Open your preferred IDE (like Visual Studio or Visual Studio Code).
    • Create a new Console Application project.
  2. Install Required NuGet Packages:Install the following NuGet packages via the NuGet Package Manager or using the .NET CLI\

    dotnet add package Microsoft.Extensions.Configuration
    dotnet add package Microsoft.SemanticKernel
    dotnet add package Microsoft.SemanticKernel.Connectors.OpenAI
    
  3. Configure User Secrets: Set up user secrets to securely store your model ID, base URL, and weather API key. This can be done via the .NET CLI:

    dotnet user-secrets init
    dotnet user-secrets set "modelId" "your-model-id"
    dotnet user-secrets set "baseUrl" "your-base-url"
    dotnet user-secrets set"weatherApiKey" "your-weather-api-key"
    
    Replace "your-model-id", "your-base-url", and "your-weather-api-key" with your actual values. you can download free weather api key from weatherapi.com

Step 2. Configure the Application

  1. Load Configuration: Start by loading the configuration values from user secrets.
    var config = new ConfigurationBuilder() .AddUserSecrets(Assembly.GetExecutingAssembly(), true) .Build();
    var modelId = config["modelId"]; 
    var baseUrl = config["baseUrl"]; 
    var weatherApiKey = config["weatherApiKey"];
    
  2. Set Up the HTTP Client: Initialize an HttpClient with a timeout setting to manage the requests to your local Llama 3.1 model.
    var httpClient = new HttpClient { Timeout = TimeSpan.FromMinutes(2) };
    
  3. Build the Semantic Kernel: Create and configure the Semantic Kernel using the KernelBuilder

    var builder = Kernel.CreateBuilder() .AddOpenAIChatCompletion(modelId: modelId!, apiKey: null, endpoint: new Uri(baseUrl!), httpClient: httpClient); 
    var kernel = builder.Build();
    

Step 3. Set Up the Chat Agent and Add Plugins

  1. Define Agent Instructions (Prompt) and Settings:
    var HostName = "AI Assistant"; 
    var HostInstructions = @"You are a helpful Assistant to answer their queries. Be respectful and precise in answering the queries. If the queries are related to getting the time or weather, Use the available plugin functions to get the answer.";
    var settings = new OpenAIPromptExecutionSettings() { Temperature = 0.0, ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions };
    ChatCompletionAgent agent = new()
    { Instructions = HostInstructions, Name = HostName, Kernel = kernel, Arguments = new(settings), };
  2. Add Plugins to the Agent: In this section, we will dive into the functionality of two key plugins used in our application: the WeatherPlugin and the LocalTimePlugin. These plugins are designed to handle specific tasks—retrieving weather details and getting the current local time—and they are integrated into the Semantic Kernel to be invoked when needed by the AI Assistant.
  3. WeatherPlugin: The WeatherPlugin is a class that interfaces with a weather API to fetch and return weather details for a specified location. Here’s a breakdown of how it works:

Functionality:

  • The plugin takes in a location name as input from the prompt and queries a weather API to retrieve the current weather conditions for that location.
  • It uses an HTTP client to send a GET request to the API, incorporating the provided location and the API key (which is securely stored using user secrets).

Code Explanation

public sealed class WeatherPlugin(string apiKey)
{
    HttpClient client = new HttpClient();

    [KernelFunction, Description("Gets the weather details of a given location")]
    [return: Description("Weather details")]
    public async Task<string> GetWeatherAsync([Description("name of the location")] string locationName)
    {
        string url = $"http://api.weatherapi.com/v1/current.json?key={apiKey}&q={locationName}&aqi=no";

        HttpResponseMessage response = await client.GetAsync(url);
        response.EnsureSuccessStatusCode();
        string responseBody = await response.Content.ReadAsStringAsync();

        return responseBody;
    }
}
  • Constructor (WeatherPlugin): Takes an API key as a parameter, which is used to authenticate requests to the weather API.
  • HttpClient: An instance of HttpClient is created to manage HTTP requests and responses.
  • GetWeatherAsync Method: This is the core method of the plugin, decorated with [KernelFunction], indicating that it can be called by the Semantic Kernel:
    • It constructs the API request URL using the provided location name and API key.
    • The method then sends an asynchronous GET request to the weather API.
    • Upon receiving a successful response, it reads the content (which contains the weather details) and returns it as a string.
    • This plugin is designed to be easily invoked by the Semantic Kernel whenever a user query involves weather information, making it a valuable tool for real-time weather data retrieval.
  • LocalTimePlugin

The LocalTimePlugin is a simpler plugin compared to the WeatherPlugin. Its sole purpose is to retrieve and return the current local time on the machine where the application is running.

Functionality

  • The plugin provides the current local time in the format "HH:mm:ss".
  • It does not require any external API calls, making it fast and lightweight.

Code Explanation

public sealed class LocalTimePlugin
{
    [KernelFunction, Description("Retrieves the current time in Local Time.")]
    public static String GetCurrentLocalTime()
    {
        return "The current local time now is :" + DateTime.Now.ToString("HH:mm:ss");
    }
}
  • GetCurrentLocalTime Method: This static method is also decorated with [KernelFunction]:
    • It simply fetches the current local time using DateTime.Now and formats it as a string in "HH:mm:ss" format.
    • The method then returns this formatted string, which the AI Assistant can use to respond to user queries about the time.
    • This plugin is straightforward for any queries related to getting the current local time.

Integrating Plugins into Semantic Kernel

Both plugins are integrated into the Semantic Kernel by being registered as KernelPlugin instances. This allows the AI Assistant to automatically invoke these functions when responding to user queries related to weather or local time.

KernelPlugin localTimePlugin = KernelPluginFactory.CreateFromType<LocalTimePlugin>();
agent.Kernel.Plugins.Add(localTimePlugin);

KernelPlugin weatherPlugin = KernelPluginFactory.CreateFromObject(new WeatherPlugin(weatherApiKey!));
agent.Kernel.Plugins.Add(weatherPlugin);
  • localTimePlugin: This plugin is created using KernelPluginFactory.CreateFromType<LocalTimePlugin>(), which registers the LocalTimePlugin with the Semantic Kernel.
  • weatherPlugin: This plugin is created by passing a new instance of WeatherPlugin (with the API key) to the factory, enabling it to fetch weather data dynamically.

Step 4 Implement the Chat Loop

  1. Initialize the Chat Interface: Create an instance of AgentGroupChat to manage the conversation between the user and the agent

    AgentGroupChat chat = new();
    
  2. Create a Function to Handle User Input: Implement a local function that invokes the agent and handles the conversation flow:

    async Task InvokeAgentAsync(string question) 
    { 
        chat.AddChatMessage(new     ChatMessageContent(AuthorRole.User, question));
        Console.ForegroundColor = ConsoleColor.Green;
        await foreach (ChatMessageContent content in chat.InvokeAsync(agent))
        {
            Console.WriteLine(content.Content);
        } 
    }
  3. Run the Chat Loop: In the main loop, continuously read user input and process it using the InvokeAgentAsync function

    Console.WriteLine("Assistant: Hello, I am your Assistant. How may I help you?");
    while (true)
    { 
        Console.ForegroundColor = ConsoleColor.White; 
        Console.Write("User: "); 
        await InvokeAgentAsync(Console.ReadLine()!); 
    }

Step 5. Run Your Application

  • Build and Run the Application: Compile the application and run it. You should see the AI Assistant prompt, and you can interact with it by asking questions related to time or weather. *

Conclusion

With the steps above, we have successfully implemented a C# Application that uses the Semantic Kernel library with a local instance of the Llama 3.1 model using Ollama that leverages function calling to handle specific tasks like retrieving the local time or weather information, showcasing the flexibility and power of combining local LLMs with function calling using C# Semantic Kernel. The entire source code is available here in the following github url.

https://github.com/vavjeeva/SKAgentLocalFunctionCalling?tab=readme-ov-file