Introduction
Generative AI has become a key technology area in today's era because of the potential it has to solve different problems faced by people. It has the potential to address so many use cases without needing to know about how to develop, train, and deploy machine learning models. With so little time to get up and running and making things very accessible to almost everyone, it has become quite popular. ChatGPT powered by Open AI is quite popular among people but not all versions like GPT-4o/GPT-4-Plus are free and accessible to everyone.
The only thing that stops us individuals from utilizing its full potential is the cost associated with using the model. To overcome such a problem community of developers has come up with open-source approaches and tech giants like Meta and Snowflake contributing amazing accurate generative AI models for free.
In this tutorial we will see how to use open source approaches to get up and running with Generative AI models for free without worrying about cost complexity.
Ollama
Ollama is a tool/software that acts as a middleware between Generative AI models and developers, it abstracts the complexity of setting up models in the system and provides a set of APIs and CLI commands to interact with the model. Below is the high-level design of how CLI and API interact with Ollama Server and Gen AI models inside a machine. Developers can easily add/remove models which we will see further in this tutorial.
Steps to set up Ollama's backend
- Download Ollama for Windows / Linux / macOS from this link - https://github.com/ollama/ollama
- Complete the installation steps as mentioned in the installer.
Now we have downloaded the Ollama server, next, we will install the model.
Based on system performance and GPU availability our models respond slow or fast. In this tutorial, we will use the Tinyllama model. It is a lightweight model that can work on low-end systems as well.
Now, we will write the below commands.
ollama pull tinyllama //this might take up to 10 mins to finish downloading the model
ollama run tinyllama //this will start running the model in our CLI
Now we can seamlessly interact with the Gen AI model of our choice. There are several options available if you want to look at possible options do visit their website to try out different stuff.
Now, we will check it's memory stats in another CLI.
ollama ps
Now, let's make the GUI more user-friendly, as not everyone is comfortable using CLI. We want to use a ChatGPT-like interface for interacting with the Generative AI model. Let's set it quickly.
For this, we will download community-contributed Open Web UI. There are many ways to run UI but for this tutorial, we will use the most commonly used available method i.e. python.
To set write the below commands.
pip install open-webui //this will take approx 10 mins to setup every deps
open-webui serve // this will run the web UI
Now we can use Web UI at default url setup at http://localhost:8080/
Below is the screenshot of me using the llama3 model offered by Meta.
Note: You may need to sign up to enter Web UI locally, for this, you may use any dummy creds and get started.
That's it. Now we can use Generative AI for free.
Thanks for reading !!
Do let me know how you feel about Generative AI in the comments.