Skip to content

Local AI with Retrieval Augmented Generation (RAG) using Open WebUI and Ollama

Sources:

Running artificial intelligence locally on your computer

Section titled “Running artificial intelligence locally on your computer”

Run a local a large language model (LLM) on your local computer to have benefits of save costs, privacy, and get more features and customization. Disadvantages is needing a computer that can run the models, technical knowledge, and time.

  • Ollama - framework to run LLMs locally
  • OpenWebUI - user interface to interact with Ollama, it also support talking to other interfaces
    • Has things like chat, model configuration, retrieval augmental generation (RAG), and other features
  • Python - programming language

A computer with at least a graphics processing unit (GPU) with minimum 4GB GPU memory, 8GB+ recommended, 20 GB free disk space, and a browser

Setting up AI Locally with ollama, open-webui and Python

Section titled “Setting up AI Locally with ollama, open-webui and Python”

For detailed steps, see the source articles. High level steps are:

  • Install the tools
  • Set up a Python virtual environment
  • Use Ollama’s online list of models to select an appropriate model. Here are examples: AI Models - AI Models
  • Install open-webui in the Python virtual environment and start it, then access the web application user interface in your browser

Assuming Python, Ollama, and Python Environment and package tools like pip are installed:

Terminal window
# Create and activate a Python virtual environment, assume use python 3.11 or higher
python -m virtualenv ./venv
source ./venv/bin/activate
# Run a small model and exit out of the chat interface with Ctrl + D
ollama run phi4-mini
# Install and run open-webui
pip install open-webui
open-webui serve

Go to the localhost site for Open WebUI to see interface for chat, models, chat controls, and settings.

Configuring open-webui setup for search, code use, and models

Section titled “Configuring open-webui setup for search, code use, and models”
  • Web search: allows using web search results
    • Settings > Admin > Web Search, choose and set up API
  • Code Interpreter: allows writing and execution of Python code in chat, changes from chat to AI being able to do work with a programming language
    • Settings > Admin > Code Execution, toggle switch
    • Advantages: Instead of just text chat, AI can run programs to help solve problems
  • Custom model: customize how an AI model behaves
    • Workspace > Models > Create New
    • Set things like system prompt

Using Local artificial intelligence (AI) with Retrieval Augmented Generation (RAG) and Existing Documents

Section titled “Using Local artificial intelligence (AI) with Retrieval Augmented Generation (RAG) and Existing Documents”

See Local AI with Retrieval Augmented Generation (RAG) using Open WebUI and Ollama System and Setup - Local AI with Retrieval Augmented Generation (RAG) using Open WebUI and Ollama System and Setup