Build a Local Artificial Intelligence System

DIY AI Infrastructure: Build Your Own Privacy-Preserving AI at Home

Source: DIY AI Infrastructure: Build Your Own Privacy-Preserving AI at Home - IBM Technology on YouTube

Run AI Models Locally with Ollama

Source:

Run AI Models Locally with Ollama: Fast & Simple Deployment - IBM Technology on YouTube
GitHub - rh-rad-ai-roadshow/parasol-insurance with code explained in video

Run a large language model (LLM) on your local machine to save time by using your own machine (no need to create accounts or infrastructure) and keep your data private.

Steps:

Install ollama and choose a model to run like examples from AI Models - AI Models which can be ollama models and models from Hugging Face
Download the model, use a quantized model, meaning a compressed model for use on limited hardware
Start an inference server to chat with the model
Use llama C plus plus to run the model
During a chat, there will be a post request to the local API to get the response

# Model features: multilingual, RAG, enterprise use case
ollama run granite3.1-dense

# Chat or /bye to quit

Using ollama with an application

Technologies:

LangChain for Java, standard API to make calls to the model
- Use LangChain dependency and point to local API endpoint
Quarkus (Kubernetes optimized) to run the application

Build a Local Artificial Intelligence System

DIY AI Infrastructure: Build Your Own Privacy-Preserving AI at Home

Run AI Models Locally with Ollama

Using ollama with an application

See Also