Skip to content

Retrieval Augmented Generation (RAG)

What is Retrieval-Augmented Generation (RAG)

Section titled “What is Retrieval-Augmented Generation (RAG)”

Source: What is Retrieval-Augmented Generation (RAG)? - YouTube by Marina Danilevsky at IBM

Asking which planet in the solar system has the most moons?

Large Language Model (LLM) Challenges

  • No source of information - just trained information
  • Out of date - there are more moons discovered over time

LLM could confidently give a wrong answer like planet Jupiter when it could be Saturn as moons are discovered. It needs to have authoritative sources like NASA and up to date data.

LLM gets relevant data from updated data set and gives response with evidence of the response. RAG improves a model’s performance with updated and domain specific information.

It can say I don’t know when the data is not available

Needs work on 2 parts:

  • Retrieval Augmented - better retrievers to find most relevant data to user prompts
  • Generation - structuring responses and best responses

Source: RAG Explained - YouTube by Luv Aggarwal and Shawn Brennan at IBM

Use case: Journalist is researching a topic using library, need to check relevant books. They ask librarian (expert on finding information) for books on certain topics so journalist (expert on content) can assess. Users trust the data in books.

RAG similarities:

  • User/machine (journalist) has questions
  • Prompt uses LLM
  • Get multiple sources of data, put data in vector database (math representation of structured and unstructured data).
  • LLM uses vector database to provide answers

Risk mitigation of bias, hallucinations involve (1) Verify input data quality is clean, governed and (2) choose appropriate LLM