Skip to content

Retrieval Augmented Generation (RAG)

What is Retrieval-Augmented Generation (RAG)

Section titled “What is Retrieval-Augmented Generation (RAG)”

Source: What is Retrieval-Augmented Generation (RAG)? - YouTube by Marina Danilevsky at IBM

  • Asking which planet in the solar system has the most moons?

  • Large Language Model (LLM) Challenges

    • No source of information - just trained information
    • Out of date - there are more moons discovered over time
  • LLM could confidently give a wrong answer like planet Jupiter when it could be Saturn as moons are discovered

    • It needs to have authoritative sources like Nasa and up to date data.
  • LLM gets relevant data from updated data set and gives response with evidence of the response
    • It can say I don’t know when the data is not available
  • Needs work on 2 parts:
    • Retrieval Augmented - better retrievers to find most relevant data to user prompts
    • Generation - structuring responses and best responses

Soure: RAG Explained - YouTube by Luv Aggarwal and Shawn Brennan at IBM

  • Use case: Journalist is researching a topic using library, need to check relevant books. They ask librarian (expert on finding information) for books on certain topics so journalist (expert on content) can assess. Users trust the data in books.
    • RAG similarities:
      • User/machine (journalist) has questions
      • Prompt uses LLM
      • Get multiple sources of data, put data in vector database (math representation of structured and unstructured data).
      • LLM uses vector database to provide answers
  • Risk mitigation of bias, hallucinations:
    • Verify input data quality is clean, governed
    • Choose appropriate LLM