ragnar 0.2

  ai, ragnar

  Tomasz Kalinowski

ragnar 0.2

We’re happy to announce the release of ragnar 0.2, a new R package for building trustworthy Retrieval-Augmented Generation (RAG) workflows.

You can install it from CRAN with:

install.packages("ragnar")

What’s retrieval-augmented generation (RAG)?

Large language models (LLMs) tend to generate fluent confident text completely detached from facts and reality. We politely call untrue statements from an LLM hallucinations. RAG reduces the risk of hallucinations by grounding LLMs in your factual, trusted documents.

With RAG, instead of asking an LLM to respond from its own memory, we:

  1. Retrieve relevant passages from trusted sources.
  2. Ask the model to answer using those passages.

RAG shifts the LLMs job from open ended generation towards summarizing and paraphrasing, an easier task where LLMs make substantially fewer fabrications.

Meet ragnar

ragnar is a tidy interface for building a RAG pipeline. Use ragnar to:

  • Convert documents from the web or local filesystem into Markdown.
  • Chunk documents using meaningful semantic boundaries.
  • Augment chunks with a short context string that situates each chunk.
  • Embed chunks with commercial or open-source models.
  • Store embeddings in DuckDB for fast, local queries.
  • Retrieve relevant chunks using both vector and text search.

Quick start: collect, convert, chunk, embed, and store your documents

Here is how to build a RAG knowledge store from the Quarto docs.

  1. Create a knowledge store.

    store <- ragnar_store_create(
      "./quarto.ragnar.duckdb",
      embed = \(x) ragnar::embed_openai(x, model = "text-embedding-3-small"),
      name = "quarto_docs"
    )
  2. Generate a list of relevant web page URLs from quarto.org. We can consult the sitemap, or, if a sitemap wasn’t available, we could also crawl the site.

    pages <- ragnar_find_links("https://quarto.org/sitemap.xml")
  3. Convert, chunk, augment, embed, and store each page.

    for (page in pages) {
      chunks <- page |>
    
        # Convert to markdown
        read_as_markdown() |>
    
        # Split document into chunks and generate 'context' for each chunk.
        markdown_chunk()
    
      # Embed and store chunks with context and metadata
      ragnar_store_insert(store, chunks)
    }
  4. Build the retrieval index.

Once the store is built, you can access it for fast retrieval.

Retrieve relevant chunks

Pass a query string to ragnar_retrieve() to perform both semantic search using vector embeddings and conventional text search to retrieve the most relevant chunks.

store <- ragnar_store_connect("./quarto.ragnar.duckdb", read_only = TRUE)
query <- "{.python} or {python} code chunk header"

ragnar_retrieve(store, query, top_k = 5)
#> # A tibble: 9 × 9
#>   origin         doc_id chunk_id start   end cosine_distance bm25  context text 
#>   <chr>          <list> <list>   <int> <int> <list>          <lis> <chr>   <chr>
#> 1 https://quart… <int>  <int>    14318 16132 <dbl [1]>       <dbl> "# Dia… "###…
#> 2 https://quart… <int>  <int>      869  2386 <dbl [1]>       <dbl> "# ASA… "Hom…
#> 3 https://quart… <int>  <int>        1  2497 <dbl [2]>       <dbl> ""      "# U…
#> 4 https://quart… <int>  <int>     3156  4928 <dbl [1]>       <dbl> "# v1.… "## …
#> 5 https://quart… <int>  <int>     5365  7389 <dbl [1]>       <dbl> "# Cre… "## …
#> 6 https://quart… <int>  <int>     7319  8804 <dbl [1]>       <dbl> "# HTM… "## …
#> 7 https://quart… <int>  <int>    11096 12763 <dbl [1]>       <dbl> "# HTM… "## …
#> 8 https://quart… <int>  <int>     9426 11250 <dbl [1]>       <dbl> "# Rev… "###…
#> 9 https://quart… <int>  <int>     5236  6904 <dbl [1]>       <dbl> "# Hel… "###…

Equip an LLM chat with your store

You can equip an ellmer chat with a tool that lets the LLM search your knowledge store automatically.

library(ellmer)

chat <- chat_openai(
  system_prompt = glue::trim("
    You are a Quarto documentation search agent and summarizer.
    You are concise.
    For every user question, perform between one and three searches.
    Include links to the source documents in your response.
    ")
  ) |>
  ragnar_register_tool_retrieve(store, top_k = 10)
#> Using model = "gpt-4.1".

The model can now search the store on demand. It has the ability to rewrite the search query and do repeated searches. The model’s responses will also cite and link back to your source documents, so users can easily follow links to learn more.

chat$chat(
  "What's the difference between {.python} and {python}
  in a code chunk header?"
)
#>  [tool call] rag_retrieve_from_store_001(text = "difference between {.python}
#> and {python} in a code chunk header")
#>  #> [{"origin":"https://quarto.org/docs/authoring/diagrams.html","doc_id"…
#>  [tool call] rag_retrieve_from_store_001(text = "chunk header options quarto
#> curly braces dot notation")
#>  #> [{"origin":"https://quarto.org/docs/authoring/lipsum.html","doc_id":2…
#> The difference between `{.python}` and `{python}` in a code chunk header is:
#> 
#> - `{python}`: This syntax is used for executable code blocks. Quarto will run 
#> the Python code inside the block and include its output in the rendered 
#> document.  
#>   ```markdown
#>   ```{python}
#>   print(1 + 1)
#>   ```
#>   ```
#>   This is for running code, capturing output, figures, etc.
#> 
#> - `{.python}`: This syntax (note the leading dot) is used for a code block that
#> is purely for display (not executed), with `.python` indicating the code should
#> be syntax-highlighted as Python. This is the Pandoc Markdown convention for 
#> indicating the language for syntax highlighting only:
#>   ```markdown
#>   ```{.python}
#>   # This code is just displayed, not executed by Quarto
#>   print(1 + 1)
#>   ```
#>   ```
#>   Or equivalently, you can use triple backticks followed by the language name:
#>   ```
#>   ```python
#>   print(1 + 1)
#>   ```
#>   ```
#>   In both forms, the code is not executed.
#> 
#> To summarize:
#> - `{python}` → Executed code block.
#> - `{.python}` or ```python → Non-executed code block with syntax highlighting 
#> only.
#> 
#> Sources:
#> - [Quarto documentation: Using 
#> Python](https://quarto.org/docs/computations/python.html)
#> - [Quarto documentation: HTML Code 
#> Blocks](https://quarto.org/docs/output-formats/html-code.html)

Inspect and iterate

Use ragnar_store_inspect() to interactively preview which text chunks are retrieved for different search queries. This helps identify issues like poor document conversion, chunking, or context augmentation, so you can refine your store creation pipeline. By making retrieval results easy to explore, ragnar lets you iterate and tune your knowledge store before connecting it to an LLM.

You can also launch the store inspector with just a single chunked document using ragnar_chunks_view(). This is particularly useful when deciding what chunking approach is most appropriate for your content.

Store Inspector UI screenshot

Additional features

  • Works with many document types: read_as_markdown() uses MarkItDown, which means it can ingest an extremely wide variety of files: HTML, PDF, docx, pptx, epubs, compressed archives, and more.
  • Flexible embeddings: Use embedding models from providers like OpenAI, Google Vertex or Gemini, Bedrock, Databricks, Ollama or LM Studio, or easily supply your own embedding function.
  • DuckDB native: Extremely fast local indexing and retrieval. Native support for MotherDuck if you need to serve the store.
  • Customizable chunk augmentation: Customize how chunks are augmented with context (headings, links, titles), and easily attach additional metadata to chunks.
  • Not a black box: Easily inspect the store contents and retrieval results.

Get started

Acknowledgements

A big thanks to all contributors who helped out with ragnar development through thoughtful discussions, bug reports, and pull requests.

@app2let, @arnavchauhan7, @atheriel, @bowerth, @cboettig, @Christophe-Regouby, @dfalbel, @dingying85, @gadenbuie, @hadley, @JCfly3000, @jrosell, @kaipingyang, @mattwarkentin, @PauloSantana2019, @pedrobtz, @RichardHooijmaijers, @schochastics, @sikiru-atanda, @SimonEdscer, @smach, @t-kalinowski, and @topepo.