dllmforge package

Submodules

dllmforge.agent_core module

Simple agent core for DLLMForge - Clean LangGraph utilities.

This module provides simple, elegant utilities for creating LangGraph agents following the pattern established in water_management_agent_simple.py.

dllmforge.agent_core.tool(func)[source]

DLLMForge wrapper around LangChain’s @tool decorator.

This decorator provides a consistent interface for creating tools within the DLLMForge ecosystem while maintaining compatibility with LangChain’s tool system.

Parameters:

func – Function to be converted into a tool

Returns:

Tool function that can be used with SimpleAgent

class dllmforge.agent_core.SimpleAgent(system_message: str = None, temperature: float = 0.1, model_provider: str = 'azure-openai', llm=None, enable_text_tool_routing: bool = False, max_tool_iterations: int = 3)[source]

Bases: object

Simple agent class for LangGraph workflows.

Initialize a simple LangGraph agent.

Parameters:
  • system_message – System message for the agent

  • temperature – LLM temperature setting

  • model_provider – LLM provider (“azure-openai”, “openai”, “mistral”)

__init__(system_message: str = None, temperature: float = 0.1, model_provider: str = 'azure-openai', llm=None, enable_text_tool_routing: bool = False, max_tool_iterations: int = 3)[source]

Initialize a simple LangGraph agent.

Parameters:
  • system_message – System message for the agent

  • temperature – LLM temperature setting

  • model_provider – LLM provider (“azure-openai”, “openai”, “mistral”)

add_tool(tool_func: Callable) None[source]

Add a tool to the agent.

Parameters:

tool_func – Function decorated with @tool

add_node(name: str, func: Callable) None[source]

Add a node to the workflow.

Parameters:
  • name – Node name

  • func – Node function

add_edge(from_node: str, to_node: str) None[source]

Add a simple edge between nodes.

Parameters:
  • from_node – Source node

  • to_node – Target node

add_conditional_edge(from_node: str, condition_func: Callable) None[source]

Add a conditional edge.

Parameters:
  • from_node – Source node

  • condition_func – Function that determines routing

create_simple_workflow() None[source]

Create a simple agent -> tools workflow with optional text-based tool routing.

compile(checkpointer=None) None[source]

Compile the workflow.

process_query(query: str, stream: bool = True) None[source]

Process a query with the agent.

Parameters:
  • query – User query

  • stream – Whether to stream the response

run_interactive() None[source]

Run the agent in interactive mode.

dllmforge.agent_core.create_basic_agent(system_message: str = None, temperature: float = 0.1, model_provider: str = 'azure-openai') SimpleAgent[source]

Create a basic agent with standard setup.

Parameters:
  • system_message – System message for the agent

  • temperature – LLM temperature

  • model_provider – LLM provider (“azure-openai”, “openai”, “mistral”)

Returns:

Configured agent instance

Return type:

SimpleAgent

dllmforge.agent_core.create_echo_tool()[source]

Create a simple echo tool for testing.

dllmforge.agent_core.create_basic_tools() List[Callable][source]

Create basic utility tools for testing.

Returns:

List of tool functions

dllmforge.anthropic_api module

class dllmforge.anthropic_api.AnthropicAPI(api_key=None, model='claude-3-7-sonnet-20250219', deployment_claude37=None, deployment_claude35=None)[source]

Bases: object

Class to interact with Anthropic’s Claude API.

Initialize the Anthropic API client with configuration.

__init__(api_key=None, model='claude-3-7-sonnet-20250219', deployment_claude37=None, deployment_claude35=None)[source]

Initialize the Anthropic API client with configuration.

check_server_status()[source]

Check if the Anthropic API service is accessible.

list_available_models()[source]

List available models from Anthropic.

send_test_message(prompt='Hello, how are you?')[source]

Send a test message to the model and get a response.

chat_completion(messages, temperature=0.7, max_tokens=1000)[source]

Get a chat completion from the model.

dllmforge.langchain_api module

Create LLM object and api calls from langchain, including Azure and non-Azure models. We use openai and mistral models for examples. An overview of available langchain chat models: https://python.langchain.com/docs/integrations/chat/

class dllmforge.langchain_api.LangchainAPI(model_provider: str = 'azure-openai', temperature: float = 0.1, api_key=None, api_base=None, api_version=None, deployment_name=None, model_name=None)[source]

Bases: object

Class to interact with various LLM providers using Langchain.

Initialize the Langchain API client with specified configuration.

Parameters:
  • model_provider (str) – Provider of model to use. Options are: - “azure-openai”: Use Azure OpenAI - “openai”: Use OpenAI - “mistral”: Use Mistral

  • temperature (float) – Temperature setting for the model (0.0 to 1.0)

  • api_key (str) – API key for the provider

  • api_base (str) – API base URL (for Azure)

  • api_version (str) – API version (for Azure)

  • deployment_name (str) – Deployment name (for Azure)

  • model_name (str) – Model name (for OpenAI/Mistral)

__init__(model_provider: str = 'azure-openai', temperature: float = 0.1, api_key=None, api_base=None, api_version=None, deployment_name=None, model_name=None)[source]

Initialize the Langchain API client with specified configuration.

Parameters:
  • model_provider (str) – Provider of model to use. Options are: - “azure-openai”: Use Azure OpenAI - “openai”: Use OpenAI - “mistral”: Use Mistral

  • temperature (float) – Temperature setting for the model (0.0 to 1.0)

  • api_key (str) – API key for the provider

  • api_base (str) – API base URL (for Azure)

  • api_version (str) – API version (for Azure)

  • deployment_name (str) – Deployment name (for Azure)

  • model_name (str) – Model name (for OpenAI/Mistral)

check_server_status()[source]

Check if the LLM service is accessible.

send_test_message(prompt='Hello, how are you?')[source]

Send a test message to the model and get a response.

Parameters:

prompt (str) – The prompt string to send.

Returns:

Dictionary containing the response and metadata.

Return type:

dict

chat_completion(messages, temperature=None, max_tokens=None)[source]

Get a chat completion from the model.

Parameters:
  • messages (list) – List of message tuples (role, content)

  • temperature (float) – Optional temperature override

  • max_tokens (int) – Optional max tokens override

Returns:

Dictionary containing the response and metadata.

Return type:

dict

ask_with_retriever(question: str, retriever)[source]

Ask a question using the retriever to get context.

Parameters:
  • question (str) – The question to ask.

  • retriever – A rag retriever object that can retrieve relevant context.

  • **kwargs – Additional keyword arguments to pass to the LLM (e.g., temperature, max_tokens).

Returns:

The response from the LLM.

dllmforge.llamaindex_api module

Create LLM object and API calls using llama_index, including Azure and non-Azure models. We use openai and mistral models for examples. An overview of available llama_index LLMs: https://docs.llamaindex.ai/en/stable/module_guides/models/llms/modules/

class dllmforge.llamaindex_api.LlamaIndexAPI(model_provider: str = 'azure-openai', temperature: float = 0.0, api_key=None, api_base=None, api_version=None, deployment_name=None, model_name=None)[source]

Bases: object

Class to interact with various LLM providers using LlamaIndex.

Initialize the LlamaIndex API client with specified configuration. :param model_provider: Provider of model to use. Options are:

  • “azure-openai”: Use Azure OpenAI

  • “openai”: Use OpenAI

  • “mistral”: Use Mistral

Parameters:
  • temperature (float) – Temperature setting for the model (0.0 to 1.0)

  • api_key (str) – API key for the provider

  • api_base (str) – API base URL (for Azure)

  • api_version (str) – API version (for Azure)

  • deployment_name (str) – Deployment name (for Azure)

  • model_name (str) – Model name (for OpenAI/Mistral)

__init__(model_provider: str = 'azure-openai', temperature: float = 0.0, api_key=None, api_base=None, api_version=None, deployment_name=None, model_name=None)[source]

Initialize the LlamaIndex API client with specified configuration. :param model_provider: Provider of model to use. Options are:

  • “azure-openai”: Use Azure OpenAI

  • “openai”: Use OpenAI

  • “mistral”: Use Mistral

Parameters:
  • temperature (float) – Temperature setting for the model (0.0 to 1.0)

  • api_key (str) – API key for the provider

  • api_base (str) – API base URL (for Azure)

  • api_version (str) – API version (for Azure)

  • deployment_name (str) – Deployment name (for Azure)

  • model_name (str) – Model name (for OpenAI/Mistral)

check_server_status()[source]

Check if the LLM service is accessible.

send_test_message(prompt='Hello, how are you?')[source]

Send a test message to the model and get a response. :param prompt: The prompt string to send. :type prompt: str

Returns:

Dictionary containing the response and metadata.

Return type:

dict

chat_completion(messages, temperature=None, max_tokens=None)[source]

Get a chat completion from the model. :param messages: List of message dicts or tuples (role, content) :type messages: list :param temperature: Optional temperature override :type temperature: float :param max_tokens: Optional max tokens override :type max_tokens: int

Returns:

Dictionary containing the response and metadata.

Return type:

dict

dllmforge.openai_api module

class dllmforge.openai_api.OpenAIAPI(api_key=None, api_base=None, api_version=None, deployment_name='gpt-4o', embedding_deployment='text-embedding-3-large')[source]

Bases: object

Class to interact with Azure OpenAI API.

Initialize the OpenAI API client with Azure configuration.

__init__(api_key=None, api_base=None, api_version=None, deployment_name='gpt-4o', embedding_deployment='text-embedding-3-large')[source]

Initialize the OpenAI API client with Azure configuration.

check_server_status()[source]

Check if the Azure OpenAI service is accessible.

list_available_models()[source]

List available models from Azure OpenAI.

send_test_message(prompt='Hello, how are you?')[source]

Send a test message to the model and get a response.

get_embeddings(text)[source]

Get embeddings for the given text using Azure OpenAI.

chat_completion(messages, temperature=0.7, max_tokens=800)[source]

Get a chat completion from the model.

dllmforge.rag_embedding module

This module provides embedding functionality for RAG (Retrieval-Augmented Generation) pipelines. It can be used to 1) vectorize document chunks, and 2) vectorize user queries. The module uses Azure OpenAI embeddings model as an example of using hosted embedding APIs. Note you need Azure OpenAI service and a deployed embedding model on Azure to use this module.

class dllmforge.rag_embedding.AzureOpenAIEmbeddingModel(model: str = 'text-embedding-3-large', api_base: str = None, deployment_name_embeddings: str = None, api_key: str = None, api_version: str = None)[source]

Bases: object

Class for embedding queries and document chunks using Azure OpenAI Embeddings.

Initialize the embedding model using provided arguments or environment variables for Azure OpenAI. :param model: Name of the embedding model to use :param api_base: Azure OpenAI API base URL :param deployment_name_embeddings: Azure OpenAI deployment name for embeddings :param api_key: Azure OpenAI API key :param api_version: Azure OpenAI API version

__init__(model: str = 'text-embedding-3-large', api_base: str = None, deployment_name_embeddings: str = None, api_key: str = None, api_version: str = None)[source]

Initialize the embedding model using provided arguments or environment variables for Azure OpenAI. :param model: Name of the embedding model to use :param api_base: Azure OpenAI API base URL :param deployment_name_embeddings: Azure OpenAI deployment name for embeddings :param api_key: Azure OpenAI API key :param api_version: Azure OpenAI API version

static validate_embedding(embedding: List[float]) bool[source]

Validate that the embedding is not empty.

static encode_filename(filename: str) str[source]

Encode filename to be safe for Azure Cognitive Search document keys.

embed(query_or_chunks: str | List[Dict[str, Any]]) List[float] | List[Dict[str, Any]][source]

Embed a single query string or a list of document chunks. :param query_or_chunks: A string (query) or a list of dictionaries (document chunks)

Each dictionary should have keys: “text”, “file_name”, “page_number”

Returns:

list of floats (embedding vector) For document chunks: list of dictionaries with keys: “chunk_id”, “chunk”, “page_number”,

”file_name”, “text_vector”

Return type:

For a string query

dllmforge.rag_embedding_open_source module

This module provides embedding functionality for RAG (Retrieval-Augmented Generation) pipelines. It can be used to 1) vectorize document chunks, and 2) vectorize user queries. The module uses Azure OpenAI embeddings model as an example of using hosted embedding APIs. Note you need Azure OpenAI service and a deployed embedding model on Azure to use this module.

class dllmforge.rag_embedding_open_source.LangchainHFEmbeddingModel(model_name: str = 'sentence-transformers/all-MiniLM-L6-v2')[source]

Bases: object

Class for embedding queries and document chunks using LangChain’s HuggingFaceEmbeddings.

Initialize the HuggingFaceEmbeddings from LangChain.

Parameters:

model_name – Name or path of the Hugging Face model (default: “sentence-transformers/all-MiniLM-L6-v2”).

__init__(model_name: str = 'sentence-transformers/all-MiniLM-L6-v2')[source]

Initialize the HuggingFaceEmbeddings from LangChain.

Parameters:

model_name – Name or path of the Hugging Face model (default: “sentence-transformers/all-MiniLM-L6-v2”).

static validate_embedding(embedding: List[float]) bool[source]

Validate that the embedding vector is non-empty and numeric.

embed(query_or_chunks: str | List[Dict[str, Any]]) List[float] | List[Dict[str, Any]][source]

Embed a single query string or a list of document chunks.

Parameters:

query_or_chunks – A string (query) or list of dicts with keys: “text”, “file_name”, “page_number”.

Returns:

list of floats (vector embedding). - For document chunks: list of dicts with keys:

”chunk_id”, “chunk”, “page_number”, “file_name”, “text_vector”.

Return type:

  • For a string query

dllmforge.rag_evaluation module

RAGAS Evaluation Module for DLLMForge

This module provides comprehensive evaluation metrics for RAG (Retrieval-Augmented Generation) pipelines using RAGAS-inspired metrics without requiring external dashboards or services.

The module evaluates four key aspects of RAG systems: 1. Context Precision - 2. Context Recall - measures the ability to retrieve all necessary information 3. Faithfulness - measures factual accuracy and absence of hallucinations 4. Answer Relevancy - measures how relevant and to-the-point answers are

All evaluations are performed using LLMs to provide human-like assessment without requiring annotated datasets.

class dllmforge.rag_evaluation.EvaluationResult(metric_name: str, score: float, explanation: str, details: Dict[str, Any])[source]

Bases: object

Container for evaluation results.

metric_name: str
score: float
explanation: str
details: Dict[str, Any]
__init__(metric_name: str, score: float, explanation: str, details: Dict[str, Any]) None
class dllmforge.rag_evaluation.RAGEvaluationResult(context_precision: EvaluationResult, context_recall: EvaluationResult, faithfulness: EvaluationResult, answer_relevancy: EvaluationResult, ragas_score: float, evaluation_time: float, metadata: Dict[str, Any])[source]

Bases: object

Container for complete RAG evaluation results.

context_precision: EvaluationResult
context_recall: EvaluationResult
faithfulness: EvaluationResult
answer_relevancy: EvaluationResult
ragas_score: float
evaluation_time: float
metadata: Dict[str, Any]
__init__(context_precision: EvaluationResult, context_recall: EvaluationResult, faithfulness: EvaluationResult, answer_relevancy: EvaluationResult, ragas_score: float, evaluation_time: float, metadata: Dict[str, Any]) None
class dllmforge.rag_evaluation.RAGEvaluator(llm_provider: str = 'auto', deltares_llm: DeltaresOllamaLLM | None = None, temperature: float = 0.1, api_key: str | None = None, api_base: str | None = None, api_version: str | None = None, deployment_name: str | None = None, model_name: str | None = None)[source]

Bases: object

RAGAS-inspired evaluator for RAG pipelines.

This evaluator provides four key metrics: - Context Precision: - Context Recall: Measures the ability to retrieve all necessary information - Faithfulness: Measures factual accuracy and absence of hallucinations - Answer Relevancy: Measures how relevant and to-the-point answers are

Initialize the RAG evaluator.

Parameters:

llm_provider – LLM provider to use (“openai”, “anthropic”, “deltares” or “auto”)

__init__(llm_provider: str = 'auto', deltares_llm: DeltaresOllamaLLM | None = None, temperature: float = 0.1, api_key: str | None = None, api_base: str | None = None, api_version: str | None = None, deployment_name: str | None = None, model_name: str | None = None)[source]

Initialize the RAG evaluator.

Parameters:

llm_provider – LLM provider to use (“openai”, “anthropic”, “deltares” or “auto”)

evaluate_context_relevancy(question: str, retrieved_contexts: List[str]) EvaluationResult[source]

Evaluate the relevancy of retrieved contexts to the question.

This metric measures the signal-to-noise ratio in the retrieved contexts. It identifies which sentences from the context are actually needed to answer the question.

Parameters:
  • question – The user’s question

  • retrieved_contexts – List of retrieved context chunks

Returns:

EvaluationResult with score and explanation

evaluate_context_precision(question: str, retrieved_contexts: List[str], ground_truth_answer: str | None, top_k: int = 5) EvaluationResult[source]

Evaluate Context Precision@k following the Ragas implementation.

For each of the top-k retrieved chunks, the LLM judges whether the chunk supports the reference answer. Average Precision (AP) is then computed as:

AP = sum(Precision@i * rel_i) / (# relevant chunks)

Parameters:
  • question – The question to evaluate.

  • reference_answer – The correct or gold answer.

  • retrieved_contexts – Ranked list of retrieved chunks.

  • top_k – Number of top chunks to evaluate.

Returns:

EvaluationResult with precision@k score and explanation.

evaluate_context_recall(question: str, retrieved_contexts: List[str], ground_truth_answer: str) EvaluationResult[source]

Evaluate the recall of retrieved contexts against a ground truth answer. This metric measures the ability of the retriever to retrieve all necessary information needed to answer the question by checking if each statement from the ground truth can be found in the retrieved context.

Parameters:
  • question – The user’s question

  • retrieved_contexts – List of retrieved context chunks

  • ground_truth_answer – The reference answer to compare against

Returns:

EvaluationResult with score and explanation

evaluate_faithfulness(question: str, generated_answer: str, retrieved_contexts: List[str]) EvaluationResult[source]

Evaluate the faithfulness of the generated answer to the retrieved contexts.

This metric measures the factual accuracy of the generated answer by checking if all statements in the answer are supported by the retrieved contexts.

Parameters:
  • question – The user’s question

  • generated_answer – The answer generated by the RAG system

  • retrieved_contexts – List of retrieved context chunks

Returns:

EvaluationResult with score and explanation

evaluate_answer_relevancy(question: str, generated_answer: str) EvaluationResult[source]

Evaluate the relevancy of the generated answer to the question. This metric measures how relevant and to-the-point the answer is by generating probable questions that the answer could answer and computing similarity to the actual question. :param question: The user’s question :param generated_answer: The answer generated by the RAG system

Returns:

EvaluationResult with score and explanation

calculate_ragas_score(context_precision: float, context_recall: float, faithfulness: float, answer_relevancy: float) float[source]

Calculate the RAGAS score as the harmonic mean of all four metrics. :param context_precision: Context precision score :param context_recall: Context recall score :param faithfulness: Faithfulness score :param answer_relevancy: Answer relevancy score

Returns:

RAGAS score (harmonic mean)

evaluate_rag_pipeline(question: str, generated_answer: str, retrieved_contexts: List[str], ground_truth_answer: str | None = None) RAGEvaluationResult[source]

Evaluate a complete RAG pipeline using all four metrics.

Parameters:
  • question – The user’s question

  • generated_answer – The answer generated by the RAG system

  • retrieved_contexts – List of retrieved context chunks

  • ground_truth_answer – Optional ground truth answer for context recall evaluation

Returns:

Complete evaluation results

print_evaluation_summary(result: RAGEvaluationResult)[source]

Print a formatted summary of the evaluation results. :param result: The evaluation results to summarize

save_evaluation_results(result: RAGEvaluationResult, output_file: str)[source]

Save evaluation results to a JSON file. :param result: The evaluation results to save :param output_file: Path to the output JSON file

dllmforge.rag_evaluation.evaluate_rag_response(question: str, generated_answer: str, retrieved_contexts: List[str], ground_truth_answer: str | None = None, llm_provider: str = 'auto', save_results: bool = True, output_file: str | None = None) RAGEvaluationResult[source]

Convenience function to evaluate a RAG response. :param question: The user’s question :param generated_answer: The answer generated by the RAG system :param retrieved_contexts: List of retrieved context chunks :param ground_truth_answer: Optional ground truth answer for context recall evaluation :param llm_provider: LLM provider to use (“openai”, “anthropic”, or “auto”) :param save_results: Whether to save results to a file :param output_file: Optional output file path

Returns:

Complete evaluation results

dllmforge.rag_preprocess_documents module

This module provides document preprocessing functionality for RAG (Retrieval-Augmented Generation) pipelines. It includes document loading and text chunking for PDF files.

class dllmforge.rag_preprocess_documents.DocumentLoader[source]

Bases: ABC

Abstract base class for document loaders.

abstract load(file_path: Path) List[Tuple[int, str]][source]

Load a document and return its contents as a list of (page_number, text) tuples. :param file_path: Path to the document file

Returns:

List of tuples containing (page_number, text) pairs

class dllmforge.rag_preprocess_documents.PDFLoader[source]

Bases: DocumentLoader

Loader for PDF documents using PyPDF2.

load(file_path: Path) Tuple[List[Tuple[int, str]], str][source]

Load a PDF document and extract text from its pages. :param file_path: Path to the PDF file

Returns:

Tuple containing (pages_with_text, file_name) where pages_with_text is a list of (page_number, text) pairs

class dllmforge.rag_preprocess_documents.TextChunker(chunk_size: int = 1000, overlap_size: int = 200)[source]

Bases: object

Class for chunking text into smaller segments with overlap. For detailed information about chunking strategies in RAG applications, including: - Why chunking is important - How to choose chunk size and overlap - Different splitting techniques - Evaluation methods See: https://www.mongodb.com/developer/products/atlas/choosing-chunking-strategy-rag/

Initialize the TextChunker. :param chunk_size: Maximum size of each chunk in characters :param overlap_size: Number of characters to overlap between chunks (recommended: 5-20% of chunk_size)

__init__(chunk_size: int = 1000, overlap_size: int = 200)[source]

Initialize the TextChunker. :param chunk_size: Maximum size of each chunk in characters :param overlap_size: Number of characters to overlap between chunks (recommended: 5-20% of chunk_size)

chunk_text(pages_with_text: List[Tuple[int, str]], file_name: str = None, metadata: dict = None) List[Dict[str, Any]][source]

Split text into chunks while preserving sentence boundaries. :param pages_with_text: List of tuples containing (page_number, text) pairs :param file_name: Name of the source file (optional) :param metadata: Metadata information extracted from the document (optional)

Returns:

{

‘text’: str, # The chunk text ‘page_number’: int, # Source page number ‘chunk_index’: int, # Index of the chunk ‘total_chunks’: int, # Total number of chunks from this document ‘file_name’: str # Name of the source file

}

Return type:

List of dictionaries containing chunks with metadata

dllmforge.rag_search_and_response module

This module provides “create index/vector-database”, “search” and “response” functionality for RAG (Retrieval-Augmented Generation) pipelines. Three steps are involved: 1. Create index/vector-database: create an index/vector-database on Azure AI search service. 1. Search: use Azure AI search service to retrieve relevant chunks from the vector database . 2. Response: use LLMs to generate a response to the user query based on the retrieved chunks. The module uses Azure AI search service and Azure OpenAI service as an example of using hosted search APIs and LLMs APIs. Note you need Azure AI search service, Azure OpenAI service and a deployed LLM model on Azure to use this module.

The example demonstrates the whole pipeline of RAG, including: 1. Preprocess the documents to chunks. 2. Vectorize the chunks. 3. Create vector index and store the chunks in the vector database. 4. Search the vector database for relevant chunks. 5. Generate a response to the user query based on the retrieved chunks.

class dllmforge.rag_search_and_response.IndexManager(search_client_endpoint=None, search_api_key=None, index_name=None, embedding_dim=None)[source]

Bases: object

__init__(search_client_endpoint=None, search_api_key=None, index_name=None, embedding_dim=None)[source]
create_index(api_base=None, deployment_name_embeddings=None, api_key=None)[source]
upload_documents(vectorized_chunks)[source]
class dllmforge.rag_search_and_response.Retriever(embedding_model, index_name=None, search_client_endpoint=None, search_api_key=None)[source]

Bases: object

__init__(embedding_model, index_name=None, search_client_endpoint=None, search_api_key=None)[source]
get_embeddings(text)[source]
invoke(query_text, top_k=5)[source]
class dllmforge.rag_search_and_response.LLMResponder(llm)[source]

Bases: object

__init__(llm)[source]
augment_prompt_with_context(query_text, chunks)[source]
generate(query_text, retrieved_chunks)[source]

Module contents

DLLMForge - Deltares LLM Forge Toolkit

A comprehensive toolkit for building and deploying LLM-based applications with RAG capabilities, agentic workflows, and enterprise-grade features.

class dllmforge.SimpleAgent(system_message: str = None, temperature: float = 0.1, model_provider: str = 'azure-openai', llm=None, enable_text_tool_routing: bool = False, max_tool_iterations: int = 3)[source]

Bases: object

Simple agent class for LangGraph workflows.

Initialize a simple LangGraph agent.

Parameters:
  • system_message – System message for the agent

  • temperature – LLM temperature setting

  • model_provider – LLM provider (“azure-openai”, “openai”, “mistral”)

__init__(system_message: str = None, temperature: float = 0.1, model_provider: str = 'azure-openai', llm=None, enable_text_tool_routing: bool = False, max_tool_iterations: int = 3)[source]

Initialize a simple LangGraph agent.

Parameters:
  • system_message – System message for the agent

  • temperature – LLM temperature setting

  • model_provider – LLM provider (“azure-openai”, “openai”, “mistral”)

add_tool(tool_func: Callable) None[source]

Add a tool to the agent.

Parameters:

tool_func – Function decorated with @tool

add_node(name: str, func: Callable) None[source]

Add a node to the workflow.

Parameters:
  • name – Node name

  • func – Node function

add_edge(from_node: str, to_node: str) None[source]

Add a simple edge between nodes.

Parameters:
  • from_node – Source node

  • to_node – Target node

add_conditional_edge(from_node: str, condition_func: Callable) None[source]

Add a conditional edge.

Parameters:
  • from_node – Source node

  • condition_func – Function that determines routing

create_simple_workflow() None[source]

Create a simple agent -> tools workflow with optional text-based tool routing.

compile(checkpointer=None) None[source]

Compile the workflow.

process_query(query: str, stream: bool = True) None[source]

Process a query with the agent.

Parameters:
  • query – User query

  • stream – Whether to stream the response

run_interactive() None[source]

Run the agent in interactive mode.

dllmforge.create_basic_agent(system_message: str = None, temperature: float = 0.1, model_provider: str = 'azure-openai') SimpleAgent[source]

Create a basic agent with standard setup.

Parameters:
  • system_message – System message for the agent

  • temperature – LLM temperature

  • model_provider – LLM provider (“azure-openai”, “openai”, “mistral”)

Returns:

Configured agent instance

Return type:

SimpleAgent

dllmforge.create_basic_tools() List[Callable][source]

Create basic utility tools for testing.

Returns:

List of tool functions

class dllmforge.AzureOpenAIEmbeddingModel(model: str = 'text-embedding-3-large', api_base: str = None, deployment_name_embeddings: str = None, api_key: str = None, api_version: str = None)[source]

Bases: object

Class for embedding queries and document chunks using Azure OpenAI Embeddings.

Initialize the embedding model using provided arguments or environment variables for Azure OpenAI. :param model: Name of the embedding model to use :param api_base: Azure OpenAI API base URL :param deployment_name_embeddings: Azure OpenAI deployment name for embeddings :param api_key: Azure OpenAI API key :param api_version: Azure OpenAI API version

__init__(model: str = 'text-embedding-3-large', api_base: str = None, deployment_name_embeddings: str = None, api_key: str = None, api_version: str = None)[source]

Initialize the embedding model using provided arguments or environment variables for Azure OpenAI. :param model: Name of the embedding model to use :param api_base: Azure OpenAI API base URL :param deployment_name_embeddings: Azure OpenAI deployment name for embeddings :param api_key: Azure OpenAI API key :param api_version: Azure OpenAI API version

static validate_embedding(embedding: List[float]) bool[source]

Validate that the embedding is not empty.

static encode_filename(filename: str) str[source]

Encode filename to be safe for Azure Cognitive Search document keys.

embed(query_or_chunks: str | List[Dict[str, Any]]) List[float] | List[Dict[str, Any]][source]

Embed a single query string or a list of document chunks. :param query_or_chunks: A string (query) or a list of dictionaries (document chunks)

Each dictionary should have keys: “text”, “file_name”, “page_number”

Returns:

list of floats (embedding vector) For document chunks: list of dictionaries with keys: “chunk_id”, “chunk”, “page_number”,

”file_name”, “text_vector”

Return type:

For a string query

class dllmforge.PDFLoader[source]

Bases: DocumentLoader

Loader for PDF documents using PyPDF2.

load(file_path: Path) Tuple[List[Tuple[int, str]], str][source]

Load a PDF document and extract text from its pages. :param file_path: Path to the PDF file

Returns:

Tuple containing (pages_with_text, file_name) where pages_with_text is a list of (page_number, text) pairs

class dllmforge.TextChunker(chunk_size: int = 1000, overlap_size: int = 200)[source]

Bases: object

Class for chunking text into smaller segments with overlap. For detailed information about chunking strategies in RAG applications, including: - Why chunking is important - How to choose chunk size and overlap - Different splitting techniques - Evaluation methods See: https://www.mongodb.com/developer/products/atlas/choosing-chunking-strategy-rag/

Initialize the TextChunker. :param chunk_size: Maximum size of each chunk in characters :param overlap_size: Number of characters to overlap between chunks (recommended: 5-20% of chunk_size)

__init__(chunk_size: int = 1000, overlap_size: int = 200)[source]

Initialize the TextChunker. :param chunk_size: Maximum size of each chunk in characters :param overlap_size: Number of characters to overlap between chunks (recommended: 5-20% of chunk_size)

chunk_text(pages_with_text: List[Tuple[int, str]], file_name: str = None, metadata: dict = None) List[Dict[str, Any]][source]

Split text into chunks while preserving sentence boundaries. :param pages_with_text: List of tuples containing (page_number, text) pairs :param file_name: Name of the source file (optional) :param metadata: Metadata information extracted from the document (optional)

Returns:

{

‘text’: str, # The chunk text ‘page_number’: int, # Source page number ‘chunk_index’: int, # Index of the chunk ‘total_chunks’: int, # Total number of chunks from this document ‘file_name’: str # Name of the source file

}

Return type:

List of dictionaries containing chunks with metadata

class dllmforge.IndexManager(search_client_endpoint=None, search_api_key=None, index_name=None, embedding_dim=None)[source]

Bases: object

__init__(search_client_endpoint=None, search_api_key=None, index_name=None, embedding_dim=None)[source]
create_index(api_base=None, deployment_name_embeddings=None, api_key=None)[source]
upload_documents(vectorized_chunks)[source]
class dllmforge.Retriever(embedding_model, index_name=None, search_client_endpoint=None, search_api_key=None)[source]

Bases: object

__init__(embedding_model, index_name=None, search_client_endpoint=None, search_api_key=None)[source]
get_embeddings(text)[source]
invoke(query_text, top_k=5)[source]
class dllmforge.LLMResponder(llm)[source]

Bases: object

__init__(llm)[source]
augment_prompt_with_context(query_text, chunks)[source]
generate(query_text, retrieved_chunks)[source]
class dllmforge.RAGEvaluator(llm_provider: str = 'auto', deltares_llm: DeltaresOllamaLLM | None = None, temperature: float = 0.1, api_key: str | None = None, api_base: str | None = None, api_version: str | None = None, deployment_name: str | None = None, model_name: str | None = None)[source]

Bases: object

RAGAS-inspired evaluator for RAG pipelines.

This evaluator provides four key metrics: - Context Precision: - Context Recall: Measures the ability to retrieve all necessary information - Faithfulness: Measures factual accuracy and absence of hallucinations - Answer Relevancy: Measures how relevant and to-the-point answers are

Initialize the RAG evaluator.

Parameters:

llm_provider – LLM provider to use (“openai”, “anthropic”, “deltares” or “auto”)

__init__(llm_provider: str = 'auto', deltares_llm: DeltaresOllamaLLM | None = None, temperature: float = 0.1, api_key: str | None = None, api_base: str | None = None, api_version: str | None = None, deployment_name: str | None = None, model_name: str | None = None)[source]

Initialize the RAG evaluator.

Parameters:

llm_provider – LLM provider to use (“openai”, “anthropic”, “deltares” or “auto”)

evaluate_context_relevancy(question: str, retrieved_contexts: List[str]) EvaluationResult[source]

Evaluate the relevancy of retrieved contexts to the question.

This metric measures the signal-to-noise ratio in the retrieved contexts. It identifies which sentences from the context are actually needed to answer the question.

Parameters:
  • question – The user’s question

  • retrieved_contexts – List of retrieved context chunks

Returns:

EvaluationResult with score and explanation

evaluate_context_precision(question: str, retrieved_contexts: List[str], ground_truth_answer: str | None, top_k: int = 5) EvaluationResult[source]

Evaluate Context Precision@k following the Ragas implementation.

For each of the top-k retrieved chunks, the LLM judges whether the chunk supports the reference answer. Average Precision (AP) is then computed as:

AP = sum(Precision@i * rel_i) / (# relevant chunks)

Parameters:
  • question – The question to evaluate.

  • reference_answer – The correct or gold answer.

  • retrieved_contexts – Ranked list of retrieved chunks.

  • top_k – Number of top chunks to evaluate.

Returns:

EvaluationResult with precision@k score and explanation.

evaluate_context_recall(question: str, retrieved_contexts: List[str], ground_truth_answer: str) EvaluationResult[source]

Evaluate the recall of retrieved contexts against a ground truth answer. This metric measures the ability of the retriever to retrieve all necessary information needed to answer the question by checking if each statement from the ground truth can be found in the retrieved context.

Parameters:
  • question – The user’s question

  • retrieved_contexts – List of retrieved context chunks

  • ground_truth_answer – The reference answer to compare against

Returns:

EvaluationResult with score and explanation

evaluate_faithfulness(question: str, generated_answer: str, retrieved_contexts: List[str]) EvaluationResult[source]

Evaluate the faithfulness of the generated answer to the retrieved contexts.

This metric measures the factual accuracy of the generated answer by checking if all statements in the answer are supported by the retrieved contexts.

Parameters:
  • question – The user’s question

  • generated_answer – The answer generated by the RAG system

  • retrieved_contexts – List of retrieved context chunks

Returns:

EvaluationResult with score and explanation

evaluate_answer_relevancy(question: str, generated_answer: str) EvaluationResult[source]

Evaluate the relevancy of the generated answer to the question. This metric measures how relevant and to-the-point the answer is by generating probable questions that the answer could answer and computing similarity to the actual question. :param question: The user’s question :param generated_answer: The answer generated by the RAG system

Returns:

EvaluationResult with score and explanation

calculate_ragas_score(context_precision: float, context_recall: float, faithfulness: float, answer_relevancy: float) float[source]

Calculate the RAGAS score as the harmonic mean of all four metrics. :param context_precision: Context precision score :param context_recall: Context recall score :param faithfulness: Faithfulness score :param answer_relevancy: Answer relevancy score

Returns:

RAGAS score (harmonic mean)

evaluate_rag_pipeline(question: str, generated_answer: str, retrieved_contexts: List[str], ground_truth_answer: str | None = None) RAGEvaluationResult[source]

Evaluate a complete RAG pipeline using all four metrics.

Parameters:
  • question – The user’s question

  • generated_answer – The answer generated by the RAG system

  • retrieved_contexts – List of retrieved context chunks

  • ground_truth_answer – Optional ground truth answer for context recall evaluation

Returns:

Complete evaluation results

print_evaluation_summary(result: RAGEvaluationResult)[source]

Print a formatted summary of the evaluation results. :param result: The evaluation results to summarize

save_evaluation_results(result: RAGEvaluationResult, output_file: str)[source]

Save evaluation results to a JSON file. :param result: The evaluation results to save :param output_file: Path to the output JSON file

class dllmforge.AnthropicAPI(api_key=None, model='claude-3-7-sonnet-20250219', deployment_claude37=None, deployment_claude35=None)[source]

Bases: object

Class to interact with Anthropic’s Claude API.

Initialize the Anthropic API client with configuration.

__init__(api_key=None, model='claude-3-7-sonnet-20250219', deployment_claude37=None, deployment_claude35=None)[source]

Initialize the Anthropic API client with configuration.

check_server_status()[source]

Check if the Anthropic API service is accessible.

list_available_models()[source]

List available models from Anthropic.

send_test_message(prompt='Hello, how are you?')[source]

Send a test message to the model and get a response.

chat_completion(messages, temperature=0.7, max_tokens=1000)[source]

Get a chat completion from the model.

class dllmforge.LlamaIndexAPI(model_provider: str = 'azure-openai', temperature: float = 0.0, api_key=None, api_base=None, api_version=None, deployment_name=None, model_name=None)[source]

Bases: object

Class to interact with various LLM providers using LlamaIndex.

Initialize the LlamaIndex API client with specified configuration. :param model_provider: Provider of model to use. Options are:

  • “azure-openai”: Use Azure OpenAI

  • “openai”: Use OpenAI

  • “mistral”: Use Mistral

Parameters:
  • temperature (float) – Temperature setting for the model (0.0 to 1.0)

  • api_key (str) – API key for the provider

  • api_base (str) – API base URL (for Azure)

  • api_version (str) – API version (for Azure)

  • deployment_name (str) – Deployment name (for Azure)

  • model_name (str) – Model name (for OpenAI/Mistral)

__init__(model_provider: str = 'azure-openai', temperature: float = 0.0, api_key=None, api_base=None, api_version=None, deployment_name=None, model_name=None)[source]

Initialize the LlamaIndex API client with specified configuration. :param model_provider: Provider of model to use. Options are:

  • “azure-openai”: Use Azure OpenAI

  • “openai”: Use OpenAI

  • “mistral”: Use Mistral

Parameters:
  • temperature (float) – Temperature setting for the model (0.0 to 1.0)

  • api_key (str) – API key for the provider

  • api_base (str) – API base URL (for Azure)

  • api_version (str) – API version (for Azure)

  • deployment_name (str) – Deployment name (for Azure)

  • model_name (str) – Model name (for OpenAI/Mistral)

check_server_status()[source]

Check if the LLM service is accessible.

send_test_message(prompt='Hello, how are you?')[source]

Send a test message to the model and get a response. :param prompt: The prompt string to send. :type prompt: str

Returns:

Dictionary containing the response and metadata.

Return type:

dict

chat_completion(messages, temperature=None, max_tokens=None)[source]

Get a chat completion from the model. :param messages: List of message dicts or tuples (role, content) :type messages: list :param temperature: Optional temperature override :type temperature: float :param max_tokens: Optional max tokens override :type max_tokens: int

Returns:

Dictionary containing the response and metadata.

Return type:

dict