Basic Retrieval-Augmented Generation using langchain milvus
Author: Balaji Hambeere
📅 Published on: 2025-02-01
Document Retrieval and Question Answering System
Problem Statement
Current document retrieval systems are inefficient and do not provide relevant results. Users need a system that can retrieve relevant documents and answer questions based on the content of those documents.
Document Loading
Load documents from a file path to process and analyze the content.
Text Chunking
Split the loaded documents into smaller chunks to process and analyze them efficiently.
def process_chunks():
raw_documents = read_document("./data/quantum_computing.txt")
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000, chunk_overlap=200)
return text_splitter.split_documents(raw_documents)
Embeddings Model Creation
Create an embeddings model using the OpenAI API to generate vector representations of the text chunks.
Vector Database Creation
Create a vector database using the Milvus library to store and query the vector representations of the text chunks.
Document Embeddings and Upsert
Generate vector representations of the text chunks using the embeddings model and upsert them into the vector database to query and retrieve relevant documents.
def create_embeddings_and_upsert():
documents = process_chunks()
embeddings_model = get_embeddings_model()
# Configure Milvus connection
return Milvus.from_documents(
documents,
embeddings_model,
connection_args={"uri": get_db()},
collection_name="quantum_computing_docs", # Name your collection
index_params={
"index_type": "IVF_FLAT",
"metric_type": "L2",
"params": {"nlist": 100} # Adjust nlist based on dataset size
}
)
User Query Input
Input a query so that to retrieve relevant documents from the vector database.
def user_query():
return 'Please explain Difference between classical computer and quantum computer?'
Retrieve Relevant Documents
Retrieve relevant documents from the vector database based on the user query to view and analyze the relevant content.
def create_retriever():
vector_store = create_embeddings_and_upsert()
return vector_store.as_retriever(search_type="mmr", search_kwargs={"k": 2})
def retrieve_relevant_docs():
retriever = create_retriever()
query = user_query()
docs = retriever.invoke(query)
return docs
Prepare Prompt for LLM
Prepare a prompt for the LLM to generate a response based on the retrieved relevant documents.
def prepare_prompt():
prompt = ChatPromptTemplate.from_template(
"""Answer the question based only on the following context: {context} Question: {question} """
)
return prompt
Generate Response using LLM
Generate a response using the LLM to view and analyze the generated content.
@chain
def qa(input):
docs = retrieve_relevant_docs()
prompt = prepare_prompt()
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
formatted = prompt.invoke({"context": docs, "question": input})
answer = llm.invoke(formatted)
return answer
# run it
result = qa.invoke(user_query())
print(result.content)
Benefits
- Efficient document retrieval and analysis
- Accurate question answering based on relevant documents
- Improved productivity and decision-making
- Scalable and flexible architecture
Technical Requirements
- OpenAI API for embeddings model and LLM
- Milvus library for vector database
- Python programming language for development
- Docker containerization for deployment
Conclusion
The Document Retrieval and Question Answering System is a powerful tool for efficient document analysis and accurate question answering. With its scalable and flexible architecture, it can be easily integrated into various applications and industries.