Build a ChatGPT Clone with Ollama, LangChain & Streamlit RAG

In the realm of technological advancements, conversational AI has become a cornerstone for enhancing user experience and providing efficient solutions for information retrieval and customer service. Among the various models and implementations, ChatGPT has emerged as a leading figure, inspiring developers and enthusiasts to create their versions or clones. This guide aims to shed light on the foundational steps necessary to embark on building a ChatGPT clone, leveraging cutting-edge tools such as LangChain, Streamlit, and Ollama, along with incorporating RAG (Retrieval-Augmented Generation) functionality for a more dynamic conversational AI.

Understanding the Core Components

Before diving into the development process, it's crucial to comprehend the core components that constitute a ChatGPT clone:

Language Model: At the heart of a ChatGPT clone is a powerful language model like Ollama or Gemini, capable of understanding and generating human-like text.
Retrieval System: The RAG functionality enriches the conversation by fetching relevant information from a dataset or the internet, providing context-aware responses.
User Interface: Streamlit offers an intuitive way to create interactive web applications, serving as the front end for users to interact with the chatbot.

The Pillars of Conversational AI Development

The creation of a sophisticated chatbot, such as a ChatGPT clone, relies heavily on integrating several advanced technologies. Among these, Ollama, Streamlit, and LangChain stand out as the cornerstone technologies that drive the development process. Each plays a pivotal role in ensuring the chatbot is not only functional but also interactive and capable of providing value through dynamic conversations.

Ollama: The Brain Behind the Operation

At the core of any conversational AI is its ability to understand and generate human-like text. This is where Ollama shines. As a powerful language model, Ollama's architecture is designed to process natural language inputs, understand the context, and generate coherent, contextually relevant responses.

Streamlit: The Window to Conversational Interactions

While the underlying AI model forms the brain of the chatbot, Streamlit acts as the face, providing a user-friendly interface for interaction. Streamlit is an open-source app framework favored for its simplicity and efficiency in turning data scripts into shareable web apps. In the context of building a ChatGPT clone, Streamlit allows developers to quickly deploy interactive chat interfaces, facilitating a seamless conversational experience.

LangChain: Bridging Models and Real-World Knowledge

The integration of LangChain introduces the RAG (Retrieval-Augmented Generation) functionality into the chatbot, enabling it to fetch and utilize external information for more accurate and context-aware responses. LangChain is a versatile toolkit that simplifies the incorporation of various language models and knowledge retrieval processes into a coherent system.

Combining Technologies for Enhanced Chatbot Functionality

The synergy between Ollama, Streamlit, and LangChain forms the backbone of a powerful conversational AI system. This blend of technologies represents the cutting edge in chatbot development, enabling creators to bring to life conversational agents that push the boundaries of what's possible in AI interactions.

Step-by-Step Guide to Integrating Ollama, Streamlit, and LangChain

Step 1: Setting Up a Virtual Environment

Open your terminal or command prompt: Navigate to the directory where you want to create your project.
Create a Virtual Environment: Run the following command to create a virtual environment named venv:

$ python -m venv venv

Activate the Virtual Environment:
1. On Windows:

$ venv\Scripts\activate

On macOS and Linux:

$ source venv/bin/activate

Step 2: Installing Required Packages

Create a requirements.txt File:
In your project directory, create a file named requirements.txt and add the following content:

streamlit
langchain
langchain-google-genai
langchain-community
faiss-cpu
python-dotenv

Install the Packages:
With your virtual environment activated, run the following command to install the required packages:

$ pip install -r requirements.txt

Step 3: Setting Up Environment Variables

Create a .env File:
In your project directory, create a file named .env.
Add Your API Keys to the .env File:
Open the .env file and add your LangChain API key like this:

LANGCHAIN_API_KEY="your_langchain_api_key"
LANGCHAIN_PROJECT="YOUR APP NAME EX:CHATGPT CLONE GEMINI STREAMLIT"

Step 4: Writing the Streamlit Application

Part 1: Import Necessary Libraries and Load Environment Variables

Create a app.py File:
In your project directory, create a file named app.py.
Import Libraries and Load Environment Variables:
Add the following code at the beginning of your app.py file:

import streamlit as st
from langchain_community.chat_models import ChatOllama
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.document_loaders import PyPDFDirectoryLoader
from langchain_community.vectorstores import FAISS
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate,MessagesPlaceholder
from langchain.chains import create_history_aware_retriever
from langchain_core.messages import HumanMessage
import os
import asyncio

from dotenv import load_dotenv
load_dotenv()

os.environ["LANGCHAIN_TRACING_V2"]="true"
os.environ["LANGCHAIN_API_KEY"]=os.getenv("LANGCHAIN_API_KEY")

Define the get_conversational_answer Function:
Add the following code to define the function that will handle generating answers:

async def get_conversational_answer(retriever, input, chat_history):
    contextualize_q_system_prompt = """Given a chat history and the latest user question \
    which might reference context in the chat history, formulate a standalone question \
    which can be understood without the chat history. Do NOT answer the question, \
    just reformulate it if needed and otherwise return it as is."""
    contextualize_q_prompt = ChatPromptTemplate.from_messages(
        [
            ("system", contextualize_q_system_prompt),
            MessagesPlaceholder("chat_history"),
            ("human", "{input}"),
        ]
    )

    llm = ChatOllama(model="llama3")

    history_aware_retriever = create_history_aware_retriever(
        llm, retriever, contextualize_q_prompt
    )

    qa_system_prompt = """You are an assistant for question-answering tasks. \
    Use the following pieces of retrieved context to answer the question. \
    If you don't know the answer, just say that you don't know. \
    Use three sentences maximum and keep the answer concise.\

    {context}"""
    qa_prompt = ChatPromptTemplate.from_messages(
        [
            ("system", qa_system_prompt),
            MessagesPlaceholder("chat_history"),
            ("human", "{input}"),
        ]
    )

    question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)
    rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)
    ai_msg = rag_chain.invoke({"input": input, "chat_history": chat_history})
    return ai_msg

Define the main Function:
Add the following code to create the main function that sets up the Streamlit interface and handles file uploads:

def main():
    st.header('Chat with your PDF using ollama by llama3 and gemini')

    if "conversation" not in st.session_state:
        st.session_state.conversation = None

    if "activate_chat" not in st.session_state:
        st.session_state.activate_chat = False

    if "messages" not in st.session_state:
        st.session_state.messages = []
        st.session_state.chat_history = []

    for message in st.session_state.messages:
        with st.chat_message(message["role"], avatar=message['avatar']):
            st.markdown(message["content"])

     embed_model = OllamaEmbeddings(model='mxbai-embed-large')

    with st.sidebar:
        st.subheader('Upload Your PDF File')
        docs = st.file_uploader('⬆️ Upload your PDF & Click to process', accept_multiple_files=True, type=['pdf'])
        if st.button('Process'):
            if docs is not None:
                os.makedirs('./data', exist_ok=True)
                for doc in docs:
                    save_path = os.path.join('./data', doc.name)
                    with open(save_path, 'wb') as f:
                        f.write(doc.getbuffer())
                    st.write(f'Processed file: {save_path}')

            with st.spinner('Processing'):
                loader = PyPDFDirectoryLoader("./data")
                documents = loader.load()
                vector_store = FAISS.from_documents(documents, embed_model)
                retriever = vector_store.as_retriever()
                if "retriever" not in st.session_state:
                    st.session_state.retriever = retriever
                st.session_state.activate_chat = True

            # Delete uploaded PDF files after loading
            for doc in os.listdir('./data'):
                os.remove(os.path.join('./data', doc))

    if st.session_state.activate_chat == True:
        if prompt := st.chat_input("Ask your question from the PDF?"):
            with st.chat_message("user", avatar='👨🏻'):
                st.markdown(prompt)
            st.session_state.messages.append({"role": "user", "avatar": '👨🏻', "content": prompt})
            retriever = st.session_state.retriever

            ai_msg = asyncio.run(get_conversational_answer(retriever, prompt, st.session_state.chat_history))
            st.session_state.chat_history.extend([HumanMessage(content=prompt), ai_msg["answer"]])
            cleaned_response = ai_msg["answer"]
            with st.chat_message("assistant", avatar='🤖'):
                st.markdown(cleaned_response)
            st.session_state.messages.append({"role": "assistant", "avatar": '🤖', "content": cleaned_response})
        else:
            st.markdown('Upload your PDFs to chat')

if __name__ == '__main__':
    main()

Step 5: Running the Application

Run the Streamlit App:
With your virtual environment activated, run the following command in your terminal:

$ streamlit run app.py

Interact with the App:
Open the URL provided by Streamlit (usually http://localhost:8501) in your web browser. Use the sidebar to upload your PDF files, process them, and then ask questions about the content of the PDFs.
Here is the Github link for codebase:
https://github.com/langchain-tech/chatgpt-clone-ollama-streamlit

By following these detailed steps, you'll set up a Streamlit application that processes and interacts with PDFs using LangChain and Google Generative AI embeddings.

Deploying and Testing Your ChatGPT Clone

Launching Your Chatbot into the Real World

After integrating the core technologies of Ollama, Streamlit, and LangChain into your ChatGPT clone, the next thrilling step is deploying and testing it. Deployment makes your chatbot accessible to users, while testing ensures it operates smoothly and effectively handles real-world interactions. This guide navigates through the deployment process and outlines essential testing strategies for a successful launch.

Deployment with Streamlit

Streamlit simplifies the deployment process, allowing you to share your conversational AI with the world through its sharing platform. To deploy, ensure your project is in a Git repository, as Streamlit's sharing feature leverages Git repositories for hosting applications. Then, sign up for a Streamlit sharing account and follow the prompts to deploy your app directly from your repository. This process makes your ChatGPT clone live on the internet, ready for user interaction.

Iterative Feedback Loop

Deploying and testing your ChatGPT clone is not a one-time process but an ongoing cycle of improvement. Gather user feedback, monitor performance metrics, and continuously refine your chatbot based on real-world usage. This iterative approach helps in enhancing the chatbot's capabilities, ensuring it remains relevant and valuable to users.

Monitoring and Analytics

Integrating monitoring and analytics tools can provide insights into how users interact with your chatbot and identify areas for improvement. Tools like Google Analytics or custom logging solutions can track user queries, response accuracy, and engagement metrics, offering valuable data to inform future enhancements.

Conclusion

The deployment and testing phase is an exciting part of the chatbot development process, offering a unique opportunity to see your creation come to life and interact with users. Embrace the challenges and feedback as opportunities for growth, and continue pushing the boundaries of what conversational AI can achieve. Remember, the key to a successful conversational AI lies in its ability to learn and adapt from real-world interactions, making continuous testing and improvement paramount. As you iterate based on user feedback and performance data, your ChatGPT clone will evolve, becoming more intelligent, reliable, and engaging for users.

FAQ's

How to make a ChatGPT clone?

There are two main approaches:

Coding with OpenAI API: Involves using Python libraries to interact with OpenAI's GPT-3 model. You'll build the user interface and functionalities around the API calls.
No-code platforms: Services like Azure offer pre-built chatbot templates that you can configure to use their large language models.

Difference between Streamlit and LangChain?

Streamlit: A Python library for creating simple web apps with minimal coding. Good for prototyping, but less ideal for complex chatbots.

Streamlit • A faster way to build and share data apps

Streamlit is an open-source Python framework for machine learning and data science teams. Create interactive data apps in minutes.

Max Wiertzmaxwiertz

LangChain: A framework specifically designed for building chatbots. Offers features like dialogue management and pre-trained models for efficient development.

How to create a chatbot using Streamlit?

While possible, Streamlit is better suited for simpler applications.

Set up a Streamlit app.
Create a text input field for user queries.
Use Streamlit functions to call an external API or process responses locally.
Display the generated response in the app.

Can I make my own AI chatbot?

Absolutely! The methods above provide starting points.

Choose a large language model API (OpenAI, Google AI, etc.).
Design your chatbot's functionalities.
Develop the user interface.Train the model (optional) for a custom chatbot (complex).

Can I code my own ChatGPT?

Yes, but complex! Use existing large language model APIs (like OpenAI's GPT-3) and build a user interface to interact with it.
RAG Chatbot using LaMDA3 (Llama3):

RAG (Retrieval-Augmented Generation) is a technique for building chatbots. LaMDA3 (potential future name for LaMDA) is a large language model. RAG combines these for potentially better chatbot performance, but LaMDA3 access is limited currently.
How to create a chatbot using LangChain?

LangChain offers a streamlined approach:

Install LangChain and its dependencies.
Choose a pre-trained model.
Define conversation flows and responses using LangChain's functionalities.
Deploy your chatbot.

What is RAG in LangChain?

LangChain's RAG utilizes retrieval-augmented generation to find relevant information and craft informative responses.

What is the purpose of a RAG?

RAG aims to enhance chatbots by finding relevant information and using it to generate informative responses.

Is LangChain API free?

LangChain offers a free tier with limitations on usage. Check their documentation for details.

By following this guide, you’ll be well on your way to creating a powerful ChatGPT clone that enhances user interaction through dynamic, context-aware conversations.