Build a Web-Connected Research Agent with Chainlit and LangChain
4 mins read

Build a Web-Connected Research Agent with Chainlit and LangChain

Introduction: Giving Your LLM Real-Time Access to the Web

Large Language Models (LLMs) like those from OpenAI, Anthropic, and Cohere have revolutionized how we interact with information. However, their knowledge is inherently static, frozen at the time of their last training run. This “knowledge cutoff” is a significant limitation for applications requiring up-to-the-minute information. Imagine asking about today’s stock market trends or the latest PyTorch News—a standard LLM would be unable to provide a relevant answer. The solution lies in creating LLM-powered “agents” that can interact with external tools, such as a live web search.

This article provides a comprehensive guide to building a powerful, web-enabled research agent. We will leverage a modern, efficient stack: LangChain as the orchestration framework, an OpenAI model as the reasoning engine, SerpApi for real-time web search capabilities, and Chainlit to rapidly build a polished, interactive user interface. Unlike traditional UI frameworks like Streamlit or Gradio, Chainlit is purpose-built for conversational AI, making it exceptionally easy to visualize an agent’s thought process. By the end of this guide, you’ll have the knowledge and code to create sophisticated AI applications that can reason, act, and access the vast, dynamic world of the internet.

Understanding the Core Components of an AI Agent

Before diving into the code, it’s crucial to understand the role of each component in our AI agent stack. Each piece serves a distinct purpose, and their synergy is what makes this approach so powerful for rapid development and prototyping.

The Brain: Large Language Models (LLMs)

The LLM is the core reasoning engine of our agent. It doesn’t just generate text; it analyzes the user’s query, determines a plan of action, and decides which tool to use to accomplish the task. While we’ll use OpenAI’s models in our examples, the principles apply to a wide range of models. The latest OpenAI News often highlights new model capabilities perfect for agentic workflows. You could easily swap in models from Anthropic (Claude), Cohere, or even open-source alternatives from the Hugging Face News ecosystem, often deployed using tools like Ollama News or vLLM News for efficient local inference.

The Framework: LangChain for Orchestration

If the LLM is the brain, LangChain is the central nervous system. The latest LangChain News consistently focuses on its ability to chain together different components to create complex applications. It provides a standardized interface for interacting with LLMs, managing tools, maintaining conversational memory, and parsing model outputs. The “Agent” abstraction in LangChain is key; it uses an LLM to iteratively decide a sequence of actions, execute them, observe the outcomes, and repeat until the user’s query is resolved. This framework abstracts away much of the complexity, allowing developers to focus on the application’s logic rather than boilerplate code. For more complex data-centric tasks, developers might also explore alternatives covered in LlamaIndex News or Haystack News.

The Tool: Accessing the World Wide Web with SerpApi

An agent is only as capable as its tools. To break free from the knowledge cutoff, our agent needs a tool to search the internet. While one could build a custom web scraper, using a dedicated API like SerpApi is far more efficient and reliable. It provides structured JSON output from Google search results, which is much easier for an LLM to parse and use than raw HTML. This tool is what gives our agent its real-time knowledge, allowing it to fetch the latest NVIDIA AI News or summarize recent developments from Google DeepMind News.

The Interface: Chainlit for Rapid UI Development

A powerful agent is useless without an intuitive user interface. This is where Chainlit shines. As a leading alternative to Streamlit News and Gradio News, Chainlit is specifically designed for building chat applications. Its Python-first approach and simple decorators allow you to create a web UI in minutes. A standout feature is its native ability to display the intermediate steps of an agent’s reasoning process—the “thoughts” and “actions”—which provides invaluable transparency and a superior user experience. This focus on conversational AI is a frequent topic in Chainlit News.

ChatGPT interface - Customize your interface for ChatGPT web -> custom CSS inside …” /><figcaption class=ChatGPT interface – Customize your interface for ChatGPT web -> custom CSS inside …

To get started, you need to install the necessary libraries. Create a project directory and install the following packages:

# You can install these packages using pip
# pip install chainlit langchain openai google-search-results

# requirements.txt
chainlit>=1.0.0
langchain>=0.1.0
langchain-openai>=0.0.5
google-search-results>=2.4.2
python-dotenv>=1.0.0

Step-by-Step Implementation: Building Your First Research Agent

With the core concepts understood, let’s build the agent. This process involves setting up our environment, initializing the LLM and tools, and then assembling them into a functional agent using LangChain.

Setting Up Environment Variables

Security and best practices are paramount. Never hardcode your API keys directly in your script. Instead, use environment variables. Create a file named .env in your project’s root directory and add your keys:

.env file:

OPENAI_API_KEY="sk-..."
SERPAPI_API_KEY="..."

Our Python script will use the python-dotenv library to load these keys securely into the environment.

Initializing the Tools and the Agent

The core of our application will be a Python script that defines the agent’s logic. We will first load the API keys, then instantiate the LLM and the search tool. Finally, we’ll use a LangChain helper function, initialize_agent, to bring everything together.

We’ll use the ChatOpenAI model and the SerpAPIWrapper tool. For the agent type, AgentType.ZERO_SHOT_REACT_DESCRIPTION is an excellent choice. This type of agent works through a “Reason-Act-Observe” loop. It reasons about the task, chooses an action (like using the search tool), observes the result, and repeats the process until it has a final answer. This iterative process allows it to handle complex, multi-step queries effectively.

Here is the complete Python code for the agent’s core logic. Note that this script doesn’t include the Chainlit UI yet; it’s the foundational block we will integrate in the next section.

import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain.agents import load_tools, initialize_agent, AgentType

# Load environment variables from .env file
load_dotenv()

# 1. Initialize the LLM
# We use ChatOpenAI for its conversational abilities.
# temperature=0 controls randomness for more predictable, factual answers.
llm = ChatOpenAI(temperature=0, model_name="gpt-4")

# 2. Load the Tools
# LangChain provides a helper function to load common tools.
# "serpapi" is the tool for interacting with the SerpApi search engine.
# "llm-math" is another useful tool for calculations.
tools = load_tools(["serpapi", "llm-math"], llm=llm)

# 3. Initialize the Agent
# We combine the LLM and tools into an agent.
# - AgentType.ZERO_SHOT_REACT_DESCRIPTION: A powerful general-purpose agent that
#   decides which tool to use based on its description.
# - verbose=True: This allows us to see the agent's thought process in the console,
#   which is great for debugging.
# - handle_parsing_errors=True: Makes the agent more robust to unexpected outputs.
agent = initialize_agent(
    tools,
    llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,
    handle_parsing_errors=True,
)

# Example of how to run the agent (for testing purposes)
if __name__ == "__main__":
    query = "What is the latest news about Meta AI's Llama 3 model?"
    response = agent.run(query)
    print(response)

Running this script directly will print the agent’s entire thought process and the final answer to your console, demonstrating that the core logic is working correctly. This is a crucial debugging step before adding the complexity of a user interface.

Integrating with Chainlit for an Interactive UI

Now that we have a functional agent, it’s time to give it a user-friendly interface. Chainlit makes this process incredibly straightforward using Python decorators. We’ll create a new file, `app.py`, which will contain our Chainlit application code.

LLM web agent diagram - LLM Agents | Prompt Engineering Guide
LLM web agent diagram – LLM Agents | Prompt Engineering Guide

Understanding Chainlit’s Core Decorators

Chainlit’s event-based architecture is primarily managed by two decorators:

  • @cl.on_chat_start: This function runs once, at the very beginning of a user’s chat session. It’s the perfect place to perform initial setup, like initializing our LangChain agent and storing it in the user’s session for later use.
  • @cl.on_message: This function is the main workhorse. It runs every time the user sends a new message. It retrieves the agent from the user session, passes the user’s message to it, and streams the response back to the UI.

Visualizing the Agent’s Thought Process

One of Chainlit’s most powerful features is its seamless integration with LangChain’s callback system. By providing a LangchainCallbackHandler to our agent’s execution call, Chainlit automatically intercepts and displays the agent’s intermediate steps—its thoughts, the actions it takes, and the observations it makes. This transparency is not just a “nice-to-have”; it builds user trust and provides critical insight into how the agent arrives at its conclusions. This is a significant advantage over frameworks that require manual UI updates for each step, which is a common theme in recent FastAPI News and Flask News discussions about building AI backends.

Below is the complete `app.py` file. To run it, save the code and execute chainlit run app.py -w in your terminal. The -w flag enables auto-reloading, so the app will update whenever you save changes to the file.

import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain.agents import load_tools, initialize_agent, AgentType
from langchain.memory import ConversationBufferMemory
import chainlit as cl

# Load environment variables
load_dotenv()

@cl.on_chat_start
async def start():
    """
    This function runs at the start of every user session.
    It initializes the agent and stores it in the user session.
    """
    # Initialize the LLM
    llm = ChatOpenAI(temperature=0, model_name="gpt-4", streaming=True)

    # Load tools
    tools = load_tools(["serpapi", "llm-math"], llm=llm)

    # Initialize memory to keep track of the conversation
    memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

    # Initialize the agent
    agent = initialize_agent(
        tools=tools,
        llm=llm,
        agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,
        verbose=True,
        memory=memory,
        handle_parsing_errors=True,
    )

    # Store the initialized agent in the user's session
    cl.user_session.set("agent", agent)

    await cl.Message(content="Hello! I am a research agent with web access. How can I help you today?").send()

@cl.on_message
async def main(message: cl.Message):
    """
    This function is called every time a user sends a message.
    It runs the agent with the user's message and streams the response.
    """
    # Retrieve the agent from the user session
    agent = cl.user_session.get("agent")

    # Set up the callback handler for streaming
    cb = cl.LangchainCallbackHandler(stream_final_answer=True)

    # Run the agent asynchronously
    response = await agent.acall(
        message.content,
        callbacks=[cb]
    )

    # The final answer is streamed by the callback handler,
    # so we don't need to send it manually.
    # If stream_final_answer=False, you would do:
    # await cl.Message(content=response["output"]).send()

Advanced Techniques, Best Practices, and Optimization

Building a basic agent is just the beginning. To create robust, production-ready applications, you need to consider memory, optimization, and best practices for deployment and monitoring.

Adding Memory for Contextual Conversations

A stateless agent that forgets the conversation after each message is of limited use. To enable follow-up questions and maintain context, we must add memory. The code in the previous section already includes this. LangChain offers various memory types, but ConversationBufferMemory is a great starting point. It stores the chat history and provides it as context to the agent in subsequent turns. By simply adding the `memory` object to the `initialize_agent` call, our agent can now handle conversations like “Who is the CEO of NVIDIA?” followed by “What was their keynote about at the last GTC conference?”.

Best Practices for Production

  • Model Selection: While GPT-4 is powerful, it can be slower and more expensive. For many tasks, GPT-3.5-Turbo or models from the Mistral AI News lineup offer a better balance of cost and performance. Experimentation is key. Tools like Weights & Biases News or MLflow News can help track and compare experiment results.
  • Error Handling: External API calls (like to SerpApi) can fail. The handle_parsing_errors=True parameter in the agent helps, but you should also consider wrapping agent calls in try-except blocks to gracefully manage network issues or API downtime.
  • Cost and Performance Monitoring: API calls cost money. Tools like LangSmith News are becoming indispensable for debugging and monitoring the cost, latency, and correctness of LLM chains and agents. This provides crucial visibility into your application’s performance.
  • Deployment: For deployment, you can containerize your Chainlit app and host it on cloud platforms like AWS SageMaker News, Azure Machine Learning, or Vertex AI News. For simpler, serverless deployments, especially those requiring GPU for self-hosted models, services covered in Modal News or Replicate News are excellent options.

Expanding Agent Capabilities with More Tools

A web search tool is just one possibility. You can create a multi-tool agent by providing a list of tools to LangChain. Consider adding:

  • A Vector Database Tool: For querying internal documents or a knowledge base. You could connect to vector stores like those from Pinecone News, Weaviate News, or Chroma News to build a powerful RAG (Retrieval-Augmented Generation) system.
  • A Code Execution Tool: A Python REPL tool allows the agent to write and execute code to perform complex calculations, data analysis, or other programmatic tasks.
  • Custom API Tools: You can easily wrap any internal or third-party API into a LangChain tool, allowing your agent to interact with your company’s CRM, project management software, or any other service.

Here is a small snippet illustrating how you might add memory to an agent that was not initially created with it.

from langchain.memory import ConversationBufferMemory
from langchain.agents import initialize_agent, AgentType

# Assume 'llm' and 'tools' are already defined

# 1. Initialize the memory component
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

# 2. Initialize the agent with the memory component
# Note the use of a conversational agent type, which is designed to work with memory
conversational_agent = initialize_agent(
    llm=llm,
    tools=tools,
    agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,
    memory=memory,
    verbose=True
)

# Now this agent will remember previous turns in the conversation
# conversational_agent.run("What is the capital of France?")
# conversational_agent.run("What is its population?")

Conclusion and Next Steps

You have successfully learned how to build a sophisticated AI research agent with real-time web access and a polished user interface. We’ve seen how the combination of LangChain’s powerful orchestration, OpenAI’s reasoning capabilities, SerpApi’s data access, and Chainlit’s rapid UI development creates a formidable stack for building modern AI applications. The key takeaway is the power of agentic workflows: by giving LLMs access to tools, we transcend their inherent limitations and unlock a new class of applications that can perceive, reason, and act in the digital world.

Your journey doesn’t end here. I encourage you to experiment further. Try swapping out the LLM for an open-source model from Hugging Face. Add new tools to your agent, such as a Wikipedia search or a financial data API. Explore Chainlit’s more advanced UI components to display images, charts, or other rich media. The world of conversational AI is evolving rapidly, with constant Meta AI News and developments from other major labs, and the skills you’ve learned today place you at the forefront of this exciting field.