601: The LangChain Framework¶

Chapter Overview

LangChain is an open-source framework designed to simplify the development of applications powered by Large Language Models. It provides a standard set of components and abstractions that act as the "glue" for connecting LLMs to external data sources, tools, and other systems.

Its primary goal is to enable developers to build complex, data-aware, and agentic applications beyond simple prompt-and-response interactions.

The Core Philosophy: Composability¶

LangChain's power comes from its principle of composability. It provides modular building blocks that can be chained together to create sophisticated workflows. This allows developers to focus on the application logic rather than writing boilerplate code for API calls, data handling, and state management.

The framework helps you answer questions like: - How can I connect my LLM to my company's private Notion database? - How can I make my chatbot remember previous parts of the conversation? - How can I give my LLM access to tools like a calculator or a web search engine? - How can I build a system that can answer a multi-step question by first searching the web, then reading a document, and finally synthesizing an answer?

The LangChain Ecosystem¶

The LangChain project is composed of several key packages:

langchain-core: Contains the base abstractions and the LangChain Expression Language (LCEL), which is the foundation of all compositions.
langchain-community: Includes all third-party integrations, such as wrappers for different model providers (OpenAI, Hugging Face), vector stores (FAISS, Chroma), and tools.
langchain: The main package that brings everything together for building application logic.

Key Abstractions of LangChain¶

LangChain organizes its functionality into several key modules, which form the basis of our learning path:

graph TD
    A[LangChain Framework] --> B[Models]
    A --> C[Prompts]
    A --> D[Data Connection<br/>for RAG]
    A --> E[Chains]
    A --> F[Agents]
    A --> G[Memory]

    B --> B1[LLMs<br/>OpenAI, Anthropic]
    B --> B2[Chat Models<br/>Conversational AI]
    B --> B3[Embeddings<br/>Vector Representations]

    C --> C1[Prompt Templates<br/>Reusable Prompts]
    C --> C2[Example Selectors<br/>Few-shot Learning]

    D --> D1[Document Loaders<br/>PDF, Web, DB]
    D --> D2[Vector Stores<br/>Similarity Search]
    D --> D3[Retrievers<br/>Information Retrieval]

    E --> E1[Simple Chains<br/>Sequential Operations]
    E --> E2[Router Chains<br/>Conditional Logic]

    F --> F1[Tool Calling<br/>External APIs]
    F --> F2[ReAct Agents<br/>Reasoning + Acting]

    G --> G1[Conversation Memory<br/>Chat History]
    G --> G2[Summary Memory<br/>Compressed Context]

    style A fill:#1565c0,stroke:#0d47a1,color:#fff
    style B fill:#e3f2fd,stroke:#1976d2
    style C fill:#e3f2fd,stroke:#1976d2
    style D fill:#e8f5e8,stroke:#388e3c
    style E fill:#fce4ec,stroke:#c2185b
    style F fill:#fce4ec,stroke:#c2185b
    style G fill:#fff3e0,stroke:#f57c00

Why LangChain Matters¶

Before LangChain¶

Building an AI application required writing custom code for every integration:

# Raw API calls, custom parsing, manual state management
import openai
import requests
import json

def ask_gpt_about_document(question, document_text):
    # Custom prompt construction
    prompt = f"Based on this document: {document_text}\n\nAnswer: {question}"

    # Manual API call
    response = openai.Completion.create(
        engine="text-davinci-003",
        prompt=prompt,
        max_tokens=150
    )

    # Custom response parsing
    return response.choices[0].text.strip()

With LangChain¶

The same functionality becomes declarative and composable:

from langchain.chains import RetrievalQA
from langchain.document_loaders import TextLoader
from langchain.llms import OpenAI

# Load document
loader = TextLoader("document.txt")
documents = loader.load()

# Create a retrieval-based QA chain
qa_chain = RetrievalQA.from_chain_type(
    llm=OpenAI(),
    chain_type="stuff",
    retriever=vectorstore.as_retriever()
)

# Ask questions
answer = qa_chain.run("What is the main topic of this document?")

The LangChain Expression Language (LCEL)¶

LCEL is LangChain's declarative way to compose chains. It uses the pipe operator (|) to create readable, modular workflows:

from langchain.schema import StrOutputParser
from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatOpenAI

# Define components
prompt = ChatPromptTemplate.from_template("Tell me a joke about {topic}")
model = ChatOpenAI()
output_parser = StrOutputParser()

# Compose the chain
chain = prompt | model | output_parser

# Execute
result = chain.invoke({"topic": "artificial intelligence"})

This approach makes chains: - Readable: The flow is clear from left to right - Reusable: Components can be mixed and matched - Debuggable: Each step can be inspected individually - Scalable: Complex workflows remain manageable

Getting Started with LangChain¶

LangChain's learning curve is gentle but comprehensive. The next sections will guide you through:

Core Components - Understanding the building blocks
Chains - Composing multi-step workflows
Agents - Building autonomous AI systems
Memory - Adding persistence and context
RAG Applications - Connecting to external knowledge

Each concept builds upon the previous one, creating a solid foundation for building production-ready AI applications.

Best Practices

Start with simple chains before moving to complex agents
Use LCEL for new projects - it's more maintainable
Leverage the community package for pre-built integrations
Always handle errors gracefully in production applications