AI Agents With Human In The Loop
The autonomous nature of Agentic Applications has often been viewed as a threat and a barrier to the implementation of agents.
However, LangChain’s latest work demonstrates how human oversight can be effectively integrated as checkpoints before tasks are executed, addressing these concerns.
Introduction
Traditional customer care implementations have long utilised chatbots, functioning similarly to how IVRs (Interactive Voice Response systems) have been used, but with text-based interactions instead of voice.
One could refer to the traditional chatbots as ITR’s (Interactive Text Response systems).
In these setups, once the automated conversation reached a certain point, the handover to a live agent was a one-time transfer. The live agent then had to take ownership of the interaction, complete the interaction, fulfil the user’s intent, and close the session.
Agentic Applications to the rescue…
Agentic Applications
As I have mentioned at the onset of this article, there has been dystopian fears around agentic applications in terms of their autonomous nature.
A seen in the image below, agents do have as their backbone one ore more Large Language Models / Foundation Models which assists with the decomposition of complex tasks into sub-tasks. From the sub-tasks, a chain of action is created which the agent can follow.
The agent has access to a list of defined tools which it can make use of to solve challenges. Each of these tools have a description which allows the agent to know when to select tools in a sequence to solve questions. As I have mentioned, the agent creates a sequence of steps which it follows to reach a final conclusion.
Tools
Tools made available to the agent can include an API to research portals like Arxiv, HuggingFace, GALE, Bing search, Serp API, etc. One of the tools is a Human-In-The-Loop tool at the disposal to an agent.
If the agent does not know what the next step or conclusion should be, an live human agent can be contacted to retrieve an answer to a specific question.
LangGraph
As shown in the working notebook below, within LangGraph an interrupt can be added where a human intervention is required. An interrupt can also be added for certain tools where it is deemed that a tool cannot be executed prior to one or more humans approving.
Hence for certain tools, a human interrupt can be added, or within the graph environment of LangGraph a node where human intervention is required can be defined as such. The HITL can be prior or post a particular node, so the human involvement can be to grant approval, or check a transaction post the interaction.
user_approval = input(“Do you want to go to Step 3? (yes/no): “)
Humans As A Tool
The key point to consider is that the entire conversation and its context are not fully handed over to an agent.
Instead, live human agents are engaged selectively, allowing automation to continue even after human interaction. This introduces a new paradigm where human involvement is accessed only when necessary.
LangChain Example
LangChain implemented a new framework called LangGraph…LangGraph is a return to a graph approach to conversations and automation.
LangGraph is a graph based orchestration framework that adds control to agent workflows. LangGraph has a node and edge approach, where each node constitutes a task or can be considered as a step. And edges are links between the nodes which can be subject to conditions which determines to which node the state moves.
LangGraph Cloud is a service for deploying and scaling these applications, featuring a Studio for easy prototyping, debugging, and sharing.
LangGraph makes agents more transparent by allowing you to inspect their behaviour and strike a balance between autonomy and following a defined sequence.
Practical LangChain Example of a LangGraph Implementation
Below is a very simple example of creating a graph in Python…
from typing import TypedDict
from langgraph.graph import StateGraph, START, END
from langgraph.checkpoint.memory import MemorySaver
from IPython.display import Image, display
class State(TypedDict):
input: str
def step_1(state):
print("---Step 1---")
pass
def step_2(state):
print("---Step 2---")
pass
def step_3(state):
print("---Step 3---")
pass
builder = StateGraph(State)
builder.add_node("step_1", step_1)
builder.add_node("step_2", step_2)
builder.add_node("step_3", step_3)
builder.add_edge(START, "step_1")
builder.add_edge("step_1", "step_2")
builder.add_edge("step_2", "step_3")
builder.add_edge("step_3", END)
# Set up memory
memory = MemorySaver()
# Add
graph = builder.compile(checkpointer=memory, interrupt_before=["step_3"])
# View
display(Image(graph.get_graph().draw_mermaid_png()))
With the graph printed out below….
Then a breakpoint an be added for approval, where the flow pauses and waits for permission.
And below the agent is initiated, with an interrupt prior to the action being taken. Again the graph approach defines the agent to a large degree, with the loop clearing indicated in the graphic representation.
Finally
It’s a fascinating time to observe the evolution of technology and architecture as they unfold before us.
We are also witnessing how organisations are aligning with what is considered best practice.
For seasoned chatbot designers and builders, graph representations of conversations are nothing new; they have long been a dominant method in building environments. However, it’s intriguing to see a renewed emphasis on graph approaches, where we return to visual and flow-based representations.
In this context, “flow” refers not only to conversations but also to processes.
I’m currently the Chief Evangelist @ Kore AI. I explore & write about all things at the intersection of AI & language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces & more.