FlowMind Is An Automatic Workflow Generator
The FlowMind research was conducted by JPMorganChase to use LLMs to overcome the rigidity of RPA. FlowMind leverages a prompt template to ground LLM reasoning via APIs.
Data integrity and confidentiality is maintained while generating automated workflows. The effectiveness of FlowMind is demonstrated with success in workflow generation, user interaction & feedback.
My View
FlowMind reminds much of Autonomous Agents as we have come to know them. In particular LangChain Agents and LlamaIndex Agentic RAG comes to mind.
FlowMind & Agents
In both of these implementations the agent creates a sequence or chain of events (analogous to automatic robotic process creation) which is executed.
The only difference in this regards is that FlowMind creates the process or flow as a Python function. Which it then in turn executes to deliver the answer.
Prompt Recipe & Prompt Template
At the heart of agents as we have come to know them, is a prompt template.
FlowMind makes use of something called a Prompt Recipe, which is shown below. The prompt recipe as JPMorganChase calls it, is in essence a template.
RAG & API Retrieval, Partitioning & Extraction
FlowMind aims to solve for hallucination by providing contextual reference data at inference; analogous to RAG. The API also seeks to retrieve, partition and extract relevant XML-like blocks. Blocks are again very much similar to chunks.
FlowMind is also challenged by the problems of selecting the top retrieved blocks/chunks and truncating blocks which are too long.
Embeddings are also used in FlowMind to search according to semantic similarity.
So FlowMind can be considered as JPMorganChase’s propriety RAG solution and obviously it meets their data privacy and governance requirements. What I find curious is that the market in general has settled on certain terminology and a shared understanding has been developed.
JPMorganChase breaks from these terms and introduces their own lexicon. However, FlowMind is very much comparable to RAG in general.
It is evident that through this implementation, JPMorganChase has full control over their stack on a very granular level. The process and flow Python functions created by FlowMind most probably fits into their current ecosystem.
MindFlow can also be leveraged by skilled users to generate flows based on a description which can be re-used.
FlowMind Objectives
The aim of FlowMind is to remedy hallucination in Large Language Models (LLMs) while ensuring there is no direct link between the LLM and proprietary code or data.
FlowMind creates flows or pipelines on the fly, a process the paper refers to as robotic process automation. There is a human-in-the-loop element, which can also be seen as a single dialog turn, allowing users to interact with and refine the generated workflows.
Application Programming Interfaces (APIs) are used for grounding the LLMs, serving as a contextual reference to guide their reasoning. This is followed by code generation, code execution, and ultimately delivering the final answer.
FlowMind Framework Overview
Stage 1: It begins by following a structured lecture plan (prompt template as seen above) to create a lecture prompt. This prompt educates the Large Language Model (LLM) about the context and APIs, preparing it to write code.
Stage 2: The LLM takes user queries or tasks and automatically generates workflow code utilising the provided APIs. This workflow code is then executed to produce the desired results. During this stage, the solution establish a feedback loop between FlowMind and the user.
FlowMind offers a high-level plain-language description of the generated workflow, and the user provides feedback to approve or refine the workflow as necessary.
Generated Code Examples
Below example questions, corresponding workflow and result generated by FlowMind:
Here is an example with a human in the loop approach…this can also be seen as a agent with a HITL agent; or a dialog approach where the user is prompted again.
Finally
Reading the study on FlowMind, one gets the sense that FlowMind aims to create a workflow or sequence of events based on various inputs.
Triggered by a user request, FlowMind utilises a range of tools (APIs) to fulfil the request.
The input is a natural language query from a user, which is processed by FlowMind using an LLM as its backbone while generating the flow as a Python function. Which it executes to subsequently present the answer.
I’m currently the Chief Evangelist @ Kore AI. I explore & write about all things at the intersection of AI & language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces & more.