BFSI workloads · Case Study 02

Enterprise Agentic RAG

I kept metadata and vectors together in PostgreSQL/pgvector. LangGraph handles planning, retrieval, evaluation, and answer synthesis so the flow stays auditable.

status

PRODUCTION

environment

GCP Kubernetes (GKE)

ingress

Istio Ingress Gateway

runtime graph

6 nodes / 6 edges

System map

Enterprise Agentic RAG

Env: GCP Kubernetes (GKE)Ingress: Istio Ingress Gateway

Problem: I designed an agentic RAG pattern for long financial filings. The system needed layout-aware parsing, hybrid retrieval, clear state transitions, and a grounding check before any answer could ship.

My engineering note

I kept metadata and vectors together in PostgreSQL/pgvector. LangGraph handles planning, retrieval, evaluation, and answer synthesis so the flow stays auditable.

Live path

request: Istio gateway -> FastAPI parser

Path running

Mini Map
Interactive map

Zones, edges, and logs come from the case-study data model.

Architecture Decision

Why I chose this design.

Short decision notes tied to the code or config that mattered.

Decision

rag_graph.py

I used LangGraph with PostgreSQL/pgvector because the RAG flow needed auditable states and retrieval metadata near the vector index.

rag_graph.pypython
python
from langgraph.graph import END, START, StateGraphfrom langgraph.checkpoint.postgres import PostgresSaver
class RAGState(TypedDict):    query: str    filters: dict[str, str]    retrieved_docs: list[Document]    grounding_score: float
def compile_rag_graph(pool):    graph = StateGraph(RAGState)    graph.add_node("plan", plan_query)    graph.add_node("retrieve", retrieve_from_pgvector)    graph.add_node("evaluate", evaluate_grounding)    graph.add_edge(START, "plan")    graph.add_edge("plan", "retrieve")    graph.add_edge("retrieve", "evaluate")    graph.add_edge("evaluate", END)    return graph.compile(checkpointer=PostgresSaver(pool))
Next case study

GPU Platform Modernization

Inference scaling - I worked on inference platform patterns where static GPU allocation slowed teams down. Production serving needed quota, ...

Read next

Work With Me

Need this level of architecture review?

Bring the hard system constraint: retrieval quality, agent failure modes, latency, evaluation, deployment topology, or technical market education.