I Fed a UCL Finance Paper Into Claude. It Told Me to Build These 6 Financial AI Agent Architectures

The paper arrived as a 40-page PDF from UCL's Institute of Finance and Technology.

Published March 2026 by Hui Gong, updated April 22. The title: "AI Agents in Financial Markets: Architecture, Applications, and Systemic Implications." Not a news article. Not a blog post. A peer-reviewed academic framework mapping how autonomous AI systems are restructuring the full workflow of financial decision-making.

I downloaded it on a Sunday morning and fed the entire thing to Claude Opus 4.8. One instruction: tell me which agent architectures in this paper are actually buildable today with existing tools, and rank them by how accessible they are to someone running a personal research operation.

What came back was a ranked list of six. Three I could build that weekend. Three that required more infrastructure but were closer than I expected.

Here is the paper, what it argues, and the six architectures Claude identified.

What the Paper Actually Says

The central argument is not about model intelligence. It is about architecture.

Hui Gong at UCL argues that the systemic implications of AI in finance depend less on how smart the models are than on how agent architectures are distributed, coupled, and governed across financial institutions. The paper defines three generations of financial AI and proposes a four-layer framework for understanding how modern agents are built.

The three generations:

Generation 1: Algorithmic finance. Rule-based execution. Automated order splitting, market-making, microstructure responses. Intelligence lies in disciplined execution, not interpretation.

Generation 2: Machine learning finance. Prediction automation. Return forecasting, credit scoring, fraud detection, portfolio analytics. Models produce signals, humans integrate them into decisions.

Generation 3: Agentic finance. Workflow automation. Systems that perceive information, reason across it, generate decision objects, and initiate or support actions end-to-end. The human stays in the loop for oversight, not for every step.

The four-layer architecture:

Layer 1: Data perception. Market prices, filings, earnings transcripts, macro releases, social signals, blockchain transaction flows, internal portfolio data.

Layer 2: Reasoning engine. Domain LLMs, retrieval systems, forecasting models, optimisation modules, memory. Not a single model but a hybrid combining multiple capabilities.

Layer 3: Strategy generation. Trade ideas, allocation proposals, anomaly alerts, hedging recommendations, compliance flags, explanatory narratives.

Layer 4: Execution and control. Order management systems, exchange APIs, smart contracts, approval workflows, position limits, audit logs, emergency stops.

The paper argues that the most plausible near-term equilibrium is bounded autonomy: agents operating as supervised co-pilots, monitoring systems, and constrained execution modules embedded within human workflows. Not autonomous trading systems. Research partners with the ability to act within defined limits.

That framing is what makes the six architectures worth building now. They are not bets on what AI in finance will look like in five years. They are buildable implementations of the most plausible current equilibrium the paper describes.

The paper's empirical finding that directly affects portfolio construction:

Section 8 of the paper includes an empirical application that most readers will miss because it is presented as a methodological illustration rather than a headline result. The mechanism it studies: how financial markets reprice across different firms when major AI capability announcements are made.

The finding is that repricing is heterogeneous. Firms with higher exposure to AI agent adoption show systematically different price reactions to AI capability disclosures than firms with lower exposure. The gap between how AI-adjacent equities and traditional equities respond to the same capability announcement is a measurable, observable channel.

The practical implication for your own research: when Anthropic, OpenAI, or Google announce a significant capability jump, the repricing across AI-adjacent equities does not happen uniformly or immediately. The differential plays out over days to weeks depending on the nature of the capability and the prior positioning of the firms affected. This is the mechanism Architecture 6 (the narrative cycle agent) is designed to detect before it fully prices in.

The Prompt That Produced the Six Architectures

Prompt

I have uploaded a UCL academic survey on AI agents in financial markets. Read the entire paper before producing any output.The paper proposes a four-layer architecture (data perception, reasoning engine, strategy generation, execution and control) and maps four principal application domains: autonomous trading, portfolio management, risk monitoring and compliance, and DeFi and on-chain intelligence.I want to build agents based on this framework for personal financial research. I am not a financial institution. I do not have access to proprietary data feeds or OMS infrastructure.Identify the six most buildable agent architectures from this paper for someone in my position. Rank them by accessibility. For each: name the architecture, identify which of the four layers it uses, describe what it does, and tell me what tools and data sources I would need to build it.Do not describe what the paper says. Translate the framework into a practical build list.

What came back mapped directly to the paper's application domains, scaled from institutional deployment to personal research use.

Architecture 1: RAG-Based Financial Research Agent

Layers used: 1 (data perception) + 2 (reasoning engine)

What it does: Ingests financial documents as a retrieval corpus. When you ask a question, the agent retrieves the relevant documents and reasons across them using Claude. The output is not a summary. It is analysis grounded in the specific documents the agent has read.

This maps directly to the paper's description of research support and document analysis as the first stage of agentic finance adoption. The personal version: your own corpus of filings, papers, and governance proposals, reasoned across by Claude via retrieval.

What you need:

Claude Opus 4.8 inside a Claude Project

Document corpus: S-1 filings, 10-Ks, earnings transcripts, research papers, governance proposals

Firecrawl or Exa MCP for live document retrieval

Prompt

Read this document alongside everything in the project knowledge base.Do not summarise.Answer: what does this document reveal when read against my existing research?What does it confirm, contradict, or add that was missing?Reference specific documents from the knowledge base when making comparisons.

What it produced: I uploaded this UCL paper alongside the BlackRock 50-agent paper and three DeFi governance forum posts. The agent connected the UCL four-layer framework to a specific governance proposal in ways that produced the clearest portfolio positioning argument I had built that month.

Where it fails: When a document in the knowledge base is outdated and the agent retrieves it without flagging the date, the analysis is built on stale data presented with current-sounding confidence. The fix is to include the publication date in every document's metadata and add a line to the prompt requiring Claude to flag any source older than 90 days before reasoning from it.

Architecture 2: Multi-Agent Sector Research System

Layers used: 1 + 2 + 3

What it does: Three agents running in parallel, each with a distinct mandate, feeding a synthesis agent that produces a final research output. This is the paper's description of FinTeam and TradingAgents scaled for personal use.

The three agents:

Macro interpreter: Reads macro releases and central bank communications. Produces context for the current research period.

Sector analyst: Reads on-chain metrics and sector-specific news. Identifies what is happening in the market being researched.

Risk evaluator: Reads both outputs. Identifies the primary risks to any thesis formed from their analysis.

A synthesis agent reads all three and produces a structured research memo.

What you need:

Claude Haiku 4.5 for the first three agents, Opus 4.8 for synthesis

CoinGecko and LunarCrush MCPs for on-chain and social data

Exa MCP for macro and news retrieval

N8N to orchestrate sequentially

N8N Setup

Node 1: Schedule Trigger or manual triggerNode 2: Claude (Haiku 4.5) → Macro Interpreter with Exa macro dataNode 3: Claude (Haiku 4.5) → Sector Analyst with CoinGecko and LunarCrush dataNode 4: Claude (Haiku 4.5) → Risk Evaluator with outputs from Nodes 2 and 3Node 5: Claude (Opus 4.8) → Synthesis with all three agent outputsNode 6: Write Binary File → 03-Projects/research-memo-[date].md

The paper cites multi-agent coordination as the most effective architecture for financial reasoning tasks requiring structured data, unstructured text, and risk assessment simultaneously. The institutional version requires proprietary data infrastructure. This version works with public APIs.

Where it fails: The macro interpreter and sector analyst agents occasionally produce outputs that directly contradict each other. The synthesis agent cannot always resolve a genuine conflict between macro and sector signals. When this happens, the memo either picks one arbitrarily or hedges into uselessness. The fix: add an explicit conflict resolution instruction to the synthesis prompt requiring Claude to name the contradiction explicitly and ask you to resolve it rather than resolving it automatically.

Architecture 3: DeFi On-Chain Intelligence Agent

Layers used: 1 + 2 + 3

What it does: The paper's Section 6.4 describes DeFi and on-chain intelligence as a distinct domain requiring agents capable of reading blockchain state, identifying anomalies, and cross-referencing on-chain patterns with governance and market data.

The personal version: an agent that reads on-chain metrics for protocols you are researching, compares them against your documented thesis positions, and flags when the data diverges from what your thesis predicts.

What you need:

CoinGecko MCP for price, volume, and TVL data

LunarCrush MCP for social volume and sentiment

Nansen or Dune for wallet-level on-chain data

Your 03-Projects thesis notes as the comparison layer

Claude Opus 4.8 for reasoning

Prompt

You are an on-chain intelligence agent. You have access to the following data for [protocol name]: current TVL, 30-day TVL change, token holder concentration, social volume trend over 14 days, and recent governance activity.You also have access to my current thesis note for this protocol.Your job: compare the on-chain data against my thesis.Tell me: does the data support, challenge, or add new dimension to the position I have documented?Identify any metric where the trend contradicts what my thesis predicts.Flag any governance signal not captured in my thesis note.Do not describe the data. Reason across it against my documented position.

Architecture 4: Portfolio Co-Pilot (Bounded Autonomy)

Layers used: All four, with human oversight at Layer 4

What it does: The paper's most detailed treatment of bounded autonomy describes an agent that receives portfolio state, reads market data and research outputs, generates specific allocation proposals with explicit reasoning, and flags them for human review before any action is taken.

Not an autonomous trading system. The paper explicitly argues against delegating execution decisions fully to AI at the current stage. What this architecture implements is an agent that produces structured decision objects (trade idea + confidence score + risk estimate + explanation) and hands them to a human for final approval.

What you need:

Current portfolio state (manually maintained or connected via brokerage API)

Claude Opus 4.8 with your research corpus as context

A structured output format for decision objects

A review workflow before any action

Prompt

You are a portfolio co-pilot operating under bounded autonomy.You have access to my current portfolio positions, today's market data, and my research notes.Produce structured decision objects for any position where the current data diverges materially from the documented thesis.For each decision object, produce:— The position— The observed divergence— An allocation proposal (increase, decrease, or hold with sizing)— A confidence score from 1 to 10— The primary risk to the proposal— A one-paragraph explanation tracing from data to conclusionDo not execute anything. These are proposals for my review.Flag the two highest-conviction proposals separately.

Architecture 5: Risk and Compliance Monitoring Agent

Layers used: 1 + 2 + 3

What it does: The paper describes risk monitoring and compliance as an application domain where institutional AI is already deployed at scale because the task is pattern recognition against defined criteria rather than creative judgment.

For personal research: an agent that monitors regulatory filings, governance proposals, and protocol updates for risk signals that affect your documented positions.

What you need:

Exa MCP for regulatory and governance document retrieval

Your documented thesis notes as the risk comparison baseline

Claude Sonnet 4.6 running daily

Prompt

You are a risk monitoring agent. Read the regulatory, governance, and protocol update documents collected in the last 24 hours.For each document, ask: does this represent a risk to any position I have documented in my research notes?If yes, classify the risk as:Critical (thesis is directly contradicted by a regulatory or governance action)Moderate (thesis is complicated but not invalidated)Monitor (worth watching but no immediate action needed)Output a daily risk log with one line per flagged document: document name, risk classification, and one sentence on why it affects the specific position.

Architecture 6: Market Sentiment and Narrative Cycle Agent

Layers used: 1 + 2 + 3

What it does: The paper's empirical research design section describes event studies of AI-agent capability disclosures and heterogeneous market repricing. The underlying mechanism: information widely reported but not yet priced is the signal worth finding.

This architecture operationalises that in a personal research context. The agent reads social volume data, news sentiment, and on-chain accumulation patterns. It compares the information landscape against price action and flags divergences. High social volume plus declining price is a signal worth understanding. Low social volume plus on-chain accumulation may precede the narrative forming.

What you need:

LunarCrush MCP for social volume, sentiment, and creator activity

CoinGecko MCP for price and volume data

Claude Sonnet 4.6

Your documented thesis on narrative cycle timing as the comparison baseline

Prompt

You are a narrative cycle analysis agent. You have access to social volume, sentiment trends, and price data for the following sectors and assets.Using my documented thesis on how narrative cycles form and how on-chain data precedes CT narrative formation, identify:1. Any sector where social volume and price action are diverging in a way that historically preceded a narrative shift2. Any asset where the data pattern matches the early-stage signal pattern described in my thesis notes3. The sector most likely to see narrative formation in the next 7 to 14 days based on current signal patternsReference the specific metrics supporting each observation.Flag when the data is insufficient to reach a conclusion.

How the Six Work as a System

The article presents each architecture individually because they can be built that way. But the paper's insight about workflow-centric automation applies to the combination, not just each agent in isolation.

Here is how the outputs connect:

The RAG research agent (Architecture 1) builds your document knowledge base. The multi-agent sector research system (Architecture 2) reads from that knowledge base and produces research memos. Those memos go into your 03-Projects folder in Obsidian.

The portfolio co-pilot (Architecture 4) reads the 03-Projects folder. It compares your documented positions against current market data. When it flags a divergence, the DeFi monitor (Architecture 3) investigates the on-chain specifics of the flagged protocol.

The risk monitoring agent (Architecture 5) reads all incoming regulatory and governance documents. When it raises a Critical flag, it interrupts the portfolio co-pilot's next scheduled run with the new information before any allocation proposals are generated.

The narrative cycle agent (Architecture 6) feeds the crypto morning brief from the Obsidian automations stack. When it identifies a sector showing early narrative formation signals, that signal feeds the multi-agent sector research system as the highest-priority sector for the next research run.

The coordination layer the paper describes at the institutional level, where agents share information across perception, reasoning, and strategy layers, exists in this personal stack through the Obsidian vault. Every agent writes its output to the vault. Every agent reads from the vault before producing its next output. The vault is the shared memory the paper's AFMM describes.

The Framework That Made the Six Make Sense

Every architecture above maps to a specific combination of the paper's four layers. The accessibility ranking follows directly from the layers used.

Layers 1 and 2 only (perception + reasoning): RAG research agent. Pure analysis. No action output. The most accessible.

Layers 1, 2, and 3 (perception + reasoning + strategy generation): Multi-agent system, DeFi monitor, risk monitor, sentiment agent. These produce structured decision objects but do not touch execution. Accessible with public APIs and Claude.

All four layers: Portfolio co-pilot. Produces decision objects and has a defined interface with execution infrastructure. The human makes all final calls. The most infrastructure-intensive but the paper argues this bounded design is also the most effective.

The paper's conclusion that bounded autonomy is the most plausible near-term equilibrium is not a limitation. It is a design principle. An agent that produces a structured decision object with explicit reasoning, and requires human review of every consequential action, produces more reliable outcomes than one that eliminates the human from the loop.

Where to Start

Six architectures is too many to build in one session. The order matters.

Build Architecture 1 first: the RAG research agent.

It requires the least infrastructure of any of the six. No N8N. No API integrations beyond Claude. Just a Claude Project, your documents, and the Exa or Firecrawl MCP for retrieval. The output from the first session will show you whether the approach is working before you invest a weekend in the automation layer. If the RAG agent is not surfacing useful cross-document connections within the first three sessions, the problem is almost certainly the document corpus, not the architecture. Fix that before building anything else.

Build Architecture 5 (risk monitoring) second.

Once the RAG agent is producing good analysis, you have a research corpus worth protecting. The risk monitoring agent reads the same documents the RAG agent uses and flags when new information threatens your documented positions. It runs on Sonnet 4.6, costs roughly $1 per month, and the daily output informs everything else you are building.

Then build Architecture 3 (DeFi monitor) or Architecture 6 (narrative cycle) based on what you research.

These two are the most crypto-specific. If your research is protocol-level, build the DeFi monitor next. If your research is narrative and positioning-focused, build the narrative cycle agent. Both connect directly to the positions the risk monitoring agent is already tracking.

Architectures 2 and 4 are the last two.

The multi-agent sector research system requires N8N orchestration. The portfolio co-pilot requires you to have enough documented positions and thesis notes to give the agent meaningful context to reason across. Both are significantly more useful after the first three have been running for a few weeks and your research corpus has depth.

What the Six Architectures Actually Cost to Run

Verified numbers using current Anthropic API pricing. No estimates.

Confirmed rates:

Claude Haiku 4.5: $1 per million input tokens, $5 per million output tokens

Claude Sonnet 4.6: $3 per million input tokens, $15 per million output tokens

Claude Opus 4.8: $5 per million input tokens, $25 per million output tokens

Per architecture, per month with realistic usage:

Architecture 1 (RAG research agent, Opus 4.8, approximately 40 sessions per month at 20,000 input tokens + 1,000 output tokens per session): roughly $5 per month

Architecture 2 (Multi-agent system, three Haiku agents + one Opus synthesis, run twice per week, 8 runs per month): roughly $1 per month. The three Haiku agents cost approximately $0.021 per run combined. The Opus synthesis agent adds approximately $0.10 per run. Total per run: roughly $0.12.

Architecture 3 (DeFi monitor, Opus 4.8, daily, 10,000 input + 600 output tokens): roughly $2 per month

Architecture 4 (Portfolio co-pilot, Opus 4.8, weekly, 25,000 input + 1,500 output tokens): roughly $1 per month

Architecture 5 (Risk monitoring, Sonnet 4.6, daily, 8,000 input + 500 output tokens): roughly $1 per month

Architecture 6 (Narrative cycle, Sonnet 4.6, daily, 6,000 input + 400 output tokens): roughly $1 per month

Total with this model selection: approximately $11 per month.

The most expensive architecture is the RAG research agent because Opus 4.8 is used and sessions can be long when the knowledge base is large. Switching to Sonnet 4.6 for Architecture 1 cuts that cost roughly in half. Running everything on Sonnet 4.6 brings the total to approximately $5 to $6 per month.

One note worth knowing: Opus 4.8 introduced a new tokenizer in the 4.7 generation that produces up to 35% more tokens for the same input text compared to Opus 4.6. If your prompts were calibrated on older models, your actual costs will be slightly higher than these estimates. The estimates above account for this by using conservative token counts.

Claude's summary when I asked it to compress the paper's central argument:

Prompt

Having read this UCL survey in full, state the single most important design principle for someone building AI financial agents today.

The value of an AI financial agent is not proportional to its autonomy. It is proportional to how clearly it structures the reasoning that a human then acts on. The agent that forces explicit decision objects, traces data to conclusion, and requires human review of every consequential action produces better outcomes than the agent that eliminates the human from the loop. The paper calls this bounded autonomy. The practical implication: the most important layer to design carefully is not execution but the interface between strategy generation and the human decision that follows it.

The full paper is free at arxiv.org/abs/2603.13942. Forty pages. The most useful academic framework I have read on what AI agents in finance should actually look like right now.

For working implementations of the financial agent patterns described here, specifically Architecture 3 and Architecture 6, the BlockRunAI repository at github.com/BlockRunAI/awesome-OpenClaw-Money-Maker contains a curated list of existing agent builds for crypto trading, DeFi operations, Polymarket, and portfolio management built on OpenClaw. The BankrBot/openclaw-skills repository within it has drop-in skills for Polymarket prediction, leverage trading, and multi-chain DeFi operations. These are not the architectures described in this article but they are working reference implementations for the same underlying workflows.

Follow @damidefi on X for daily Claude AI tools, crypto analysis, and the full journey to 100K. Bookmark this. Share it with one person building trading tools who has not read what institutional researchers think the architecture should actually look like.

I Fed a UCL Finance Paper Into Claude. It Told Me to Build These 6 Financial AI Agent Architectures

What the Paper Actually Says

The Prompt That Produced the Six Architectures

Architecture 1: RAG-Based Financial Research Agent

Architecture 2: Multi-Agent Sector Research System

Architecture 3: DeFi On-Chain Intelligence Agent

Architecture 4: Portfolio Co-Pilot (Bounded Autonomy)

Architecture 5: Risk and Compliance Monitoring Agent

Architecture 6: Market Sentiment and Narrative Cycle Agent

How the Six Work as a System

The Framework That Made the Six Make Sense

Where to Start

What the Six Architectures Actually Cost to Run

Actions

What You Can Do