Hi,👋 we have updated the app and fixed multiple bugs. We are lacking funds, request to free user not to use Adblock. Ads are non intrusive. 😊

@omarsar0: Fine-tuning LLM Agents without...

@omarsar0
14 views Aug 26, 2025
1
Fine-tuning LLM Agents without Fine-tuning LLMs

Catchy title and very cool memory technique to improve deep research agents.

Great for continuous, real-time learning without gradient updates.

Here are my notes:
Media image
2
Overview

Proposes a memory‑based learning framework that lets deep‑research agents adapt online without updating model weights.

The agent is cast as a memory‑augmented MDP with case‑based reasoning, implemented in a planner–executor loop over MCP tools.
Media image
3
Method

Decisions are guided by a learned case‑retrieval policy over an episodic Case Bank.

Non‑parametric memory retrieves Top‑K similar cases; parametric memory learns a Q‑function (soft Q‑learning or single‑step CE training in deep‑research settings) to rank cases for reuse and revision.
4
Architecture

Planner (LLM CBR) + Executor (LLM MCP client) with three memories: Case, Subtask, Tool.

It involves planning, tool execution, writing/reading of cases, and a replay buffer. Tools span search, crawl, multimodal document parsing, code execution, and math utilities.
Media image
5
Results:

• GAIA: 87.88% Pass@3 on validation and 79.40% on test, competitive with or above open‑source agent frameworks
• DeepResearcher: 66.6 F1 and 80.4 PM average across seven open‑domain QA sets
• SimpleQA: 95.0% accuracy, beating recent web‑agent baselines
• HLE: 24.4 PM, close to GPT‑5 and ahead of several strong baselines
Media image
6
Practical takeaways for agent builders:

• Use a compact, curated case memory with adaptive retrieval rather than growing prompts.

• Keep planning concise. A fast planner outperforms slow‑think planners for multi‑step tool use on GAIA by avoiding verbose or shortcut plans.

• Separate planning and execution with explicit Subtask and Tool memories to coordinate long‑horizon work and reduce hallucinations

Paper: arxiv.org/abs/2508.16153
Media image
Actions
Visual Editor Carousel Maker NEW
Update Thread
What You Can Do
  • Download as PDF
  • Save to Notion
  • Export as Markdown
  • Visual Editor
  • LinkedIn & Instagram Carousel Maker
Create Free Account

Includes 7-day Premium trial