✨ Visual Editor

close

palette Canvas & Background

Gradient:arrow_forward
Text Color:
135°

style Card Style

40px
16px

text_fields Typography

16px
elvis
@omarsar0
A Survey on LLMs in Scientific Discovery

The next step for AI agents is scientific discovery.

This is a great paper summarizing trends and the future.

Here are my notes:
Thread image
elvis
@omarsar0
What's the paper about?

This paper presents a conceptual framework to understand the evolving role of LLMs in scientific discovery, emphasizing their progression from task-specific tools to autonomous scientific agents.

Anchored in the stages of the scientific method, the survey proposes a three-level taxonomy, LLM as Tool, Analyst, and Scientist, and categorizes over 90 research works accordingly.
Thread image
elvis
@omarsar0
Three Levels of Autonomy:

Tool (Level 1): LLMs automate discrete tasks (e.g., literature summarization, code snippets) with direct human supervision.

Analyst (Level 2): LLMs independently handle analytical workflows, such as statistical modeling or symbolic regression, requiring less human intervention.

Scientist (Level 3): LLMs autonomously conduct multi-stage research cycles, including hypothesis generation, experimentation, and refinement, with minimal human input.
Thread image
elvis
@omarsar0
Mapping to the Scientific Method

The paper maps LLM applications to all six stages of the scientific method (e.g., hypothesis generation, data analysis, conclusion). The table shows a detailed breakdown of Level 1 works by task and domain.

Characteristics of Level 1 systems include:

- Operates with explicit prompts and limited autonomy
- Enhances researcher productivity in discrete tasks
- Outputs generally require human integration and validation
Thread image
elvis
@omarsar0
Level 2

Here is the comparison and classification of Level 2 research works in LLM-based scientific discovery.

These are autonomous analytical agents that execute goal-oriented tasks with moderate human oversight.

Characteristics include:

- Capable of multi-step reasoning and data modeling
- Manages sequences of tasks (e.g., analyzing experiments, refining models)
- Requires humans mainly for goal definition and result validation
Thread image
elvis
@omarsar0
Level 3

Notable Level 3 systems include The AI Scientist, Agent Laboratory, and Zochi, which demonstrate autonomous literature review, idea development, experimentation, and report generation.

These systems often use agentic workflows and multi-agent feedback loop.

Unlike Level 2 systems, which require humans to define tasks or validate outputs, Level 3 systems may start from broad prompts or even operate autonomously within a domain, with human involvement limited to high-level oversight or quality control.
Thread image
elvis
@omarsar0
Challenges and Future Directions

The authors highlight key challenges for advancing LLM-based science:

- enabling fully autonomous research cycles

- integrating robotic automation for physical experiments

- achieving transparent and interpretable reasoning

- ensuring continuous self-improvement

- addressing ethical governance and societal alignment

This paper has a comprehensive set of related works for further reading if anyone is interested in specific domains.

Paper: arxiv.org/abs/2505.13259
Generated by Thread Navigator
100%
view_carousel Carousel Studio NEW
Press + S to quick-export