How do LLMs help in research and innovation?

Jun 09, 2024

Would it be cool if LLMs could help us develop new products or make breakthrough discoveries? AutoTRIZ and ResearchAgent show how to achieve this.

The theory of Inventive Problem Solving (TRIZ) is a well-developed practice for creative problem solving. AutoTRIZ uses LLM to generate most of the solutions. It consists of 4 modules. The first module helps to identify the problem. It accepts and enhances the user's problem description by making it clear and concise. The next step is to detect engineering contradictions in the problem statement. The third module adds inventive principles for identified contradictions. Only in the last step are we ready to generate a solution. Together with a clear problem statement(LLM), engineering contradiction(LLM), and inventive principles(knowledge base), AutoTRIZ produces a report with the potential solution.

AUTOTRIZ: ARTIFICIAL IDEATION WITH TRIZ AND LARGE LANGUAGE MODELS

It's a bit of a different story when we switch focus from engineering tasks to scientific research. ResearchAgent has 3 steps: problem identification, method development and experiment design. The first step (problem identification) takes as an input - paper, academic graph, and knowledge store. Basically, academic graphs and knowledge stores are a form of RAG. This graph helps ResearchAgent to find related scientific papers quickly. The goal of the first step is to identify gaps or contradictions in current and existing papers. Then, ResearchAgent mimics the human approach by developing methods and experimental designs. Two main differences are comparing the previous approach - RAG and Reviewing Agents, which improve the output of every step in ResearchAgent.

ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models

The missing piece in ReasearchAgent is simulation or the conduct of experiments—Scientific Generative Agent (SGA) closed this gap. On one side, SGA is capable of proposing scientific hypotheses; on the other side, it has a module to do simulations in Python. We can think of it as a for-loop, where for every step we generate a solution and optimise it. All of these are done with the help of LLM. LLM proposes a solution, writes Python code for the simulation and receives feedback from the simulation.

LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery

From these three approaches, a pattern is emerging:

We identify a high-level plan of what people do - TRIZ or scientific process
Split a plan into small steps
Use LLM for generation and validation
If possible, use formal methods for checking correctness - Python

In my personal opinion, all of this is possible because LLMs are especially good at solving the Blank Page Problem. It happens when you start a new project and stare at it without taking action. LLMs can easily generate the first action.

References:

https://arxiv.org/abs/2403.13002 - AutoTRIZ: Artificial Ideation with TRIZ and Large Language Models
https://arxiv.org/abs/2404.07738 - ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models
https://arxiv.org/abs/2405.09783 - LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery
https://arxiv.org/abs/2402.12993 - An Autonomous Large Language Model Agent for Chemical Literature Data Mining

Shchegrikovich LLM

Discussion about this post