An Intelligent Agentic System for Complex Image Restoration Problems

Kaiwen Zhu1,2 *, Jinjin Gu3 *, Zhiyuan You4,5, Yu Qiao2, Chao Dong5,2 †
1Shanghai Jiao Tong University   2Shanghai Artificial Intelligence Laboratory
3The University of Sydney   4The Chinese University of Hong Kong 5Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences
ICLR 2025
*: Equal Contribution     †: Corresponding Author

Abstract

Real-world image restoration (IR) is inherently complex and often requires combining multiple specialized models to address diverse degradations. Inspired by human problem-solving, we propose AgenticIR, an agentic system that mimics the human approach to image processing by following five key stages: Perception, Scheduling, Execution, Reflection, and Rescheduling. AgenticIR leverages large language models (LLMs) and vision-language models (VLMs) that interact via text generation to dynamically operate a toolbox of IR models. We fine-tune VLMs for image quality analysis and employ LLMs for reasoning, guiding the system step by step. To compensate for LLMs' lack of specific IR knowledge and experience, we introduce a self-exploration method, allowing the LLM to observe and summarize restoration results into referable documents. Experiments demonstrate AgenticIR's potential in handling complex IR tasks, representing a promising path toward achieving general intelligence in visual processing. stages

Method

Learning from exploration

framework

LLMs alone fail to grasp the intricate interactions among operations and thus cannot plan reliably. To equip LLMs with the ability to plan, we let the agent self-explore beforehand and then summarize the accumulated experience to distill knowledge. The knowledge will be a concrete ground for planning in inference, helping the agent to make correct and consistent decisions.

Workflow

framework

An example illustrating the framework of AgenticIR. (a) presents the entire workflow, where bubble frames beside robots represent responses from LLMs and VLMs, and the numbers in circles correspond to those in (b). (b) points out the tree search nature of the system. (c) expounds on how to execute a single-degradation restoration operation with a toolbox.

Effectiveness of designs

Real-world application

Comparison with other methods

All-in-one models may struggle to handle so many types of degradations. Random invocation of corresponding models may miss the appropriate models and execution order.

BibTeX


        @inproceedings{agenticir,
          title={An Intelligent Agentic System for Complex Image Restoration Problems},
          author={Kaiwen Zhu and Jinjin Gu and Zhiyuan You and Yu Qiao and Chao Dong},
          booktitle={The Thirteenth International Conference on Learning Representations},
          year={2025},
          url={https://openreview.net/forum?id=3RLxccFPHz}
        }