Araştırma40 views

StepFun AI Announces Low-Cost Deep Research Agent Step-DeepResearch

StepFun AI introduced Step-DeepResearch, a new 32B-parameter deep research agent model that aims to transform web searches into fully-fledged research workflows.

StepFun AI Announces Low-Cost Deep Research Agent Step-DeepResearch

AI Redefines Research

Step-DeepResearch, introduced by StepFun AI, stands out as a 32-billion-parameter end-to-end deep research agent aiming to transform web search into genuine research workflows with long-term reasoning, tool usage, and structured reporting capabilities. The model is built upon Qwen2.5 32B-Base and trained to act as a single agent that plans, researches sources, verifies evidence, and writes cited reports while maintaining low inference cost.

Atomic Capabilities and Innovative Architecture

Step-DeepResearch reframes the research process as sequential decision-making through a compact set of 'atomic capabilities.' These capabilities are defined as planning and task decomposition, deep information search, reflection and verification, and professional report generation. Instead of coordinating many external agents, the system incorporates this loop into a single model that decides the next action at each step. This approach is seen as a development parallel to the evolution of AI agents towards managing increasingly complex tasks.

Targeted Data Synthesis and Three-Stage Training

To train the model with these atomic capabilities, separate data processing pipelines were created for each skill. The training process consists of three stages: intermediate training where tool-less atomic capabilities are instilled, a second stage introducing explicit tool calls and increasing context length to 128k tokens, and finally supervised fine-tuning where deep research traces are integrated. In the final stage, using PPO-based reinforcement learning in a real tool environment, the agent was trained with a 'Rubrics Judge' that optimizes its reports according to detailed checklists.

Real-Time Tool Usage and Comprehensive Evaluation

At inference time, the model operates as a single ReAct-style agent that thinks, calls tools, and observes. It utilizes a toolset including batch web search, to-do list manager, shell commands, and file operations. For knowledge acquisition, a proprietary Search API based on over 20 million high-quality academic articles and 600 premium indexes, and a curated authority indexing strategy isolating over 600 trusted domains are used. This highlights the importance of modern research infrastructure, in contrast to the challenges of leveraging AI's power with legacy systems.

Competitive Performance and Early Access

The model's performance was measured using a Chinese benchmark called ADR-Bench, consisting of 110 open-ended tasks. In Elo ratings based on expert evaluations, the 32B model is reported to outperform larger open models and be competitive with systems like Kimi-Researcher. On the Scale AI 'Research Rubrics,' it achieves 61.42% rubric compliance, demonstrating performance comparable to OpenAI and Gemini's deep research systems. The model is currently available for early access.

The release of Step-DeepResearch can be seen as part of the competition in cloud and AI infrastructure. It also suggests that the role of such deep research tools may increase in the technology-focused agendas of global forums.

Related Articles