System Prompt Optimization with Meta-Learning

Published 14 May 2025 in cs.CL, cs.AI, and cs.LG | (2505.09666v2)

Abstract: LLMs have shown remarkable capabilities, with optimizing their input prompts playing a pivotal role in maximizing their performance. However, while LLM prompts consist of both the task-agnostic system prompts and task-specific user prompts, existing work on prompt optimization has focused on user prompts specific to individual queries or tasks, and largely overlooked the system prompt that is, once optimized, applicable across different tasks and domains. Motivated by this, we introduce the novel problem of bilevel system prompt optimization, whose objective is to design system prompts that are robust to diverse user prompts and transferable to unseen tasks. To tackle this problem, we then propose a meta-learning framework, which meta-learns the system prompt by optimizing it over various user prompts across multiple datasets, while simultaneously updating the user prompts in an iterative manner to ensure synergy between them. We conduct experiments on 14 unseen datasets spanning 5 different domains, on which we show that our approach produces system prompts that generalize effectively to diverse user prompts. Also, our findings reveal that the optimized system prompt enables rapid adaptation even to unseen tasks, requiring fewer optimization steps for test-time user prompts while achieving improved performance.

Abstract PDF Upgrade to Chat

Summary

The paper demonstrates that bilevel meta-learning with MetaSPO significantly enhances prompt generalization across 14 unseen tasks.
The paper employs a dual-loop method with an inner loop for user prompt refinement and an outer loop for system prompt optimization.
The paper shows improved efficiency by reducing test-time adaptation iterations and lowering computational requirements.

System Prompt Optimization with Meta-Learning

The paper "System Prompt Optimization with Meta-Learning" introduces a novel approach for optimizing system prompts in LLMs to enhance generalization and adaptability across diverse tasks and domains. This method addresses the often overlooked component of system prompts, which are the task-agnostic instructions guiding the LLM's foundational behavior.

Introduction to Bilevel System Prompt Optimization

The research highlights the sensitivity of LLM performance to both system and user prompts. While prior work primarily focused on optimizing user prompts for specific tasks, this study emphasizes the importance of optimizing system prompts. The goal is to create prompts that are robust and capable of transferring across various tasks, including those not encountered during the optimization phase.

Traditional prompt optimization methods focus on single-task user prompts, limiting their generalization to tasks outside the training distribution. By contrast, bilevel system prompt optimization aims to enhance the flexibility and transferability of system prompts. The process is framed as a bilevel optimization problem, where the higher-level objective is the system prompt's generalization across tasks, and the lower-level objective involves optimizing user prompts for task-specific performance.

Figure 1: Concept Figure illustrating the distinction between conventional task-specific optimization and bilevel system prompt optimization.

Meta-Learning Framework for System Prompt Optimization

The proposed method, Meta-level System Prompt Optimizer (MetaSPO), utilizes a meta-learning framework to optimize system prompts. This framework consists of two hierarchical loops: an inner loop for user prompt optimization and an outer loop for system prompt optimization (Figure 2).

Figure 2: Overview of MetaSPO's inner and outer loops for prompt optimization.

Inner Loop

In the inner loop, user prompts are iteratively refined to maximize task-specific performance. This involves analyzing incorrectly predicted examples to generate and evaluate candidate user prompts.

Outer Loop

In the outer loop, the system prompt is optimized across a distribution of tasks. This involves evaluating potential system prompts to ensure generality and compatibility with a variety of user prompts and tasks.

Experimental Results

The paper's experiments demonstrate significant improvements in generalization performance using MetaSPO across 14 unseen tasks from five domains. The optimization enables the system prompt to improve task performance without further user prompt adaptation:

Unseen Generalization: MetaSPO outperforms conventional baselines by better generalizing across diverse tasks (Figure 3).
Test-Time Adaptation: The approach also enhances the efficiency of user prompt optimization, achieving faster convergence with fewer examples.

Figure 3: Performance comparison showing the superiority of MetaSPO-optimized user prompts.

Detailed Analyses

Detailed analyses in the paper highlight several key insights:

Source-Target Task Similarity: Performance improvement correlates with task similarity, illustrating the benefit of training on tasks similar to the target task.
Cross-Domain Generalization: MetaSPO's optimized prompts remain effective even in cross-domain scenarios, demonstrating robustness.
Efficiency: The optimized prompts reduce computational requirements for test-time adaptation, showcasing substantial resource savings (Figure 4).
Figure 4: Efficiency for test-time adaptation showing optimization iteration and data quantity impacts.

Conclusion

The introduction of bilevel system prompt optimization marks a significant advancement in enhancing the adaptability and robustness of LLMs across diverse tasks. MetaSPO's meta-learning approach effectively leverages task distributions to produce system prompts that can adapt to a wide range of scenarios with minimal additional tuning. Future work could explore further scalability of this approach, potentially extending it to other types of model adaptations and optimizations.