Hierarchical Expert Prompt for Large-Language-Model: An Approach Defeat Elite AI in TextStarCraft II for the First Time

Published 16 Feb 2025 in cs.AI | (2502.11122v1)

Abstract: Since the emergence of the LLM, LLM has been widely used in fields such as writing, translating, and searching. However, there is still great potential for LLM-based methods in handling complex tasks such as decision-making in the StarCraft II environment. To address problems such as lack of relevant knowledge and poor control over subtasks of varying importance, we propose a Hierarchical Expert Prompt (HEP) for LLM. Our method improves the understanding of game situations through expert-level tactical knowledge, improving the processing quality of tasks of varying importance through a hierarchical framework. Our approach defeated the highest level (Elite) standard built-in agent in TextStarCraft II for the first time and consistently outperformed the baseline method in other difficulties. Our experiments suggest that the proposed method is a practical solution for tackling complex decision-making challenges. The replay video can be viewed on https://www.bilibili.com/video/BV1uz42187EF and https://youtu.be/dO3PshWLV5M, and our codes have been open-sourced on https://github.com/luchang1113/HEP-LLM-play-StarCraftII.

Abstract PDF Upgrade to Chat

Summary

The paper introduces a Hierarchical Expert Prompt framework that integrates expert tactic prompts and hierarchical decision protocols to enhance LLM strategic decisions in TextStarCraft II.
The methodology achieves a 75% win rate against VeryHard opponents and records first-time victories over Elite AI, outperforming traditional baseline methods.
Ablation studies confirm that both the Expert Tactic Prompt and Hierarchical Decision Prompt modules are critical for improved resource management and tactical performance.

Hierarchical Expert Prompt for Large-Language-Model

Introduction

The paper "Hierarchical Expert Prompt for Large-Language-Model: An Approach Defeat Elite AI in TextStarCraft II for the First Time" introduces a novel Hierarchical Expert Prompt (HEP) framework designed to enhance LLMs' (LLMs) decision-making capabilities in complex environments such as StarCraft II. Leveraging a combination of expert tactical knowledge and hierarchical decision frameworks, the approach tackles deficiencies in resource management, military development, and technology upgrades encountered by existing LLM-based methods like Chain of Summary (CoS) in the TextStarCraft II platform.

StarCraft II Environment as a Testing Ground

StarCraft II serves as a rigorous testing ground for decision-making algorithms due to its comprehensive state space, multi-agent involvement, incomplete information, and long temporal sequences required for effective gameplay.

Figure 1: StarCraft II. In this decision-making environment, players need to control units, collect resources, build and upgrade technology, and confront opponents with incomplete observation information, making it one of the most complex decision-making environments.

TextStarCraft II was developed to facilitate LLM-based decision-making by providing a textual interface for macro-level strategic decisions as opposed to micro-level operations, which are automated.

Hierarchical Expert Prompt Methodology

The Hierarchical Expert Prompt framework integrates specialized tactical knowledge through Expert Tactic Prompts (ETP) and hierarchical decision-making protocols through Hierarchical Decision Prompts (HDP). These components serve as foundational elements that guide the LLM in comprehensively evaluating game states leading to enhanced strategic decision-making capabilities.

Figure 2: Interacting with LLM: Hierarchical Expert Prompt Method in TextStarCraft II. LLM takes the L1 summary from the environment, utilizes knowledge from HEP, and generates text-based actions according to analyses.

Expert Tactic Prompt (ETP)

ETP encompasses a knowledge base of well-defined tactics including unit types, key technologies, critical timings, and applicable scenarios. By embedding this into the system prompt, the LLM gains access to specialized game strategies informing its action decisions.

Hierarchical Decision Prompt (HDP)

The HDP establishes a tiered structure where prioritized decisions (e.g., Nexus and Assimilator construction for resource management) are differentiated from routine tasks (e.g., economy, technology, military development). The LLM selects actions from a hierarchical decision-making framework, ensuring resource allocation aligns with strategic objectives.

Experimental Evaluation

Experimentation was conducted against StarCraft II opponents ranging from Hard to Elite levels. The HEP approach demonstrated superior performance compared to the baseline CoS, achieving a 75% victory rate against VeryHard opponents and securing unprecedented wins against Elite AI agents.

Figure 3: Screenshots of Game Replay Against VeryHard Opponent.

Win rates improved by significant margins across tested difficulty levels, highlighted by notable enhancements in decision-making quality without the need for extensive training data. Additionally, consumption metrics, such as API call tokens and query times, show reasonable increments given the performance improvements achieved.

Detailed Analysis

Data visualization pertaining to game resources, unit supply, army composition, and tactical transitions provides insights into the practical superiority of the HEP methodology. Comparative analysis evidenced the adaptive capacity of HEP in building economically viable and strategically potent military forces, utilizing resources effectively for technology-driven combat superiority.

Figure 4: Resource Data indicates our method's focused resource accumulation strategy, facilitating tactical military development.

Ablation Study

Ablation studies dissect the contribution of ETP and HDP individually to validate their integral roles in HEP's success. Both modules proved essential, with their removal adversely impacting resource management and tactical acumen, ultimately leading to failure against challenging AI setups.

Figure 5: Ablation Study Results underscore the necessity of both ETP and HDP modules, indicating their effectiveness in improving decision-making quality and survival capabilities in engagements.

Conclusion

HEP successfully imbues LLMs with expert-level tactical awareness and decision-making hierarchy, achieving high proficiency in complex strategic assignments typical of StarCraft II. The framework extends LLM application boundaries in decision-oriented tasks, offering practical solutions for integrating expert knowledge into AI systems capable of surpassing existing AI benchmarks without extensive additional training requirements. Future explorations may focus on extending these concepts to broader decision-making domains and further refining hierarchical strategies for enhanced automation efficacy.

Markdown Report Issue