Greedy-First Algorithm Overview

Updated 20 November 2025

Greedy-First Algorithm is a paradigm that makes locally optimal selections in domains such as contextual bandits, online AdWords allocation, and parallel search.
It employs adaptive exploitation by triggering exploration or dual updates only when safety conditions or budget constraints demand, ensuring guarantees like O(log T) regret and 1/2-competitiveness.
Empirical studies show that constrained expansion and decoupled node evaluation in parallel search improve scalability and speedup while preserving near-optimal performance.

The term Greedy-First Algorithm denotes several distinct algorithmic paradigms across learning theory, combinatorial optimization, and parallel search. Notable instances include (a) an adaptive contextual bandit framework minimizing unnecessary exploration; (b) a primal–dual online algorithm for the AdWords allocation problem under the small-bid assumption; and (c) a family of constrained parallel best-first search methods enforcing optimality domain invariants. Although these usages share an embrace of “greedy” (locally optimal, maximally opportunistic) expansion or allocation when safe, they each embody distinct theoretical guarantees and mechanistic subtleties.

1. Greedy-First in Contextual Bandits

In the contextual bandit setting, “Greedy-First” refers to an algorithm that dynamically determines, from live observed data, whether to operate in a pure greedy (exploitation) mode or to invoke explicit exploration. This approach is formalized in "Mostly Exploration-Free Algorithms for Contextual Bandits" (Bastani et al., 2017).

Suppose at time $t$ a context vector $X_t \in \mathbb{R}^d$ is observed and the learner must select an arm $i \in [K]$ , each associated with an unknown parameter $\beta_i \in \mathbb{R}^d$ . The reward has linear form $Y_{i,t} = X_t^\top\beta_i + \varepsilon_{i,t}$ with $\varepsilon_{i,t}$ subgaussian. The algorithm proceeds as follows:

Greedy Phase: At each $t$ , select the arm maximizing $X_t^\top\hat\beta_i$ (where $\hat\beta_i$ is the OLS estimator for arm $i$ ).
Exploration Trigger: For each arm, maintain the sample covariance $X_t \in \mathbb{R}^d$ 0 (where $X_t \in \mathbb{R}^d$ 1 is the index set of times when arm $X_t \in \mathbb{R}^d$ 2 was chosen). If at any $X_t \in \mathbb{R}^d$ 3, for some $X_t \in \mathbb{R}^d$ 4, $X_t \in \mathbb{R}^d$ 5, force a switch to an explicit exploration algorithm (e.g., OLS bandit).
Guarantee: Under mild conditions (specifically, if "covariate diversity" holds: $X_t \in \mathbb{R}^d$ 6 $X_t \in \mathbb{R}^d$ 7), the greedy phase persists almost surely and cumulative regret is $X_t \in \mathbb{R}^d$ 8. Otherwise, Greedy-First guarantees $X_t \in \mathbb{R}^d$ 9 regret with strictly less exploration than UCB or Thompson sampling (Bastani et al., 2017).

Simulations on synthetic and real data show Greedy-First matches or outperforms exploration-based methods in settings where greedy is rate-optimal and rapidly adapts when exploration is necessary. This formulation minimizes unnecessary exploration while retaining minimax optimality.

2. Greedy-First in Online AdWords Allocation

For the online AdWords allocation problem under adversarial order and the small-bid assumption, Greedy-First denotes a primal–dual algorithm that always allocates queries to the active advertiser with maximum feasible bid, maintaining dual feasibility at all times (Li, 2019).

Formulation:

Let $i \in [K]$ 0 denote the set of advertisers with budgets $i \in [K]$ 1. Each query $i \in [K]$ 2 arrives online with bids $i \in [K]$ 3.
On each arrival, assign $i \in [K]$ 4 to the feasible $i \in [K]$ 5 maximizing $i \in [K]$ 6, where $i \in [K]$ 7 is a dual variable, 0 until exhaustion, then jumps to 1.
After each match, if advertiser $i \in [K]$ 8 is exhausted, set $i \in [K]$ 9.
This assignment strategy yields the pure greedy allocation under the small-bid assumption ( $\beta_i \in \mathbb{R}^d$ 0).
The algorithm achieves a competitive ratio of $\beta_i \in \mathbb{R}^d$ 1 for the revenue objective, tight in the worst case. This ratio is proven via primal–dual analysis: the constructed dual is always feasible, and the sum of primal gains is at least half the dual value (Li, 2019).

A key point is that the algorithm remains fully greedy until budget exhaustion triggers a dual variable update, and the small-bid assumption ensures that no single query causes excessive “jump” in dual variables.

3. Greedy-First in Parallel Greedy Best-First Search

In parallel graph search, the “Greedy-First” style describes a class of constrained parallel greedy best-first search (GBFS) algorithms that enforce expansions only within a theoretically justified subset of the state space, specifically the Bench Transition System (BTS)—the set of all states that could be expanded by some sequential GBFS policy (Shimoda et al., 2024).

Constraint Enforcement: Expansion is allowed only for states $\beta_i \in \mathbb{R}^d$ 2 satisfying satisfies(s) = \texttt{true} \Longleftrightarrow s \in \mathrm{BTS} $</code>.</li> <li><strong>Traditional Bottlenecks:</strong> In naïve parallelizations, threads may idle waiting for BTS-permitted states at the top of the open list, and all successors of a node are generated and evaluated monolithically—stalling parallel progress.</li> <li><strong>Decoupled Generation–Evaluation (<a href="https://www.emergentmind.com/topics/symmetrized-gradient-estimator-sge" title="" rel="nofollow" data-turbo="false" class="assistant-link" x-data x-tooltip.raw="">SGE</a>):</strong> The SGE variant splits node expansion into two stages: (a) a single thread generates all successors, placing them into an unevaluated queue; (b) any idle thread evaluates$ \beta_i \in \mathbb{R}^d$3 for these children. Once all siblings are evaluated, the batch is atomically inserted into the open list, respecting the BTS constraint.


Empirical Outcomes: SGE significantly increases state evaluation rates (by 9–19% for 4–16 threads compared to the prior best), reduces the number of states expanded, decreases search time (e.g., 33% faster at 16 threads), and almost doubles speedup over single-threaded baselines (achieving $\beta_i \in \mathbb{R}^d$4, near the ideal $\beta_i \in \mathbb{R}^d$5 scaling) (Shimoda et al., 2024).
Limitations: In unconstrained settings, the overhead of maintaining sibling records and extra queues may reduce efficiency; alternative schedulings are needed for lazy evaluation or other search paradigms.


4. Theoretical Guarantees and Analysis
The Greedy-First approach, in all its guises, is characterized by aggressive exploitation constrained by rigorous safety checks or dual updates.


Bandits: Greedy-First achieves $\beta_i \in \mathbb{R}^d$6 cumulative regret under conditions including boundedness, margin, and covariate diversity (or a problem-dependent positive probability otherwise) (Bastani et al., 2017).
AdWords: The primal–dual construction ensures a $\beta_i \in \mathbb{R}^d$7-competitive ratio in adversarial arrivals under the small-bid assumption (Li, 2019).
Parallel GBFS: SGE recovers nearly linear speedup under reasonable assumptions, with expansion order constrained to mimic plausible sequential GBFS trajectories, avoiding pathological expansion blowup (Shimoda et al., 2024).


These guarantees underscore the conditions—problem regularity, structural invariants, or budgetary smallness—under which greedy-first deployment is algorithmically sound.
5. Algorithmic Instantiations and Pseudocode Structures
Tabulated below are the core steps of Greedy-First algorithms across the three domains:

Domain
Greedy-First Mechanism
Exploration/Constraint Trigger


Contextual Bandits
Play arm maximizing $\beta_i \in \mathbb{R}^d$8, update OLS, monitor covariance
Switch if eigenvalue $\beta_i \in \mathbb{R}^d$9 low


Online AdWords
Match to $Y_{i,t} = X_t^\top\beta_i + \varepsilon_{i,t}$0 maximizing $Y_{i,t} = X_t^\top\beta_i + \varepsilon_{i,t}$1, $Y_{i,t} = X_t^\top\beta_i + \varepsilon_{i,t}$2 on exhaustion
Budgets fully spent


Parallel GBFS (SGE)
Expand BTS-permitted node, generate, queue successors, multithreaded $Y_{i,t} = X_t^\top\beta_i + \varepsilon_{i,t}$3 eval
Expansion only for $Y_{i,t} = X_t^\top\beta_i + \varepsilon_{i,t}$4 BTS


The precise pseudocode for each variant follows the respective domain’s computational conventions, with formal steps as provided in (Bastani et al., 2017, Li, 2019), and (Shimoda et al., 2024).
6. Limitations and Extensions
While the Greedy-First paradigm offers significant advantages in terms of computational efficiency and simplicity, it is subject to several limitations:


Contextual Bandits: Success depends on diversity in context sequences; absent this, forced exploration may be necessary. The precise cutoff for switching is parameter-dependent.
AdWords: The $Y_{i,t} = X_t^\top\beta_i + \varepsilon_{i,t}$5-competitive bound is tight; higher ratios require more sophisticated algorithms such as MSVV/Balance.
Parallel Search: Overhead from managing successor queues and sibling sets may hinder performance in unconstrained tasks or in the presence of lazy heuristics. Adapting the SGE idea to multi-heuristic, bidirectional, or domain factorization strategies remains an open avenue (Shimoda et al., 2024).


A plausible implication is that Greedy-First methods are optimally suited where structure or regularity makes greedy action safe, but may require augmentation or fallback in more adversarial, ill-behaved, or poorly-observed settings.
7. Context and Comparative Frameworks
The Greedy-First idiom crystallizes an approach across domains whereby maximally opportunistic (“greedy”) action is taken whenever safe, deferring costlier exploration, constraint checks, or evaluation until necessary. In contextual bandit literature, this challenges the notion that extensive forced exploration is always necessary. In online combinatorial optimization, it provides a simple, primal–dual justified baseline. In parallel search, it enables efficient utilization of multi-core hardware without sacrificing the invariants maintained by sequential search analogs.

Empirical results and theoretical analyses confirm its situational optimality. However, strict establishable ceilings on performance and the dependency on structural or statistical regularity delimit the practical applicability of Greedy-First, motivating ongoing research into adaptive and hybrid algorithms that interpolate between greedy exploitation and principled exploration or constraint enforcement (Bastani et al., 2017, Li, 2019, Shimoda et al., 2024).

      
        
          
  
    

    Markdown

  
    

    Report Issue


          
  
    

    Upgrade to Chat

        

      

      



  
    

    References (3)

    
  
  
    

    
      
        
          1.
        
        
          Mostly Exploration-Free Algorithms for Contextual Bandits 

          (2017)
        
      
    
    
      
        
          2.
        
        
          A Survey of Adwords Problem With Small Bids In a Primal-dual Setting: Greedy Algorithm, Ranking Algorithm and Primal-dual Training-based Algorithm 

          (2019)
        
      
    
    
      
        
          3.
        
        
          Decoupling Generation and Evaluation for Parallel Greedy Best-First Search(extended version) 

          (2024)

Domain	Greedy-First Mechanism	Exploration/Constraint Trigger
Contextual Bandits	Play arm maximizing $\beta_i \in \mathbb{R}^d$8, update OLS, monitor covariance	Switch if eigenvalue $\beta_i \in \mathbb{R}^d$9 low
Online AdWords	Match to $Y_{i,t} = X_t^\top\beta_i + \varepsilon_{i,t}$0 maximizing $Y_{i,t} = X_t^\top\beta_i + \varepsilon_{i,t}$1, $Y_{i,t} = X_t^\top\beta_i + \varepsilon_{i,t}$2 on exhaustion	Budgets fully spent
Parallel GBFS (SGE)	Expand BTS-permitted node, generate, queue successors, multithreaded $Y_{i,t} = X_t^\top\beta_i + \varepsilon_{i,t}$3 eval	Expansion only for $Y_{i,t} = X_t^\top\beta_i + \varepsilon_{i,t}$4 BTS




  
    


  












  


    
    

        
        
            

        
        

      
      
          Topic to Video (Beta)

        
            
  


    No one has generated a video about this topic yet.
    
        
          

          Sign Up to Generate
        
          

          All Videos

      
  

  Subscribe on YouTube

    



        
      
      
    
    
  











  


    
    

        
        
            

        
        

      
      
          Whiteboard

        
            
  



    No one has generated a whiteboard explanation for this topic yet.
    
        
          

          Sign Up to Generate
    



        
      
      
    
    
  










  


    
    

        
        
            

        
        

      
      
          Follow Topic

        
            
  Get notified by email when new papers are published related to Greedy-First Algorithm.

  
      
        

        Sign Up to Follow Topic by Email
  

        
      
      
    
    
  










  


    
    

        
        
            

        
        

      
      
          Continue Learning

        
            
    
        
          How does the Greedy-First algorithm dynamically decide when to switch from exploitation to exploration in contextual bandits? 

        
        
          What are the underlying theoretical guarantees that support the efficiency of Greedy-First in online AdWords allocation? 

        
        
          In what ways does the constrained parallel best-first search strategy enhance the performance of Greedy-First algorithms? 

        
        
          What limitations does the Greedy-First approach face in environments with low diversity or adversarial inputs? 

        
        
          Find recent papers about contextual bandit algorithms. 

        
    

        
      
      
    
    
  










  


    
    

        
        
            

        
        

      
      
          Related Topics

        
            
    
        
          Epsilon-Greedy Algorithm 

        
        
          Extreme Bandit Allocation Strategy 

        
        
          Multi-Armed Bandit Framework 

        
        
          Multi-Arm Bandit Frontier Exploration 

        
        
          Combinatorial Bandits 

        
        
          Implicit Multi-Arm Bandit Allocation 

        
        
          Bandit-Driven Adaptation 

        
        
          Primal-Dual Greedy Algorithms 

        
        
          Greedy Action Guidance (GAG) 

        
        
          Marginal-Gain Heuristic Solutions


    

    
    


    
      
        
          Content



            
              

              Overview

              
                

                References

            
              

              Topic to Video

            
              

              Whiteboard

            
              

              Follow Topic

            
              

              Continue Learning

            
              

              Related Topics



  

  
    
      
        Stay informed about trending AI papers: