Controlled multi-region benchmarking

Conduct controlled multi-region benchmarking to rigorously evaluate Unbrowse and browser-based baselines across geographic regions using standardized cloud environments.

Background

The paper reports single-host, live-web benchmarks and acknowledges that absolute timings are geography-dependent. They indicate that a more controlled evaluation across regions is future work.

In the conclusion, controlled multi-region benchmarking is explicitly listed among the remaining open problems.

References

Formal economic analysis, incentive compatibility proofs, controlled multi-region benchmarking, and deployment-scale validation remain open problems; the present contribution is an architectural proposal with an implemented system and initial empirical evidence that shared route lookup can outperform redundant browser rediscovery on the evaluated tasks.

Internal APIs Are All You Need: Shadow APIs, Shared Discovery, and the Case Against Browser-First Agent Architectures  (2604.00694 - Tham et al., 1 Apr 2026) in Conclusion (final paragraph)