Papers
Topics
Authors
Recent
Search
2000 character limit reached

Non-parametric online market regime detection and regime clustering for multidimensional and path-dependent data structures

Published 27 Jun 2023 in stat.ML and q-fin.MF | (2306.15835v1)

Abstract: In this work we present a non-parametric online market regime detection method for multidimensional data structures using a path-wise two-sample test derived from a maximum mean discrepancy-based similarity metric on path space that uses rough path signatures as a feature map. The latter similarity metric has been developed and applied as a discriminator in recent generative models for small data environments, and has been optimised here to the setting where the size of new incoming data is particularly small, for faster reactivity. On the same principles, we also present a path-wise method for regime clustering which extends our previous work. The presented regime clustering techniques were designed as ex-ante market analysis tools that can identify periods of approximatively similar market activity, but the new results also apply to path-wise, high dimensional-, and to non-Markovian settings as well as to data structures that exhibit autocorrelation. We demonstrate our clustering tools on easily verifiable synthetic datasets of increasing complexity, and also show how the outlined regime detection techniques can be used as fast on-line automatic regime change detectors or as outlier detection tools, including a fully automated pipeline. Finally, we apply the fine-tuned algorithms to real-world historical data including high-dimensional baskets of equities and the recent price evolution of crypto assets, and we show that our methodology swiftly and accurately indicated historical periods of market turmoil.

Authors (2)
Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. S. Aminikhanghahi and D. J. Cook. A survey of methods for time series change point detection. Knowledge and information systems, 51(2):339–367, 2017.
  2. A kernel multiple change-point algorithm via model selection, 2012. doi:10.48550/ARXIV.1202.3878.
  3. A kernel multiple change-point algorithm via model selection. Journal de la Société Française de Statistique, 156(4):133–162, 2015.
  4. D. J. Aldous. Weak convergence and the general theory of processes. Editeur inconnu, 1981.
  5. N. Aronszajn. Theory of reproducing kernels. Transactions of the American mathematical society, 68(3):337–404, 1950.
  6. Statistical inference for generative models with maximum mean discrepancy. arXiv preprint arXiv:1906.05944, 2019.
  7. Pricing under rough volatility. Quantitative Finance, 16(6):887–904, 2016.
  8. A data-driven market simulator for small data environments. Available at SSRN 3632431, 2020.
  9. Adapted topologies and higher rank signatures. arXiv preprint arXiv:2005.08897, 2020.
  10. B.-E. Chérief-Abdellatif and P. Alquier. Finite sample properties of parametric mmd estimation: robustness to misspecification and dependence. Bernoulli, 28(1):181–213, 2022.
  11. Anomaly detection on streamed data, 2020, 2006.03487.
  12. K.-T. Chen. Integration of paths, geometric invariants and a generalized baker-hausdorff formula. Annals of Mathematics, pages 163–178, 1957.
  13. I. Chevyrev and T. Lyons. Characteristic functions of measures on geometric rough paths. The Annals of Probability, 44(6):4049–4082, 2016.
  14. General signature kernels. arXiv preprint arXiv:2107.00447, 2021.
  15. I. Chevyrev and H. Oberhauser. Signature moments to characterize laws of stochastic processes. arXiv preprint arXiv:1810.10971, 2018.
  16. R. Cont. Empirical properties of asset returns: stylized facts and statistical issues. Quantitative finance, 1(2):223, 2001.
  17. J. C. Dunn. A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters. Journal of Cybernetics, 3(3):32–57, 1973. doi:10.1080/01969727308546046.
  18. Kernel measures of conditional dependence. Advances in neural information processing systems, 20, 2007.
  19. Characteristic kernels on groups and semigroups. In D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, editors, Advances in Neural Information Processing Systems, volume 21. Curran Associates, Inc., 2009. URL https://proceedings.neurips.cc/paper/2008/file/d07e70efcfab08731a97e7b91be644de-Paper.pdf.
  20. Discretely sampled signals and the rough hoff process. Stochastic Processes and their Applications, 126(9):2593–2614, Sep 2016. doi:10.1016/j.spa.2016.02.011.
  21. J. Friedman. On multivariate goodness-of-fit and two-sample testing. Technical Report, Stanford University, 2004.
  22. D. Garreau and S. Arlot. Consistent change-point detection with kernels. Electronic Journal of Statistics, 12(20), 2018.
  23. A kernel method for the two-sample-problem. Advances in neural information processing systems, 19, 2006.
  24. A kernel two-sample test. Journal of Machine Learning Research, 13(Mar):723–773, 2012.
  25. A fast, consistent kernel two-sample test. Advances in neural information processing systems, 22, 2009.
  26. T. Gneiting and A. E. Raftery. Weather forecasting with ensemble methods. Science, 310(5746):248–249, 2005.
  27. T. Gneiting and A. E. Raftery. Strictly proper scoring rules, prediction, and estimation. Journal of the American statistical Association, 102(477):359–378, 2007.
  28. Clustering market regimes using the wasserstein distance. Available at SSRN 3947905, 2021.
  29. Adapted probability distributions. Transactions of the American Mathematical Society, 286(1):159–201, 1984.
  30. B. Hambly and T. Lyons. Uniqueness for the signature of a path of bounded variation and the reduced path group. Annals of Mathematics, pages 109–167, 2010.
  31. Graph-based change-point detection. The Annals of Statistics, 43(1):139–176, 2015.
  32. On the use of random forest for two-sample testing. Computational Statistics & Data Analysis, 170(170435), 2022.
  33. Non-adversarial training of neural sdes with signature kernel scores. arXiv preprint arXiv:2305.16274, 2023.
  34. S. C. Johnson. Hierarchical clustering schemes. Psychometrika, 32(3):241–254, 1967.
  35. Generalized sliced wasserstein distances. arXiv preprint arXiv:1902.00434, 2019.
  36. L. Kaufman and P. J. Rousseeuw. Finding groups in data: an introduction to cluster analysis, volume 344. John Wiley & Sons, 2009.
  37. A. Kondratyev and C. Schwarz. The market generator. Available at SSRN 3384948, 2019.
  38. Random forests for change point detection. arXiv preprint arXiv:2205.04997, 2022.
  39. Mmd gan: Towards deeper understanding of moment matching network. Advances in neural information processing systems, 30, 2017.
  40. R. F. Ling. On the theory and construction of k-clusters. The computer journal, 15(4):326–332, 1972.
  41. Learning from the past, predicting the statistics for the future, learning an evolving system. arXiv preprint arXiv:1309.0260, 2013.
  42. D. Lopez-Paz and M. Oquab. Revisiting classifier two-sample tests. International Conference on Learning Representations, 2017.
  43. Distribution regression for sequential data, 2020, 2006.05805.
  44. Homogeneity and changepoint detection tests for multivariate data using rank statistics. Electronic Journal of Statistics, 12(20), 2018.
  45. T. J. Lyons. Differential equations driven by rough signals. Revista Matemática Iberoamericana, 14(2):215–310, 1998.
  46. C. McDiarmid et al. On the method of bounded differences. Surveys in combinatorics, 141(1):148–188, 1989.
  47. J. MacQueen. Some methods for classification and analysis of multivariate observations. In Proc. 5th Berkeley Symposium on Math., Stat., and Prob, page 281, 1965.
  48. A generalised signature method for multivariate time series feature extraction. arXiv preprint arXiv:2006.00873, 2020.
  49. A nonparametric approach for multiple change point analysis of multivariate data. Journal of the American Statistical Association, 109(505):334–345, 2014.
  50. On the choice of interpolation scheme for neural cdes. Transactions on Machine Learning Research, 2022(9), 2022.
  51. Y. Mroueh and T. Nguyen. On the convergence of gradient descent in gans: Mmd gan as a gradient flow. In International Conference on Artificial Intelligence and Statistics, pages 1720–1728. PMLR, 2021.
  52. E. C. Merkle and M. Steyvers. Choosing a strictly proper scoring rule. Decision Analysis, 10(4):292–304, 2013.
  53. Asymptotic guarantees for learning generative models with the sliced-wasserstein distance. arXiv preprint arXiv:1906.04516, 2019.
  54. Sig-wasserstein gans for time series generation. CoRR, abs/2111.01207, 2021, 2111.01207. URL https://arxiv.org/abs/2111.01207.
  55. Probabilistic forecasting with conditional generative networks via scoring rule minimization. arXiv preprint arXiv:2112.08217, 2021.
  56. Wasserstein barycenter and its application to texture mixing. In International Conference on Scale Space and Variational Methods in Computer Vision, pages 435–446. Springer, 2011.
  57. The signature kernel is the solution of a goursat pde. SIAM Journal on Mathematics of Data Science, 3(3):873–899, 2021.
  58. Detecting change-points in time series by maximum mean discrepancy of ordinal pattern distributions. arXiv preprint arXiv:1210.4903, 2012.
  59. C.-J. Simon-Gabriel and B. Schölkopf. Kernel distribution embeddings: Universal kernels, characteristic kernels and kernel metrics on distributions. The Journal of Machine Learning Research, 19(1):1708–1736, 2018.
  60. Higher order kernel mean embeddings to capture filtrations of stochastic processes. Advances in Neural Information Processing Systems, 34, 2021.
  61. Selective review of offline change point detection methods. Signal Processing, 167:107299, 2020.
  62. Deep hedging: learning to simulate equity option markets. arXiv preprint arXiv:1911.01700, 2019.
  63. Y. Zhang and H. Chen. Graph-based multiple change-point detection. ArXiv Preprint, arXiv:2110.01170, 2021.
  64. Understanding failures in out-of-distribution detection with deep generative models. In International Conference on Machine Learning, pages 12427–12436. PMLR, 2021.
Citations (3)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.