ESQA: Event Sequences Question Answering
Abstract: Event sequences (ESs) arise in many practical domains including finance, retail, social networks, and healthcare. In the context of machine learning, event sequences can be seen as a special type of tabular data with annotated timestamps. Despite the importance of ESs modeling and analysis, little effort was made in adapting LLMs to the ESs domain. In this paper, we highlight the common difficulties of ESs processing and propose a novel solution capable of solving multiple downstream tasks with little or no finetuning. In particular, we solve the problem of working with long sequences and improve time and numeric features processing. The resulting method, called ESQA, effectively utilizes the power of LLMs and, according to extensive experiments, achieves state-of-the-art results in the ESs domain.
- Coles: Contrastive learning for event sequences with self-supervision, 2022. URL http://dx.doi.org/10.1145/3514221.3526129.
- Longformer: The long-document transformer. ArXiv, abs/2004.05150, 2020. URL https://api.semanticscholar.org/CorpusID:215737171.
- Machine learning for data-driven discovery in solid earth geoscience. Science, 363(6433):eaau0323, 2019. doi: 10.1126/science.aau0323. URL https://www.science.org/doi/abs/10.1126/science.aau0323.
- Jolt: Jointly learned representations of language and time-series. In Deep Generative Models for Health Workshop NeurIPS 2023, 2023.
- Deep learning for anomaly detection in time-series data: Review, analysis, and guidelines. IEEE Access, 9:120043–120065, 2021. doi: 10.1109/ACCESS.2021.3107975.
- Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555, 2020.
- Bert: Pre-training of deep bidirectional transformers for language understanding. In North American Chapter of the Association for Computational Linguistics, 2019. URL https://api.semanticscholar.org/CorpusID:52967399.
- Lift: Language-interfaced fine-tuning for non-language machine learning tasks. Advances in Neural Information Processing Systems, 35:11763–11784, 2022.
- David P. Doane. Aesthetic frequency classifications. The American Statistician, 30:181–183, 1976. URL https://api.semanticscholar.org/CorpusID:119563223.
- Alfabattle2.0, 2021. URL https://github.com/smirnovevgeny/AlfaBattle2.0.
- Large language models are zero-shot time series forecasters, 2023.
- Tabllm: Few-shot classification of tabular data with large language models, 2022. URL https://api.semanticscholar.org/CorpusID:252992811.
- Crime prediction using spatio-temporal data, 2020.
- Lora: Low-rank adaptation of large language models, 2021. URL https://api.semanticscholar.org/CorpusID:235458009.
- Language is not all you need: Aligning perception with language models, 2023. URL https://api.semanticscholar.org/CorpusID:257219775.
- Tabbie: Pretrained representations of tabular data, 2021. URL https://api.semanticscholar.org/CorpusID:233864627.
- Lightgbm: A highly efficient gradient boosting decision tree. In Neural Information Processing Systems, 2017. URL https://api.semanticscholar.org/CorpusID:3815895.
- Grounding language models to images for multimodal inputs and outputs, 2023. URL https://api.semanticscholar.org/CorpusID:258947258.
- Albert: A lite bert for self-supervised learning of language representations. ArXiv, abs/1909.11942, 2019. URL https://api.semanticscholar.org/CorpusID:202888986.
- D. Lane. Introduction to statistics \\\backslash\, 2003. URL https://books.google.ru/books?id=jzyAzQEACAAJ.
- Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models, 2023. URL https://api.semanticscholar.org/CorpusID:256390509.
- Thomas Liniger. Multivariate hawkes processes. PhD thesis, ETH Zurich, 2009.
- Fixing weight decay regularization in adam. ArXiv, abs/1711.05101, 2017. URL https://api.semanticscholar.org/CorpusID:3312944.
- Pretrained transformers as universal computation engines, 2021. URL https://api.semanticscholar.org/CorpusID:232168936.
- Valeriy Max. Python and data analysis: Final project, 2019. URL https://kaggle.com/competitions/python-and-analyze-data-final-project.
- Event stream gpt: a data pre-processing and modeling library for generative, pre-trained transformers over continuous-time sequences of complex events. Advances in Neural Information Processing Systems, 36, 2024.
- The neural hawkes process: A neurally self-modulating multivariate point process. Advances in neural information processing systems, 30, 2017.
- Perceive your users in depth: Learning universal user representations from multiple e-commerce tasks, 2018.
- Catboost: unbiased boosting with categorical features. In Neural Information Processing Systems, 2017. URL https://api.semanticscholar.org/CorpusID:5044218.
- Tabular transformers for modeling multivariate time series. ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 3565–3569, 2020. URL https://api.semanticscholar.org/CorpusID:226237049.
- Tabular transformers for modeling multivariate time series, 2021.
- Language models are unsupervised multitask learners. 2019. URL https://api.semanticscholar.org/CorpusID:160025533.
- Robust speech recognition via large-scale weak supervision, 2022. URL https://api.semanticscholar.org/CorpusID:252923993.
- Educational Center Sirius. Age group prediction competition, 2020. URL https://ods.ai/competitions/sberbank-sirius-lesson/data.
- Representation learning with contrastive predictive coding, 2018. URL https://api.semanticscholar.org/CorpusID:49670925.
- Finetuned language models are zero-shot learners, 2021. URL https://api.semanticscholar.org/CorpusID:237416585.
- Easytpp: Towards open benchmarking the temporal point processes. arXiv preprint arXiv:2307.08097, 2023.
- Tableformer: Robust transformer modeling for table-text encoding, 2022. URL https://api.semanticscholar.org/CorpusID:247187588.
- I-Cheng Yeh and Che hui Lien. The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Syst. Appl., 36:2473–2480, 2009. URL https://api.semanticscholar.org/CorpusID:15696161.
- Tabert: Pretraining for joint understanding of textual and tabular data, 2020. URL https://api.semanticscholar.org/CorpusID:218674345.
- Barlow twins: Self-supervised learning via redundancy reduction. ArXiv, abs/2103.03230, 2021. URL https://api.semanticscholar.org/CorpusID:232110471.
- Large language models for time series: A survey. arXiv preprint arXiv:2402.01801, 2024.
- One Fits All: Power general time series analysis by pretrained lm, 2023.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.