Papers
Topics
Authors
Recent
Search
2000 character limit reached

A general framework for blockchain analytics

Published 4 Jul 2017 in cs.CR | (1707.01021v2)

Abstract: Modern cryptocurrencies exploit decentralised blockchains to record a public and unalterable history of transactions. Besides transactions, further information is stored for different, and often undisclosed, purposes, making the blockchains a rich and increasingly growing source of valuable information, in part of difficult interpretation. Many data analytics have been developed, mostly based on specifically designed and ad-hoc engineered approaches. We propose a general-purpose framework, seamlessly supporting data analytics on both Bitcoin and Ethereum - currently the two most prominent cryptocurrencies. Such a framework allows us to integrate relevant blockchain data with data from other sources, and to organise them in a database, either SQL or NoSQL. Our framework is released as an open-source Scala library. We illustrate the distinguishing features of our approach on a set of significant use cases, which allow us to empirically compare ours to other competing proposals, and evaluate the impact of the database choice on scalability.

Citations (10)

Summary

  • The paper provides a unified framework that standardizes blockchain data analysis by integrating native blockchain data with external datasets.
  • The study demonstrates the framework's flexibility using Bitcoin case studies, comparing SQL and NoSQL performance for scalability and efficiency.
  • The open-source Scala library enables modular analytics, allowing deep transaction input scans and effective monitoring of fees and exchange rates.

A General Framework for Blockchain Analytics

Introduction

The paper "A general framework for blockchain analytics" (1707.01021) addresses the growing complexity and richness of blockchain data, particularly within Bitcoin and Ethereum networks. Recognizing the substantial data these blockchains encompass, often outside mere transaction records, the authors propose a comprehensive framework to streamline data analytics. This framework integrates blockchain data with external data sources, providing a versatile platform for queries and analysis across SQL and NoSQL databases. It is released as an open-source Scala library to encourage widespread adoption and further innovations.

Core Contributions and Methodology

The proposed framework serves to unify and standardize the process of blockchain data analysis, overcoming the fragmentation seen in previous approaches where each study custom-built its tools. This general-purpose tool is tailored to allow effective interaction with both blockchain-native data and external datasets—such as exchange rates—by organizing these into integrated views. The study's empirical demonstrations on Bitcoin are notably expansive, showcasing varied use cases and contrasting effects of database choice on scalability and efficiency.

Several implemented analytics illustrate the framework's flexibility and robustness. One key aspect is the ability to conduct a deep scan of the blockchain, identifying transaction input sources beyond what is typically accessible directly. Among various functionalities—such as analyzing metadata with OP codes, tracking exchange rates' impact on transaction values, and monitoring transaction fees—each is realized through the framework's modular architecture tailored by Scala's strong type system. Figure 1

Figure 1: Average number of inputs (red line) and outputs (blue line) by date.

Implementation and Evaluation

The paper includes a comparative performance evaluation of using SQL versus MongoDB backends for storing and querying blockchain views. The experiments executed with consumer-grade hardware demonstrate comparable timescales for data creation and query execution across SQL and NoSQL setups, though NoSQL's schema-less nature tends to simplify query structuring. Critically, these performance metrics are crucial for potential adopters needing to scale analytics over large blockchain datasets.

Comparison with Other Tools

The work contrasts with other existing tools by providing a broader and more flexible framework. While many tools either lack the extensibility to incorporate external data sources or are not readily adaptable to multiple blockchains, this framework offers a marked improvement in those areas. Unlike RAM-only solutions that require extensive memory resources, this disk-based tool caters to broader use, supporting both consumer and enterprise deployment scenarios.

Discussion and Future Directions

The paper emphasizes the potential of the framework to unify diverse analytics methodologies under a single, adaptable interface. It becomes a crucial stepping stone for researchers needing reusable components in blockchain analytics. Moving forward, enhancements could involve integrating real-time data retrieval capabilities from blockchain networks themselves or incorporating machine learning tools for predictive analytics. Additionally, extending support for blockchain forks and peer-to-peer network data will broaden the analytical scope even further.

Conclusion

The paper presents a pivotal development in the field of blockchain analytics, offering both versatility and performance. By allowing the integration of blockchain and external data into coherent views, the framework significantly streamlines the analytics pipeline. Its open-source nature combined with the choice between SQL and NoSQL databases provides a robust platform for researchers to explore blockchain data. The proposed system not only benefits current analytical needs but also paves the way for future developments aligned with the evolving blockchain landscape.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.