Boolformer: Symbolic Regression of Logic Functions with Transformers

Published 21 Sep 2023 in cs.LG and cs.LO | (2309.12207v1)

Abstract: In this work, we introduce Boolformer, the first Transformer architecture trained to perform end-to-end symbolic regression of Boolean functions. First, we show that it can predict compact formulas for complex functions which were not seen during training, when provided a clean truth table. Then, we demonstrate its ability to find approximate expressions when provided incomplete and noisy observations. We evaluate the Boolformer on a broad set of real-world binary classification datasets, demonstrating its potential as an interpretable alternative to classic machine learning methods. Finally, we apply it to the widespread task of modelling the dynamics of gene regulatory networks. Using a recent benchmark, we show that Boolformer is competitive with state-of-the art genetic algorithms with a speedup of several orders of magnitude. Our code and models are available publicly.

Abstract PDF Upgrade to Chat

Citations (5)

View on Semantic Scholar

Summary

The paper introduces Boolformer which reformulates symbolic regression for Boolean logic as a sequence prediction task.
It demonstrates robust performance by handling noisy and incomplete data to deliver interpretable binary classification results.
Its application to gene regulatory network inference shows superior speed and efficiency over traditional genetic algorithms.

Analysis of "Boolformer: Symbolic Regression of Logic Functions with Transformers"

The paper "Boolformer: Symbolic Regression of Logic Functions with Transformers" introduces a novel application of Transformer architectures for the specific task of symbolic regression of Boolean functions. This work pivots on the ability of the Transformers, well-known for their success in NLP and vision tasks, to tackle combinatorial logic-based problems, which are inherently distinct due to their discrete and combinatorial nature.

The research proposes Boolformer, an innovative Transformer model tailored to infer compact Boolean expressions from provided truth tables. Unlike traditional deep learning approaches that struggle with the inherent structural complexity of logic functions, Boolformer leverages symbolic regression to express Boolean functions in terms of fundamental logical gates: AND, OR, and NOT.

Key Contributions

Symbolic Regression Methodology: Boolformer employs a sequence prediction framework to perform symbolic regression on Boolean formulas. The task, framed as a sequence prediction problem, enables Boolformer to efficiently predict a symbolic expression for unseen functions, signifying its generalization capability.
Handling Noisy and Incomplete Data: The paper demonstrates Boolformer's robustness against noise and incomplete data through systematic evaluations. This robustness is showcased by experiments introducing flipped bits and irrelevant variables, reflecting potential real-world scenarios.
Performance on Binary Classification: When applied to binary classification tasks from the PMLB database, Boolformer provides interpretable and competitive results against classic machine learning approaches such as Random Forests. This positions Boolformer as a viable solution offering explainability, which is often a bottleneck in complex models.
Speed and Efficiency in GRN Inference: A notable application of Boolformer is in the modeling of gene regulatory networks (GRNs), where it outperforms state-of-the-art genetic algorithms in terms of computational speed, being several orders of magnitude faster.

Implications and Future Directions

The implications of this research are multi-fold. On the theoretical side, it bridges the gap between deep learning and logic inference, showing that Transformers can be effectively adapted for symbolic tasks traditionally handled by rule-based systems or exhaustive searches like SAT solvers. Practically, Boolformer opens pathways for applications in domains where Boolean models are prevalent, such as biology and medicine, providing fast and interpretable solutions.

Future work may expand on this foundation by incorporating more complex logical operations, such as XOR, directly into the learning process, which could increase the model's expressiveness and efficiency. Additionally, exploring linear or more efficient attention mechanisms may allow Boolformer to scale to larger input sizes, thereby enhancing its applicability to more complex real-world tasks.

In conclusion, the work on Boolformer represents a significant step in the use of deep learning architectures for logic-based tasks. It highlights the flexibility of Transformers and their capacity for symbolic manipulation, broadening the landscape of AI applications in areas that demand interpretability and logical rigor.

Markdown Report Issue