Cause of OpenFE’s poor performance on the bike regression task

Ascertain why the OpenFE automated feature engineering algorithm exhibits especially poor performance on the bike sharing regression dataset across all cross-validation folds in the reported experiments.

Background

In the regression results, FAMOSE is compared against classical methods including OpenFE. While FAMOSE generally improves or maintains performance, OpenFE performs notably worse on the bike sharing dataset.

The authors explicitly state that they do not currently know the reason for OpenFE’s degradation on this task, indicating a dataset-specific failure mode that requires further investigation.

References

Interestingly, OpenFE performs especially poorly for the bike task, although we are unsure as of yet why its performance is consistently worse across all folds.

FAMOSE: A ReAct Approach to Automated Feature Discovery  (2602.17641 - Burghardt et al., 19 Feb 2026) in Section Results (Regression)