Explain the negative interaction effect between Code_Generated and Code_Shared on citation counts

Determine the explanation for the negative interaction effect observed between the Open Science Indicators Code_Generated (indicating that code was generated) and Code_Shared (indicating that code was shared) on log-transformed total citation counts when estimating the base regression model with interactions among Open Science Indicators (reported in the Appendix as "Results for the base model with interactions among OSI"). Clarify whether this effect reflects a genuine relationship in the underlying publications or instead arises as an artifact of the OSI v5 dataset used in the study.

Background

The authors extend their base citation-impact model by including interaction terms among Open Science Indicators (OSI), specifically examining how combinations of practices such as code generation and code sharing relate to citation counts. In this interaction model, they report a surprising negative effect associated with the combination of Code_Generated and Code_Shared, contrasting with expectations that sharing code might enhance reuse and visibility.

They note that this negative effect may be a dataset artifact and explicitly state that they are unsure how to explain it. This leaves unresolved whether the observed interaction is due to modeling, data composition, disciplinary confounds, or a real behavioral pattern in citation practices for papers that both generate and share code.

References

When considering OSI interactions (Table \ref{tab:model_interactions}), we find a further negative effect provided by code generated and code shared. This surprising result may be an artifact of the dataset, that we are unsure how to explain.

An analysis of the effects of sharing research data, code, and preprints on citations  (2404.16171 - Colavizza et al., 2024) in Appendix, paragraph preceding Table “Results for the base model with interactions among OSI”