SATD and Refactoring Coupling

Updated 17 February 2026

SATD/Refactoring Coupling is defined as the interplay between explicit technical debt markers (e.g., TODO, FIXME) and code refactoring aimed at improving software structure.
Empirical studies reveal over 55% of SATD removals coincide with refactoring, with operations like Extract Method and Remove Method frequently observed.
Automated detection tools and statistical analyses underscore that managing SATD through refactoring significantly impacts software maintainability and dependency architecture.

Self-Admitted Technical Debt (SATD) and its coupling with refactoring constitute a central concern in empirical software engineering. SATD refers to any source-code or artifact annotation (typically comment-based) in which developers explicitly acknowledge suboptimal, incomplete, or otherwise “debtful” code, most commonly via TODO, FIXME, or XXX tags. Refactoring comprises source-code transformations that preserve semantics while improving structural or design quality. A growing body of evidence, spanning multiple large-scale studies, demonstrates that SATD markers and refactoring activities are tightly interwoven: the majority of SATD removals occur alongside refactoring, and certain refactoring operations more frequently coincide with SATD management than others (Peruma et al., 2022, Esfandiari et al., 2024). This coupling has far-reaching implications for the maintainability, architecture, and evolution of software projects (Sutoyo et al., 26 Jan 2025).

1. Key Definitions and Detection Methods

Self-Admitted Technical Debt (SATD) is defined as inline documentation in source-code or related artifacts (e.g., commit messages, issues) where developers explicitly signal technical deficiencies or planned rework. The canonical markers operationalizing SATD include TODO (work incomplete), FIXME (known issues), and XXX (warnings deserving review) (Peruma et al., 2022, Esfandiari et al., 2024).

Refactoring refers to semantics-preserving code transformations, such as Rename Method, Extract Method, Change Variable Type, Move Class, or Remove Method, executed primarily to improve structure, modularity, readability, or architecture (Peruma et al., 2022). Refactorings are detected using static analysis tools such as RefactoringMiner or RefDiff, which mine AST-level changes for recognized patterns.

Architectural Technical Debt (ATD) is a subtype of SATD focusing on violations of modularity, overcoupling, obsolete technology, or similar architectural deficiencies. Identification of ATD involves manual curation or classification schemes that isolate architecture-related markers or comments from broader sets of SATD instances (Sutoyo et al., 26 Jan 2025).

Detection of SATD and refactoring activities at scale commonly leverages datasets such as SmartSHARK, coupled with diff-based heuristics for comment markers and AST-diffing for structural transformations (Peruma et al., 2022, Esfandiari et al., 2024).

2. Quantitative Characterization of SATD/Refactoring Coupling

Empirical studies across dozens of open-source Java systems have established that SATD removal and addition are highly likely to co-occur with refactoring operations. Key quantitative metrics include:

Coupling Ratio (CR): The proportion of SATD-removal commits that also perform a refactoring operation. For example, in a corpus of 77 Apache Java projects, the coupling ratio is 55.37%, signifying that over half of all SATD-removal commits involve refactoring (Peruma et al., 2022).
Odds Ratio (OR): The likelihood that a commit with SATD change (addition or removal) also contains a refactoring, relative to non-SATD commits. All 76 systems with SATD removals demonstrate OR > 1, with statistical significance in 74 cases (p < 0.05, Fisher’s Exact Test) (Peruma et al., 2022).
Project Coverage: SATD removal co-occurs with refactoring in 95% of studied projects (p < 0.05, OR > 1), and SATD addition does so in 89% of projects (Esfandiari et al., 2024).
Prevalence: Only 10.9% of all refactoring commits are associated with SATD removal, whereas 55.37% of all SATD removals involve refactoring (Peruma et al., 2022).

The volume and intensity of refactoring actions are also higher in files where SATD is removed. The median number of refactoring operations per file is 2 when debt is removed versus 1 otherwise (Mann-Whitney U, p < 0.05) (Peruma et al., 2022).

3. Refactoring Types and Patterns in SATD Management

Refactoring operations involved in SATD repayment are manifold, but certain types are especially prominent. In the context of single-file SATD-removal commits (Peruma et al., 2022):

Refactoring Type	% of Ops in SATD Payoff	Mean Ops per File
Extract Method	22.32%	2.16
Change Variable Type	18.07%	2.27
Rename Variable	10.89%	1.60
Rename Method	9.73%	1.66
Rename Parameter	7.64%	1.90
Extract Attribute	7.41%	5.05
Inline Method	4.48%	2.07
Extract Variable	4.17%	1.17
Rename Attribute	3.94%	1.59
Change Return Type	3.32%	1.48

Three specific operations—move class, remove method, and move attribute—exhibit statistically higher occurrence in the presence of SATD, with odds ratios ≈2.3, 1.9, and 2.1 respectively (p < 0.05) (Esfandiari et al., 2024). However, the overall mix of refactoring types does not differ significantly between SATD-involved and non-involved files (Fisher’s p > 0.05; cosine similarity ≈ 0.84) (Peruma et al., 2022, Esfandiari et al., 2024). This suggests that while the frequency of refactoring increases with SATD, the structural nature of transformations is robust across contexts.

Qualitative analysis highlights three classes of debt typically repaid via refactoring: error-handling improvements, structural/design clean-up (“get rid of cast,” identifier renaming, code motion), and feature updates (implementing previously deferred logic) (Peruma et al., 2022).

4. Architectural SATD, Dependency Metrics, and Code Coupling

At the architectural level, repayment of SATD (ATD) measurably alters the static dependency structure as characterized by FAN-IN and FAN-OUT metrics (Sutoyo et al., 26 Jan 2025):

FAN-IN ( $\mathrm{FAN\text{-}IN}(C)$ ): Number of classes depending on a given class $C$ .
FAN-OUT ( $\mathrm{FAN\text{-}OUT}(C)$ ): Number of classes on which $C$ depends.

Repayment of ATD leads, on average, to a 57.5% increase in FAN-IN and a 26.7% increase in FAN-OUT (Cohen’s $d\approx0.15,0.21$ ; both small effect), indicating increased centrality and architectural complexity. The central node effect elevates the risk of ripple effects and architectural “hub” formation, which can impact maintainability and future evolution.

Compared to non-ATD files, which exhibit even larger coupling metric increases, ATD repayment represents a focused, yet significant, source of growing dependency centralization (Sutoyo et al., 26 Jan 2025).

Modification frequency data reveal that ATD-designated files are touched less frequently than non-ATD files (median changes: ATD = 2, non-ATD = 7; Mann-Whitney U, p ≪ 0.001), implying that such files often become maintenance blind spots until debt is addressed.

5. Statistical Analyses and Methodological Foundations

SATD/refactoring coupling has been robustly supported using a battery of statistical analyses:

Odds Ratios and Chi-square ( $\chi^2$ ) Tests: Committed per project to determine whether SATD/refactoring co-occurrence exceeds random expectation. OR > 1 and $\chi^2$ p < 0.05 in the vast majority of cases (Peruma et al., 2022, Esfandiari et al., 2024).
Mann-Whitney U Tests: Used for non-parametric comparison of refactoring intensity (operations per file) between SATD-removal and non-SATD contexts; all tests yield p < 0.05 for higher operation count in SATD-removal commits (Peruma et al., 2022, Sutoyo et al., 26 Jan 2025).
Fisher’s Exact Test: Applied to refactoring type distributions, establishing that the mix of refactoring operations is statistically indistinguishable between SATD-removal and regular refactorings (p > 0.05) (Peruma et al., 2022).
Spearman Partial Correlation: Shows modest positive correlation between coupling metrics (FAN-IN/OUT) and frequency of change in ATD and non-ATD files (ATD: $r=0.241$ for FAN-IN, $r=0.175$ for FAN-OUT; both p < 0.001) (Sutoyo et al., 26 Jan 2025).

These analyses underpin the empirical consensus regarding the universality and strength of the SATD/refactoring linkage.

6. Interpretation, Context, and Practical Implications

The SATD/refactoring coupling reflects the practical reality that code-quality management is a cyclical process: developers explicitly acknowledge debt via SATD, and subsequently orchestrate targeted refactoring—often en masse—to retire these debt markers. The “mutual trigger” phenomenon is evidenced by high co-occurrence rates and elevated ORs in both addition and removal events (Esfandiari et al., 2024). Refactorings such as move class and remove method are disproportionately associated with SATD, reflecting the structural nature of much debt repayment.

Repayment of architectural SATD centralizes software dependencies, which enhances short-term modularity but may induce long-term centralization risks, necessitating ongoing monitoring of evolution metrics (Sutoyo et al., 26 Jan 2025).

For practitioners, these patterns suggest actionable recommendations:

Prioritize refactoring in TODO/FIXME/XXX-dense modules.
Integrate automated SATD/refactoring detection in CI/CD pipelines.
Monitor evolving coupling metrics (FAN-IN/OUT) to flag emerging architectural vulnerabilities.
Augment code review and refactoring recommendation tools to consider SATD markers as signal for imminent or needed restructuring.

Composite or SATD-aware refactoring strategies—such as multi-step “debt recipes”—can address common repayment patterns with greater automation and precision (Peruma et al., 2022).

7. Open Problems and Future Directions

Continued investigation is warranted in multiple directions:

Temporal sequencing to clarify causality between SATD introduction and refactoring (Esfandiari et al., 2024).
Cross-language analysis to validate patterns beyond Java, including Python and C#.
Broader operationalization of SATD, leveraging NLP-based detectors and extending beyond explicit markers.
Depth interviews with developers to elucidate motivational dynamics.
Linking SATD/refactoring interactions with bug-proneness, modularity decay, and architectural drift over project lifespans.

A plausible implication is that robust technical debt management infrastructure should synthesize SATD marker analytics, fine-grained refactoring recommendation, and dependency/coupling monitoring to support sustained software quality under real-world maintenance constraints (Peruma et al., 2022, Sutoyo et al., 26 Jan 2025, Esfandiari et al., 2024).