DECAP: Modular Systems in Parsing, Learning & Design
- DECAP is a modular framework that decouples complex phenomena across diverse domains such as spoken parsing, imitation learning, power optimization, and image captioning.
- It employs specialized agentic submodules that isolate domain-specific challenges to enable targeted error correction and improved performance metrics.
- Empirical evaluations show significant gains over traditional methods, leveraging error-aware evaluation and a deterministic, modular design.
DECAP is a term appearing in several distinct and technically specialized research contexts, most notably (i) as the DECoupled Agentic Parser for robust syntactic analysis of spoken code-switching utterances, (ii) as Decaying Action Priors in imitation learning for torque-based locomotion, (iii) as (De)Coupling Capacitor optimization in power distribution networks, and (iv) as data-efficient image captioning and explicit caption editing frameworks. Each of these DECAP or DeCAP instantiations is grounded in a unique methodology and domain; all are united by a focus on decoupling or decomposing complex co-occurring phenomena through modular system architectures, domain-informed priors, or context-adaptive policies.
1. DECAP: Decoupled Agentic Parsing for Spoken Code-Switching
The DECAP parser (“DECoupled Agentic Parser”) is introduced to address the distinctive linguistic and structural challenges of syntactic dependency parsing in spoken code-switching (CSW) data, where conversational transcriptions exhibit phenomena—repetitions, ellipsis, fillers, discourse-driven structure—that violate the standard Universal Dependencies (UD) assumptions (Tyagi et al., 6 Feb 2026). Classical monolithic or LLM-based parsers tend to conflate structural noise with error, resulting in underperformance both in terms of formal accuracy and linguistic plausibility.
DECAP implements a deterministic, four-stage agentic pipeline architecture:
- Spoken-Phenomena Handler (SPH): Tags disfluencies and spoken-specific structures, minimally re-tokenizes, and anchors reparandum nodes.
- Language-Specific Resolver (LSR): Applies language-aware contraction expansion, multiword expression (MWE) confirmation, and lemma normalization, ensuring all tokens are UD-compatible.
- Core UD Structure Assigner: Produces UPOS tags, heads, and deprels under hard constraints provided by SPH and LSR, relaxing strict UD assumptions only as required.
- Verifier and Ranker (V/R): Globally repairs, enforces single-root and acyclicity, and integrates per-token agent confidence scores with a penalty system.
The output includes both the fully structured parse and token-level diagnostics, facilitating downstream ambiguity-aware evaluation.
2. Formalization and Subtask Decomposition in DECAP
Each agent in DECAP explicitly formalizes its subtask:
- SPH operates as a tagging and tokenization edit module, labeling tokens with phenomena such as ‘repetition’ or ‘ellipsis’, assigning anchor links for reparanda, and updating the ID map.
- LSR normalizes cross-lingual contractions, validates MWEs against a high-precision whitelist, and suggests lemmas with associated confidence metrics.
- Core UD Structure Assignment is posed as a constrained graph prediction task. Hard constraints include, e.g., forcing reparanda to attach as rep-dependents, and enforcing a single-root per sentence.
- V/R applies fixed priority rules to resolve ill-formedness (e.g., multiple roots, cycles) and aggregates confidence/penalty scores across agents.
Consider the following schema for token record data (abbreviated):
| Token | lang_tag | spoken_label | spoken_anchor | lemma | mwe | confidence | penalty |
|---|---|---|---|---|---|---|---|
| "won’t" | en | enclisis | N/A | "will" | false | 0.95 | 0.0 |
| "uh" | en | discourse | root | "uh" | false | 0.96 | 0.1 |
This structuring enables modular error tracing and targeted repair of parsing failures.
3. Interface with Ambiguity-Aware Evaluation (FLEX-UD)
Standard dependency metrics (LAS/UAS) penalize any deviation from gold, failing to capture the linguistic validity of many spoken phenomena parses. DECAP integrates directly with FLEX-UD, an evaluation framework that computes a weighted aggregate of fine-grained component scores (Split, ID, UPOS, HEAD, DEPREL) and applies a severity penalty for catastrophic or unlicensed violations. The mapping
aligns parser evaluation with linguistically plausible outcomes, leveraging the token-level penalty and confidence outputs of DECAP.
4. Training, Inference Regime, and Data Requirements
DECAP agents are not parametric models requiring supervised gradient descent; instead, they are instantiated as fixed, deterministic prompt-engineered modules in a GPT-4.1 environment, with temperature zero for inference (Tyagi et al., 6 Feb 2026). No agents are fine-tuned on target CSW datasets (e.g., SpokeBench, Miami Corpus). The only data requirements are tokenized, language-ID-annotated utterances and MWEs/contraction schemas. Human-annotated data are reserved exclusively for prompt validation and downstream evaluation.
5. Empirical Results and Comparative Performance
On the SpokeBench benchmark, DECAP outperforms both traditional (Stanza-based) bilingual parsers and closed-source monolithic LLM pipelines. It achieves:
- LAS: 0.48 (vs. 0.13 Stanza, 0.32 LLM baseline)
- UPOS-LAS: 0.87 (vs. 0.52, 0.69)
- FLEX-UD Final: 76.2 (vs. 29.5, 72.2)
Ablations show notable performance drops when omitting SPH (LAS -0.13) or LSR (FLEX-UD -3 points). On highly challenging categories (ellipsis, repetition), DECAP produces up to 52.6% relative improvements over the best non-agentic baseline.
6. Interpretive Context and Extensions
DECAP’s effectiveness rests on two principles:
- Modularization of linguistic subtasks enables robust handling of non-canonical elements without conflating them with core syntax.
- Error-aware evaluation through FLEX-UD aligns machine scoring with actual linguistic ambiguity and variation.
A plausible implication is that further decomposition, e.g., integrating dialogue-act recognition or richer prosodic cues, could further increase robustness in dialogic parsing environments.
7. Other Notable Uses: DECAP in Policy Learning and Beyond
- DECAP (Decaying Action Priors): In imitation learning for torque-based legged locomotion, DECAP refers to a two-stage pipeline combining policy imitation from position space with a time-decaying, PD-based bias in torque action space, resulting in accelerated and robust convergence to natural gaits and low sensitivity to reward scaling (Sood et al., 2023).
- DECap & DeCap (Captioning): In image captioning and editing, DECap denotes diffusion-based explicit caption editing for robust, generalizable, and controllable caption refinement (Wang et al., 2023), while DeCap refers to a projection-based, text-only training paradigm for zero-shot decoding of CLIP latents in caption generation (Li et al., 2023).
- DECAP (Context-Adaptive Prompts): In debiasing LLMs for QA, DeCAP is a context-adaptive prompt system that combines ambiguity detection with neutral guidance injection for state-of-the-art zero-shot fairness (Bae et al., 25 Mar 2025).
- DECAP (PDN Optimization): In physical design for ICs and HBM, DECAP involves RL-based transformer policies for data-efficient, scalable, and re-usable decoupling capacitor assignments (Park et al., 2022).
Each instantiation emphasizes principled decoupling—across structural, statistical, or architectural axes—resulting in greater robustness and efficiency across tasks.