Local Classifier per Parent/Node (LCPN)
- LCPN is a modular hierarchical classification strategy that trains distinct multi-class classifiers at each internal node of a label tree.
- Extensions LCPN+ and LCPN+F enhance robust inference by integrating probabilistic all-paths and global flat classification mechanisms.
- Implemented in the HiClass framework, LCPN ensures hierarchy-consistent predictions while leveraging specialized metrics for structured evaluation.
Local Classifier per Parent/Node (LCPN) is a modular hierarchical classification strategy wherein distinct multi-way classifiers are trained at each internal node of a tree-structured label hierarchy. Each classifier discriminates among its direct child classes using only those training examples whose true class resides in the node’s subtree. Extensions such as LCPN+ (probabilistic all-paths) and LCPN+F (combining local and global classification) further improve robustness and performance, especially in structured or time-series domains. This paradigm is foundational for hierarchical classification, prominently implemented in frameworks such as HiClass (Miranda et al., 2021), and empirically validated in automatic hierarchy induction settings (&&&1&&&).
1. Formal Framework and Problem Definition
Let denote the tree-structured hierarchy, where is the set of nodes (internal and leaves) and is the parent-child relation. The leaves represent the atomic classes; each training sample is labeled by a unique leaf . The path identifies the sequence of ancestor nodes for each class.
For every internal node with children , LCPN trains a multi-class classifier . The training set for is constructed from all examples whose label lies in ’s subtree:
The overall objective is to minimize the node-wise aggregate loss:
where may be zero–one or a convex surrogate loss.
Inference proceeds in a top-down fashion: starting at the root, each classifier selects the most probable child node; the process recurses until a leaf is reached, producing a hierarchy-consistent prediction path.
2. Algorithms and Variants
Standard LCPN
Training: For each internal node , construct and train on discriminating amongst .
Inference:
1 2 3 4 5 |
p = root while C(p) != ∅: c_pred = f_p.predict(x) # argmax over children p = c_pred return p # must be a leaf |
LCPN+ (Probabilistic All-Paths)
Rather than greedily following a single path, LCPN+ computes a chain-rule probability for every leaf and returns the maximizer:
where is the set of internal nodes along the path to , and is the chosen child on that path.
Pseudocode:
1 2 3 4 5 6 |
for each leaf ℓ in C: score[ℓ] = 1 for each parent p on path to ℓ: q = child of p on that path score[ℓ] *= f_p(x)[q] return argmax_ℓ score[ℓ] |
LCPN+F (Local–Global Combined)
Augments LCPN+ by multiplying the global flat classifier’s prediction for each leaf:
where excludes the leaf split; is a flat classifier over leaf classes.
Pseudocode:
1 2 3 4 5 6 |
for each leaf ℓ in C: score[ℓ] = f(x)[ℓ] # global flat classifier for each parent p on path to ℓ: q = child of p on that path score[ℓ] *= f_p(x)[q] return argmax_ℓ score[ℓ] |
No special loss is introduced—each and is trained with standard cross-entropy.
3. HiClass Implementation
The HiClass Python library (Miranda et al., 2021) implements LCPN in hiclass.local.lcpn.LocalClassifierPerParentNode. The constructor accepts any scikit-learn classifier as a base_estimator and a specified or inferred hierarchy. Input hierarchical labels may be in matrix (samples depth) or nested-list form; HiClass’s HierarchicalLabelEncoder facilitates encoding/decoding.
Typical usage proceeds as:
1 2 3 4 5 6 7 8 9 |
from hiclass.local.lcpn import LocalClassifierPerParentNode from hiclass.utils import HierarchicalLabelEncoder X = ... # features, shape (n_samples, n_features) y_paths = ... # list of lists of node IDs/strings hle = HierarchicalLabelEncoder() y_encoded = hle.fit_transform(y_paths) lcpn = LocalClassifierPerParentNode(base_estimator=..., hierarchy=hle.hierarchy_) lcpn.fit(X, y_encoded) y_pred = lcpn.predict(X) |
4. Comparative Analysis and Efficiency
HiClass supports multiple local strategies:
| Strategy | Classifiers per... | Inference Consistency | #Class Discrimination |
|---|---|---|---|
| LCPN | Parent Node | Guaranteed | total #classes |
| LCL (Level-wise) | Depth Level | Not always | All nodes at same level |
| LCN (Node-wise) | Each Node | Depends | Child nodes only |
LCPN provides notably smaller classification tasks, leveraging local structures, with hierarchical consistency by construction. Its main trade-offs are the increased number of classifiers to train and the possibility of top-down error propagation—an incorrect early split blocks correct leaves downstream.
Computational cost for standard LCPN scales as , where is tree depth and cost of one local prediction; LCPN+ and LCPN+F induce ( = leaf count) but LCPN+F includes an additional for global prediction (Alagoz, 2023).
Empirically, flat classification (FC) is fastest, followed by global, LCPN+, and LCPN+F—all slower than FC, but standard LCPN can be up to slower (due to many small classifiers). LCPN+F and LCPN+ are similar in runtime, with the flat classifier overhead negligible for moderate .
5. Hierarchy Induction and Empirical Performance
Automated hierarchy generation (using divisive or agglomerative tree-building and LDA for dimensionality reduction) can significantly influence hierarchical classifier performance (Alagoz, 2023).
| Scheme | Glass | PPTW | Yeast | Faces | FiftyWords |
|---|---|---|---|---|---|
| LCPN (Div) | 0.877 | 1.119 | 0.935 | 0.845 | 0.842 |
| LCPN+ (Div) | 0.905 | 1.079 | 0.905 | 0.848 | 0.927 |
| LCPN+F (Div) | 1.007 | 1.029 | 0.897 | 1.002 | 1.022 |
(Table shows Learning-Efficiency ; bold means HC flat baseline.)
LCPN+F dominates in most divisive clustering settings, especially for time-series datasets (PPTW, FiftyWords) and structured domains (Glass, Faces). Hierarchy induction quality is critical; poor clustering or reduction will degrade hierarchical classification relative to flat classifiers.
6. Hierarchical Evaluation Metrics
HiClass provides specialized metrics for hierarchical classification assessment (Miranda et al., 2021):
- Hierarchical Precision, Recall, F1: Based on overlaps of node sets along true and predicted paths:
where denotes true node-path, predicted.
- Tree Induced Error (TE): Fraction of hierarchy levels misclassified:
Python usage:
1 2 3 |
from hiclass.metrics import hierarchical_precision, hierarchical_recall hp = hierarchical_precision(y_true, y_pred) hr = hierarchical_recall(y_true, y_pred) |
These metrics provide fine-grained evaluation sensitive to hierarchical structure, outperforming flat accuracy measures for structured tasks.
7. Strengths, Limitations, and Practical Guidelines
Local-per-parent strategies exhibit modularity, exploiting local decision boundaries with parallelizable training and inference. LCPN+ and LCPN+F address error propagation inherent in greedy LCPN by evaluating entire leaf chains; LCPN+F additionally injects global discrimination, enhancing robustness.
Limitations include computational overhead for large , sensitivity to hierarchy induction method, and reduced efficacy where global flat models are inherently optimal (e.g., weakly structured problems).
Practitioners are advised:
- For and moderate inference cost, LCPN+F is optimal for balancing error propagation/global context.
- For strict efficiency requirements, standard LCPN is preferable—accepting error path risks.
- Always empirically validate against flat baselines, experiment with hierarchy induction (divisive/agglomerative clustering plus LDA), and prioritize LCPN+F in structured or time-series domains.
Local Classifier per Parent/Node thus offers a powerful framework for hierarchical classification, with probabilistic and combined extensions further enabling robust and accurate modeling of complex class structures (Miranda et al., 2021, Alagoz, 2023).