Hearing4All Auditory Profiles Framework
- Hearing4All auditory profiles is a data-driven, multidimensional framework that categorizes hearing loss into clinically homogeneous subgroups using both pure-tone and supra-threshold measures.
- The methodology employs full-covariance Gaussian Mixture Models combined with PCA and t-SNE for robust cluster formation, achieving 13 distinct profiles validated by normalized indices and ANOVA.
- A federated merging approach and Random Forest classifiers ensure scalable, privacy-preserving integration across datasets, supporting personalized hearing-aid fitting and clinical research.
The Hearing4All auditory profiles constitute a data-driven, multidimensional framework for partitioning individuals with hearing loss into clinically homogeneous subgroups. Central to this approach is the integration of both pure-tone threshold and supra-threshold auditory measures, enabling refined characterization of hearing impairment phenotypes and facilitating precision hearing-aid fitting. The methodology is distinguished by its scale (OHHR, N=1,127) and its explicit internal validation via manifold learning and normalized intrinsic metrics. Recent work has extended the pipeline to federated dataset integration, supporting global generalizability and robust, privacy-preserving clinical utility (Xu et al., 7 Jan 2026, Saak et al., 2024).
1. Feature Set and Data Architecture
The Hearing4All framework utilizes an extended clinical dataset (the Oldenburg Hearing Health Record, N=1,127, mean age = 67.2 years, SD = 12.0) capturing a broad array of audiometric and supra-threshold indices (Xu et al., 7 Jan 2026). Essential features include:
- Pure-tone air-conduction thresholds (11 frequencies, 0.25–8 kHz, both ears): standard audiogram.
- Loudness-growth parameters via Adaptive Categorical Loudness Scaling (ACALOS): transition levels (LCUT), medium loudness (L25), uncomfortable loudness (L50), and slope MLOW, quantified at 1.5 and 4 kHz.
- Speech-in-noise thresholds: Göttingen Sentence Test (GÖSA) and Digit-Triplet Test (DTT), language specific.
- Optional components: cognitive screening (DemTect), verbal intelligence, health indices (SF-12); primarily excluded from core clustering.
Total dimensionality is approximately 15–20 per listener, optimized to capture both audibility deficits and supra-threshold distortions. In federated merging extensions, disclosed common features consist of age, SRT in noise (GOESA), air- and bone-conduction PTA, asymmetry (ASYM), air–bone gap (ABG), UCL PTA, and ACALOS indices (L15, L35, L35–L15 at 1, 4 kHz); dataset-specific features can enrich the initial clustering but are excluded from shared summaries (Saak et al., 2024).
2. Clustering and Manifold Learning Methodology
Profile construction is performed by fitting a Gaussian Mixture Model (GMM) to the z-scored feature space. Each subject's nearest Bisgaard audiogram (N1–N7, S1–S3) is encoded as a one-hot vector, supporting interpretable stratification (Xu et al., 7 Jan 2026).
Methodological highlights include:
- Full-covariance GMM: Clustering conducted with prescribed N.
- Assignment: Listeners assigned according to maximum posterior probability.
- Dimensionality reduction and visualization:
- Principal Component Analysis (PCA): For global variance structure.
- t-Distributed Stochastic Neighbor Embedding (t-SNE): Preserves local clusters by minimizing the Kullback–Leibler divergence:
with as high-dimensional affinities and as low-dimensional (Student’s t-distributed) affinities.
The federated architecture enables iterative, distribution-level merging of locally computed AP histograms without raw data sharing. Similarity scoring between profiles is based on overlapping density of feature distributions:
with overall profile similarity,
Iterative merges continue until a predefined stopping criterion—typically a sharp drop in and change in median overlap variance—signals the limit of clinically interpretable fusion. The resulting global pipeline is extensible to new features and datasets (Saak et al., 2024).
3. Determining the Optimal Number of Profiles
Exploration of cluster counts (N=2–15) revealed two distinct minima in normalized Davies–Bouldin (DB) index: a trivial solution at N=2 (normal vs. impaired), and a stable minimum at N=13, validated by ANOVA (p < 0.05) and post-hoc comparison favoring N=13 over N≠2,3 (Xu et al., 7 Jan 2026). Selection of N=13 yields high clustering quality (compact, well-separated), without sacrificing clinically essential granularity.
Within federated integration, the same criterion was used: merging proceeds until subsequent merges would combine strongly dissimilar clusters, as indicated by a drop in and overlap variance. This approach consistently produces 13 interpretable, clinically meaningful APs (Saak et al., 2024).
4. Intrinsic Evaluation Metrics
To benchmark cluster integrity and separation, three normalized indices were utilized:
| Metric (normalized) | Formula (core) | Desired Direction |
|---|---|---|
| Davies–Bouldin (DB) | Lower is better | |
| Calinski–Harabasz (CH) | Higher is better | |
| Silhouette Index (SI) | Higher is better |
Normalization: DB divided by ; CH multiplied by ; SI scaled by for cross-N comparability. Hearing4All with 13 profiles achieves normalized DB ≈ 0.19, outperforming simple baselines (DB ≈ 0.40), and nearly matching Bisgaard (DB ≈ 0.15); CH and SI values are moderate, exceeding general phenotype approaches (Xu et al., 7 Jan 2026). Visual cluster separability in t-SNE and PCA is enhanced due to multimodal feature incorporation.
5. Comparative Framework Analysis and Federated Extension
When evaluated on a common dataset, 13-class Hearing4All profiles demonstrate robust internal consistency and clear cluster separation, surpassing audiometric-phenotype and general-phenotype frameworks and performing comparably to Bisgaard, WHO HI grades, WARHICS levels, and BEAR profiles. The innovative multimodal feature integration allows t-SNE plots to reveal subclusters not captured by audiogram slope alone; PCA and t-SNE together confirm the superiority of manifold-based profile solutions (Xu et al., 7 Jan 2026).
Federated merging extends this capability, allowing the integration of profiles from multiple sources. AP histograms from disparate datasets (A: 13 APs, B: 31 APs) are iteratively merged based on the overlap of feature distributions, resulting in a single set of 13 global APs. This process maintains profile interpretability across clinical settings and supports data privacy (Saak et al., 2024).
6. Classification Models and Clinical Utility
Random Forest classifiers, trained on merged global APs, support assignment of new patients to profiles given only common features (Saak et al., 2024). Ten feature-set scenarios address practical use cases:
- “ALL” (full common set): Kappa ≈ 0.55, F₁ ≈ 0.60
- “APP” (smartphone): Kappa ≈ 0.60, F₁ ≈ 0.65
- “HA” (hearing-aid fitting): Kappa ≈ 0.50, F₁ ≈ 0.55
- Single-measure models: Kappa ≈ 0.40–0.45, F₁ ≈ 0.45-0.50
Highest performance is obtained using feature combinations including loudness scaling, audiogram, and SRT. The “APP” scenario enables remote triage using minimal, accessible input. Sensitivity, precision, and F₁ score values are reported per profile, confirming clinical viability for stratification and customized fitting.
Within the clinical workflow, the assignment to one of 13 APs guides selection of hearing-aid parameters—including compression ratios, gain, and noise reduction—based on the characteristic deficits and dynamic range of each profile. The framework also facilitates homogeneous subgroup enrollment in clinical trials and offers a structure for federated learning-based refinement as new datasets are incorporated (Xu et al., 7 Jan 2026).
7. Generalizability, Extensibility, and Future Directions
The federated merging mechanism is scalable, requiring only local AP generation, sharing of feature histograms, and application of the profile-merging algorithm. As additional datasets are added, the number of global APs will converge, representing a fixed, population-based profile set. Feature extensibility allows future integration of novel auditory tests, maintaining backward compatibility (Saak et al., 2024). Adoption will benefit from harmonized metadata standards and containerized workflows.
A plausible implication is that global profile convergence will enable precise stratification across clinical and research sites, rendering audiological characterization both reproducible and interpretable. The pipeline’s privacy-preserving design supports regulatory compliance and collaborative research.
In summary, the Hearing4All auditory profiles framework integrates multimodal audiological data via robust GMM clustering and scalable federated merging, achieving high intrinsic validity, clinical interpretability, and practical applicability in personalized hearing healthcare and research contexts (Xu et al., 7 Jan 2026, Saak et al., 2024).