Papers
Topics
Authors
Recent
Search
2000 character limit reached

Bi-dendrograms for clustering the categories of a multivariate categorical data set

Published 19 Sep 2025 in stat.ME | (2509.15939v1)

Abstract: The clustering of categories in a multivariate categorical data set is investigated, where the problem separates into that of merging categories of the same variables (i.e., within-variable categories), and combining categories of different variables (i.e., between-variable categories). For the within-variable problem, the objective is to arrive at fewer categories (and, consequently, lower data dimensionality) without affecting the essential features of the data set, thereby simplifying the interpretation of any analysis using the categorical variables. The categories can be of an ordinal or nominal nature, and this property is respected in the clustering, where only adjacent categories of ordinal variables can be combined. For the between-variable problem, the objective is to arrive at asmall number of category clusters that typify the observations in the data set. In this latter problem there is no restriction on which categories can combine, as long as they do not combine within the same variable. In each of these problems, results are given in the form of a pair of dendrograms stacked one on top of the other, called a bi-dendrogram. For the within-variable problem, once all categories within each variable have been merged, the second stage is to cluster the variables themselves. For the between-variable problem, the second stage is to cluster groups of respondents that fall into the response sets arrived at in the first stage of clustering. The approach is illustrated using a sociological survey data set from the International Social Survey Program.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.