- The paper demonstrates how TDA maps knowledge sharing on GitHub by classifying 13 communication channels using the SECI model.
- The analysis reveals that newer projects diversify channels more significantly, with distinct usage patterns observed across ecosystems like Bower and npm.
- The study highlights practical implications for maintainers by emphasizing the strategic integration of tacit and explicit channels to enhance collaboration.
"A Topological Analysis of Communication Channels for Knowledge Sharing in Contemporary GitHub Projects" (1905.03593)
Introduction
This paper investigates the complex landscape of communication channels within GitHub projects, guided by the SECI model of knowledge management which delineates knowledge transfer into four processes: Socialization, Externalization, Combination, and Internalization. The authors explore how these channels facilitate the capture and exchange of knowledge, particularly through the lenses of externalization and combination.
Methodology
The authors conduct a topological analysis of seven library ecosystems on GitHub, comprising over 210,000 projects. The preliminary study involved classifying 13 communication channels into tacit and explicit knowledge using the SECI model. This classification informed a topological data analysis (TDA), which mapped the adoption and evolution of communication channels over three years (2015-2017). TDA provides geometric representations of data, revealing structural patterns within complex datasets.
Figure 1: A typical TDA showcases the shape of data; here it is used to analyze communication channels.
Discussion of Communication Channels
The study categorizes GitHub communication channels like pull requests, wikis, and issues according to their knowledge forms. Tacit knowledge is often personalized and context-specific, as seen in wikis and GitHub pages, whereas explicit knowledge is structured, as in issue trackers and changelogs.
The analysis revealed that newer projects increasingly adopt diverse communication channels, reflecting an evolving approach to knowledge sharing. The study found that "externalization" channels, especially those fostering new knowledge capture like wikis, were favored alongside "combination" channels that manage existing knowledge like issue trackers and contributing guidelines.
Results and Findings
Key results indicate:
- Evolution Over Time: Younger GitHub projects diversify their communication tools more than older ones. Channels such as contributing guidelines have grown in prevalence, pointing to a focus on structured community contributions.


Figure 2: Projects created in 2015 illustrate an older reliance on fewer communication channels.
- Variance Across Ecosystems: Different library ecosystems exhibit distinct patterns of communication channel usage. For example, ecosystems like Bower heavily utilize wikis for tacit knowledge sharing, whereas npm emphasizes issue trackers.


Figure 3: Topology for Bower libraries shows a triangular pattern indicating diverse channel adoption.
Evaluation of TDA
The study employs TDA to navigate high-dimensional data, showcasing its ability over traditional methods like PCA by providing a detailed multi-channel landscape. TDA identifies dominant channels within ecosystems, offering insights unattainable by PCA's summarized dimensions.
Implications for Practice
- Adoption of Varied Channels: Project maintainers are encouraged to integrate multiple communication channels to enhance knowledge flow and engagement.
- Dynamic Nature of Channels: The evolution of channels indicates the need for continuous adaptation of communication strategies to current development practices.
- Strategic Knowledge Management: A keen understanding of tacit and explicit channels will bolster project visibility and community investment.
Conclusion
The paper underscores the critical role of communication channels in shaping knowledge dynamics on GitHub. By leveraging topological insights, researchers and practitioners can better understand ecosystemic collaboration and evolve project strategies in alignment with emergent social coding practices. Future work can explore further the cross-sectional usage of channels and their quality in enhancing developer interaction.