Papers
Topics
Authors
Recent
Search
2000 character limit reached

An Empirical Comparison of Dependency Network Evolution in Seven Software Packaging Ecosystems

Published 13 Oct 2017 in cs.SE | (1710.04936v1)

Abstract: Nearly every popular programming language comes with one or more package managers. The software packages distributed by such package managers form large software ecosystems. These packaging ecosystems contain a large number of package releases that are updated regularly and that have many dependencies to other package releases. While packaging ecosystems are extremely useful for their respective communities of developers, they face challenges related to their scale, complexity, and rate of evolution. Typical problems are backward incompatible package updates, and the risk of (transitively) depending on packages that have become obsolete or inactive. This manuscript uses the libraries.io dataset to carry out a quantitative empirical analysis of the similarities and differences between the evolution of package dependency networks for seven packaging ecosystems of varying sizes and ages: Cargo for Rust, CPAN for Perl, CRAN for R, npm for JavaScript, NuGet for the .NET platform, Packagist for PHP, and RubyGems for Ruby. We propose novel metrics to capture the growth, changeability, resuability and fragility of these dependency networks, and use these metrics to analyse and compare their evolution. We observe that the dependency networks tend to grow over time, both in size and in number of package updates, while a minority of packages are responsible for most of the package updates. The majority of packages depend on other packages, but only a small proportion of packages accounts for most of the reverse dependencies. We observe a high proportion of fragile packages due to a high and increasing number of transitive dependencies. These findings are instrumental for assessing the quality of a package dependency network, and improving it through dependency management tools and imposed policies.

Citations (216)

Summary

  • The paper introduces novel indices to quantify growth, changeability, reusability, and fragility in seven software ecosystems.
  • It employs survival analysis and regression models to uncover diverse evolution patterns and uneven update distributions.
  • Findings reveal deep dependency layers that heighten ecosystem fragility, informing improved dependency management strategies.

An Empirical Comparison of Dependency Network Evolution in Seven Software Packaging Ecosystems

The paper "An Empirical Comparison of Dependency Network Evolution in Seven Software Packaging Ecosystems" investigates the dynamics of package dependency networks across seven diverse software ecosystems: Cargo, CPAN, CRAN, npm, NuGet, Packagist, and RubyGems. It leverages the \textsf{libraries.io} dataset for a comprehensive analysis, focusing on the evolution of these networks in terms of size, changeability, reusability, and fragility.

Research Questions and Methodology

The study addresses four main research questions:

  1. Growth: How do package dependency networks grow over time?
  2. Changeability: How frequently are packages updated?
  3. Reusability: To what extent do packages depend on other packages?
  4. Fragility: How prevalent are transitive dependencies?

Methods include statistical analysis techniques such as survival analysis and regression models to identify trends within the networks. Furthermore, the authors propose novel indices, like the Changeability Index, Reusability Index, and P-Impact Index, to quantify and compare the respective characteristics across ecosystems.

Key Findings

  1. Continuous Growth: All ecosystems exhibit growth in the number of packages and dependencies, although the growth rate and its complexity vary. Some networks grow linearly, while others, notably npm, exhibit exponential growth in both packages and dependencies.
  2. Frequent Updates: Most ecosystems have stable or growing numbers of package updates over time. A minority of packages are responsible for the majority of updates, with updates concentrated in newer, less stable packages. Notably, CRAN imposes policies that result in fewer, but more stable, updates.
  3. Reusability Patterns: Dependencies are abundant, and most packages are either dependent or required by others. A significant inequality exists in reverse dependencies, with a small number of packages having a large number of dependents. The study's Reusability Index shows increasing reuse over time in most ecosystems.
  4. High Fragility: Transitive dependencies contribute to ecosystem fragility, as they can propagate failures. The studied networks often have deep dependency layers, exacerbating this issue. The P-Impact Index highlights a growing number of "high-impact" packages that can influence a significant portion of the ecosystem upon failure.

Practical and Theoretical Implications

The study underscores the importance of understanding package dependency networks in managing software ecosystems' growth and complexity. It reveals the challenges posed by frequent updates and the intricate propagation of dependencies, providing insights that could inform better dependency management tools and strategies.

Theory-wise, the authors suggest that Lehman's laws of software evolution, typically applied to software systems, extend to ecosystems when adapted to network characteristics like growth and complexity.

Future Work

The paper prompts further exploration of ecosystem-specific dynamics and the socio-technical network effects of developer interactions. Future research might include extending analyses to other ecosystems, exploring the socio-technical aspects, and integrating complex network theories to better understand the emergent structures governing these ecosystems.

In conclusion, this comprehensive study provides a robust foundation for understanding dependency networks in software packaging ecosystems, offering both quantitative insights and qualitative discussions that can guide ecosystem management and tool development.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.