Papers
Topics
Authors
Recent
Search
2000 character limit reached

Modeling citation concentration through a mixture of Leimkuhler curves

Published 13 Jan 2024 in cs.DL and stat.AP | (2401.07052v1)

Abstract: When a graphical representation of the cumulative percentage of total citations to articles, ordered from most cited to least cited, is plotted against the cumulative percentage of articles, we obtain a Leimkuhler curve. In this study, we noticed that standard Leimkuhler functions may not be sufficient to provide accurate fits to various empirical informetrics data. Therefore, we introduce a new approach to Leimkuhler curves by fitting a known probability density function to the initial Leimkuhler curve, taking into account the presence of a heterogeneity factor. As a significant contribution to the existing literature, we introduce a pair of mixture distributions (called PG and PIG) to bibliometrics. In addition, we present closed-form expressions for Leimkuhler curves. {Some measures of citation concentration are examined empirically for the basic models (based on the Power {and Pareto distributions}) and the mixed models derived from {these}.} An application to two sources of informetric data was conducted to see how the mixing models outperform the standard basic models. The different models were fitted using non-linear least squares estimation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Research productivity: Are higher academic ranks more productive than lower ones?. Scientometrics, 88:915–928. DOI: 10.1007/s11192-011-0426-6
  2. Handbook of Mathematical Functions. No. 55 in Applied Mathematics Series. National Bureau of Standards.
  3. Citation statistics: a report from the International Mathematical Union (IMU) in cooperation with the International Council of Industrial and Applied Mathematics (ICIAM) and the Institute of Mathematical Statistics (IMS). Statistical Science, 24(1):1–14. https://www.jstor.org/stable/20697661
  4. Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6):716–723.
  5. Amemiya, T. (1985). Advanced Econometrics. Oxford: Basic Blackwell.
  6. Conditional Specification of Statistical Models. Springer-Verlag, New York.
  7. Atkinson, A. B. (1970). On the measurement of inequality. Journal of Economic Theory, 2:244–263.
  8. Application of Bradford’s law of scattering and Leimkuhler model to information science literature. COLLNET Journal of Scientometrics and Information Management, 15(1):197-212. DOI: 10.1080/09737766.2021.1943041
  9. Bozdogan, H. (1987). The general theory and its analytical extension. Psychometrika, 52, 345–370.
  10. Bradford, S. C. (1934). Sources of information on specific subjects. Engineering, 137:85–86; reprinted in Journal of Information Science, 10(4):173–175 (1985).
  11. Brzezinski, M. (2015). Power laws in citation distributions: evidence from Scopus. Scientometrics, 103:213–228. DOI: 10.1007/s11192-014-1524-z
  12. Burrell, Q. L. (1991). The Bradford distribution and the Gini index. Scientometrics, 21:181–194.
  13. Burrell, Q. L. (1992). The Gini index and the Leimkuhler curve for bibliometric processes. Information Processing and Management, 28:19–33.
  14. Burrell, Q. L. (2005). Symmetry and other transformation features of Lorez/Leimkuhler representations of informetric data. Information Processing and Management, 41:1317–1329.
  15. The analysis of library data. Journal of the Royal Statistical Society. Series A (General), 145(4):439–471.
  16. Notes on the measurement of inequality. Journal of Economic Theory, 6:180–187.
  17. Devore J. L. (2015). Probability and Statistics for Engineering and the Sciences. Boston: Cengage Learning.
  18. Modeling the obsolescence of research literature in disciplinary journals through the age of their cited references. Scientometrics, 127:2901–2931.
  19. Egghe L. (2006). Theory and practise of the g-index. Scientometrics, 69(1):131–152.
  20. Measuring statistical heterogeneity: The Pietra index. Physica A, 389:117–125.
  21. On power-law relationships of the Internet topology. In: Applications, technologies, architectures, and protocols for computer communication: Proceedings of the conference on applications, technologies, architectures, and protocols for computer communication, New York: ACM, pp. 251–262.
  22. The inverse Gaussian distribution and its statistical application-a review. Journal of the Royal Statistical Society. Series B (Methodological), 40(3):263–289.
  23. Parametric Lorenz curves based on the beta system of distributions. Communications in Statistics-Theory and Methods, 51(23):8371–8390. DOI: 10.1080/03610926.2021.1894449
  24. Gordy, M.B. (1998). Computationally convenient distributional assumptions for common-value auctions. Computational Economics, 12, 61–78.
  25. Patterns in the growth and thematic evolution of Artificial Intelligence research: A study using Bradford distribution of productivity and path analysis. Research Square Preprin. DOI: 10.21203/rs.3.rs-1806711/v1
  26. Hubert, J.J. (1977). Bibliometric models for journal productivity. Social Indicators Research, 4:441–473.
  27. Co-citation and co-authorship networks of statisticians. Journal of Business & Economic Statistics, 40(2):469–485. DOI: 10.1080/07350015.2021.1978469
  28. Kakwani, N. (1980). On a class of poverty measures. Econometrica, 48:437–446.
  29. Leimkuhler, F.F. (1967). The Bradford distribution. Journal of Documentation, 23:197–207.
  30. Lotka, A. J. (1926). The frequency distribution of scientific productivity. Journal of the Washington Academy of Sciences, 16(12):317–323.
  31. Nair N. U., Vineshkumar B. (2022). Modelling informetric data using quantile functions. Journal of Informetrics, 16(2), 101266. DOI: 10.1016/j.joi.2022.101266
  32. Newman, M. E. J. (2005). Power laws, Pareto distributions and Zipf’s law. Contemporary Physics, 46(5):323–351.
  33. Pareto, V. (1895). La legge della domanda. Giornale degli Economisti 2nd Series, 10:59–68.
  34. Becoming Metric-Wise. A bibliometric guide for researchers. Chandos-Elsevier.
  35. Salpeter, E. (1955). The luminosity function and stellar evolution. Astrophysical Journal, 121:161–167.
  36. Sarabia, J. (2008a). Explicit expressions for the Leimkuhler curve in parametric families. Information Processing and Management, 44:1808–1818.
  37. An ordered family of Lorenz curves. Journal of Econometrics, 91, 43–60.
  38. Sarabia, J. (2008b). A general definition of the Leimkuhler curve. Journal of Informetrics, 2:156–163.
  39. A general method for generating parametric Lorenz and Leimkuhler curves. Journal of Informetrics, 4(4):524–39.
  40. Power laws and critical fragmentation in global forests. Scientific Reports, 8:17766. DOI: 10.1038/s41598-018-36120-w
  41. Seshadri, V. (1983). The inverse Gaussian distribution: some properties and characterizations. The Canadian Journal of Statistics, 11(2):131–136.
  42. Shorrocks, A. F. (1983). Ranking Income Distributions. Economica, 50:2–17.
  43. Distorted Lorenz curves: models and comparisons. Social Choice and Welfare, 42,4,761–780.
  44. Thelwall, M. (2016a). The discretised lognormal and hooked power law distributions for complete citation data: Best options for modelling and regression. Journal of Informetrics, 10(2):336–346. DOI: 10.1016/j.joi.2015.12.007
  45. Thelwall, M. (2016b). Are the discretised lognormal and hooked power law distributions plausible for citation data?. Journal of Informetrics, 10(2):454–470. DOI: 10.1016/j.joi.2016.03.001
  46. Distributions for cited articles from individual subjects and years. Journal of Informetrics, 8(4):824–839. DOI: 10.1016/j.joi.2014.08.001
  47. Yitzhaki, S. (1983). On an extension of the Gini inequality index. International Economic Review, 24:617–628.
  48. Zipf, G. K. (1941). National unity and disunity; the nation as a bio-social organism. Bloomington: Principia Press.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.