Evaluation indicators for open-source software: a review

Zhao, Yuhang; Liang, Ruigang; Chen, Xiang; Zou, Jing

doi:10.1186/s42400-021-00084-8

Cybersecurity

Table 1 Statistics of investigated papers. “Source” denotes where the projects are collected for investigation, “# Project” indicates the number of investigated projects, and “Approach” shows what methods are used to obtain the conclusions

From: Evaluation indicators for open-source software: a review

Study	Source	# Project	Data analysis methods
Lerner (2005)	SourceForge	40000	OLS¹
Colazo et al. (2009)	SourceForge	62	OLS and Cox regression
Sen et al. (2008)	SourceForge	196 responses	Multinomial logit analysis
Grewal (2006)	SourceForge	108	Latent class cluster analysis
Crowston et al. (2004)	SourceForge	122	PSM²
Garousi (2009)	SourceForge	8,627	N.A.
Crowston et al. (2003)	Surveys via SlashDot³	170	Atlas-ti¹³
Sen (2006)	FreshMeat	12923	FIML⁴
Wu et al. (2007)	SourceForge	56	3SLS⁵
Stewart et al. (2005)	FreshMeat	147	MANCOVA⁶
Raymond (1999)	Fetchmail¹²	N.A.	N.A.
Fershtman et al. (2004)	SourceForge	71	GLS⁷
Subramaniam (2009)	SourceForge	8,627	Random-effects and linear regression
Midha et al. (2012)	N.A.	283	VIF⁸
Colazo (2005)	SourceForge	62	OLS
Tsay et al. (2012)	Github	N.A.	Separate negative & binomial regression
Homscheid et al. (2016)	Survey	321	Theory-driven approach
Spaeth et al. (2015)	Maemo and OpenMoko	N.A.	N.A.
Teigland et al. (2014)	eZ Publish	N.A.	Abductive approach
Guinan et al. (1998)	15 organizations	66 Teams	PCA¹⁶
English et al. (2007)	SourceForge	110,933	N.A.
Beecher (2008)	Debian⁹	50	GQM¹⁰ Method
Robinson and Vlas. (2015)	SourceForge	31	Six-Vertex measurement model
Comino et al. (2007)	SourceForge	88,192	N.A.
Giuri et al. (2004)	SourceForge	N.A.	Multinomial logit analysis
Schweik (2009)	SourceForge	107,747	N.A.
Ghapanchi (2015))	N.A.	1,409	PLS¹¹
Chang (2018)	CFA and brigades’ Slack channels	143	Inferential statistics method
Ke and Zhang (2011)	SourceForge	233	PLS
Peng (2019)	Github	N.A.	OLS, GLM¹⁷, BLR¹⁸
Feitelson et al. (2006)	SourceForge	1681	Least-squares analysis
Emanuel et al. (2010)	SourceForge	160141	Datamining 2-Itemset Association Rule
Tamura and Yamada (2007)	Fedora Core Linux	N.A.	Neural network and NHPP model
Norikane et al. (2018)	QT project database	N.A.	Prediction model
Bao et al. (2019)	Github	917	Wilcoxon rank-sum test with Bonferroni correction
Yang et al. (2013)	Ohloh	N.A.	Regression data analysis
Hanoğlu and Tarhan (2019)	Github	17	Understand 5.1 and JASP
Crowston and Shamshurin (2017)	ASF¹⁴ Incubator	74	Violin plot
Joy et al. (2018)	Github	130	OLS
Chen et al. (2015)	N.A.	70	Data Analysis
Greene and Fischer (2016)	Github	1000	N.A.
Rebouças et al. (new12-20)	Github	35360	Fisher’s Exact Test
Hata et al. (020803)	Github	22	Game-theoretical models
Fronchetti et al. (020804)	Github	450	Random Forest and KSC clustering algorithm¹⁹

¹OLS: Ordinary least squares regression
²PSM: Parametric Survival Model
³Surveys via SlashDot: The data was collected by surveying developers via SlashDot, a popular Web-based discussion board
⁴FIML: Full Information Maximum Likelihood
⁵3SLS: Three-Stage Least-Squares regression
⁶MANCOVA: Multivariate analysis of covariance
⁷GLS: Generalized least squares regression
⁸VIF: Variance Inflation Factors
⁹Debian: This survey is made among Linux kernel developers
¹⁰GQM: Goal, Question, Metric method
¹¹PLS: Partial least squares regression
¹²Fetchmail: Full-featured IMAP and POP client
¹³Atlas-ti: A program used for qualitative research or data analysis
¹⁴ASF: Apache Software Foundation
¹⁵LCA: Latent class cluster analysis
¹⁶PCA: Principal component analysis
¹⁷GLM: Generalized linear model
¹⁸BLR: Bayesian linear regression
¹⁹KSC clustering algorithm: K-Spectral Centroid clustering algorithm

Back to article page