Six Sigma as a Management System
The process measurement system and the problem solving methodology are applied for process improvement which is directly related to the organization’s strategy (Motorola, 2011). Motorola has found that using Six Sigma as a measurement system and as a methodology are not enough to drive the improvements in an organization (Motorola, 2011), whereas Six Sigma is used as well as a management system for achieving the organizational business strategy. Six Sigma according to General Electric (GE): “Six Sigma is a highly disciplined process that helps us focus on developing and delivering near-perfect products and services, the central idea behind Six Sigma is that if you can measure how many ‘defects’ you have in a process, you can systematically figure out how to eliminate them and get as close to ‘zero defects’ as possible. To achieve Six Sigma Quality, a process must produce no more than 3.4 defects per million opportunities. An ‘opportunity’ is defined as a chance for nonconformance, or not meeting the required specifications” (Electronic, 2005). Six sigma according to isixsigma: “Six Sigma is a rigorous and disciplined methodology that uses data and statistical analysis to measure and improve a company’s operational performance by identifying and eliminating ‘defects’ in manufacturing and service-related processes. Commonly defined as 3.4 defects per million opportunities, Six Sigma can be defined and understood at three distinct levels: measurement system, methodology and philosophy” (isixsigma, 2011). (Linderman et al., 2003) emphasizes the need for a common definition of Six Sigma: “Six Sigma is an organized and systematic method for strategic process improvement and new product and service development that relies on statistical methods and the scientific method to make dramatic reductions in customer defined defect rates”.
ISBSG Data repository
In software engineering, the data collected for empirical studies is very important. Data repositories such as the ISBSG provides a free set of questionnaires to collect data on software projects, including software functional size measured with standard measurement methods recognized by ISO. ISBSG collects data in a repository in Australia and provides an extract of data to practitioners and researchers in a MS-Excel file – see Figure 1.2. Figure 1.2 Management of the ISBSG repository (Cheikhi, 2008) The data collection questionnaire is available on the ISBSG website (www.isbsg.org/datacollection- questionnaires) and includes a large number of quantitative and descriptive information on the different characteristics of a software project, namely: team project effort by phase of development, the development methods and techniques, etc. ISBSG provides to its users a dictionary of terms and measures it has defined (ISBSG, 2013) to facilitate the understanding of the questionnaire, to assist in the collection of project data in the repository and to standardize the way that the data collected are analyzed. The questionnaire consists of seven sections broken down into several sub-sections. ISBSG offers at a modest license fee the public the data collected from various organizations around the world, with different methodologies, techniques and phases of the software life cycle, and in standard format (Cheikhi, 2008). For example, ISBSG provides useful data for multiple purposes, namely the comparison of productivity models, models for estimating the effort, etc. (Cheikhi, 2008). Such models can be used by organizations to improve their capacity in terms of planning and control of projects. In addition, the ISBSG repository collects a large number of numeric data on the different characteristics of the software project, including with its various project phases from planning to completion (Cheikhi, 2008). The ISBSG collects data related to software quality that span the entire software life cycle, from project initiation to project completion.
Deletion methods for treatment of missing values
The missing data deletion techniques consist of deleting the fields that contain missing data, and because of their simplicity, they are widely used (Roth, 1994); but this may not lead to the most efficient utilization of the data because such handling can incur a bias in the data unless the values are Missing Completely at Random (Song et Shepperd, 2007). Consequently they should be used only in situations where the amount of missing data is very small (Song et Shepperd, 2007). Researchers have been cautioned against using the deletion methods because they have been shown to have serious limitations (Schafer, 1997).
• Listwise deletion Listwise deletion is also referred as to Casewise deletion, or complete case. This method uses only the data fields that do not have missing values. Because of its simplicity, this may result in many observations that are being deleted can be desirable (Graham et Schafer, 1999). This method is generally acceptable only if there is small number of missing values and also when the data is randomly missing within the data set that is being used (Song et Shepperd, 2007). The listwise deletion method is the simplest technique where all the missing data are removed (Van Hulse et Khoshgoftaar, 2008). When the analyst discards the project with missing data on any of the variables selected and proceeds with the analysis using standard methods (Graham, 2012), then the results of the analysis will be unbiased (Graham, 2012). However, this procedure can lead to a large loss of the observations, which may result in a small data set if the number of the missing data fields are high, in particular when the original data set is small: this situation often occurs for software project estimation (Myrtveit, Stensrud et Olsson, 2001), and (Song et Shepperd, 2007). If the deleted data fields do not represent a random sample from the entire population, the inference will be biased (Mockus, 2008). Also, fewer data fields result in less efficient inference (Mockus, 2008).
• Pairwise deletion Pairwise deletion is also referred to as the available case method. This method considers each data field separately where the fields that contain data will be considered and the ones that do not will be removed from the data set in order to reduce the number of data fields being removed, which may result of using the listwise deletion method (Bala, 2013); however, this approach will result in changing the sample size for each considered data field. Note that pairwise deletion becomes like the listwise deletion when all the data fields are needed for a particular analysis, e.g. multiple regression analysis (Bala, 2013). This method will result in unbiased results if the data is randomly missing (Little et Rubin, 2014). Pairwise deletion needs at least three variables for this kind of approach in order to be different from listwise deletion (Mockus, 2008).
The advantage of this method is that the sample size for each individual analysis is generally higher than with the listwise method (Song et Shepperd, 2007) It is necessary when the overall sample size is small or the number of the missing data is large (Song et Shepperd, 2007). Pairwise deletion is a procedure that focuses on the variance-covariance matrix and each element of that matrix is estimated from all data available for that element (Graham, 2012). The pairwise deletion uses of all available data (Graham, 2012); however, there is no obvious way to estimate standard errors (Graham, 2012). It also may generate an inconsistent covariance matrix in case of multiple variables that contain missing values as mentioned before; on the other hand, the listwise deletion method always generates consistent covariance matrices (Graham et Schafer, 1999). Since the pairwise deletion method uses all of the observed data, then, it should perform better than listwise deletion method when the missing data are completely missing at small correlations and randomness (Little et Rubin, 2014), as shown in the Kim and Curry study (Graham et Schafer, 1999). Studies have shown that when the correlations are large, the listwise deletion method performs better than the pairwise deletion method (Azen et Van Guilder, 1981). However, these methods lead to inefficient analyses and, more seriously, commonly produce severely biased estimates (Donders et al., 2006). There are more techniques to handle missing data, such as imputation techniques, that give much better results (Donders et al., 2006): these techniques are easy accessible and available in standard statistical software, such as SAS. Nevertheless, there seems to be a general lack of understanding that has limited their use by researchers (Donders et al., 2006). (Haitovsky, 1968) stated that imputation techniques might perform better than deletion techniques, when the data set contains large amount of missing data, or the mechanism leading to the missing data is non-random.
|
Table des matières
INTRODUCTION
CHAPTER 1 LITREATURE REVIEW
1.1 Introduction
1.2 Definition
1.2.1 Six Sigma as a Measurement System
1.2.2 Six Sigma as a Problem Solving Methodology
1.2.2.1 Six Sigma DMAIC
1.2.2.2 Design for Six Sigma (DFSS)
1.2.2.3 Six Sigma as a Management System
1.3 Six Sigma Concepts
1.4 Tools and techniques in Six Sigma
1.5 Challenges of Implementing Six Sigma: Strengths and Weaknesses
1.5.1 Weaknesses
1.5.2 Strengths
1.6 Critical success factors of implementing Six Sigma
1.7 Different views on applying Six Sigma in software organizations
1.8 Why software organizations should choose Six Sigma?
1.9 The International Software Benchmarking Standards Group (ISBSG)
1.9.1 ISBSG Data repository
1.9.2 ISBSG Internal View
1.9.3 Anonymity of the data collected
1.9.4 Extract data from the ISBSG data repository
1.10 The PRedictOr Models In Software Engineering (PROMISE) repository
1.11 Methods for treating the missing values
1.11.1 Deletion methods for treatment of missing values
1.11.2 Imputation methods
1.12 Techniques to deal with outliers
1.13 Defect estimation models
1.13.1 Regression techniques
1.13.2 Estimation models: Evaluation criteria
1.14 Literature review of ISBSG-based studies dealing with missing values
CHAPTER 2 RESEARCH GOAL, OBJECTIVES AND METHODOLOGY
2.1 RESEARCH GOAL AND MOTIVATION
2.2 REASEARCH OBJECTIVES
2.3 THE RESEARCH METHODOLOGY
CHAPTER 3 DATA PREPARATION
3.1 ISBSG data collection questionnaire
3.2 Quality-related Information in the ISBSG Questionnaire
3.3 Analysis of the quality-related data fields in the ISBSG MS-Excel data extract (Release 12 of 2013)
3.3.1 First level of data preparation
3.3.2 Second level of data preparation
3.4 Mapping the of ISBSG Questionnaire to Six Sigma methodologies (DMAIC and DFSS)
3.5 Analysis of software projects of ISBSG dataset N=360 projects
3.5.1 Software projects’ development type analysis results
3.5.2 Six Sigma projects’ type analysis
3.6 Imputation and Defect estimation activities
CHAPTER 4 SINGLE IMPUTATION (SI)
4.1 Introduction
4.2 Implement the Imputation technique for Total Defects field with missing values
4.3 Summary
CHAPTER 5 REGRESSION IMPUTATION (RI)
5.1 Introduction
5.2 Implement the Imputation technique for Total Defects field with missing values
5.3 Summary
CHAPTER 6 STOCHASTIC REGRESSION IMPUTATION (SRI)
6.1 Introduction
6.2 Implement the Imputation technique for Total Defects field with missing values
6.3 Summary
CHAPTER 7 VERFICATION STRATEGY FOR THE IMPUTATION TECHNIQUES
7.1 Introduction
7.2 Verification strategy: creating artificially missing values from a complete dataset
7.3 Dataset preparation of verification strategy: artificially initiate subset with missing data
7.4 Verification analysis for original complete data set N=49 projects
7.5 Verification analysis for imputed data set of N=49 software projects by Single imputation technique
7.6 Verification analysis for imputed dataset of N=49 software projects by Regression imputation technique
7.7 Verification analysis for imputed dataset of N=49 software projects by Stochastic regression imputation technique
7.8 Summary of comparison of performance of SI, RI and SRI techniques on TD estimation models, N=49 projects
Page
CHAPTER 8 SIXSIGMA ANALYSIS FOR SOFTWARE PROJECT OF ISBSG DATA SET
8.1 Introduction
8.2 Sigma analysis results of software projects of ISBSG data set N=360 projects
8.3 Classification of software projects based on Sigma levels of imputed SRI Dataset
N=360 projects for defect estimation purposes
8.4 Summary
CONCLUSION
FUTURE WORK AND RECOMMENDATIONS
ANNEX I LIST OF APPENDICES ON CD-ROM
BIBLIOGRAPHY
Télécharger le rapport complet