Data Mining and the Project Management Environment
According to Bala (2008), data mining deals with the principle of extracting knowledge from large volume of data and picking out relevant information that finds application in various business decision-making processes. By its very nature the project oriented environment deals extensively with data, information and knowledge for a wide spectrum of decision-making scenarios.
Private and public sector organisations that are involved in delivering projects normally possess a tremendous amount of data related to past and current projects. This voluminous historical projects data is often by itself of low value. However its hidden potential needs to be exploited for various purposes within the project life cycle to ensure the achievement of the business objectives and more specifically corporate success. Executive management must seek ways to exploit data to add value to processes and create a new reality in terms of establishing innovative practices by capturing intelligence and knowledge across the organisation. Hence, the project oriented environment with its extensive data generating capability and capacity has a direct potential link with data mining application concepts for private and public sector organisations.
Overseas, data mining techniques have been successfully applied to various private sector industries in marketing, financial services, and health care. Governments are using data mining for improving service delivery, analysing scientific information, managing human resources, detecting fraud, and detecting criminal and terrorist activities. However, literature is scarce regarding the application of data mining to a project oriented environment.
Organisation’s value chain
An organisation’s value chain becomes an important notion when examining the application of data mining to the project oriented environment. One should note that when referring to an organisation’s value chain we are in reality referring to two separate concurrent but complementary value chains. One portrays the physical value chain and the other depicts the informational value chain. Hence, the physical value chain is the transformation of tangible resources, such as materials and labour, to a finished product or service; while, the informational value chain consists of the data necessary to transform tangible resources to a finished product or service. Both value chains are necessary, each supporting the other, and ultimately they shape the basis of the organisation’s business survival.
Admittedly in the knowledge management literature there is a major difficulty in the use of consistent vocabulary (Hicks et al., 2006). The informational value chain in this context is viewed to be similar to the knowledge hierarchy as defined by Nissen (2000). This researcher viewed the knowledge hierarchy as the traditional concept of knowledge transformations, where data is transformed into information, and information is transformed into knowledge. This is a rather simplistic representation of data transformation. Hicks et al. (2006, 2007) extended the knowledge hierarchy by adding a new personal knowledge class (wisdom). Furthermore, Pyle (2003) and Wong (2004) refer to the knowledge value chain where data is viewed as a detailed record of selected events that is first identified and created, is summarised and structured into information for a specific purpose, is then transformed into knowledge from information by a structured framework. Reference to the informational value chain in this text should be viewed as incorporating the notions presented by these researchers. Data mining or knowledge discovery refers to the process of finding interesting information in large repositories of data (Ayre, 2006). Therefore, the informational value chain is viewed as fundamental to the application of data mining in private and public sector entities.
Data mining is the process of analysing data from different perspectives and summarising it into useful information; information that can be used to increase revenue, cut costs, or both (Palace, 1996). Hence, the focus of data mining in the project oriented environment context is the exploitation and application of the organisation’s vast repository of projects data to the projects that are in the pipeline or are being implemented. The aim is to ensure the maximum return on project completion with the consequence that the undertaken projects will have the intended impact on the private and public sector organisations’ business strategy.
Role of ICT
ICT plays a crucial role in bringing together the processes and data to populate the projects data warehouse that may be mined to determine operational patterns and resolve specific concerns for private and public sector organisations. Firstly, input data may consist of raw data that act as the input transactions for Management Information Systems (MIS) which generate the transactional databases or/and may consist of documented project experiences, such as, business strategies; contracts and projects scopes; various concerns and solutions; and various conflicts and conflict resolution that are entered directly into the projects data warehouse without an MIS filtering. Secondly, MIS provides information for the projects data warehouse and may also utilise its transactional databases as an input source for Decision Support Systems (DSS) and Executive Information Systems (EIS). Thirdly, DSS and EIS may after executing the relevant business models provide information and knowledge to the projects data warehouse. Finally, the projects data warehouse will consist of data, information, and knowledge that will be used by data mining methods for the resolution of a wide spectrum of project related concerns. The long term objectives are to reconcile the varying views of data; provide a consolidated view of enterprise data; create a central point for accessing and sharing analytical data; and develop an enterprise approach to business intelligence and reporting.
This concept is in accordance with the five tier knowledge management hierarchy of Hicks et al., (2007), where the data warehouse is populated from various sources, including individual experience; databases; learning systems; DSS and EIS; knowledge pooling; best practices; expert systems and corporate strategy.
Data mining applications may be applied to various project stages that ensure project management success. These include:
(a) Project proposal preparation and project scope. At this project phase an organisation must overcome three major hurdles: to outclass its competitors by promptly providing a precise and accurate project bid; to be awarded the project; and to execute the project within the defined project scope and the established contractual parameters to achieve its estimated profit margin.
Nemati and Barko (2002) argue that we are living in an age where information is quickly becoming the differentiator between industry leading firms and second rate organisations. The application of data mining at this level would focus on the analysis of the projects data warehouse to find similar project requirements configuration for projects that have already been undertaken by the organisation. This may be achieved through text mining and rules generation using classification and association of the projects data warehouse. This data mining application goes beyond the concept of exploring patterns and relationships within the projects data warehouse to discover hidden knowledge; it would aim at enhancing the decision-making process by transforming data and information into actionable knowledge and gaining a strategic competitive advantage.
(b) Accurate Estimation of Time and Cost to Project Completion. At the project planning stage, data mining may be applied for the preparation of utility data for current project activities by analysing the projects data warehouse and using cluster analysis to identify similar activities that had been conducted in previous projects and extracting the related estimate data and alternative methods of executing the planned activities of the current project. This data mining application may be particularly beneficial to construction industry projects, where the resultant analysis may provide a combination of ways of executing particular activities utilising different equipment, crew sizes, and working hours. Admittedly, the resultant activity estimates and alternative methods would still need to be reviewed but the overall effort and cost to conduct this essential planning task would be significantly decreased.
(c) Occupational Health and Safety. Many projects involving the output of physical items, such as in the engineering and construction environments, regularly encounter occupational health and safety issues. The consequences of accidents during project execution may turn out to be very harmful both in terms of human causalities and project cost escalation. Hence, in a project oriented environment, data mining tools such as, learning and discovery algorithms may be used to determine which project activities, skills or/and resources may be more prone to occupational health and safety issues so that appropriate steps are taken to mitigate or prevent adverse occurrences. Furthermore, decision trees and association rules may be used to detect anomalies in the way project activities are being carried out in relation to past projects and current regulatory standards (e.g. engineering, construction, and occupational health and safety standards).
(d) Preventative Maintenance of Plant and Equipment. Many project oriented organisations, particularly those involving engineering and construction, increasingly rely on profits generated from the high utilisation of plant and equipment. The unscheduled disruption in the use of plant and equipment during project execution not only incurs direct costs of labour, replacement parts and consumables, but also the consequential costs of delays to contract, possible loss of client goodwill and ultimately, loss of profit. Srinivas and Harding (2008) propose a data mining integrated architecture model that provides a mechanism for continuous learning and may be applied to resolve concerns regarding process planning and scheduling, including extracting knowledge to establish rules for identifying maintenance interventions. Wang (2007) illustrates the use of data mining to solve a scheduled maintenance problem in a manufacturing shop which may also be applicable to a project environment. Wang’s data mining application has two objectives: classification - to determine what subsystems or components are most responsible for downtime, the “root cause”; and prediction - to forecast when preventative maintenance would be most effective in reducing failures. Finally, the generated information may be used to establish maintenance policy guidelines, such as planned plant and equipment maintenance schedule. In this example classification and prediction were achieved by utilising decision trees.
The above are just a few examples of how data mining techniques may be applied in a project management environment. However, it is essential to have a senior management executive who will act as the organisation’s data mining champion to guarantee the long-term sustainability of the data mining investment. These measures will ensure that using data mining in a project oriented environment will help an entity to achieve corporate success at unprecedented levels.
A more advanced and detailed description of how data mining may be applied in project management environment is found in my chapter in the Handbook of Research on Data Mining in Public and Private Sectors: Organizational and Government Applications (IGI Global, New York).