Preview: IEEE Transactions on Software Engineering
IEEE Transactions on Software Engineering
The IEEE Transactions on Software Engineering is an archival journal published monthly. We are interested in well-defined theoretical results and empirical studies that have potential impact on the construction, analysis, or management of software. The sc
Published: Mon, 3 Nov 2014 15:36:52 GMT
PrePrint: Using Traceability Links to Recommend Adaptive Changes for Documentation Evolution
Developer documentation helps developers learn frameworks and libraries, yet developing and maintaining accurate documentation requires considerable effort and resources. Contributors who work on developer documentation often need to manually track all changes in the code, determine which changes are significant enough to document, and then, adapt the documentation. We propose AdDoc, a technique that automatically discovers documentation patterns, i.e., coherent sets of code elements that are documented together, and that reports violations of these patterns as the code and the documentation evolves. We evaluated our approach in a retrospective analysis of four Java open source projects and found that at least 50% of all the changes in the documentation were related to existing documentation patterns. Our technique allows contributors to quickly adapt existing documentation, so that they can focus their documentation effort on the new features.
PrePrint: Who Will Stay in the FLOSS Community? Modelling Participant's Initial Behaviour
Motivation: To survive and succeed, FLOSS projects need contributors able to accomplish critical project tasks. However, such tasks require extensive project experience of Long Term Contributors (LTCs). Aim: We measure, understand, and predict how the newcomers’ involvement and environment in the Issue Tracking System (ITS) affect their odds of becoming an LTC. Method: ITS data of Mozilla and Gnome, literature, interviews, and online documents were used to estimate the number of participants and to design measures of involvement and environment. A logistic regression model was used to explain and predict contributor’s odds of becoming an LTC. We also reproduced the results on new data provided by Mozilla. Results: We constructed nine measures of involvement and environment based on events recorded in an ITS. Macro-climate is the overall project environment while micro-climate is person-specific and varies among the participants. Newcomers who are able to get at least one issue reported in the first month to be fixed, doubled their odds of becoming an LTC. The macro-climate with high project popularity and the micro-climate with low attention from peers reduced the odds. The precision of LTC prediction was 38 times higher than for a random predictor. We were able to reproduce the results with new Mozilla data without losing the significance or predictive power of the previously published model. We encountered unexpected changes in some attributes and suggest ways to make analysis of ITS data more reproducible. Conclusions: The findings suggest the importance of initial behaviors and experiences of new participants and outline empiricallybased approaches to help the communities with the recruitment of contributors for long-term participation and to help the participants contribute more effectively. To facilitate the reproduction of the study and of the proposed measures in other contexts we provide the data we retrieved and the scripts we wrote at https://www.passion-lab.org/projects/developerfluency.html
PrePrint: On the Effectiveness of Contracts as Test Oracles in the Detection and Diagnosis of Functional Faults in Concurrent Object-Oriented Software
Design by Contract (DbC) is a software development methodology that focuses on clearly defining the interfaces between components to produce better quality object-oriented software. Though there exists ample support for DbC for sequential programs, applying DbC to concurrent programs presents several challenges. Using Java as the target programming language, we tackle such challenges by augmenting the Java Modelling Language (JML) and modifying the JML compiler to generate Runtime Assertion Checking code to support DbC in concurrent programs. We applied our solution in a carefully designed case study on a highly concurrent industrial software system from the telecommunications domain to assess the effectiveness of contracts as test oracles in detecting and diagnosing functional faults in concurrent software. Based on these results, clear and objective requirements are defined for contracts to be effective test oracles for concurrent programs whilst balancing the effort to design them. Effort is measured indirectly through the Contract Complexity Measure (CCM), a measure we define. Main results include that contracts of a realistic level of completeness and complexity can detect around 76% of faults and reduce the diagnosis effort for such faults tenfold. We, therefore, show that DbC can be applied to concurrent software and can be a valuable tool to improve the economics of software engineering.
PrePrint: NLP-KAOS for Systems Goal Elicitation: Smart Metering System Case Study
This paper presents a computational method that employs Natural Language Processing (NLP) and text mining techniques to support requirements engineers in extracting and modeling goals from textual documents. We developed a NLPbased goal elicitation approach within the context of KAOS goal-oriented requirements engineering method. The hierarchical relationships among goals are inferred by automatically building taxonomies from extracted goals.We use smart metering system as a case study to investigate the proposed approach. Smart metering system is an important subsystem of the next generation of power systems (smart grids). Goals are extracted by semantically parsing the grammar of goal-related phrases in abstracts of research publications. The results of this case study show that the developed approach is an effective way to model goals for complex systems, and in particular, for the research-intensive complex systems.
PrePrint: A Component Model for Model Transformations
Model-Driven Engineering promotes an active use of models to conduct the software development process. In this way, models are used to specify, simulate, verify, test and generate code for the final systems. Model transformations are key enablers for this approach, being used to manipulate instance models of a certain modelling language. However, while other development paradigms make available techniques to increase productivity through reutilization, there are few proposals for the reuse of model transformations across different modelling languages. As a result, transformations have to be developed from scratch even if other similar ones exist. In this paper, we propose a technique for the flexible reutilization of model transformations. Our proposal is based on generic programming for the definition and instantiation of transformation templates, and on component-based development for the encapsulation and composition of transformations. We have designed a component model for model transformations, supported by an implementation currently targeting the Atlas Transformation Language (ATL). To evaluate its reusability potential, we report on a generic transformation component to analyse workflow models through their transformation into Petri nets, which we have reused for eight workflow languages, including UML Activity Diagrams, YAWL and two versions of BPMN.
PrePrint: Input-Sensitive Profiling
In this article we present a building block technique and a toolkit towards automatic discovery of workload-dependent performance bottlenecks. From one or more runs of a program, our profiler automatically measures how the performance of individual routines scales as a function of the input size, yielding clues to their growth rate. The output of the profiler is, for each executed routine of the program, a set of tuples that aggregate performance costs by input size. The collected profiles can be used to produce performance plots and derive trend functions by statistical curve fitting techniques. A key feature of our method is the ability to automatically measure the size of the input given to a generic code fragment: to this aim, we propose an effective metric for estimating the input size of a routine and show how to compute it efficiently. We discuss several examples, showing that our approach can reveal asymptotic bottlenecks that other profilers may fail to detect and can provide useful characterizations of the workload and behavior of individual routines in the context of mainstream applications, yielding several code optimizations as well as algorithmic improvements. To prove the feasibility of our techniques, we implemented a Valgrind tool called aprof and performed an extensive experimental evaluation on the SPEC CPU2006 benchmarks. Our experiments show that aprof delivers comparable performance to other prominent Valgrind tools, and can generate informative plots even from single runs on typical workloads for most algorithmically-critical routines.
PrePrint: Requirements engineering using the agent paradigm: a case study of an aircraft turnaround simulator
In this paper, we describe research results arising from a technology transfer exercise on agent-oriented requirements engineering with an industry partner. We introduce two improvements to the state-of-the-art in agent-oriented requirements engineering, designed to mitigate two problems experienced by ourselves and our industry partner: (1) the lack of systematic methods for agent-oriented requirements elicitation and modelling; and (2) the lack of prescribed deliverables in agent-oriented requirements engineering. We discuss the application of our new approach to an aircraft turnaround simulator built in conjunction with our industry partner, and show how agent-oriented models can be derived and used to construct a complete requirements package. We evaluate this by having three independent people design and implement prototypes of the aircraft turnaround simulator, and comparing the three prototypes. Our evaluation indicates that our approach is effective at delivering correct, complete, and consistent requirements that satisfy the stakeholders, and can be used in a repeatable manner to produce designs and implementations. We discuss lessons learnt from applying this approach.
PrePrint: Predicting Vulnerable Software Components via Text Mining
This paper presents an approach based on machine learning to predict which components of a software application contain security vulnerabilities. The approach is based on text mining the source code of the components. Namely, each component is characterized as a series of terms contained in its source code, with the associated frequencies. These features are used to forecast whether each component is likely to contain vulnerabilities. In an exploratory validation with 20 Android applications, we discovered that a dependable prediction model can be built. Such model could be useful to prioritize the validation activities, e.g., to identify the components needing special scrutiny.
PrePrint: Test Code Quality and Its Relation to Issue Handling Performance
Automated testing is a basic principle of agile development. Its benefits include early defect detection, defect cause localization and removal of fear to apply changes to the code. Therefore, maintaining high quality test code is essential. This study introduces a model that assesses test code quality by combining source code metrics that reflect three main aspects of test code quality: completeness, effectiveness and maintainability. The model is inspired by the Software Quality Model of the Software Improvement Group which aggregates source code metrics into quality ratings based on benchmarking. To validate the model we assess the relation between test code quality, as measured by the model, and issue handling performance. An experiment is conducted in which the test code quality model is applied to 18 open source systems. The test quality ratings are tested for correlation with issue handling indicators, which are obtained by mining issue repositories. In particular, we study the (1) defect resolution speed, (2) throughput and (3) productivity issue handling metrics. The results reveal a significant positive correlation between test code quality and two out of the three issue handling metrics (throughput and productivity), indicating that good test code quality positively influences issue handling performance.