Predicting Coding Effort in Projects Containing XML
Abstract—This paper studies the problem of predicting the coding effort for a subsequent year of development by analysing metrics extracted from project repositories, with an emphasis on projects containing XML code. The study considers thirteen open source projects and applies machine learning algorithms to generate models to predict one-year coding effort, measured in terms of lines of code added, modified and deleted. Both organisational and code metrics associated to revisions are taken into account. The results show that coding effort is highly determined by the expertise of developers while
Open source software development should strive for even greater code maintainability
Unlike the traditional closed source software (CSS), OSS can be freely used, modified, and redistributed. Its source code is also freely accessible A study of almost six million lines of code tracks how freely accessible source code holds up against time and multiple iterations
Static source code checking for user-defined properties
Only a small fraction of the output generated by typical static analysis tools tends to reveal serious software defects. There are two main causes for this phenomenon. The first is that the typical static analyzer casts its nets too broadly, reporting everything reportable, rather
Plaggie: GNU-licensed source code plagiarism detection engine for Java exercises
ABSTRACT A source code plagiarism detection engine Plaggie is presented. It is a stand- alone Java application that can be used to check Java programming exercises. Plaggies functionality is similar with previously published JPlag web service but unlike JPlag, Plaggie
A convolutional attention network for extreme summarization of source code
Attention mechanisms in neural networks have proved useful for problems in which the input and output do not have fixed dimension. Often there exist features that are locally translation invariant and would be valuable for directing the models attention, but previous attentional
Using Heuristic Search Techniques To Extract Design Abstractions From Source Code .
As modern software systems are large and complex, appropriate abstractions of their structure are needed to make them more understandable and, thus, easier to maintain. Software clustering tools are useful to support the creation of these abstractions. In this
Identifying authorship by byte-level n-grams: The source code author profile (scap) method
Source code author identification deals with identifying the most likely author of a computer program, given a set of predefined author candidates. There are several scenarios where digital evidence of this kind plays a role in investigation and adjudication, such as code
Structured generative models of natural source code
We study the problem of building generative models of natural source code (NSC); that is, source code written by humans and meant to be understood by humans. Our primary contribution is to describe new generative models that are tailored to NSC. The models are
Bimodal modelling of source code and natural language
We consider the problem of building probabilistic models that jointly model short natural language utterances and source code snippets. The aim is to bring together recent work on statistical modelling of source code and work on bimodal models of images and natural
Reading source code .
Source code is, among other things, a text to be read. In this paper I argue that reading source code is a key activity in software maintenance, and that we can profitably apply experiences and reading systems from text databases to the problem of reading source
Visualizing Software Product Line Variabilities in Source Code .
Implementing software product lines is a challenging task. Depending on the implementation technique the code that realizes a feature is often scattered across multiple code units. This way it becomes difficult to trace features in source code which hinders
Source code instrumentation and quantification of events
ABSTRACT Aspect-Oriented Programming is making quantified programmatic assertions over programs that otherwise are not annotated to receive these assertions. Varieties of AOP systems are characterized by which quantified assertions they allow, what they permit in the
A UNIX clone with source code for operating systems courses
Students learn by doing, not by listening. Physicists and chemists have long understood this, which is why students in these fields are required to perform experiments in the laboratory and write up their findings. Computer scientists also realize this basic truth, so many courses
Architecture of a source code exploration tool: A software engineering case study
We discuss the design of a software system that helps software engineers (SEs) to perform the task we call just in time comprehension (JITC) of large bodies of source code . We discuss the requirements for such a system and how they were gathered by studying SEs at
Phishing websites detection based on phishing characteristics in the webpage source code
ABSTRACT World Wide Web Consortium (W3C) is the international standards organization for the World Wide Web (www). It develops standards, specifications and recommendations to enhance the interoperability and maximize consensus about the content of the web and
IRiSS-A Source Code Exploration Tool.
Abstract IRiSS (Information Retrieval based Software Search) is a software exploration tool that uses an indexing engine based on an information retrieval method. IRiSS is implemented as an add-in to the Visual Studio .NET development environment and it allows
Source code review of the Diebold voting system
This report is a security analysis of the Diebold voting system, which consists primarily of the AccuVote-TSX (AV-TSX) DRE, the AccuVote-OS (AV-OS) optical scanner, and the GEMS election management system. It is based on a study of the systems source code that we
Learning Unified Features from Natural and Programming Languages for Locating Buggy Source Code .
Bug reports provide an effective way for end-users to disclose potential bugs hidden in a software system, while automatically locating the potential buggy source code according to a bug report remains a great challenge in software maintenance. Many previous studies
PDE4Java: Plagiarism Detection Engine For Java, Source Code : A Clustering Approach.
The educational community across the world is facing the increasing problem of plagiarism. This widespread problem has motivated the need of an efficient, robust and fast detection procedure that is difficult to be achieved manually. The Plagiarism Detection Engine for Java
Intent operationalisation for source code generation
In the research on software development, there was less achievement in an efficient general development methodology that could be effective and sufficient in dealing with a wide range of software problems related to different domains. Also a challenge of having a universal
metrics have little effect on improving the accuracy of estimations of coding effort. The study also shows that models trained on one project are unreliable at estimating effort in other projects.
FREE IEEE PAPER