An Analytical Method for Refactoring Object Oriented Code

We present a new method based on software analysis for refactoring object-oriented programs. The segment to be refactored is decomposed by a parser into structural elements and described by five sets of relations organized in sparse matrix format. Rows correspond to tuples, columns to variables in the segment, column partitions to classes, and row partitions to methods. Row and column permutations and rearrangement of partitions are refactorings. A heuristic search algorithm is proposed, which follows prescribed rules or menu selections. The output is a design for the given segment.

Our work draws heavily from mathematical notation by establishing a parallel between the “refactoring” of equations and computer code. The method divides the refactoring problem into parts suitable for separate analysis, makes critical refactoring information readily available, and helps to appreciate the nature of refactoring with a visual presentation. Examples illustrate the algorithm. The first two examples demonstrate how rules are applied, class and method definition, variable classification into attributes and arguments, temporaries or return values for methods, how different search paths yield different refactorings, and hierarchical model nesting. The Fowler refactoring Move Method is illustrated. The third example starts from a poor UML design and produces two different good designs. It illustrates how designs can be refactored, the use of relational operations to refactor, and the conversion between conditional logic and polymorphism. The technology can automate refactoring, or be combined with development and enhance coherence and integration in program evolution. Refactoring is a technology that improves the quality of existing code by applying a series of elementary behavior-preserving transformations called refactorings. Research in refactoring officially started in 1992 with the publication of a PhD thesis [1] by William Opdyke. More recently, Martin Fowler published a catalog of refactorings in book form [2], listing more than 70 refactorings. An extensive survey of software refactoring was published by Tom Mens et al. [3]. Much of the work loosely clusters along two main directions, methods based on graph representations of the code and intended for automation, and methods that support manual operations such as finding segments of code that need improvement, or deciding which refactorings to apply. In the first group, we should mention the detection of “bad smells” [2, 4, 5, 6], applying design patterns to automate refactoring in Java, although the authors admit to possible errors [7], feature decomposition of programs, where a feature is an increment in program functionality [8], detection of duplicated code [9, 10, 11], detection of program invariants [12], and using dynamic data obtained during runtime [13]. The use of code metrics is commanding attention, including using distance cohesion metrics to associate methods and variables of a class [14], using a combination of code metrics and visualization [15], and using object-oriented metrics for bad smell detection [14]. Efforts in the second group include [16, 17, 18, 19]. Graph transformation and rewriting was formalized and used to analyze dependencies in refactoring [20, 21]. M. Verbaere et al. [22] note that existing tools are language-specific and contain significant bugs, even in sophisticated development environments, and propose a language for refactoring that manipulates a graph representation of the program. There is also a third, incipient direction. T. Mens et al. [3] suggest that refactorings can be classified according to the code quality attributes they affect, such as robustness, extensibility, reusability, or performance, and then used to enhance the attributes of interest. They propose to estimate the effect by considering the internal quality attributes affected, such as code size, complexity, coupling and cohesion, which can be measured if access to program structure information is available. Code metrics can be used to determine the effect of refactorings on maintainability [23], or quality [24]. Performance can be improved by replacing conditional logic with polymorphism [25]. There has been work on specialized refactoring systems such as architectural refactoring of inherited software, particularly corporate software [26] and clustering techniques for architecture recovery [27]. And in aspect-oriented programming, using program slicing to improve method and aspect extraction [28], role-based refactoring [29], role mapping [30], and aspect refactoring [31]. Refactoring of non-object-oriented code has also been considered. [32]. Code refactoring has been linked to algebraic factoring [8], and, in the particular case of user interface design, to matrix algebra [33]. The need to teach refactoring in Computer Science curricula has now been recognized [34]. In spite of all the efforts, refactoring remains a complex and risky procedure. We propose a method for refactoring object-oriented software where the segment of code of interest is separated into elements. The method is behavior-preserving and static. It is based on code analysis , intended for automation, language-independent, applicable to objet-oriented and non-object-oriented code, and inspired on mathematical notation. Mathematical notation is mature and precise. For centuries, mathematicians have been writing equations, condensing properties and behavior into objects, “refactoring” their equations to make them more meaningful and easier to manipulate, using patterns, and perfecting a notation that would allow them to express all that. For example, a symmetric positive definite matrix A is a

Free download research paper