[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PVS] 1 year fellowship for a doctoral researcher on program analysis

(apologies for multiple postings)


The University of Namur, Belgium, is seeking a young researcher
(no PhD) for a 1 year research project entitled

"Automatic program analysis for refactoring logic programs".

A short description follows. Interested candidates should send their CV before September 15, 2008 to Wim Vanhoof (wva@xxxxxxxxxxxxxxxx).


Program refactoring [3] is the process of systematically changing the structure of a
program without changing its semantics. The goal of refactoring is to improve the design
of the code after it has been written, in order to facilitate maintenance (including further
development) of the software. Emerged from the OO and XP communities, program
refactoring has recently gained attention in the fields of functional [8] and logic
programming [11]. Within the software engineering community, the process of
refactoring is considered important and has been identified as central to software
development and maintenance [3].

At the basis of the refactoring process is a catalogue of available source-to-source
transformations - the so-called refactorings. For each refactoring, a set of conditions is
specified under which the transformation is correct in the sense that it preserves the
semantics of the program. The activity of refactoring consists then in repeatedly searching
through the source code, looking for a code fragment of which the design could be
improved by a particular refactoring from the catalogue. The refactoring is subsequently
applied, and the whole process is repeated. Although each transformation can have an
impact (positive or negative) on the performance of the program, the primary aim of each
transformation is to improve the readability and maintainability of the code.
Even if refactoring is basically a manual process that is performed by the programmer, the
need for automation is recognized due to the time-consuming and error-prone nature of
the refactoring activity [3, 8, 10]. Although tools do exist that aid the developer with
performing a particular refactoring on a selected fragment of source code - an example
being the Refactoring Browser that was developed for Smalltalk [10] - identifying where
to perform (a particular) refactoring in the code remains essentially a non-trivial, creative
and therefore manual process.

Although a substantial number of different refactorings has been identified in the
literature, the primary indication for where to perform refactoring seems to be the
presence of duplicated code [3]. The notion of duplicated code is not limited to literally
copied code, but refers to fragments of code that have the same input/output behaviour.
The goal of refactoring is then to transform the source code in such a way that the
duplication is removed. This generally requires to extract the duplicated input/output
behaviour into a new subroutine (be it a method, function or predicate), which may
require a generalisation of the concerned code fragments and a possible reorganisation of
the code as a whole [3, 11].

Code duplication can be caused by a number of reasons. First of all, unfamiliarity of the
developer with the existing code body may result in reimplementation of routines that
already exist in the application under development. Second, the “copy and paste”
technique is commonly used when the existing functionality has to be slightly adapted.
Even if in this case one usually does not end up literally duplicating input/output
behaviour, the changes introduced by adaptation are usually relatively minor and a
generalization of the original and the adapted fragments could be often proposed. Finally,
code duplication might result from a polyvariant program analysis.

The problem of deciding whether two code fragments implement the same functionality is
well-known to be undecidable. Nevertheless, automatic program analysis techniques have
been developed, notably in the context of imperative and object-oriented languages, that
are capable to detect such equivalence under particular circumstances and within a certain
error margin, e.g. [1, 5, 6, 7, 9]. Also related is the work on plagiarism detection for
programs written in such languages, e.g. [4, 14, 13].

In this project, we will investigate the automatic detection of duplicated code in
declarative programming languages, in particular logic programming languages such as
Prolog and Mercury. Declarative languages allow the developer to program at a much
higher level of abstraction than it is the case with most imperative languages. In a
declarative language, one describes properties of the desired solution rather than the
actual algorithm that should be used to find the solution.

We aim to develop an analysis that allows to find, without user intervention, duplicated
code fragments into a logic program and we will study if and how the results of this
analysis can be used to drive a number of refactorings to remove the unwanted
duplication from the program. These refactorings include predicate extraction (replacing
duplicated code fragments by a call to a newly defined predicate), the removal of
predicates implementing the same relation as another predicate and the generalization of
duplicated code fragments into calls to newly generated higher-order predicate [11, 12].
The development of such a duplicated code analysis for logic programs, and its
integration with refactoring, presents some interesting research opportunities. Firstly, the
declarative nature of logic programs makes it not straightforward to adapt the methods
developed in an imperative (or object-oriented) setting. Secondly, the fact that logic
programs have a small and formally well-defined semantics and use an explicit symbolic
data representation makes the use of advanced analyses possible. Therefore it might well
be possible to obtain more fine-grained results than is the case for imperative languages.
Finally, it might be worthwhile to investigate how the developed analysis could be used in
the context of plagiarism detection for logic programs.

[1] B. S. Baker. On finding duplication and near-duplication in large software systems. In Proc. Second
IEEE Working Conference on Reverse Engineering, pages 86-95, July 1995.
[2] F. Degrave and W. Vanhoof. Towards a normal form for Mercury programs. In A. King, ed. LOPSTR
2007, volume 4915 of Lecture notes in computer science, pages 43-58. Springer-Verlag, 2007.
[3] M. Fowler, K. Beck, J. Brant, W. Opdyke, and D. Roberts. Refactoring : Improving the Design of
Existing Code. Object Technology Series. Addison-Wesley, 1999.
[4] S. Horwitz. Identifying the semantic and textual differences between two versions of a program. ACM
SIGPLAN Notices, 25(6) :234-245, 1990.
[5] T. Kamiya, S. Kusumoto, and K. Inoue. Ccfinder : A multilinguistic token-based code clone detection
system for large scale source code. IEEE Trans. Software Eng., 28(7): 654-670, 2002.
[6] R. Komondoor and S. Horwitz. Using slicing to identify duplication in source code. In Static Analysis
Symposium, pages 40-56, 2001.
[7] K. Kontogiannis, R. de Mori, E. Merlo, M. Galler, and M. Bernstein. Pattern matching for clone and
concept detection. Autom. Softw. Eng., 3(1/2) :77-108, 1996.
[8] H. Li, C. Reinke, and S. Thompson. Tool support for refactoring functional programs. In J. Jeuring,
editor, ACM SIGPLAN 2003 Haskell Workshop. ACM 2003.
[9] J. Mayrand, C. Leblanc, and E. Merlo. Experiment on the automatic detection of function clones in a
software system using metrics. In Intl. Conf. on Software Maintenance, pages 244-253, 1996.
[10] D. Roberts, J. Brant, and R. E. Johnson. A refactoring tool for Smalltalk. Theory and Practice of Object
Systems (TAPOS), 3(4) :253-263, 1997.
[11] A. Serebrenik, T. Schrijvers and B. Demoen. Improving Prolog programs: Refactoring for Prolog.
Theory and practice of logic programming (Accepted) 2008.
[12] W. Vanhoof. Searching semantically equivalent code fragments in logic programs. In S. Etalle, editor,
LOPSTR 2004, volume 3573 of Lecture notes in computer science, pages 1-18. Springer-Verlag, 2005.
[13] J. Winstead and D. Evans. Towards differential program analysis. In Proceedings of the 2003
Workshop on Dynamic Analysis, 2003.
[14] W. Yang. Identifying syntactic differences between two programs. Software Practice and Experience,
21(7): 739-755, 1991.