Type Migration in Ultra-Large-Scale Codebases

Abstract

Type migration is a refactoring activity in which an existing type is replaced with another one throughout the source code. Manually performing type migration is tedious as programmers need to find all instances of the type to be migrated, along with its dependencies that propagate over assignment operations, method hierarchies, and subtypes. Existing automated approaches for type migration are not adequate for large codebases – they perform an intensive whole-program analysis that does not scale. If we could represent the type structure of the program as graphs, then we could employ a MapReduce parallel and distributed process that scales to hundreds of millions of LOC. We implemented this approach as an IDE-independent tool called T2R, which integrates with most build systems. We evaluated T2R’s accuracy, usefulness and scalability on seven open source projects and one proprietary codebase of 300M LOC. T2R generated 130 type migration patches with 97% accuracy, out of which 98% were accepted by the original developers.

Refactoring Type Migration Ultra-large-scale codebases MapReduce ICSE

BibTex

@inproceedings{Ketkar:ICSE:2019:T2R,
    author={Ketkar, Ameya and Mesbah, Ali and Mazinanian, Davood and Dig, Danny and Aftandilian Edward},
    title={Type Migration in Ultra-Large-Scale Codebases},
    booktitle={Proceedings of the 41th International Conference on Software Engineering},
    series = {ICSE 2019},
    location = {Montreal, Canada},
    numpages = {12}
    year = 2019,
}