Understanding the use of lambda expressions in Java

Abstract

Java 8 retrofitted lambda expressions, a core feature of functional programming, into a mainstream object-oriented language with an imperative paradigm. However, we do not know how Java developers have adapted to the functional style of thinking, and more importantly, what are the reasons motivating Java developers to adopt functional programming. Without such knowledge, researchers miss opportunities to improve the state of the art, tool builders use unrealistic assumptions, language designers fail to improve upon their designs, and developers are unable to explore efficient and effective use of lambdas.

We present the first large-scale, quantitative and qualitative empirical study to shed light on how imperative programmers use lambda expressions as a gateway into functional thinking. Particularly, we statically scrutinize the source code of 241 open-source projects with 19,770 contributors, to study the characteristics of 100,540 lambda expressions. Moreover, we investigate the historical trends and adoption rates of lambdas in the studied projects. To get a complementary perspective, we seek the underlying reasons on why developers introduce lambda expressions, by surveying 97 developers who are introducing lambdas in their projects, using the firehouse interview method.

Among others, our findings revealed an increasing trend in the adoption of lambdas in Java: in 2016, the ratio of lambdas introduced per added line of code increased by 54% compared to 2015. Lambdas were used for various reasons, including but not limited to (i) making existing code more succinct and readable, (ii) avoiding code duplication, and (iii) simulating lazy evaluation of functions. Interestingly, we found out that developers are using Java's built-in functional interfaces inefficiently, i.e., they prefer to use general functional interfaces over the specialized ones, overlooking the performance overheads that might be imposed. Furthermore, developers are not adopting techniques from functional programming, e.g., currying. Finally, we present the implications of our findings for researchers, tool builders, language designers, and developers.

Empirical Studies Java 8 Firehouse Interview Method Lambda Expressions OOPSLA

Artifacts

  • The paper can be downloaded here.
  • See the list of the analyzed projects and navigate through the detected lambda expressions here.
  • The CSV files for the analysis can be downloaded here. The format of the files is as following:

    File Description
    authors-commits-lambds-*.csv Data for each author, # of lambdas and # of lines of code introduced
    lambda-method-location-*.csv Location info for each lambda (new method, new class, existing method)
    lambdas-added-per-revision-*.csv Data for each commit (author, committer, # of lambdas and LOC added)
    size-age-*.csv Data for the age of the projects and the first time lambdas introduced
    size-authors-*.csv Data for the # of authors per project
    size-commits-*.csv Data for the # of commits per project
    size-java-files.csv Data for the # of Java files per project
    tags.csv Data for the assigned tags
    tags-automatic-vs-manual.csv Data for the tools used
    test-vs-production Data saying wather each lambda is in test or production code

    (Note that, for some of the files, we collected the data from three machines, so *’s in the file names stand for the name of the machines that we used).

  • The R scripts used for the analysis and creating the plots for the paper can be downloaded here. The list of the functions defined in the rscripts is as follows:

    Function Name Description
    load() Loads the CSV files into the environment. MUST run before everything.
    getSizeCommitsPlot() Get the violin plot for number of commits in the analyzed project
    getSizeAgePlot() Get the violin plot for the age of the analyzed projects
    getSizeAuthorsPlot() Get the violin plot for the number of authors of the analyzed projects
    getSizeJavaFilesPlot() Get the violin plot for the size of the Java files
    lambdaTools() Get info about the IDEs used by the surveyed developers and degree of automation (RQ4, Fig. 9)
    lambdasTrend() Get Kendal’s Tau for each project (RQ1, Figure 3)
    lambdasPerCommitOutliers() Projects having commits with outlier number of lambdas introduced (RQ 4)
    lambdaEnthusiasts() List of lambda enthusiasts (RQ3)
    whoIsMakingLambdas(F) Ratio of developers making lambdas over team size (RQ3, Fig 5.b),
    whoIsMakingLambdas(T) Distribution of ratios of lambdas per developer (RQ3, Fig 5.a)
    outsiderVsCore() Ratio of lambdas per line for outsiders vs core developers (RQ3, Fig. 6)
    tagsInfo() Tags (RQ 5)
    productionVsTest() Gets info about Lambds in production vs test cde (RQ2, Fig. 4)
  • We used a modified version of RefactoringMiner (a tool developed by Nikolaos Tsantalis that detects refactorings in the history of Java systems) to detect newly-introduced lambda expressions in successive versions of the code. We call this fork LambdaMiner. Note that, the current version of LambdaMiner used for this publication is behind the RefactoringMiner’s head, and we plan to merge the changes as soon as possible.
  • The (anonymized) email data will be available upon request.

BibTex

@article{MKTD:OOPSLA:2017,
   author={Mazinanian, Davood and Ketkar, Ameya and Tsantalis, Nikolaos and Dig, Danny},
   title={Understanding the use of lambda expressions in Java},
   year = 2017,
   month = Oct,
   volume = 1,
   number = {OOPSLA},
   journal = {Proc. ACM Program. Lang.}
}