Characteristics From Literature

In this page we report the characteristics of code recommender systems that we extracted by inspecting papers in the literature. Particularly, we focused on papers that have presented:

Techniques for code generation/completion;
Empirical studies about code generation/completion techniques;
Techniques to generate code examples;
Empirical studies about code examples used by developers.

Note that, to define such a list, we did not perform a systematic literature review to identify all papers in the surveyed areas. However, we relied on the experience of the six authors to identify a set of 41 peer-reviewed papers published in international conferences and journals and applied backward snowballing on their references to identify additional relevant works. At the end, we inspected 53 papers.

Each paper was assigned to one author who had the responsibility of extracting a list of "characteristics" described in the paper (if any). In some cases, the papers from which we extracted a characteristic did not explicitly point to the need for considering such a characteristic when building code recommender systems. However, this could be inferred from the text of the paper.

The following table presents the extracted 14 characteristics:

#	Attribute	Description	References
1	Characteristics of recommended code > Code quality > High readability and understandability > Concise Code	The recommended code must be as short and simple as possible.	[ 1 , 2 , 3 ]
2	Characteristics of recommended code > Code quality > Correctness	The recommended code must be bug-free.	[ 1 ]
3	Awareness > Developers' knowledge	If multiple recommendations are possible, the one using code that is more familiar to the developer must be used (e.g., the code using APIs already used in the past by the developer receiving the recommendation).	[ 4 ]
4	Characteristics of recommended code > Code quality > High readability and understandability	The recommended code must be readable (e.g., avoid very long statements, adopt indentation) and easy to understand.	[ 5 ]
5	Characteristics of recommended code > Code quality > High reusability	The recommended code is easy to reuse. This is particularly relevant for recommender systems suggesting code components at higher granularity (e.g., entire methods).	[ 5 ]
6	Characteristics of recommended code > Provides additional information > Documentation > Commented code	The recommended code features comments documenting the statements.	[ 2 ]
7	Characteristics of recommended code > Adaptive > Developer's coding style	The recommended code is adapted to the developer's coding style (e.g., using extra parenthesis to better format code if this is a practice frequently performed by the developer).	[ 6 ]
8	Characteristics of recommended code > Code quality > Meets best coding practices > Meets company/client standards	The recommended code can be configured to meet specific best coding practices defined by the company/client (e.g., if variables are named with camelCase, the same convention must be adopted in the recommended code).	[ 6 ]
9	Characteristics of recommended code > Structural characteristics > Precise typing information	The recommended code can be configured to meet specific best coding practices defined by the company/client (e.g., if variables are named with camelCase, the same convention must be adopted in the recommended code).	[ 7 ]
10	Usability > High responsiveness	The responsiveness of the code recommendation system in terms of time needed to generate a recommendation. The recommender is responsive, not causing lagging while coding.	[ 8 ]
11	Characteristics of recommended code > Adaptive > Coding context > Identifier names	The recommended code is adapted to the context of the recommendation by using the same variable names of the code it completes when possible.	[ 7 ]
12	Characteristics of recommended code > Code quality > Correctness > Correct syntax	The recommended must not introduce syntax errors.	[ 9 , 10 , 8 ]
13	Characteristics of recommended code > Code quality > High readability and understandability > Step-by-step solution	In case the recommended code spans across many statements, the code is divided into multiple chunks (by using a blank line), each one responsible for a sub-task.	[ 2 ]
14	Characteristics of recommended code > Code quality > Correctness > Bug/Vulnerability free	The recommended code is does not introduce bugs and/or vulnerabilities.	[ 11 ]

References:

[1] J. Kim, S. Lee, S. Hwang, and S. Kim. Adding examples into java documents. In 2009 IEEE/ACM International Conference on Automated Software Engineering, pages 540–544, 2009.

[2] S. M. Nasehi, J. Sillito, F. Maurer, and C. Burns. What makes a good code example?: A study of programming Q&A in stackoverflow. In 2012 28th IEEE International Conference on Software Maintenance (ICSM), pages 25–34, 2012.

[3] Vincent J Hellendoorn, Sebastian Proksch, Harald C Gall, and Alberto Bacchelli. When code completion fails: A case study on real-world completions. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), pages 960–970. IEEE, 2019.

[4] Madhuri R Marri, Suresh Thummalapenta, and Tao Xie. Improving software quality via code searching and mining. In 2009 ICSE Workshop on Search-Driven Development-Users, Infrastructure, Tools and Evaluation, pages 33–36. IEEE, 2009.

[5] Laura Moreno, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, and Andrian Marcus. How can i use this method? In Proceedings of the 37th International Conference on Software Engineering - Volume 1, ICSE ’15, page 880-890, 2015.

[6] Htoo Htoo Sandi Kyaw, Shwe Thinzar Aung, Hnin Aye Thant, and Nobuo Funabiki. A proposal of code completion problem for java programming learning assistant system. In Conference on Complex, Intelligent, and Software Intensive Systems, pages 855–864. Springer, 2018.

[7] Daniel Perelman, Sumit Gulwani, Thomas Ball, and Dan Grossman. Type-directed completion of partial expressions. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’12, page 275-286, 2012.

[8] Alexey Svyatkovskiy, Shao Kun Deng, Shengyu Fu, and Neel Sundaresan. Intellicode compose: Code generation using transformer. arXiv preprint arXiv:2005.08025, 2020.

[9] Lu ́ıs Eduardo de Souza Amorim, Sebastian Erdweg, Guido Wachsmuth, and Eelco Visser. Principled syntactic code completion using placeholders. SLE 2016, page 163-175, 2016.

[10] Wenhan Wang, Sijie Shen, Ge Li, and Zhi Jin. Towards full-line code completion with neural language models. arXiv preprint arXiv:2009.08603, 2020.

[11] Roei Schuster, Congzheng Song, Eran Tromer, and Vitaly Shmatikov. You autocomplete me: Poisoning vulnerabilities in neural code completion, 2020.