In-IDE Generation-based Information Support with a Large Language Model

Abstract:

Understanding code is challenging, especially when working in new and complex development environments. Code comments and documentation can help, but are typically scarce or hard to navigate. Large language models (LLMs) are revolutionizing the process of writing code. Can they do the same for helping understand it? In this study, we provide a first investigation of an LLM-based conversational UI built directly in the IDE that is geared towards code understanding. Our IDE plugin queries OpenAI’s GPT-3.5 and GPT-4 models with four high-level requests without the user having to write explicit prompts: to explain a highlighted section of code, provide details of API calls used in the code, explain key domain-specific terms, and provide usage examples for an API. The plugin also allows for open-ended prompts, which are automatically contextualized to the LLM with the program being edited. We evaluate this system in a user study with 32 participants, which confirms that using our plugin can aid task completion more than web search. We additionally provide a thorough analysis of the ways developers use, and perceive the usefulness of, our system, among others finding that the usage and benefits differ significantly between students and professionals. We conclude that in-IDE prompt-less interaction with LLMs is a promising future direction for tool builders.

From Discussion:

Comprehension outsourcing. Our analysis revealed an intriguing finding regarding participants’ behavior during the study, where some of them deferred their need for code comprehension to the LLM, which was well described by one participant as comprehension outsourcing. These participants prompted the model at a higher level directly and did not read and fully comprehend the code before making changes. As one participant commented, “I was surprised by how little I had to know about (or even read) the starter code before I can jump in and make changes.” This behavior might be attributed to developers’ inclination to focus on task completion rather than comprehending the software, as reported in the literature [42]. Or, participants may have also weighed the costs and risks of comprehending code themselves, and chosen to defer their comprehension efforts to the language model. While this behavior was observed in the controlled setting of a lab study and may not fully reflect how developers approach code comprehension in their daily work, it does raise concerns about the potential impact of such a trend (or over-reliance on LLMs [62]) on code quality. This highlights the importance of preventing developers who tend to defer their comprehension efforts to the LLM from being steered in directions that neither they nor the LLM are adequately equipped to handle. Studies showing developers’ heavy reliance on Stack Overflow, despite its known limitations in accuracy and currency [64, 69], further emphasize the need for caution before widely adopting LLM-based tools in code development. Research on developers’ motivations and reasons for code comprehension when LLMs are available will be valuable in informing future tool designs

In-IDE Generation-based Information Support with a Large Language Model

arxiv.org

About
Latest Posts

Ryan Watkins

Professor at George Washington University

I am a Professor with Human-Technology Collaboration and Educational Technology programs at George Washington University in Washington DC. I have written 12 books and more than 100 articles, and I co-host of the Parsing Science podcast where scientists tell the stories behind their research. I am also the developer of the WeShareScience.com online platform for sharing research videos, and SciencePods.com where researchers can create free podcasts about their science. My research interests include human interactions with intelligent machines, needs, needs assessments, and instructional design.