Understanding code is challenging, especially when working in new and complex development environments. Code comments and documentation can help, but are typically scarce or hard to navigate. Large language models (LLMs) are revolutionizing the process of writing code. Can they do the same for helping understand it? In this study, we provide a first investigation of an LLM-based conversational UI built directly in the IDE that is geared towards code understanding. Our IDE plugin queries OpenAI’s GPT-3.5 and GPT-4 models with four high-level requests without the user having to write explicit prompts: to explain a highlighted section of code, provide details of API calls used in the code, explain key domain-specific terms, and provide usage examples for an API. The plugin also allows for open-ended prompts, which are automatically contextualized to the LLM with the program being edited. We evaluate this system in a user study with 32 participants, which confirms that using our plugin can aid task completion more than web search. We additionally provide a thorough analysis of the ways developers use, and perceive the usefulness of, our system, among others finding that the usage and benefits differ significantly between students and professionals. We conclude that in-IDE prompt-less interaction with LLMs is a promising future direction for tool builders.
Comprehension outsourcing. Our analysis revealed an intriguing finding regarding participants’ behavior during the study, where some of them deferred their need for code comprehension to the LLM, which was well described by one participant as comprehension outsourcing. These participants prompted the model at a higher level directly and did not read and fully comprehend the code before making changes. As one participant commented, “I was surprised by how little I had to know about (or even read) the starter code before I can jump in and make changes.” This behavior might be attributed to developers’ inclination to focus on task completion rather than comprehending the software, as reported in the literature . Or, participants may have also weighed the costs and risks of comprehending code themselves, and chosen to defer their comprehension efforts to the language model. While this behavior was observed in the controlled setting of a lab study and may not fully reflect how developers approach code comprehension in their daily work, it does raise concerns about the potential impact of such a trend (or over-reliance on LLMs ) on code quality. This highlights the importance of preventing developers who tend to defer their comprehension efforts to the LLM from being steered in directions that neither they nor the LLM are adequately equipped to handle. Studies showing developers’ heavy reliance on Stack Overflow, despite its known limitations in accuracy and currency [64, 69], further emphasize the need for caution before widely adopting LLM-based tools in code development. Research on developers’ motivations and reasons for code comprehension when LLMs are available will be valuable in informing future tool designs