Language Models and software developers

Sep 15, 2024

A rule of thumb for a SaaS company is to spend 40% of revenue on R&D. This data makes software development a perfect target for disruption. In addition, software code is an ideal task for GenAI: code is easy to verify, there are many public repositories to get data for training, and we can generate a lot of synthetic data; some of the code is annotated in PRs or with comments. All of these factors explain why LLMs should be good at code generation. There are still several problems. How do we plan what to do with the code? How do we find the code which needs to be changed? Here are a few approaches.

In the simplest form, we can send a piece of code to the ChatGPT and ask for help. Code2Prompt, a GitHub repo, provides a tool to prepare a nice prompt from your code. In this case, we need to know what piece of code to add to the prompt and provide a plan for what needs to be done with the code.

The FunCoder paper shows how to improve planning. The idea is to use a Divide-and-Conquer approach with consensus. When we write code, we tend to split a task into functions and execute them in some order. FunCoder does the same by decomposing a problem into smaller functions and using them to generate a tree-like hierarchy of function calls before providing the final solution. Smaller functions are easy to generate and test, but this approach might lead to cascading errors, which is why FunCoder uses consensus to reduce inconsistencies.

Another approach to improve planning is to use a multi-agent system, as in the CodeR paper. The paper introduced several agents (Manager, reproducer, Fault Localizer, Editor and Verifier) and pre-defined tasks such as search file, edit, shell command and others. CodeR focuses on issue resolution. It prepares a plan before execution and stores it in a graph data structure. This helped CodeR achieve a 28% resolution rate with only one submission per issue.

Repository-level search is a bit different problem. Some codebases are quite huge, comprising tens of millions of source code lines. The CODEXGRAPH paper solves this problem. It combines LLM agents with graph database interfaces. The paper shows how to improve code navigation and structure-aware context retrieval. We need to build a graph where Modules, Classes and Functions are nodes with meta information and edges show relations such as Contains, Uses, Inherits, etc. The result is a 10%-17% improvement in Retrieval-Augmented Code Generation (RACG) tasks.

The CodeAgent paper focuses on repo-level code changes. It leverages agent strategies and external tools such as web search to improve code generation, but what is most important for me is that the paper proposes using all software artefacts included in a repository, such as documentation, code dependency, and runtime environment.

A few thoughts. One element is missing - operation time artefacts, such as logs, traces, database query stats and etc. These items are essential when making decisions during coding and bug fixing. According to the TIOBE index, Python is the most popular language and has a 6% lead. The popularity of a language is a proxy metric for the amount of code written in it. The more training code we have, the better LLMs support it. Soon we will add one more factor when choosing a framework or language - how well language models support it.

References:

https://arxiv.org/abs/2408.03910 - CODEXGRAPH: Bridging Large Language Models and Code Repositories via Code Graph Databases
https://blossomstreetventures.medium.com/percent-of-saas-revenue-for-r-d-s-m-cogs-and-g-a-8f8cfbe33c2a - Percent of SaaS revenue for R&D, S&M, COGS, and G&A
https://github.com/raphaelmansuy/code2prompt - Code2Prompt is a powerful command-line tool that simplifies the process of providing context to Large Language Models (LLMs)
https://arxiv.org/abs/2405.20092 - Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation
https://arxiv.org/abs/2406.01304 - CodeR: Issue Resolving with Multi-Agent and Task Graphs
https://arxiv.org/abs/2401.07339 - CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges
https://www.tiobe.com/tiobe-index/ - TIOBE Index for September 2024

Shchegrikovich LLM

Discussion about this post