Andrej Karpathy, Tesla’s former Head of AI, tweeted at the end of January: “The hottest new programming language is English” referring to the development of programming code via text prompts with ChatGPT, Copilot (from GitHub), CodeWhisperer (from Amazon) and more. Incidentally, among these coding assistants, the CoPilot (GitHub) enjoys the highest popularity; This application is also the first “Generative AI” application to reach a sales volume of 100 mUSD.

The Coding Wizards are used in the developer community, especially by juniors. This user behavior is also reflected, for example, in lower access to the site Stackoverflow.com: While the number of hits in July 2022 was still around 20 million per month, it has come down to 10 million a year later. In a response to that development, Stack Overflow has laid off 100 employees due to the impact on its own business. And another indicator: Rising numbers of unemployed IT professionals in the USA with a robust labor market; the Wall Street Journal states: “Joblessness in IT last month surpassed the national rate of 3.8%, a sign that entry-level IT hiring might be slowing as AI-enabled automation takes hold” (cf. article www.techmonitor.ai IT Unemployment Soars to 4.3% Amid Overall Jobs Growth)

Hence the question: Where can AI coding assistants be used really efficiently, where do coders need to be cautious? The page www.techmonitor.ai summarizes some experiences in the article Your AI coding assistant is a hot mess (from 29.06.2023) some experiences:

Developers like to use these tools for code review, which is often very time-consuming. The wizards also prove to be very helpful when learning new code. The tools are, however, also used for the generation of new code – as a prerequisite for a satisfactory result, however, it is recommended to generate new code for widely used programming languages – for rarely used programming languages, there is a lack of sufficient data to train the AI models properly.

Another “Golden Rule”: The closer the programming task is to standard requirements, the more reliable the result. Conversely, “they [note: the coding assistants] also might not be particularly useful for more ambitious projects. These models are trained on code that already exists, meaning the more novel or specific your use case, the less useful they’ll be.”

Incidentally, even AI coding assistants are prone to hallucinating. And since AI models are primarily trained with open source code (which inevitably contains bugs), AI assistants also reproduce such bugs. This is problematic in terms of cyber resilience of the code produced. Quite an effort of critical review is required here – but this mindset has not yet been consistently established. A study from Stanford University (December 2022) found that developers with access to AI programming assistants often produced more vulnerabilities than participants without access, but at the same time were more likely to believe they had written secure code.

The following (very detailed) article is definitely worth reading on the development of such a “critical mindset”: On Modern software quality, or why I think using language models for programming is a bad idea (from 30.03.2023) This article was recommended to me by a senior developer of a software company, who, after an experimental phase, had simply prohibited the use of AI coding assistants for juniors because too much inefficient code had been created, which had to be cleaned up again by the senior developers.

The author of the aforementioned article thinks that the use of language models for programming is a bad idea overall, and justifies this as follows; I have already listed some of the arguments, so the repetition once again underlines the relevance of these objections:

  • First, language models are constantly backward-looking, tending to favor the ordinary, average, mediocre, and predictable. However, they are great for modifying text and are occasionally useful for summaries.
  • Second, the security risks that arise from error-prone training data, among other things. Prompt injections can be used to inject malicious code into the model, while vulnerabilities in the training data can be exploited to manipulate the model’s output.
  • Third, language models are designed to generate outputs based on the inputs they receive, without questioning the validity of the inputs. This can lead to unrealistic specifications because the model simply generates outputs based on the inputs it receives, without questioning whether or not the inputs are valid or meaningful.
  • Fourth, automation bias leads to an uncritical evaluation of the output code, which in turn results in poor code quality [cf. the Stanford study]. “automation bias” is the tendency to rely too much on automated systems without questioning their results.
  • Fifth, language models are trained on existing datasets, which may contain biases and outdated information. This can lead to the generation of outdated code, which can harm the open source ecosystem.
  • Sixth, the use of language models can lead to the literal copying of open source.
  • Author

    The author is a manager in the software industry with international expertise: Authorized officer at one of the large consulting firms - Responsible for setting up an IT development center at the Bangalore offshore location - Director M&A at a software company in Berlin.