DeepMind’s large-scale language model AlphaCode has been hailed as having the potential to beat computer programmers by parsing algorithms and generating complex programs.
In a new paper published in the Science, DeepMind researchers say AlphaCode beats half the average human coder in standard programming competitions.
DeepMind’s developers tested AlphaCode on a competitive programming site where human developers are given coding problems and ranked based on their results to validate its potential.
Even though machine learning has made great strides, it’s hard to build an AI that’s good at multiple things. You can train a machine with data corresponding to one class of programming problems but fail when you try to tackle another problem. So, the researchers decided to skip all training on algorithms and code structures and treat programming like a translation problem.
AlphaCode, developed in 2018, is a transformer-based language model with 41.4 billion parameters. This is four times the size of GitHub Copilot’s language model Codex, which parses 12 billion parameters. Earlier, another DeepMind AI AlphaFold managed to decode the structure of almost all proteins known to science.
Coding challenges often include a task description, and the output code given by human participants is only a representation of that description. Therefore, AlphaCode’s architecture is to handle the problem in the following three phases.
- Data: AlphaCode is fed data from public repositories on GitHub.
- Learning: The tool learns on a dataset and calibrates to the task’s requirements.
- Sampling and Evaluation: After learning, the AI tool massively samples program variations for each problem. Then, through filtering and clustering, rank the programs into ten small subsets and submit them for external evaluation.
Researchers have found that AlphaCode can generate many possible answers to coding problems, about 40% of which either exhaust all available system memory or the solution cannot be reached in a reasonable amount of time.
DeepMind found that clusters of similar codes gave better answers, while wrong codes were randomly distributed. By focusing on these answers, AlphaCode could answer about one-third of the coding challenges correctly. But many human programmers are smarter, so AlphaCode only reached the top 54%.
This experiment is also attractive from the perspective of thinking about programming itself. AlphaCode was not given any generic examples of what constitutes an algorithm, nor does it know if the internal representation of the problem contains anything that can be identified as an algorithm.
The advent of AlphaCode won’t immediately put engineers out of work, but future evolutions will likely do just that. Recently DeepMind AI, DeepNash, successfully played the classic board game Stratego with human expert-level performance, and the neural network even uses bluffs to beat human players.