What to know
- Hugging Face has released StarCoder 2 in collaboration with Service Now and Nvidia.
- The AI-powered code generator comes in three variants (with different parameter sizes) trained by Service Now, Hugging Face, and Nvidia, the first two of which will run on most modern consumer GPUs.
- StarCoder 2 is reported to be the most efficient AI code generator, designed for developers who want to build applications quickly and without sacrificing quality.
AI-based code generators are receiving a lot of attention from developers. Even though most such tools are far from perfect, the likes of which include Amazon’s CodeWhisperer and GitHub Copilot, demand for their alternatives is only growing by the day. Fortunately, Hugging Face – in collaboration with Service Now and Nvidia – is here to save the day with its latest iteration of StarCoder, an open-source code generator that modern GPUs would have no problem running. Here’s all you need to know about it.
What is StarCoder 2
Starcoder 2, developed first by AI startup Hugging Face, is a family of large language models for code that come in the following three variants:
- StarCoder2-3B model (with 3 billion parameters) trained by Service Now.
- StarCoder2-7B model (with 7 billion parameters) trained by Hugging Face.
- StarCoder2-15B model (with 15 billion parameters) trained by Nvidia.
All three models were trained on The Stack V2, a new code dataset that is seven times bigger than its first iteration, with advanced techniques to understand programming languages and discussions around program source codes.
However, only the first two variants will be able to run on most modern consumer GPUs. Part of the reason for this could be the fact that they were both trained on 17 programming languages while Starcoder2-15B was trained on 600+ programming languages (by Nvidia).
Nevertheless, even the smallest model (trained by Service Now) is as good, if not better, than the previous iteration’s best.
How does StarCoder 2 compare with other AI code generators
Like most AI code generators, StarCoder 2 will provide suggestions to complete code lines, summarize bits of code, and pull them up when prompted for it. Reportedly, it is also much more efficient and gains an edge over other code generators in terms of performance as well.
Furthermore, StarCoder 2 is said to take no more than a few hours before it is deployed locally, learns the developer’s source code, and can be used to create apps and chatbots. It is also considered a far more ethical AI code generator than some others, mostly because it was trained on data that was licensed by Software Heritage.
One important caveat to note is that StarCoder 2’s license, the BigCode Open RAIL-M 1.0, may bring with it its own set of challenges to developers as it won’t allow completely open use of the code generator as freely as they like. Certain restrictions have been implemented to ensure compliance with laws and regulations, such as the EU AI Act.
Discussion