Did you know that training a single Large Language Model (LLM) can emit as much CO2 as driving a car around the Earth’s equator approximately 300 times? When you’re scrolling through Instagram, liking posts, and sharing reels without a thought, behind the scenes, each interaction is fuelled by massive data centres running around the clock, emitting carbon emissions, that are often overlooked. It’s a staggering reality about LLMs as they’ve taken the world by storm.
These models have the incredible ability to process huge amounts of data, but there's a downside: training and maintaining them requires a massive amount of computational power, which leads to a large trail of carbon emissions that goes unnoticed. For example, a single query on the very popular LLM, ChatGPT, can consume up to 100 times more energy than a Google search. This is just the tip of the iceberg.
That’s where green coding steps in, an emerging practice advocating for sustainable software development practices by optimizing algorithms, designs, and hardware for energy efficiency. This article delves into the potential of green coding to address environmental concerns of LLMs, paving the way for a future where machine learning and a healthy planet coexist.
The environmental cost of training and operating LLMs becomes very clear by examining case studies that illustrate their carbon emissions and energy use. Take a look at the box plot given below as it describes CO2 emitted in grams corresponding to each ML task.
The training of GPT-4, one of the most advanced language models to date, required resources that led to significant environmental impacts. For example, the carbon footprint generated during its development is comparable to the emissions from driving a gas-powered car nearly 29 million kilometres or powering 1,300 homes for a year! Just a day’s operation of GPT-3 alone emits around 8.4 tons of CO2 per year! One may not notice it, but beneath the surface lies an unseen threat to the environment.
The graph to the right presents a staggering estimation of the number of homes that can be powered by just training these LLMs and the numbers are eye-opening.
Additionally, beyond carbon emissions, disinformation and transparency risks pose significant challenges to maintaining the environmental balance. The problem is that it is very difficult to assess gas emissions due to a lack of transparency from tech companies who own these LLMs. Recent insights gleaned that training BERT on a GPU is comparable to a cross-country flight, while BLOOM’s training emits as much CO2 as 30 flights between London and New York, and GPT-3’s training equals 500 tons of CO2 emissions (nearly 600 flights!). Furthermore, OpenAI recently disclosed that $700,000 was spent on running ChatGPT, which wasn’t even a sustainable model from a business perspective!
Green coding offers a multi-faceted approach to tackling the environmental challenge of LLMs. While its definition states that it is the degree of eco-friendliness of the model considering the specific problems, gauged against selected sustainability metrics, its scope extends far beyond this textbook definition.
The following metrics provide a comprehensive framework to evaluate the sustainability of generated codes:
The rise of LLMs has introduced an era of innovation and endless possibilities, but like all powers, it came with a hidden cost – a growing carbon footprint. As we explored, training and operating these models consumed vast amounts of energy, leaving a trail of carbon emissions. The challenges we face are real. As a community of developers, researchers, and users, it's important to adopt and promote environmentally-friendly coding practices. This means raising awareness, fostering collaboration, and integrating green coding principles into our daily routines, as well as exploring ways to incorporate them into LLM development workflows. Together, we can bridge the gap between AI and a healthy planet by taking action today, and ensure that LLMs continue to evolve by reducing their significant contribution to the existing trail of environmental destruction.