This article is part of a VB special issue. Read the full series here: Intelligent Sustainability.
Everything counts in large amounts. You don’t have to be Google, or build large AI models, to benefit from writing efficient code. But how do you measure that?
It’s complicated, but that’s what Abhishek Gupta and the Green Software Foundation (GSF) are relentlessly working on. The GSF is a nonprofit formed by the Linux Foundation, with 32 organizations and close to 700 individuals participating in various projects to further its mission.
Its mission is to build a trusted ecosystem of people, standards, tooling and best practices for creating and building green software, which it defines as “software that is responsible for emitting fewer greenhouse gases.”
The likes of Accenture, BCG, GitHub, Intel and Microsoft participate in GSF, and its efforts are organized across four working groups: standards, policy, open source and community.
Gupta, who serves as the chair for the Standards working group at GSF, in addition to his roles as BCG’s senior responsible AIleader and expert and the Montreal AI Ethics Institute founder and principal researcher, shared current work and roadmap on measuring the impact of software on sustainability.
The first step towards greener code is measuring its impact
The first thing Gupta notes about the GSF is that it focuses on reduction, not neutralization. This means that things like renewable energy credits or power purchase agreements, aiming to offset and neutralize, aren’t part of the GSF’s mission. The focus, Gupta said, is on actual reductions in how you design, develop, and deploy software systems. This is a work in progress, and a very complex exercise.
But companies at every scale can benefit from more efficient code. Thinkabout what happens to your phone, or laptop, when running apps that involve more or less processing, i.e., playing videos versus editing text. The difference in battery drain is significant. The larger the scale, the larger the stakes — making large language models more efficient, for example, could result in considerable savings.
The first step towards improving is measuring, as the famous adage goes. The focal point of Gupta’s work with the GSF Standards working group is something called the software carbon intensity specification (SCI). The SCI specification defines a methodology for calculating the rate of carbon emissions for a software system.
The GSF has adopted the notion of carbon efficiency as a way of thinking about the carbon impacts of software systems. This, Gupta explained, is broken down into three parts: energy efficiency, hardware efficiency and carbon awareness.
Energy efficiency is trying to consume as little electricity as possible. Electricity is the main way software consumes energy, and in most parts of the world it’s priimarily generated from burning fossil fuel. This is where its carbon impact comes from.
Hardware efficiency is trying to use the least amount of embodied carbon possible. Embodied carbon, Gupta noted, is meant to capture the carbon impact of everything that goes into hardware such asservers, chips, smartphones etc.
Carbon awareness focuses on trying to do more work when the electricity is “clean,” and less when the electricity is “dirty,” Gupta said. He also referred to the notion of energy proportionality. The idea there is that higher rates of utilization for a piece of hardware mean that electricity is turned into more useful work, rather than idling. When it comes to actually measuring impact, however, things get messy.
“Some folks look at Flops. Some look directly at the energy consumed by the systems, and there’s a variety of approaches that lead to quite different results. That’s one of the challenges that we face in the field,” Gupta said.
The goal, Gupta said, is to have energy efficiency, hardware efficiency and carbon awareness talked about very explicitly in the calculation. Ultimately, the SCI aims to become an official standard, promoting comparability.
Granularity and transparency are key for a complex undertaking
One of the key points that Gupta made is that “software and hardware are inextricably linked”. The GSF prioritizes reducing carbon emissions in software, but the choice and use of hardware is a very important part of that.
Nowadays, the cloud is where the majority of software is produced and deployed. When we talk about software systems deployed in the cloud, a question that Gupta said people often ask is about fractional use. If only a fraction of a certain hardware is used, only for a certain amount of time, how should that be accounted for? This is where time-sharing and resource sharing come into play.
These are ways to calculate what part of a hardware system’s embodied emissions should be taken into account when calculating the carbon intensity score for software. Scale is also considered, through a parameter Gupta called functional unit. That can be the number of minutes spent using the software, or the number of API calls served, for example.
For hardware, essentially, the entire lifecycle analysis needs to be considered to be able to calculate embodied emissions. That is really complex, so the GSF started an initiative on creating open data sets that will help people calculate embodied emissions.
“When you reserve a particular instance on a cloud provider, they’ll give you some information about the performance of that node and its parameters. But then what are the specifics of that piece of hardware that is actually running your software?” Gupta said. “Getting transparency, getting data on that tends to be important as well. And that’s why we’re investing in creating some open data so that you can facilitate those calculations.”
Granularity is key, as Gupta emphasized, otherwise it all ends up being rather abstract and vague. Inevitably, this also leads to complexity, and questions about boundaries, i.e., what should be included in software carbon emissions calculations.
“You can think about memory, storage, compute, but also some things that we tend to forget. What is the logging infrastructure? Do you have any sort of monitoring in place? Do you have idle machines that are on standby for redundancy? Do you have some sort of build and deploy pipelines?” He said. “Then speaking of machine learning models. You can have an inventory of models that are used. You can have shadow deployments, canary deployments. You have all of these things, backups that are in place, that also end up being part of that boundary.”
The other important principle Gupta emphasized is transparency. Transparency about what is included in calculations, but also about how these calculations are done. For example, where direct observability is not possible, the GSF promotes what Gupta called “a lab based, or model-based approach”.
“When we talk about consumption of third-party modules, APIs, libraries, if you don’t have direct visibility, taking a lab based on model-based approach where you can approximate and get some directional intelligence on what the carbon impacts are is still useful. And you can use that in your SCI score calculation, with the requirement that you are transparent and [state] that’s what you’ve done,” Gupta said.
From measuring to acting
Ultimately, the SCI with all its intricacies and complexity is a means to an end, and the goal is to make it accessible to everyone. The purpose, the GSF notes, is to help users and developers make informed choices about which tools, approaches, architectures, and services they use in the future. It is a score rather than a total; lower numbers are better than higher numbers, and reaching zero is impossible.
It is possible to calculate an SCI score for any software application, from a large distributed cloud system to a small monolithic open-source library, any on-premise application or even a serverless function. The product or service may be running in any environment, whether a personal computer, private data center or a hyperscale cloud.
As Gupta noted, there is a panoply of related tools out there: Allen AI Institute’s Beaker, RAPL, Greenframe, Code Carbon and PowDroid, to name a few. The GSF offers a comprehensive list.
These tools can help enterprises in getting a better understanding of the energy consumption of your application, but because everybody is doing it a bit differently, the results that you get also tend to be different, Gupta said. This is why the GSF promotes adoption of the SCI.
An important aspect regardless of the choice of specific tool is actionable feedback. That is, the tool should not only measure the carbon impact of the software, but also offer suggestions for improvement. Some of these tools provide targeted recommendations on what parts of the code are more energy hungry, and where to optimize. But that’s not all that matters — recommendations about processes and choices are important too, Gupta said.
For AI systems, Gupta explaned that, people should also think about things like system design, training methodology, and model architectures. Quantizing weights, using distilled networks, adopting TinyML approaches can all be quite useful in reducing the carbon impacts of systems. As there is “a tremendous push” for getting AI models to work on resource-constrained devices, that also has the byproduct of mitigating carbon impacts.
Making the right hardware choices can also help, according to Gupta. Using fit for purpose hardware, i.e., application specific integrated circuits, or AI chips such as TPUs, may help reduce the amount of energy used to train AI models. The same goes for deploying AI models — there are systems specifically developed for that purpose, Gupta noted. Making tactical choices in terms of where and when models are trained can also provide benefits.
At the moment, sustainability reporting on software is at an embryonic stage. It’s rarely done, it’s on a voluntary basis, and it’s not standardized. An example that comes to mind is Google Cloud Model Cards, used to report on AI models. Gupta believes that sustainability should become a first class citizen everywhere, alongside business and functional considerations.
“When you have a product that needs to go out the door, the things that are optional are the first ones that are going to be dropped. If we start to incorporate these as mandatory requirements, then I think people would start paying more attention,” he said.
At the same time, Gupta added, as consumers become more savvy, looking at environmental impact scores and making choices based on that, that will also make a difference. If users are only willing to pay for software that is green, it will impact bottom lines, and organizations will be forced to change their practices.
Currently, the GSF is working on releasing the first official version of SCI, which Gupta noted will be “a huge milestone.” It is expected to be unveiled at the 2022 UN Climate Change Conference. As Gupta shared, organizations that are a part of the GSF are considering incorporating SCI into their measurement methodologies and the software systems that they build.
The GSF is also working on the awareness-raising front, including by holding summits around the world.
“We’re embarking on this mission to raise awareness. It’s not something that people really think about today. So, we’re getting people to become aware that — ‘Hey, green software is a thing, and this is why you should care about it,’” Gupta concluded.