BMX ride signature visualization
Revealing the unique style of each cyclist
For most people, the launch of ChatGPT in November 2022 was the moment AI became real. What started as a simple website quickly transformed into a global phenomenon, sparking a wave of competition among labs like Anthropic, Google, and Mistral. In just a few years, AI has shifted from futuristic promise to a multi-billion-dollar industry, with new models and products released at breakneck speed. But while performance is easy to market, ethics and sustainability remain harder to measure and easier to ignore.
Consumers can now choose from an ever-increasing number of AI chatbots, products and services. In an effort to win over consumers, companies promote their products as the newest, most intelligent and fastest AI products around.
But the increasing capability of AI technology is only one side of the story. There are many concerns about the environmental impact of these systems, which require huge data centers that use up vast amounts of resources. Furthermore, the data that is used to produce these models is increasingly being scrutinized, raising ethical questions about the fair use of authored work. The result is an information overload. Consumers have more choice than ever when it comes to AI tools, but struggle to make an informed decision. It has become practically impossible as a consumer to choose a product based not just on the novelty and claimed performance of a model.
To bring environmental and ethical ramifications of AI models to the forefront, we propose a universal rating system that evaluates not only on technical performance, but also takes these factors into account. We see three major impacts such a universal rating could bring to AI technology:
There are already many benchmarks in the field of AI, yet few combine these three crucial aspects. In our research, we found plenty of benchmarks and leaderboards evaluating performance and effectivity metrics as well as some focussed on responsible AI metrics or environmental impact. But these frameworks seemed to operate in isolation.
Additionally, while these databases are valuable in developing a rich understanding of the wider impact of AI systems, their reach is limited. The information usually lives in dedicated websites, reports and tables that make it possible to do a thorough comparison between AI models within a given framework, but chances are small that the average consumer will find and consult these databases before using AI products.
A universal rating system meets consumers where they can make a difference: when choosing and using products. Such a rating system needs to fulfill a number of requirements. First, the rating system should be identifiable as an independent score. Second, it should be clear to the user to assess a rating as good or bad. Finally, the system should balance simplicity with richness, providing consumers with enough insight to make informed decisions without overwhelming them.
Through an extensive design exploration, we’ve developed a first proposal of what a universal AI rating system could look like.
This rating consists of three parts. The performance pillar describes the capability of an AI model, and looks at metrics such as accuracy and speed, compared to state-of-the-art benchmarks. The sustainability pillar scores the environmental impact of a model through metrics such as energy used and carbon emitted both during development and uses. The ethics pillar scores the ethical impact of a model by analyzing biases and examining privacy and transparency standards.
The system uses a familiar A to F scale, with A representing the highest overall quality and F the lowest, following established conventions that consumers recognize from other rating systems.
Each of the three pillars, Ethics, and Sustainability, is scored separately, with a clear grade and an intuitive green-to-red color scale for instant recognition. While the three scores are combined into a single triangular visual, the individual ratings remain distinct, making it easy to spot where an AI tool excels and where it falls short. This balance allows users to grasp the overall impact at a glance, while still seeing the nuances across different dimensions. The design mirrors familiar consumer labels (like energy and nutrition ratings), lowering the barrier to understanding while opening up a more holistic view of AI evaluation.
Our rating system treats sustainability and ethics as equally important as performance. This means an AI model might achieve an "A" in performance while receiving lower grades in sustainability or ethics, making it immediately visible how these shortcomings affect the model's overall profile.
While all three pillars are weighted equally in our system, we recognize that performance often drives initial decision-making. That's why performance is positioned at the top of our visualization, with ethics and sustainability as the foundational pillars below, emphasizing that strong performance should be built on ethical and sustainable practices.
The power of a universal rating system is that it informs consumers in the moments where they can make a choice. This means the visual should live comfortably in a myriad of contexts. High-level tools such as browser plugins or local applications can help users keep track of integrated AI models they encounter in websites and software. Model picker dropdowns have already become commonplace in many AI-powered products and services. In addition, platforms such as marketplaces and comparison tools can tap into the rating system to provide an extra layer of information for consumers. All of these places could redirect to a single database providing more in-depth information and help consumers find alternatives.
Please note that all the following visuals are purely illustrative and do not represent actual ratings based on data.
In addition, platforms such as marketplaces and comparison tools can tap into the rating system to provide an extra layer of information for consumers.
All of these places could redirect to a single database providing more in-depth information and help consumers find alternatives.
We acknowledge the technical challenges posed by a universal AI rating system, nor are we claiming that our proposal is a perfect solution. Establishing comparable metrics will require substantial research, and today's high-scoring models could be obsolete and outperformed within months. Not to mention, implementation would demand extensive collaboration between industry, academia, and policymakers.
Yet AI literacy has never been more urgent. As AI's influence accelerates, consumers need better tools to make informed choices about which models they use. And these choices should not be limited to performance. On the contrary: as the AI industry scales up, ethical and sustainable considerations are more important than ever. A universal rating system could spark the broader public discourse we need about AI's growing impact.
This vision requires partnership. We're eager to connect with experts and stakeholders who share our commitment to AI literacy and want to help build more nuanced conversations around AI technology.
BMX ride signature visualization
Revealing the unique style of each cyclist
5 approaches to sustainability data design
Different ways to apply Data Design to inform about our changing world and drive action.
Flight of the Night Re-imagined
What role can artistic and generative design play in reshaping data visualization?
A creative approach to workshops on AI-powered design