Skip to main content

AI for Sustainability and Sustainability in AI

I will be referring to the following 3 papers on this very interesting topic.

(1}  https://link.springer.com/article/10.1007/s43681-021-00043-6

 Sustainable AI: AI for sustainability and the sustainability of AI

A Van Wynsberghe – AI and Ethics, 2021 – Springe

(2) https://www.researchgate.net/publication/342763375_Carbontracker_Tracking_and_Predicting_the_Carbon_Footprint_of_Training_Deep_Learning_Models/link/5f0ef0f2a6fdcc3ed7083852/download

(3)      Lacoste, A., Luccioni, A., Schmidt, V., Dandres T.: Quantifying

the Carbon Emissions of Machine Learning. (2019)

While there is a tremendous push for using new-generation generative AI based on large language models to solve business applications, there are also voices of concern from experts in the community about the dangers and ethical consequences.  A lot has been written about this but one aspect which has not picked up sufficient traction, in my opinion, is Sustainable AI.  

In (1), Wynsberghe defines two disciplines on AI & sustainability.   AI for Sustainability and Sustainable AI.

AI for Sustainability is any business application using AIML technology to solve climate problems.  Use of this new generation technology to help in climate change and CO2 reductions.   Major applications are getting developed for optimal energy distribution across renewable and fossil energy sources. Any % extra use from renewable sources, help in less use of fossil fuels and help in climate change.  Various other applications may include better climate predictions and the use of less water, pesticides, and fertilizers for food production.  Many Industry 4.0 applications to build new smart factories, smart cities, and smart buildings fall into this category.

On the other hand, Sustainable AI measures the massive use of GPU and other computing, storage, and communications energy usage while building the AI models and suggest ways to reduce this.  While digital software development and testing can be done in a few developers’ laptops with minimal use of IT resources, the AIML software development life cycle calls for the use of massive training data and develop deep learning neural networks with multiple millions of nodes.   Some of the new generation Large Language models use billions of parameters beyond the imagination of all of us.  The energy use does not stop here.  Fine Tuning learning for specific domains or relearning is as energy-consuming or sometimes even more than original Training.   Some numbers mentioned in (1) are reproduced here to highlight the point.   One deep-learning NLP model consumed energy equivalent to 600,000 lbs of CO2.  Google Alpha-Go-Zero generated over 90 Tonnes of CO2 over 40 days it took for the initial training.  These numbers are large and at least call for review and discussions.   I hope I have been able to open your eyes and generate some interest in this new dimension of AI & Ethics i.e impact on climate change.

I am sure many of you will ask “Isn’t any next-generation industrialization from horse carriages to automobiles or steam engines to combustion always increased the use of energy and why do we need to worry about this for AI?”.  Or “There has been so much talk on how many light bulbs one can light for the same power used for a simple google search , why worry about this now ?”.  All valid questions.  

However, I will argue that

  1. The current climate change situation is already in a critical stage and any unplanned large-scale usage new of energy can become “the feather that broke the camel’s back!”.
  2. Use of fully data driven life cycle and billions of parameters, deep neural networks are being used for the first time at an industrial scale and industry-wide and there are too many unknowns.

What are the suggestions?

  • Energy consumption measurement and publication must become part of the AI & Ethics practice followed by all AI development organizations.   (2)  Carbon Tracker Tool and (3) Machine learning emission calculator are suggestions for this crucial measurement.  I strongly recommend organizations use their Quality & Metrics departments to research and agree on a measurement acceptable to all within each organization.  More research and discussions need to calculate the net increased use of energy compared to current IT tools to get the right measurement. In some cases, the current IT tools may be using legacy mainframes and expensive dedicated communication lines using up large amounts of energy and the net difference by using AIML may not be that large.
  • Predicting the energy use at the beginning of the AIML project life cycle also is required. (3). 
  • The prediction data of CO2 equivalent emissions need to be used as another cost in approving AIML projects.
  • Emission prediction also will force AIML developers to select the right size training data and use of right models for the application. Avoid the temptation of running the model on billions of data sets just because data is available!. Use the right tools for the right job.  You don’t need a tank to kill an ant!.
  • Ask the question of whether the use of deep learning is appropriate for this business application? For example, a simple HR application used for recruitment or employee loyalty prediction with Deep learning models may turn out to be too expensive in terms of Co2 emissions and need not be considered a viable project.
  • CEOs include this data in their Climate Change Initiatives Report to the Board and shareholders and also deduct carbon credits used up by these AIML applications in the company’s Carbon credit commitments.

More Later,

L Ravichandran

Small talk about Large Language Models

Since its formal launch, ChatGPT has been receiving a lot of press and has also been the topic of – heated – discussions in the recent past.

I had played with generative AI some time back and also shared the result in one of my earlier posts.

Post ChatGPT, the investments in AI – or more specifically generative AI tech – based companies has seen a sharp rise.

There is also a general sense of fear – rising from uncertainty and the dread of the possibility of such technologies taking away specialized jobs and roles has been noticed across industries.

I was talking to an architect a few days ago and she said that in their community, the awe and fear of AI tech is unprecedented.

With just a few words, some of the sketches generated by tools like Dall-E, Craiyon, Stable diffusion etc are apparently so realistic and logical.. for example, when the query was to have the porch door opening out into the garden with a path to the main gate.. the image was generated in less than a couple of minutes..

With all the promise of creating new content quickly, many questions have also come up, without clear answers.

The first – also a topic of interest on aithougts.org – is that of ethics.

Whether it is deep fakes – btw, I had experimented with a technology that could have been used for this – when I was looking for tools to simplify podcast editing – on a platform called Descript – where I could train the model with my voice.. I had to read a predefined text for about 30 minutes – and then, based on written text, it could synthesize that text in my voice.. At that time, the technology was not yet as mature as today and so, I did not pursue.

I digress..

Getting back to the debate on generative AI, ethics of originality [I believe that there are now tools emerging that can check if content was generated by ChatGPT!] that could influence how students create their assignment papers.. or generate more marketing content, all based on content that is already available on the net – and ingested by the ChatGPT transformer.

Another aspect is the explainability of the generated content. The bias in the generated content or when there is a need for an expert opinion to also be factored in, would not be possible unless the source is known. The inherent bias in the training data is difficult to overcome as much of this is historical and if balanced data has not been captured or recorded in the past, would be very difficult to fix, or at least adjust the relevance.

The third aspect is about the ‘originality’ or ‘uniqueness’ of the generated content – let me use the term solution from now on..

There is a lot of work being done in these areas, some in research institutions and some in companies applying them in specific contexts.

I had an opportunity recently to have a conversation with the founder of a startup that is currently in stealth mode, working on a ‘domain aware, large language model based’ generative AI solution.

A very interesting conversation that touches upon many of the points as above.

 

You can listen to this conversation as a podcast in 2 parts here:

https://pm-powerconsulting.com/blog/the-potential-of-large-language-models-with-steven-aberle/

https://pm-powerconsulting.com/blog/episode-221/

 

Or watch the conversation as a video in 2 parts here:

https://www.youtube.com/watch?v=86fGLa9ljso

https://www.youtube.com/watch?v=f9DnDNUwFBs

 

Do share your comments and experiences with the emerging applications of GAN, Transformers etc.