What Not to Say
Teaching chatbots to speak ‘properly’ and ‘decently’
Many of us would have heard about Microsoft’s Tay.ai chatbot, which was released and pulled back within 24 hours in 2016, due to abusive learnings by the chatbot. It took less than 24 hours to corrupt an innocent AI chatbot. What went wrong? Tay.ai’s learning module was excellent, which ironically was the problem – it was rapidly learning swear words, hate language etc. from the large number of people who used abusive language during conversations with the chatbot. However, unlike some of the internal filters many of us have, Tay.ai went ahead and learnt from these signals, and started using these phrases and hate language. All this happened in less than 24 hours, which forced Microsoft to pull this from public use.
I have been observing how my son and daughter-in-law are teaching my 3-year-old granddaughter about the use of good language. Basic things like saying ‘Please’, ‘Thank You’, ‘Good morning’, ‘Good night’, etc. In other words, decent and desirable language was taught first. They have also given strict instructions to us (grandparents) and extended family about what to say – and what not to say – in front of the kid. The child will still hear some ‘bad words’ in school, malls, playgrounds etc. This is beyond the parents’ control. In these cases, they teach the child about how a very few bad people still use ‘bad’ language and good people never use these words, thus starting to lay in the internal filters in my granddaughter’s mind.
We should apply the same principle to these innocent but fast-learning chatbots. Let us ‘teach’ the chatbot all the ‘good’ phrases like ‘Please’, ‘Thank you’ etc. Let us also ‘teach’ the chatbot about showing empathy, such as saying ‘Sorry that your product is not working. We will do everything possible to fix it’, ‘Sorry to ask you to repeat as I did not understand your question’, and so on.
Finally, let us create a negative list of ‘bad’ phrases, and hate language in all possible variations. English in the UK will have British, Scottish, and Irish variations. Some phrases which are considered acceptable in one area may be objectionable in another. Same for Australia, New Zealand, India, New York Northern English, Southern USA English, etc. Let us build internal filters in these chatbots to ignore or unlearn these phrases in the learning process. By looking at the IP address of the user, the bot can identify the geographical location and apply the right language filters.
Will this work? As good parents we have been doing this to teach our kids and grandkids from time immemorial. Mostly this is working; very few kids grow to become users of hate language.
Will it slow down the machine learning process? Perhaps a little bit, but this is a price worth paying, compared to having a chatbot use foul language and upset your valuable customers.
You may be wondering if this simple approach is supported by any AI research or whether this is just a grandfather’s tale! There is lots of research in this area that supports my approach.
There are many references to articles on ‘Seldonain Algorithm’ for AI Ethics. I want to refer to an article titled ‘Developing safer machine learning algorithms at UMass Amhrest’. The authors recommend that the burden of ensuring that ML systems are well-behaved is with the ML designer and not with the end user, and they suggested a 3-step Seldonian algorithm. Let us look at this.
Step one is to provide an Interface specified by the user to define undesirable or bad behaviour. The ML algorithm will use the interface and try as much as possible to avoid these undesirable behaviours.
Step two is to use High-Probability Constraints: Seldonian algorithms guarantee with high-probability that they will not cause the undesirable behaviour that the user specified via the interface.
Step three in the algorithm is No Solution Found: Seldonian algorithms must have the ability to say No Solution Found (NSF) to indicate that they were unable to achieve what they were asked.
Let us consider two examples involving human life to illustrate the Interface definitions. Example one is a robot that controls a robotic assembly line. The robot senses that a welding operation has gone out of sync and is causing all welded cars to be defective. The robot controller wants to issue the instruction to immediately stop the assemble line and get the welding station fixed. However, the user knows that abrupt stoppage of assembly line may cause harm to some factory workers who may be on another station in the assembly line. This undesirable decision to immediately stop the assembly line needs to be defined in the interface, as this will cause harm to humans compared to a material loss in defective cars.
Example two is an autonomous truck carrying cargo driving in a hilly road with a cliff on the driving side. A human driver is coming fast in the wrong lane ( human’s fault) and approaching the truck for a certain head-on collision. The only desirable outcome for the truck is to fall of the cliff and destroy itself with the cargo rather than trying to look at various other optimal decisions which may have some probability of hitting the car and harming the human.
In our chatbot good-behavior problem, the undesirable behaviors are usage of the phrases in the ‘Negative List’ for each geographical variation. The interface will have this list and the logic to identify geographical variations.
I am in discussion with some sponsors for a research project to develop an English-language chatbot etiquette engine. Initial reactions from the various stakeholders are positive – everyone agrees on the need for an etiquette engine as well as my approach.
I will be delighted to receive critique and comments from all of you.
As a closing note, wanted to tell you that Natural Language processing (NLP) is taking huge strides. “NLP is eating the ML” is the talk of the town. NLP research supported by Large Language models, Transformers etc. are moving way ahead. Investment is going into Q&A, Language Generation, Knowledge management, Unsupervised/reinforcement learning.
In addition to desirable behavior, many other ethical issues need to be incorporated. For e.g
· Transparency: Does everyone know broadly how learning is done and how decisions are taken?
· Explainability: For every individual decision, if requested, can we explain how the decision was taken?
Also, a lot of current AI/ML algorithms especially neural networks based have become black boxes. We expect a shift towards more simpler algorithms for enterprise usage.
No Comments yet!