Researchers warn of ‘chance parrot’ in AI text generators
AI researchers warn about the effect of a ‘stochastic parrot’ in AI text generators. They would prefer companies to use smaller, curated datasets to train models.
Researchers working on the risks of AI therefore recently held a ‘day of the stochastic parrot’, reports Venturebeat. That term comes from a paper by two researchers who worked at Google at the time about the risks of Large Language Models, as the models for text generators are called. Such systems are ‘random parrots’ because, like those animals, they repeat words that they take from their training data without being aware of the meaning.
Many of the objections raised in that paper still hold. Due to the large dataset, there is a lot of material that AI text generators should not repeat, such as calls for violence, porn and discrimination. Companies try to get it out by having people make corrections. In the case of OpenAI, those people are low-paid Kenyans, wrote Time earlier. As a result, exploitation of employees is lurking, the researchers argue.
Moreover, there is a bias in texts on the internet, because they are largely written by white people and more by men than by women, the researchers fear. As a result, that bias automatically ends up in the LLM, without it being clear how it came about.
Solutions include working with smaller, curated datasets for training. That should avoid many of the issues around moderation and bias. In addition, some scientists want AI models to be transparent about where they get answers from.