“I hope people use [SHADES] As a diagnostic tool to identify where and how there can be problems in a model, “says Talat.” It is a way of knowing what is missing in a model, where we cannot be sure that a model works well and if it is accurate or not. “
To create the multilingual data set, the team recruited native speakers and language fluids, including Arabic, Chinese and Dutch. They translated and wrote all the stereotypes that could think of their respective languages, which then verified another native speaker. Each stereotype was noted by the speakers with the regions in which it was recognized, the group of people to which it was addressed and the type of bias it contained.
Then, each stereotype was translated into English by the participants, a language spoken by each taxpayer, before they translate it into additional languages. Then, the speakers indicated whether the translated stereotype was recognized in their language, creating a total of 304 stereotypes related to physical appearance, personal identity and social factors of people as their occupation.
The team is present Your findings at the Annual Nations Conference of the Americas of the Chapter of the Association of Computer Linguistics in May.
“It’s an exciting approach,” says Myra Cheng, a doctoral student at Stanford University who studies social prejudices at AI. “There is good coverage of different languages and cultures that reflect their subtlety and nuances.”
Mitchell says he expects other taxpayers to add new languages, stereotypes and regions to tones, which is publicly availablewhich leads to the development of better language models in the future. “It has been a massive collaboration effort of people who want to help make a better technology,” she says.