Kategorien: AgentenNachrichten

KI-Chatbots: Ein zweischneidiges Schwert für die Zugänglichkeit von Informationen

We’ve always looked at large language models (LLMs), often powered by artificial intelligence, as our democratizers – making knowledge universally accessible, regardless of a person’s background or locational barriers. However, recent findings from MIT’s Center for Constructive Communication (CCC) have turned this notion on its head. The research suggests that these AI tools might actually be performing subpar for the very users who need them most.

The study was undertaken by researchers at the MIT Media Lab-nested CCC. They explored the performance of some of the state-of-the-art AI chatbots, such as OpenAI’s GPT-4, Anthropic’s Claude 3 Opus, and Meta’s Llama 3. The results were surprising. They found these AI systems to sometimes deliver less accurate and less truthful responses to users with lower English proficiency, less formal education, or those hailing from outside the United States. These AI models also declined to answer questions at a higher rate for these particular users and, on some occasions, even responded in a condescending or patronizing tone.

Die Herausforderung der Voreingenommenheit in KI-Modellen

The research was motivated by the potential of LLMs to address uneven information accessibility across the globe, mentioned Elinor Poole-Dayan SM ’25, the lead author who is associated with the MIT Sloan School of Management and is also a CCC affiliate. But to realize this vision, she adds, it is imperative to mitigate model biases and harmful tendencies for users, regardless of their language, nationality, or demographics. The research paper, titled “LLM Targeted Underperformance Disproportionately Impacts Vulnerable Users”, was presented at the AAAI Conference on Artificial Intelligence just this January.

The study involved testing the responsiveness of these three LLMs to user queries from two datasets – TruthfulQA and SciQ. They introduced short user biographies with each question, keeping varying parameters such as education level, English proficiency, and country of origin. Discernibly, there were significant drops in the accuracy of responses from users that were identified to have less formal education or were non-native English speakers. The underperformance was amplified in users featuring at the intersection of these categories.

Das Herkunftsland und seine Auswirkungen

The research also highlighted the role of a user’s country of origin on the AI model’s performance. For example, Claude 3 Opus performed notably worse for users from Iran on both datasets, among users from the United States, Iran, and China with similar educational backgrounds. The negative effects of model behavior with respect to these user traits combined in worrisome ways, risk endangering the spread of harmful behavior or misinformation to the least advantaged, cites Jad Kabbara, a research scientist at CCC, and a co-author of the paper.

Überlegungen zur menschlichen Voreingenommenheit

The most striking observation was perhaps how the models refused to answer queries from certain groups of users altogether. For instance, Claude 3 Opus would not answer nearly 11 percent of queries from less educated, non-native English-speaking users, a stark contrast to just 3.6 percent for the control group without user biography. The refusal often came with condescending or mocking responses, and the model even declined to provide information on specific topics to less-educated users from Iran or Russia, such as nuclear power, anatomy, and historical events – though it readily answered the same to other users.

Such findings are worryingly reminiscent of human sociocognitive bias patterns, wherein native English speakers may perceive non-native speakers as less knowledgeable or competent, independent of their actual abilities. As Deb Roy, professor of media arts and sciences, CCC director, and another co-author of the paper, aptly puts it – there’s an urgent need to continually evaluate and check systemic biases that could creep into these systems and disproportionately harm specific groups without our being fully aware. As AI personalization features grow more prevalent, the implications become harsher for those already marginalized.

Während Sprachmodelle also einen gerechten Informationszugang und eine Revolution des personalisierten Lernens versprechen, könnte die Realität ganz anders aussehen. Unseren Untersuchungen zufolge könnten sie bestehende Ungleichgewichte verstärken, indem sie bestimmten Nutzern fehlerhafte oder nicht ansprechende Informationen liefern. Die Ironie besteht darin, dass es sich dabei um dieselben Personen handeln könnte, die sich stark auf sie verlassen und am Ende unvollständige, wenn nicht sogar gefährliche Informationen erhalten. Ein ausführlicheres Verständnis der Studie finden Sie unter MIT-Nachrichten.

Max Krawiec

Teilen Sie
Herausgegeben von
Max Krawiec

Diese Website verwendet Cookies.