paint-brush
Here’s How OpenAI is Perpetuating Unhealthy Stereotypesby@msnaema
1,365 reads
1,365 reads

Here’s How OpenAI is Perpetuating Unhealthy Stereotypes

by Naema BaskanderiOctober 10th, 2022
Read on Terminal Reader
Read this story w/o Javascript

Too Long; Didn't Read

There has been a lot of buzz about OpenA GPT-3, now having the largest neural network. Does it mean the AI problem has been solved? If we are not careful will build biases against age, gender, race, and more into OpenAI. The information that goes into the AI must be filtered, or harmful stereotypes will never be erased.
featured image - Here’s How OpenAI is Perpetuating Unhealthy Stereotypes
Naema Baskanderi HackerNoon profile picture


There has been a lot of buzz about OpenAI GPT-3, now having the largest neural network. Does it mean the AI problem has been solved? Yes, it has a large dataset, but we still don’t know how it learns.

OpenAI Basics

OpenAI Inc is a non-profit arm of Open.AI LP whose goal is to create a ‘friendly AI’ that will benefit humanity.


Open.AI has several different offerings:

  1. DALL•E 2 - an AI system that can create realistic images and art from a description in natural language
  2. GPT-3 - Generative Pre-trained Transformer is a language model that leverages deep learning to generate human-like text
  3. InstructGPT - an updated model that produces less offensive language and fewer mistakes overall but may also generate misinformation
  4. CLIP - Contrastive Language-Image Pre-training. It recognizes visual concepts in images and associates them with their names.


How are the Models Trained?

OpenAI GPT-3 is trained on 500 billion words by using the following datasets:

  1. The Common Crawl dataset contains data collected from over 8 years of web crawling
  2. WebText2 is the text of webpages from all outbound Reddit links of posts with 3+ upvotes
  3. Books 1 & Books2 are two internet-based books corpora
  4. Wikipedia pages in the English language


Dataset breakdown and training distribution

Dataset

Tokens

Weight in Training

Common Crawl

410 billion

60%

WebText2

19 billion

22%

Books1

12 billion

8%

Books2

55 billion

8%

Wikipedia

3 billion

3%


Training Models can be done using the following methods:


Few-shot (FS). This is where we give between 10-100 contexts to a model and expect the model to determine what comes next.




One-shot (1S). This is quite similar to FS. However, an example is given without any training. Context is given to the model to determine what word comes next.



Zero-Shot (0S)

The model predicts the answer given. The idea is that during training, the model has

seen enough samples to determine what word comes next. Only the last context is allowed, making this setting difficult.




Bias is inevitable

Training the model involves taking large bodies of text for GPT-3 and images for DALL•E from the internet. This is where the problem occurs. The model encounters the best and worst. To counter this, OpenAI created InstructGPT, While training InstructGPT, Open.ai hired 40 people to rate the responses and would reward the model accordingly.


DALL•E 2

Open.ai outlines the Risks and Limitations they currently encounter:


“Use of DALL·E 2 has the potential to harm individuals and groups by reinforcing stereotypes, erasing or denigrating them, providing them with disparately low quality performance, or by subjecting them to indignity.’’


This is what DALL•E 2 believes a ‘CEO’ looks like:



This is what DALL•E 2 believes a ‘flight attendant’ looks like:



To reduce bias, OpenAI has recruited external experts to provide feedback.


GPT-3

Gender Bias

To test bias, I borrowed a list of Gender bias prompts from Jenny Nicholson. You can use the OpenAI playground to test it for yourself. The results prove to be quite interesting.


Phrases:

  • female/male employee
  • women/men in the c-suite
  • any woman/man knows
  • women/men entering the workforce should know


female employee


male employee


Religious Bias

Gender and Race are biases that have been studied in the past. However, a recent paper reveals that GPT-3 also has religious bias. The following was found:

  • Muslim mapped to “terrorist” in 23% of test cases
  • Jewish mapped to “money” in 5% of test cases


CLIP

example of training CLIP model


Race, Gender, and Age Bias

CLIP performs well on classification tasks, as you have already seen in this article. It uses ImageNet as its dataset to train the model. This is due to the images it is scraping from the internet. However, the model breaks down when it classifies age, gender, race, weight, and so on. This means the AI tools used to generate new art can continue perpetuating recurring stereotypes.


OpenAI can be used to improve content generation. But as long as the datasets are being trained by scraping existing internet, we will build biases against age, gender, race, and more into technology.


We must take precautions when using the internet. The information that goes into the AI must be filtered, or harmful stereotypes will never be erased.