LibGuides: Artificial Intelligence 101: Terms & Examples

A-C

Algorithm
A procedure or set of instructions for solving a computational or mathematical problem or performing a computation. For example, the process by which Google provides results to a search query is through algorithms; Google uses hundreds of factors to rank search results and determine relevancy. Algorithms are in nearly every aspect of internet-based technology and software, from what videos TikTok shows you on your For You Page to how the results are displayed on an academic database like JSTOR to whether or not a patient claim is approved by a health insurance company to the price of rent on an apartment, and beyond. Many rely on accessing, buying, or selling user data to determine ranking, which is why different users will see different results while doing the same search on the same platform.
Anthropomorphizing
When people apply human traits to non-humans (such as animals, objects, or technology). For example, thanking Siri for providing an answer or believing a chatbot made a joke.

Artificial Intelligence or AI
Technology that allows machines to emulate human intelligence. They require large datasets to analyze behavior and then intuit how to respond to or engage with that information. Some of what is being called AI today is not “humanlike intelligence” but anthropomorphizing a piece of technology. For example, there is no thought process or intellectual activity happening when ChatGPT outputs a string of text. It is merely generating text based on whatever data was put into its LLM as it relates to the original query; it is looking for commonly associated words connected to the words input by the user, which is why it often creates hallucinations or “lies” (to use another anthropomorphizing term) in its response. See also “stochastic parrots”.

Bias
When a dataset like an LLM contains human prejudices, inclinations, or bigotries and then treats that input as if it’s neutral. Bias is inherent in all algorithms, whether intentional or unintentional, because all datasets using human-generated material or language contain bias. For example, Stable Diffusion “created” images showing low wage jobs like “fast food worker” of mostly people with darker skin. AI bots and virtual assistants can reinforce gender stereotypes. OpenAI itself admitted ChatGPT is biased. AI is incapable of determining bias or adjusting for it. Someone has to tell the LLM what bias is; who makes that determination, how they make that determination, and why are processes which are also infused with bias. To address bias, companies like OpenAI often hire low-wage laborers to view and scrub toxic material, meaning a human must evaluate the output of a program like ChatGPT and remove dangerous content. Other underpaid workers are manually adding labels to content in datasets, but these labels are also often very biased.

Big data
Very large sets of data. Encompasses the “three Vs”: the volume of information compiled, the velocity at which that information is created or processed, and the variety of data being collected.

Black box
An AI system that is impenetrable to anyone outside that system, including the programmers. Outsiders cannot access or see what datasets or LLMs have been used or how that data is being processed.

The black box problem
When researchers are unable to see how deep learning models are making their decisions. “If, for example, an autonomous vehicle strikes a pedestrian when we’d expect it to hit the brakes, the black box nature of the system means we can’t trace the system’s thought process and see why it made this decision.”

Chatbot
Application that attempts to mimic human conversation. They are often used in customer service for support issues.

Sources Used

Allen, John R. and Darrell M. West. “The Brookings glossary of AI and emerging technologies.” Brookings.edu, July 13, 2020. https://www.brookings.edu/articles/the-brookings-glossary-of-ai-and-emerging-technologies/.

Campbell, Audrey. “AI terms + education: A glossary of what you need to know.” Turnitin.com, last updated 2023. https://www.turnitin.com/blog/ai-terms-education-a-glossary-of-what-you-need-to-know.

Cui, Jasmine and Jason Abbruzzese. “An AI glossary: the words and terms to know about the booming industry.” NBC News, May 16, 2023. https://www.nbcnews.com/tech/innovation/ai-glossary-openai-chatgpt-words-terms-know-booming-industry-rcna81793.

Pasick, Adam. “Artificial Intelligence Glossary: Neural Networks and Other Terms Explained.” New York Times, March 27, 2023. https://www.nytimes.com/article/ai-artificial-intelligence-glossary.html.

D-L

Data mining
The process of collecting and analyzing large sets of data to determine patterns, relationships, trends, and other information. Sometimes that information is used by the company hosting the application doing the data mining, and sometimes that information is packaged and sold to other organizations.

Deep learning
A subfield of machine learning that teaches a program to learn by example. It uses multiple artificial neural networks; the “deep” refers to the many layers within the networks. For example, generative AI models are a type of deep learning.
Deepfake
An artificial image, or video or audio recording that has been generated by a type of machine learning called deep learning. For example, fake celebrity voiceovers on TikTok videos.

Deepfakes have become increasingly dangerous, particularly for women. Teen girls are being targeted and harassed with fake nude photos. Since December 2018, the research company Sensity AI has tracked online deepfake videos and "has consistently found that between 90% and 95% of them are nonconsensual porn. About 90% of that is nonconsensual porn of women." Experts are warning of the potential political deepfakes to have severe consequences for the 2024 presidential election. See also “shallowfakes.”
Doomers
Organizations and individuals who believe AI poses an imminent existential threat to human existence. They believe AI could become sentient and may even trigger a cataclysmic event. It is an extreme reaction to developing responsible AI and wanting to mitigate the harms AI could potentially cause. Doomers often want to research and develop more ways to utilize AI to combat problems with the current iterations of AI or to benefit humanity, and to do that they often raise millions of dollars in seed money. Contrast with people who see AI as a risk and want to install parameters on the technology or include clauses in contracts to prevent its use. These people are not attempting to profit off their warnings or concerns but seek to minimize exploitation, digital colonization, and other harms caused by profit-driven AI systems. See also “openwashing”.
Ethical AI, Responsible AI, Trustworthy AI
Developing and deploying AI in a legal, ethical, and trustworthy way by being transparent, secure, and equitable. However, “there is some debate on whether responsible AI frameworks can address the explicit and implicit biases embedded within systems to ensure equity in predictive decisions, especially when applied to employment, health care, financial services, and criminal justice.” The lower-level workers in departments and companies in the "ethical AI" field are increasingly burning out and feeling undervalued. Furthermore, "1) It can be hard to agree as to what constitutes ethical behavior. 2) Humans are the problem: Whose ethics? Who decides? Who cares? Who enforces? 3) Like all tools, AI can be used for good or ill, which makes standards-setting a challenge. 4) Further AI evolution itself raises questions and complications."
Generative AI
An application that generates content such as text, video, images, or code. The application finds patterns in data sets then generates similar content. The application has no way of determining if the content it generates is accurate or real and does not understand the underlying principles of creative activities. It merely mimicks the data that it has been trained on, which is why the results are very often hallucinations.
Hallucination
When a generative AI application provides a result that is incorrect, irrelevant, or does not exist. For example, two lawyers were fined for using ChatGPT when submitting court filings because the generative AI program created fake court cases. It often provides fake citations for users attempting to do scholarly research.

Large language model or LLM
An algorithm that uses very large amounts of data to predict content. Datasets for LLMs are often created by scraping websites, books, social media platforms, etc.

M-Z

Machine Learning or ML
A subfield of AI. ML was defined in the 1950s by scientist Arthur Samuel as “the field of study that gives computers the ability to learn without explicitly being programmed.” There are 3 types of machine learning models: supervised, unsupervised, and reinforcement

Supervised is when a machine learning model is trained with labeled data sets to help the models become more accurate as they learn.
Unsupervised searches unlabeled data sets for patterns, relationships, and trends that people aren’t specifically looking for.
Reinforcement uses trial and error with rewards for success to help a model learn.

Machine learning model or MLM

A program that uses machine learning to detect patterns in data.
Neural network, artificial neural network (ANN), or simulated neural network (SNN)
A subset of machine learning and an integral part of deep learning. They are inspired by the way the human brain works and attempt to replicate neuron signaling. Neural networks are designed to start with training data and "learn" to improve over time.

Open AI According to their website, "Open AI is an AI research and deployment company. Our mission is to ensure that artificial general intelligence benefits all of humanity." Open AI is two bodies the non-profit OpenAI, Inc. and the for-profit subsidiary corporation OpenAI, L.P. It was founded in 2015 and secured two investments from Microsoft, $1 billion in 2019 and $10 billion in 2023. It is behind the text generator ChatGPT and image generator DALL-E.
Openwashing
A term referring to the way AI companies like OpenAI use the term “open” to imply their technology is both “open source software and open science” when in reality they are using “open” more as a marketing term. “Open source” is software whose code is freely available for use and modification. Relatedly, “open science” is a movement to make scientific research and communication accessible to everyone and promotes collaboration and knowledge sharing. The term was coined by David Gray Widder, Sarah West, and Meredith Whittaker in their 2023 research paper “Open (For Business): Big Tech, Concentrated Power, and the Political Economy of Open AI.” They found that “the terms ‘open’ and ‘open source’ are used in confusing and diverse ways, often constituting more aspiration or marketing than technical descriptor…This complicates an already complex landscape, in which there is currently no agreed on definition of ‘open’ in the context of AI, and as such the term is being applied to widely divergent offerings with little reference to a stable descriptor.” See also “black box” and “the black box problem”.
Poisoning
When a data training model is intentionally tampered with to either corrupt the dataset or manipulate the way the machine learning model interprets the data. One example was "Tay", a Twitter chatbot created by Microsoft that was intended to learn by interacting with other Twitter users. Within a day, users had fed Tay so much hate speech that Tay began generating its own hate speech in response. Microsoft shut the account down soon after. Nightshade and Glaze are another example of poisoning. These programs were designed for creators to alter pixels in their art that would corrupt datasets if their artwork was taken without their consent.

Shallowfake
Images, video, text, or audio presented out of context or with minor editing to change the context. The material may be edited or unedited or even just mislabeled, but the goal is to sow discontent, to discredit, or to spread misinformation or disinformation. For example, users posted photos from 2018 of a refinery fire from Ohio and of a SpaceX launch and claimed it showed the Lahaina Maui fire in an attempt to fuel a conspiracy theory. See also "deepfake."
Stochastic parrots
A term some are using instead of “generative AI.” It refers to LLMs “that are impressive in their ability to generate realistic-sounding language but ultimately do not truly understand the meaning of the language they are processing.” It is a way of engaging with generative text applications that reminds users that the tool itself is not intelligent nor does it comprehend the meaning of the text it generates.

The term was coined by Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Margaret Mitchell in their 2021 research paper “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” The paper argues that LLMs “it is important to understand the limitations of [language models] and put their success in context. This not only helps reduce hype which can mislead the public and researchers themselves regarding the capabilities of these LMs…LMs are not performing natural language understanding (NLU), and only have success in tasks that can be approached by manipulating linguistic form.”