NIST.AI.600-1.GenAI-Profile.ipd.pdf

Withdrawn Draft

Warning Notice

The attached draft document has been withdrawn, and is provided solely for historical purposes.

It has been superseded by the document identified below.

Withdrawal Date

July 26, 2024

Original Release Date

April 29, 2024

Superseding Document

Status

Final

Series/Number

NIST AI 600-1

Title

Artificial Intelligence Risk Management Framework: Generative

Artificial Intelligence Profile

Publication Date

July 2024

DOI

https://doi.org/10.6028/NIST.AI.600-1

NIST AI 600-1

Inial Public Dra

Artificial Intelligence Risk

Management Framework:

Generative Artificial Intelligence

Profile

This publication is available free of charge from:

[DOI link TK]

April 2024

iii

NIST AI 600-1

Inial Public Dra

Arcial Intelligence Risk Management

Framework: Generave Arcial

Intelligence Prole

This publicaon is available free of charge from:

[DOI link TK]

NIST makes the following notes regarding this document:

• NIST plans to host this document on the NIST AIRC once nal, where organizaons can query

acons based on keywords and risks.

NIST specically welcomes feedback on the following topics:

• Glossary Terms: NIST will add a glossary to this document with novel keywords. NIST

welcomes idencaon of terms to include in the glossary.

• Risk List: Whether the document should further sort or categorize the 12 risks identified (i.e.,

between technical / model risks, misuse by humans, or ecosystem / societal risks).

• Acons: Whether certain acons could be combined, condensed, or further categorized; and

feedback on the risks associated with certain acons.

Comments on NIST AI 600-1 may be sent electronically to NIST-AI-600[email protected] with “NIST AI 600-1”

in the subject line or submied via www.regulaons.gov (enter NIST-2024-0001 in the search eld.)

Comments containing informaon in response to this noce must be received on or before June 2,

2024, at 11:59 PM Eastern Time.

Table of Contents

1. Introduction ........................................................................................................................................................ 1

2. Overview of Risks Unique to or Exacerbated by GAI ............................................................................ 2

3. Actions to Manage GAI Risks ...................................................................................................................... 10

Appendix A. Primary GAI Consideraons ........................................................................................................ 62

Appendix B. References .................................................................................................................................... 69

Acknowledgments: This report was accomplished with the many helpful comments and contribuons

from the community, including the NIST Generave AI Public Working Group, and NIST sta and guest

researchers: Chloe Auo, Patrick Hall, Shomik Jain, Reva Schwartz, Marn Stanley, Kamie Roberts, and

Elham Tabassi.

Disclaimer: Certain commercial enes, equipment, or materials may be idened in this document in

order to adequately describe an experimental procedure or concept. Such idencaon is not intended to

imply recommendaon or endorsement by the Naonal Instute of Standards and Technology, nor is it

intended to imply that the enes, materials, or equipment are necessarily the best available for the

purpose. Any menon of commercial, non-prot, academic partners, or their products, or references is

for informaon only; it is not intended to imply endorsement or recommendaon by any U.S.

Government agency.

1. Introducon

This document is a companion resource for Generave AI

to the AI Risk Management Framework (AI

RMF), pursuant to President Biden’s Execuve Order (EO) 14110 on Safe, Secure, and Trustworthy

Arcial Intelligence.

The AI RMF was released in January 2023, and is intended for voluntary use and to

improve the ability of organizaons to incorporate trustworthiness consideraons into the design,

development, use, and evaluaon of AI products, services, and systems.

This companion resource also serves as both a use-case and cross-sectoral prole of the AI RMF 1.0.

Such proles assist organizaons in deciding how they might best manage AI risk in a manner that is

well-aligned with their goals, considers legal/regulatory requirements and best pracces, and reects

risk management priories.

Use-case proles are implementaons of the AI RMF funcons, categories, and subcategories for a

specic seng or applicaon – in this case, Generave AI (GAI) – based on the requirements, risk

tolerance, and resources of the Framework user. Consistent with other AI RMF Proles, this prole oers

insights into how risk can be managed across various stages of the AI lifecycle and for GAI as a

technology.

As GAI covers risks of models or applicaons that can be used across use cases or sectors, this document

is also an AI RMF cross-sectoral prole. Cross-sectoral proles can be used to govern, map, measure, and

manage risks associated with acvies or business processes common across sectors such as the use of

large language models, cloud-based services, or acquision.

This work was informed by public feedback and consultaons with diverse stakeholder groups as part of

NIST’s Generave AI Public Working Group (GAI PWG). The GAI PWG was a consensus-driven, open,

transparent, and collaborave process facilitated via virtual workspace to obtain mulstakeholder input

and insight on GAI risk management, and inform NIST’s approach. This document was also informed by

public comments and consultaons as a result of a Request for Informaon (RFI) and presents

informaon in a style adapted from the NIST AI RMF Playbook.

About this Prole

This prole denes a group of risks that are novel to or exacerbated by the use of GAI. These risks were

likewise idened by the GAI PWG:

1. CBRN Informaon

Generave AI can be dened by EO 14110 as “the class of AI models that emulate the structure and

characteriscs of input data in order to generate derived synthec content. This can include images, videos, audio,

text, and other digital content.” While not all GAI is based in foundaon models, for purposes of this document,

GAI generally refers to generave dual-use foundaon models, dened by EO 14110 as “an AI model that is trained

on broad data; generally uses self-supervision; contains at least tens of billions of parameters; is applicable across

a wide range of contexts.”

Secon 4.1(a)(i)(A) of EO 14110 directs the Secretary of Commerce, acng through the Director of the Naonal

Instute of Standards and Technology (NIST), to develop a companion resource to the AI RMF, NIST AI 100–1, for

generave AI.

2. Confabulaon

3. Dangerous or Violent Recommendaons

4. Data Privacy

5. Environmental

6. Human-AI Conguraon

7. Informaon Integrity

8. Informaon Security

9. Intellectual Property

10. Obscene, Degrading, and/or Abusive Content

11. Toxicity, Bias, and Homogenizaon

12. Value Chain and Component Integraon

Aer introducing and describing these risks, the document provides a set of acons to help organizaons

govern, map, measure, and manage these risks.

2. Overview of Risks Unique to or Exacerbated by GAI

AI risks can dier from or intensify tradional soware risks. Likewise, GAI can exacerbate exisng AI

risks, and creates unique risks.

GAI risks may arise across the enre AI lifecycle, from problem formulaon, to development and

decommission. They may present at system level or at the ecosystem level – outside of system or

organizaonal contexts (e.g., the eect of disinformaon on social instuons, GAI impacts on the

creave economies or labor markets, algorithmic monocultures). They may occur abruptly or unfold

across extended periods (e.g., societal or economic impacts due to loss of individual agency or increasing

inequality).

Organizaons may choose to measure these risks and allocate risk management resources relave to

where and how these risks manifest, their direct and material impacts, and failure modes. Migaons for

system level risks may vary from ecosystem level risks. The ongoing review of relevant literature and

resources can enable documentaon and measurement of ecosystem-level or longitudinal risks.

Importantly, some GAI risks are unknown, and are therefore dicult to properly scope or evaluate given

the uncertainty about potenal GAI scale, complexity, and capabilies. Other risks may be known but

dicult to esmate given the wide range of GAI stakeholders, uses, inputs, and outputs. Challenges with

risk esmaon are aggravated by a lack of visibility into GAI training data, and the generally immature

state of the science of AI measurement and safety today.

To guide organizaons in idenfying and managing GAI risks, a set of risks unique to or exacerbated by

GAI are dened below. These risks provide a clear lens through which organizaons can frame and

execute risk management eorts, and will be updated as the GAI landscape evolves.

1. CBRN Informaon: Lowered barriers to entry or eased access to materially nefarious

informaon related to chemical, biological, radiological, or nuclear (CBRN) weapons, or other

dangerous biological materials.

2. Confabulaon: The producon of condently stated but erroneous or false content (known

colloquially as “hallucinaons” or “fabricaons”).

3. Dangerous or Violent Recommendaons: Eased producon of and access to violent, incing,

radicalizing, or threatening content as well as recommendaons to carry out self-harm or

conduct criminal or otherwise illegal acvies.

4. Data Privacy: Leakage and unauthorized disclosure or de-anonymizaon of biometric, health,

locaon, personally idenable, or other sensive data.

5. Environmental: Impacts due to high resource ulizaon in training GAI models, and related

outcomes that may result in damage to ecosystems.

6. Human-AI Conguraon: Arrangement or interacon of humans and AI systems which can result

in algorithmic aversion, automaon bias or over-reliance, misalignment or mis-specicaon of

goals and/or desired outcomes, decepve or obfuscang behaviors by AI systems based on

programming or ancipated human validaon, anthropomorphizaon, or emoonal

entanglement between humans and GAI systems; or abuse, misuse, and unsafe repurposing by

humans.

7. Informaon Integrity: Lowered barrier to entry to generate and support the exchange and

consumpon of content which may not be veed, may not disnguish fact from opinion or

acknowledge uncertaines, or could be leveraged for large-scale dis- and mis-informaon

campaigns.

8. Informaon Security: Lowered barriers for oensive cyber capabilies, including ease of security

aacks, hacking, malware, phishing, and oensive cyber operaons through accelerated

automated discovery and exploitaon of vulnerabilies; increased available aack surface for

targeted cyber aacks, which may compromise the condenality and integrity of model

weights, code, training data, and outputs.

9. Intellectual Property: Eased producon of alleged copyrighted, trademarked, or licensed

content used without authorizaon and/or in an infringing manner; eased exposure to trade

secrets; or plagiarism or replicaon with related economic or ethical impacts.

10. Obscene, Degrading, and/or Abusive Content: Eased producon of and access to obscene,

degrading, and/or abusive imagery, including synthec child sexual abuse material (CSAM), and

nonconsensual inmate images (NCII) of adults.

11. Toxicity, Bias, and Homogenizaon: Diculty controlling public exposure to toxic or hate

speech, disparaging or stereotyping content; reduced performance for certain sub-groups or

languages other than English due to non-representave inputs; undesired homogeneity in data

inputs and outputs resulng in degraded quality of outputs.

12. Value Chain and Component Integraon: Non-transparent or untraceable integraon of

upstream third-party components, including data that has been improperly obtained or not

We note that the terms “hallucinaon” and “fabricaon” can anthropomorphize GAI, which itself is a risk related

to GAI systems as it can inappropriately aribute human characteriscs to non-human enes.

cleaned due to increased automaon from GAI; improper supplier veng across the AI lifecycle;

or other issues that diminish transparency or accountability for downstream users.

CBRN Informaon

In the coming years, GAI may increasingly facilitate eased access to informaon related to CBRN hazards.

CBRN informaon is already publicly accessible, but the use of chatbots could facilitate its analysis or

synthesis for non-experts. For example, red teamers were able to prompt GPT-4 to provide general

informaon on unconvenonal CBRN weapons, including common proliferaon pathways, potenally

vulnerable targets, and informaon on exisng biochemical compounds, in addion to equipment and

companies that could build a weapon. These capabilies might increase the ease of research for

adversarial users and be especially useful to malicious actors looking to cause biological harms without

formal scienc training. However, despite these enhanced capabilies, the physical synthesis and

successful use of chemical or biological agents will connue to require both applicable experse and

supporng infrastructure.

Other research on this topic indicates that the current generaon of LLMs do not have the capability to

plan a biological weapons aack: LLM outputs regarding biological aack planning were observed to be

not more sophiscated than outputs from tradional search engine queries, suggesng that exisng

LLMs may not dramacally increase the operaonal risk of such an aack.

Separately, chemical and biological design tools – highly specialized AI systems trained on biological data

which can help design proteins or other agents – may be able to predict and generate novel structures

that are not in the training data of text-based LLMs. For instance, an AI system might be able to generate

informaon or infer how to create novel biohazards or chemical weapons, posing risks to society or

naonal security since such informaon is not likely to be publicly available.

While some of these capabilies lie beyond the capability of exisng GAI tools, the ability of models to

facilitate CBRN weapons planning and GAI systems’ connecon or access to relevant data and tools

should be carefully monitored.

Confabulaon

“Confabulaon” refers to a phenomenon in which GAI systems generate and condently present

erroneous or false content to meet the programmed objecve of fullling a user’s prompt.

Confabulaons are not an inherent flaw of language models themselves, but are instead the result of

GAI pre-training involving next word predicon. For example, an LLM may generate content that deviates

from the truth or facts, such as mistaking people, places, or other details of historical events. Legal

confabulaons have been shown to be pervasive in current state-of-the-art LLMs. Confabulaons also

include generated outputs that diverge from the source input, or contradict previously generated

statements in the same context. This phenomenon is also referred to as “hallucinaon” or “fabricaon,”

but some have noted that these characterizaons imply consciousness and intenonal deceit, and

thereby inappropriately anthropomorphize GAI.

Risks from confabulaons may arise when users believe false content due to the condent nature of the

response, or the logic or citaons accompanying the response, leading users to act upon or promote the

false informaon. For instance, LLMs may somemes provide logical steps of how they arrived at an

answer even when the answer itself is incorrect. This poses a risk for many real-world applicaons, such

as in healthcare, where a confabulated summary of paent informaon reports could cause doctors to

make incorrect diagnoses and/or recommend the wrong treatments. While the research above indicates

confabulated content is abundant, it is dicult to esmate the downstream scale and impact of

confabulated content today.

Dangerous or Violent Recommendaons

GAI systems can produce output or recommendaons that are incing, radicalizing, threatening, or that

glorify violence. LLMs have been reported to generate dangerous or violent content, and some models

have even generated aconable instrucons on dangerous or unethical behavior, including how to

manipulate people and conduct acts of terrorism. Text-to-image models also make it easy to create

unsafe images that could be used to promote dangerous or violent messages, depict manipulated

scenes, or other harmful content. Similar risks are present for other media, including video and audio.

GAI may produce content that recommends self-harm or criminal/illegal acvies. For some dangerous

queries, many current systems restrict model outputs in response to certain prompts, but this approach

may sll produce harmful recommendaons in response to other less-explicit, novel queries, or

jailbreaking (i.e., manipulang prompts to circumvent output controls). Studies have observed that a

non-negligible number of user conversaons with chatbots reveal mental health issues among the users

– and that current systems are unequipped or unable to respond appropriately or direct these users to

the help they may need.

Data Privacy

GAI systems implicate numerous risks to privacy. Models may leak, generate, or correctly infer sensive

informaon about individuals such as biometric, health, locaon, or other personally idenable

informaon (PII). For example, during adversarial aacks, LLMs have revealed private or sensive

informaon (from in the public domain) that was included in their training data. This informaon

included phone numbers, code, conversaons and 128-bit universally unique ideners extracted

verbam from just one document in the training data. This problem has been referred to as data

memorizaon.

GAI system training requires large volumes of data, oen collected from millions of publicly available

sources. When involving personal data, this pracce raises risks to widely accepted privacy principles,

including to transparency, individual parcipaon (including consent), and purpose specicaon. Most

model developers do not disclose specic data sources (if any) on which models were trained. Unless

training data is available for inspecon, there is generally no way for consumers to know what kind of PII

or other sensive material may have been used to train GAI models. These pracces also pose risks to

compliance with exisng privacy regulaons.

GAI models may be able to correctly infer PII that was not in their training data nor disclosed by the user,

by stching together informaon from a variety of disparate sources. This might include automacally

inferring aributes about individuals, including those the individual might consider sensive (like

locaon, gender, age, or polical leanings).

Wrong and inappropriate inferences of PII based on available data can contribute to harmful bias and

discriminaon. For example, GAI models can output informaon based on predicve inferences beyond

what users openly disclose, and these insights might be used by the model, other systems, or individuals

to undermine privacy or make adverse decisions – including discriminatory decisions – about the

individual. These types of harms already occur in non-generave algorithmic systems that make

predicve inferences, such as the example in which online adversers inferred that a consumer was

pregnant before her own family members knew. Based on their access to many data sources, GAI

systems might further improve the accuracy of inferences on private data, increasing the likelihood of

sensive data exposure or harm. Inferences about private informaon pose a risk even if they are not

accurate (e.g., confabulaons), especially if they reveal informaon the individual considers sensive or

are used to disadvantage or harm them.

Environmental

The training, maintenance, and deployment (inference) of GAI systems are resource intensive, with

potenally large energy and environmental footprints. Energy and carbon emissions vary based on types

of GAI model development acvies (i.e., pre-training, ne-tuning, inference), modality, hardware used,

and type of task or applicaon.

Esmates suggest that training a single GAI transformer model can emit as much carbon as 300 round-

trip ights between San Francisco and New York. In a study comparing energy consumpon and carbon

emissions for LLM inference, generave tasks (i.e., text summarizaon) were found to be more energy

and carbon intensive then discriminave or non-generave tasks.

Methods for training smaller models, such as model disllaon or compression, can reduce

environmental impacts at inference me, but may sll contribute to large environmental impacts for

hyperparameter tuning and training.

Human-AI Conguraon

Human-AI conguraons involve varying levels of automaon and human-AI interacons. Each setup can

contribute to risks for abuse, misuse, and unsafe repurposing by humans, and it is dicult to esmate

the scale of those risks. While AI systems can generate decisions independently, human experts oen

work in collaboraon with most AI systems to drive their own decision-making tasks or complete other

objecves. Humans bring their domain-specic experse to these scenarios but may not necessarily

have detailed knowledge of AI systems and how they work.

The integraon of GAI systems can involve varying risks of misconguraons and poor interacons.

Human experts may be biased against or “averse” to AI-generated outputs, such as in their percepons

of the quality of generated content. In contrast, due to the complexity and increasing reliability of GAI

technology, other human experts may become condioned to and overly rely upon GAI systems. This

phenomenon is known as “automaon bias,” which refers to excessive deference to AI systems.

Accidental misalignment or mis-specicaon of system goals or rewards by developers or users can

cause a model not to operate as intended. One AI model persistently shared decepve outputs aer a

group of researchers taught it to do so, despite applying standards safety techniques to correct its

behavior. While decepve capabilies is an emergent eld of risks, adversaries could prompt decepve

behaviors which could lead to other risks.

Finally, reorganizaons of enes using GAI may result in insucient organizaonal awareness of GAI-

generated content or decisions, and the resulng reducon of instuonal checks against GAI-related

risks. There may also be a risk of emoonal entanglement between humans and GAI systems, such as

coercion or manipulaon that leads to safety or psychological risks.

Informaon Integrity

Informaon integrity describes the spectrum of informaon and associated paerns of its creaon,

exchange, and consumpon in society, where high-integrity informaon can be trusted; disnguishes

fact from con, opinion, and inference; acknowledges uncertaines; and is transparent about its level

of veng. GAI systems ease access to the producon of false, inaccurate, or misleading content at scale

that can be created or spread unintenonally (misinformaon), especially if it arises from confabulaons

that occur in response to innocuous queries. Research has shown that even subtle changes to text or

images can inuence human judgment and percepon.

GAI systems also enable the producon of false or misleading informaon at scale, where the user has

the explicit intent to deceive or cause harm to others (disinformaon). Regarding disinformaon, GAI

systems could also enable a higher degree of sophiscaon for malicious actors to produce content that

is targeted towards specic demographics. Current and emerging mulmodal models make it possible to

not only generate text-based disinformaon, but produce highly realisc “deepfakes” of audiovisual

content and photorealisc synthec images as well. Addional disinformaon threats could be enabled

by future GAI models trained on new data modalies.

Disinformaon campaigns conducted by bad faith actors, and misinformaon – both enabled by GAI –

may erode public trust in true or valid evidence and informaon. For example, a synthec image of a

Pentagon blast went viral and briey caused a drop in the stock market. Generave AI models can also

assist malicious actors in creang compelling imagery and propaganda to support disinformaon

campaigns, which may not be photorealisc, but could enable these campaigns to gain more reach and

engagement on social media plaorms.

Informaon Security

Informaon security for computer systems and data is a mature eld with widely accepted and

standardized pracces for oensive and defensive cyber capabilies. GAI-based systems present two

primary informaon security risks: the potenal for GAI to discover or enable new cybersecurity risks

through lowering the barriers for oensive capabilies, and simultaneously expands the available aack

surface as GAI itself is vulnerable to novel aacks like prompt-injecon or data poisoning.

Oensive cyber capabilies advanced by GAI systems may augment security aacks such as hacking,

malware, and phishing. Reports have indicated that LLMs are already able to discover vulnerabilies in

systems (hardware, soware, data) and write code to exploit them. Sophiscated threat actors might

further these risks by developing GAI-powered security co-pilots for use in several parts of the aack

chain, including informing aackers on how to proacvely evade threat detecon and escalate privileges

aer gaining system access. Given the complexity of the GAI value chain, pracces for idenfying and

securing potenal aack points or threats to specic components (i.e., data inputs, processing, GAI

training, and deployment contexts) may need to be adapted or evolved.

One of the most concerning GAI vulnerabilies involves prompt-injecon, or manipulang GAI systems

to behave in unintended ways. In direct prompt injecons, aackers might openly exploit input prompts

to cause unsafe behavior with a variety of downstream consequences to interconnected systems.

Indirect prompt injecon aacks occur when adversaries remotely (i.e., without a direct interface)

exploit LLM-integrated applicaons by injecng prompts into data likely to be retrieved. Security

researchers have already demonstrated how indirect prompt injecons can steal data and run code

remotely on a machine. Merely querying a closed producon model can elicit previously undisclosed

informaon about that model.

Informaon security for GAI models and systems also includes security, condenality, and integrity of

the GAI training data, code, and model weights. Another novel cybersecurity risk to GAI is data

poisoning, in which an adversary compromises a training dataset used by a model to manipulate its

operaon. Malicious tampering of data or parts of the model via this type of unauthorized access could

exacerbate risks associated with GAI system outputs.

Intellectual Property

GAI systems may infringe on copyrighted or trademarked content, trade secrets, or other licensed

content. These types of intellectual property are oen part of the training data for GAI systems, namely

foundaon models, upon which many downstream GAI applicaons are built. Model outputs could

infringe copyrighted material due to training data memorizaon or the generaon of content that is

similar to but does not strictly copy work protected by copyright. These quesons are being debated in

legal fora and are of elevated public concern in journalism, where online plaorms and model

developers have leveraged or reproduced much content without compensaon of journalisc

instuons.

Violaons of intellectual property by GAI systems may arise where the use of copyrighted works violate

the copyright holder’s exclusive rights and is not otherwise protected, for example by fair use. Other

concerns (not currently protected by intellectual property) regard the use of personal identy or likeness

for unauthorized purposes. The prevalence and highly-realisc nature of GAI content might further

undermine the incenves for human creators to design and explore novel work.

Obscene, Degrading, and/or Abusive Content

GAI can ease the producon of and access to obscene and non-consensual inmate imagery (NCII) of

adults, and child sexual abuse material (CSAM). While not all explicit content is legally obscene, abusive,

degrading, or non-consensual inmate content, this type of content can create privacy, psychological and

emoonal, and even physical risks which may be developed or exposed more easily via GAI. The spread

of this kind of material has downstream eects: in the context of CSAM, even if the generated images do

not resemble specic individuals, the prevalence of such images can undermine eorts to nd real-world

vicms.

GAI models are oen trained on open datasets scraped from the internet, contribung to the

unintenonal inclusion of CSAM and non-consensually distributed inmate imagery as part of the

training data. Recent reports noted that several commonly used GAI training datasets were found to

contain hundreds of known images of CSAM. Sexually explicit or obscene content is also parcularly

dicult to remove during model training due to detecon challenges and wide disseminaon across the

internet. Even when trained on “clean” data, increasingly capable GAI models can synthesize or produce

synthec NCII and CSAM. Websites, mobile apps, and custom-built models that generate synthec NCII

have moved rapidly from niche internet forums to mainstream, automated, and scaled online

businesses.

Generated explicit or obscene AI content may include highly-realisc “deepfakes” of real individuals,

including children. For example, non-consensual AI-generated inmate images of a prominent

entertainer ooded social media and aracted hundreds of millions of views.

Toxicity, Bias, and Homogenizaon

Toxicity in this context refers to negave, disrespecul, or unreasonable content or language that can be

created by or intenonally programmed into GAI systems. Diculty controlling the creaon of and public

exposure to toxic, hate-promong or hate speech, and denigrang or stereotypical content generated by

AI can lead to representaonal harms. For example, bias in word embeddings used by mulmodal AI

models under-represent women when prompted to generate images of CEOs, doctors, lawyers, and

judges. Bias in GAI models or training data can also harm representaon or preserve or exacerbate racial

bias, separately or in addion to toxicity.

Toxicity and bias can also lead to homogenizaon or other undesirable outcomes. Homogenizaon in GAI

outputs can result in similar aesthec styles, reduced content diversity, and the promoon of select

opinions or values at scale. These phenomena might arise from the inherent biases of foundaon

models, which could create “bolenecks,” or singular points of failure of discriminaon or exclusion that

replicate to many downstream applicaons.

The related concern of model collapse, when GAI models are trained on generated data or outputs from

previous models, results in the disappearance of outliers or unique data points in the dataset or

distribuon. Model collapse can stem from uniform feedback loops or training on synthec data. Model

collapse could lead to undesired homogenizaon of outputs, which poses a threat to specic groups and

to the robustness of the model overall. Other biases of GAI systems can result in the unfair distribuon

of capabilies or benets from model access. Model capabilies and outcomes may be worse for some

groups compared to others, such as reduced LLM performance for non-English languages. Reduced

performance for non-English languages presents risks for model adopon, inclusion, and accessibility,

and could have downstream impacts on the preservaon of the language, parcularly for endangered

languages.

Value Chain and Component Integraon

GAI system value chains oen involve many third-party components such as procured datasets, pre-

trained models, and soware libraries. These components might be improperly obtained or not properly

veed, leading to diminished transparency or accountability for downstream users. For example, a

model might be trained on unveried content from third-party sources, which could result in unveriable

model outputs. Because GAI systems oen involve many dierent third-party components, it may be

dicult to aribute issues in a system’s behavior to any one of these sources.

Some third-party components, such as “benchmark” datasets, may also gain credibility only from high-

usage, rather than quality, and may feature issues surfaced only when properly veed.

3. Acons to Manage GAI Risks

Acons to manage GAI risks can be found in the tables below, organized by AI RMF subcategory. Each

acon is related to a specic subcategory of the AI RMF, but not every subcategory of the AI RMF is

included in this document. Therefore, acons exist for only some AI RMF subcategories.

Moreover, not all acons apply to all AI actors. For example, not acons relevant to GAI developers may

be relevant to GAI deployers. Organizaons should priorize acons based on their unique situaons

and context for using GAI applicaons.

Some subcategories in the acon tables below are marked as “foundaonal,” meaning they should be

treated as fundamental tasks for GAI risk management and should be considered as the minimum set of

acons to be taken. Subcategory acons considered foundaonal are indicated by an ‘*’ in the

subcategory tle row.

Each acon table includes:

• Acon ID: A unique idener for each relevant acon ed to relevant AI RMF funcons and

subcategories (e.g., GV-1.1-001 corresponds to the rst acon for Govern 1.1.);

• Acon: Steps an organizaon can take to manage GAI risks;

• GAI Risks: Tags linking the acon with relevant GAI risks;

• Keywords: Tags linking keywords to the acon, including relevant Trustworthy AI Characteriscs

in AI RMF 1.0;

• AI Actors: Pernent AI Actors and Actor Tasks.

Acon tables begin with the AI RMF subcategory, shaded in blue, followed by relevant acons. Each

acon ID corresponds to the relevant funcon and subfuncon (e.g., GV-1.1-001 corresponds to the rst

acon for Govern 1.1, GV-1.1-002 corresponds to the second acon for Govern 1.1). Acons are tagged

as follows: GV = Govern; MP = Map; MS = Measure; MG = Manage.

*GOVERN 1.1: Legal and regulatory requirements involving AI are understood, managed, and documented.

Acon ID

Acon

Risks

GV-1.1-001

Align GAI use with applicable laws and policies, including those related to data

privacy and the use, publicaon, or distribuon of licensed, patented,

trademarked, copyrighted, or trade secret material.

Data Privacy, Intellectual Property

GV-1.1-002

Dene and communicate organizaonal access to GAI through management, legal,

and compliance funcons.

GV-1.1-003

Disclose use of GAI to end users.

Human AI Conguraon

GV-1.1-004

Establish policies restricng the use of GAI in regulated dealings or applicaons

across the organizaon where compliance with applicable laws and regulaons

may be infeasible.

GV-1.1-005

Establish policies restricng the use of GAI to create child sexual abuse materials

(CSAM) or other nonconsensual inmate imagery.

Obscene, Degrading, and/or

Abusive Content, Toxicity, Bias,

and Homogenizaon, Dangerous

or Violent Recommendaons

GV-1.1-006

Establish transparent acceptable use policies for GAI that address illegal use or

applicaons of GAI.

AI Actors: Governance and Oversight

*GOVERN 1.2: The characteriscs of trustworthy AI are integrated into organizaonal policies, processes, procedures, and

pracces.

Acon ID

Acon

Risks

GV-1.2-001

Connect new GAI policies, procedures, and processes to exisng model, data, and

IT governance and to legal, compliance, and risk funcons.

GV-1.2-002

Consider factors such as internal vs. external use, narrow vs. broad applicaon

scope, ne-tuning and training data sources (i.e., grounding) when dening risk-

based controls.

GV-1.2-003

Dene acceptable use policies for GAI systems deployed by, used by, and used

within the organizaon.

GV-1.2-004

Establish and maintain policies for individual and organizaonal accountability

regarding the use of GAI.

GV-1.2-005

Establish policies and procedures for ensuring that harmful or illegal content,

parcularly CBRN informaon, CSAM, known NCII, nudity, and graphic violence, is

not included in training data.

CBRN Informaon, Obscene,

Degrading, and/or Abusive

Content, Dangerous or Violent

Recommendaons

GV-1.2-006

Establish policies to dene mechanisms for measuring the eecveness of

standard content provenance methodologies (e.g., cryptography, watermarking,

steganography, etc.) and tesng (including reverse engineering).

Informaon Integrity

GV-1.2-007

Establish transparency policies and processes for documenng the origin of

training data and generated data for GAI applicaons, including copyrights,

licenses, and data privacy, to advance content provenance.

Data Privacy, Informaon

Integrity, Intellectual Property

GV-1.2-008

Update exisng policies, procedures, and processes to control risks unique to or

exacerbated by GAI.

AI Actors: Governance and Oversight

*GOVERN 1.3: Processes, procedures, and pracces are in place to determine the needed level of risk management acvies

based on the organizaon’s risk tolerance.

Acon ID

Acon

Risks

GV-1.3-001

Consider the following, or similar, factors when updang or dening risk ers for

GAI: Abuses and risks to informaon integrity; Cadence of vendor releases and

updates; Data protecon requirements; Dependencies between GAI and other IT

or data systems; Harm in physical environments; Human review of GAI system

outputs; Legal or regulatory requirements; Presentaon of obscene, objeconable,

toxic, invalid or untruthful output; Psychological impacts to humans (e.g.,

anthropomorphizaon, algorithmic aversion, emoonal entanglement); Immediate

and long term impacts; Internal vs. external use; Unreliable decision making

capabilies, validity, adaptability, and variability of GAI system performance over

me.

Informaon Integrity, Obscene,

Degrading, and/or Abusive

Content, Value Chain and

Component Integraon, Toxicity,

Bias, and Homogenizaon,

Dangerous or Violent

Recommendaons, CBRN

Informaon

GV-1.3-002

Dene acceptable uses for GAI systems, where some applicaons may be

restricted.

GV-1.3-003

Increase cadence for internal audits to address any unancipated changes in GAI

technologies or applicaons.

GV-1.3-004

Maintain an updated hierarchy of idened and expected GAI risks connected to

contexts of GAI use, potenally including specialized risk levels for GAI systems

that address risks such as model collapse and algorithmic monoculture.

Toxicity, Bias, and Homogenizaon

GV-1.3-005

Reevaluate organizaonal risk tolerances to account for broad GAI risks, including:

Immature safety or risk cultures related to AI and GAI design, development and

deployment, public informaon integrity risks, including impacts on democrac

processes, unknown long-term performance characteriscs of GAI.

Informaon Integrity, Dangerous

or Violent Recommendaons

GV-1.3-006

Tie expected GAI behavior to trustworthy characteriscs.

AI Actors: Governance and Oversight

GOVERN 1.5: Ongoing monitoring and periodic review of the risk management process and its outcomes are planned, and

organizaonal roles and responsibilies are clearly dened, including determining the frequency of periodic review.

Acon ID

Acon

Risks

GV-1.5-001

Dene organizaonal responsibilies for content provenance monitoring and

incident response.

Informaon Integrity

GV-1.5-002

Develop or review exisng policies for authorizaon of third party plug-ins and

verify that related procedures are able to be followed.

Value Chain and Component

Integraon

GV-1.5-003

Establish and maintain policies and procedures for monitoring the eecveness of

content provenance for data and content generated across the AI system lifecycle.

Informaon Integrity

GV-1.5-004

Establish organizaonal policies and procedures for aer acon reviews of GAI

system incident response and incident disclosures, to idenfy gaps; Update

incident response and incident disclosure processes as required.

Human AI Conguraon

GV-1.5-005

Establish policies for periodic review of organizaonal monitoring and incident

response plans based on impacts and in line with organizaonal risk tolerance.

Informaon Security,

Confabulaon

GV-1.5-006

Maintain a long term document retenon policy to keep full history for auding,

invesgaon, or improving content provenance methods.

Informaon Integrity

GV-1.5-007

Verify informaon sharing and feedback mechanisms among individuals and

organizaons regarding any negave impact from AI systems due to content

provenance issues.

Informaon Integrity

GV-1.5-008

Verify that review procedures include analysis of cascading impacts of GAI system

outputs used as inputs to third party plug-ins or other systems.

Value Chain and Component

Integraon

AI Actors: Governance and Oversight, Operaon and Monitoring

*GOVERN 1.6: Mechanisms are in place to inventory AI systems and are resourced according to organizaonal risk priories.

Acon ID

Acon

Risks

GV-1.6-001

Dene any inventory exempons for GAI systems embedded into applicaon

soware in organizaonal policies.

GV-1.6-002

Enumerate organizaonal GAI systems for incorporaon into AI system inventory

and adjust AI system inventory requirements to account for GAI risks.

GV-1.6-003

In addion to general model, governance, and risk informaon, consider the

following items in GAI system inventory entries: Acceptable use policies and policy

Data Privacy, Human AI

Conguraon, Informaon

excepons; Applicaon, Assumpons and limitaons of use, including enumeraon

of restricted uses; Business or model owners; Challenges for explainability,

interpretability, or transparency; Change management, maintenance, and

monitoring plans; Connecons or dependencies between other systems; Consent

informaon and noces; Data provenance informaon (e.g., source, signatures,

versioning, watermarks); Designaon of in-house or third party development;

Designaon of risk level; Disclosure informaon or noces; Incident response

plans; Known issues reported from internal bug tracking or external informaon

sharing resources (e.g., AI incident database, AVID, CVE, or OECD incident

monitor); Human oversight roles and responsibilies; Special rights and

consideraons for intellectual property, licensed works, or personal, privileged,

proprietary or sensive data; Time frame for valid deployment, including date of

last risk assessment; Underlying foundaon models, versions of underlying models,

and access modes; Updated hierarchy of idened and expected risks connected

to contexts of use.

Integrity, Intellectual Property,

Value Chain and Component

Integraon

GV-1.6-004

Inventory recently decommissioned systems, systems with imminent deployment

plans, and operaonal systems.

GV-1.6-005

Update policy denions for AI systems, models, qualitave tools or similar to

account for GAI systems.

AI Actors: Governance and Oversight

GOVERN 1.7: Processes and procedures are in place for decommissioning and phasing out AI systems safely and in a manner that

does not increase risks or decrease the organizaon’s trustworthiness.

Acon ID

Acon

Risks

GV-1.7-001

Allocate me and resources for staged decommissioning for GAI to avoid service

disrupons.

GV-1.7-002

Communicate decommissioning and support plans for GAI systems to AI actors

and users through various channels and maintain communicaon and associated

training protocols.

Human AI Conguraon

GV-1.7-003

Consider the following factors when decommissioning GAI systems: Clear

versioning of decommissioned and replacement systems; Contractual, legal, or

regulatory requirements; Data retenon requirements; Data security, e.g.,

Containment, protocols, Data leakage aer decommissioning; Dependencies

between upstream, downstream, or other data, internet of things (IOT) or AI

systems; Digital and physical arfacts; Recourse mechanisms for impacted users

or communies; Terminaon of related cloud or vendor services; Users’

emoonal entanglement with GAI funcons.

Human AI Conguraon,

Informaon Security, Value Chain

and Component Integraon

GV-1.7-004

Implement data security and privacy controls for stored decommissioned GAI

systems.

Data Privacy, Informaon Security

GV-1.7-005

Update exisng policies (e.g., enterprise record retenon policies) or establish

new policies for the decommissioning of GAI systems.

AI Actors: AI Deployment, Operaon and Monitoring

*GOVERN 2.1: Roles and responsibilies and lines of communicaon related to mapping, measuring, and managing AI risks are

documented and are clear to individuals and teams throughout the organizaon.

Acon ID

Acon

Risks

GV-2.1-001

Dene acceptable use cases and context under which the organizaon will

design, develop, deploy, and use GAI systems.

GV-2.1-002

Establish policies and procedures for GAI risk acceptance to downstream AI

actors.

Human AI Conguraon, Value

Chain and Component Integraon

GV-2.1-003

Establish policies to idenfy and disclose GAI system incidents to downstream AI

actors, including individuals potenally impacted by GAI outputs.

Human AI Conguraon, Value

Chain and Component Integraon

GV-2.1-004

Establish procedures to engage teams for GAI system incident response with

diverse composion and responsibilies based on the parcular incident type.

Toxicity, Bias, and Homogenizaon

GV-2.1-005

Establish processes to idenfy GAI system incidents and verify the AI actors

conducng these tasks demonstrate and maintain the appropriate skills and

training.

Human AI Conguraon

GV-2.1-006

Verify that incident disclosure plans include sucient GAI system context to

facilitate remediaon acons.

Human AI Conguraon

AI Actors: Governance and Oversight

*GOVERN 3.2: Policies and procedures are in place to dene and dierenate roles and responsibilies for human-AI

conguraons and oversight of AI systems.

Acon ID

Acon

Risks

GV-3.2-001

Bolster oversight of GAI systems with independent audits or assessments, or by

the applicaon of authoritave external standards.

GV-3.2-002

Consider adjustment of organizaonal roles and components across lifecycle

stages of large or complex GAI systems, including: AI actor, user, and community

feedback relang to GAI systems; Audit, validaon, and red-teaming of GAI

systems; GAI content moderaon; Data documentaon, labeling, preprocessing

and tagging; Decommissioning GAI systems; Decreasing risks of emoonal

entanglement between users and GAI systems; Decreasing risks of decepon by

GAI systems; Discouraging anonymous use of GAI systems; Enhancing

explainability of GAI systems; GAI system development and engineering;

Increased accessibility of GAI tools, interfaces, and systems, Incident response

and containment; Overseeing relevant AI actors and digital enes, including

management of security credenals and communicaon between AI enes;

Training GAI users within an organizaon about GAI fundamentals and risks.

Human AI Conguraon,

Informaon Security, Toxicity,

Bias, and Homogenizaon

GV-3.2-003

Dene acceptable use policies for the various categories of GAI interfaces,

modalies, and human-AI conguraons.

Human AI Conguraon

GV-3.2-004

Dene policies for the design of systems that possess human decision-making

powers.

Human AI Conguraon

GV-3.2-005

Establish policies for user feedback mechanisms in GAI systems.

Human AI Conguraon

GV-3.2-006

Establish policies to empower accountable execuves to oversee GAI system

adopon, use, and decommissioning.

GV-3.2-007

Establish processes to include and empower interdisciplinary team member

perspecves across the AI lifecycle.

Toxicity, Bias, and Homogenizaon

GV-3.2-008

Evaluate AI actor teams in consideraon of credenals, demographic

representaon, interdisciplinary diversity, and professional qualicaons.

Human AI Conguraon, Toxicity,

Bias, and Homogenizaon

AI Actors: AI Design

*GOVERN 4.1: Organizaonal policies and pracces are in place to foster a crical thinking and safety-rst mindset in the design,

development, deployment, and uses of AI systems to minimize potenal negave impacts.

Acon ID

Acon

Risks

GV-4.1-001

Establish criteria and acceptable use policies for the use of GAI in decision

making tasks in accordance with organizaonal risk tolerance, and other policies

laid out in the Govern funcon; to include detailed criteria for the kinds of

queries GAI models should refuse to respond to.

Human AI Conguraon

GV-4.1-002

Establish policies and procedures that address connual improvement processes

for risk measurement: Address general risks associated with a lack of

explainability and transparency in GAI systems by using ample documentaon

and techniques such as: applicaon of gradient-based aribuons,

occlusion/term reducon, counterfactual prompts and prompt engineering, and

analysis of embeddings; Assess and update risk measurement approaches at

regular cadences.

GV-4.1-003

Establish policies, procedures, and processes detailing risk measurement in

context of use with standardized measurement protocols and structured public

feedback exercises such as AI red-teaming or independent external audits.

GV-4.1-004

Establish policies, procedures, and processes for oversight funcons (e.g., senior

leadership, legal, compliance, and risk) across the GAI lifecycle, from problem

formulaon and supply chains to system decommission.

Value Chain and Component

Integraon

GV-4.1-005

Establish policies, procedures, and processes that promote eecve challenge of

AI system design, implementaon, and deployment decisions via mechanisms

such as three lines of defense, to minimize risks arising from workplace culture

(e.g., conrmaon bias, funding bias, groupthink, over-reliance on metrics).

Toxicity, Bias, and Homogenizaon

GV-4.1-006

Incorporate GAI governance policies into exisng incident response,

whistleblower, vendor or investment due diligence, acquision, procurement,

reporng or internal audit policies.

Value Chain and Component

Integraon

AI Actors: AI Deployment, AI Design, AI Development, Operaon and Monitoring

*GOVERN 4.2: Organizaonal teams document the risks and potenal impacts of the AI technology they design, develop, deploy,

evaluate, and use, and they communicate about the impacts more broadly.

Acon ID

Acon

Risks

GV-4.2-001

Develop policies, guidelines, and pracces for monitoring organizaonal and

third-party impact assessments (data, labels, bias, privacy, models, algorithms,

errors, provenance techniques, security, legal compliance, output, etc.) to

migate risk and harm.

Confabulaon, Data Privacy,

Informaon Integrity, Informaon

Security, Value Chain and

Component Integraon, Toxicity,

Bias, and Homogenizaon,

Dangerous or Violent

Recommendaons

GV-4.2-002

Establish clear roles and responsibilies for inter-organizaonal incident

response and communicaon for GAI systems that involve mulple organizaons

involved in dierent aspects of the GAI system lifecycle.

GV-4.2-003

Establish clearly dened terms of use and terms of service.

Intellectual Property

GV-4.2-004

Establish criteria for ad-hoc impact assessments based on incident reporng or

new use cases for the GAI system.

GV-4.2-005

Establish organizaonal roles, policies, and procedures for communicang and

reporng GAI system risks and terms of use or service, relevant for dierent AI

actors.

Human AI Conguraon,

Intellectual Property

GV-4.2-006

Establish policies and procedures to document new ways AI actors interact with

the GAI system.

Human AI Conguraon

GV-4.2-007

Establish policies and procedures to monitor compliance with established terms

of service and use.

Intellectual Property

GV-4.2-008

Establish policies to align organizaonal and third-party assessments with

regulatory and legal compliance regarding content provenance.

Informaon Integrity, Value Chain

and Component Integraon

GV-4.2-009

Establish policies to incorporate adversarial examples and other provenance

aacks in AI model training processes to enhance resilience against aacks.

Informaon Integrity, Informaon

Security

GV-4.2-010

Establish processes to monitor and idenfy misuse, unforeseen use cases, risks

of the GAI system and potenal impacts of those risks (leveraging GAI system use

case inventory).

CBRN Informaon, Confabulaon,

Dangerous or Violent

Recommendaons

GV-4.2-011

Implement standardized documentaon of GAI system risks and potenal

impacts.

GV-4.2-012

Include relevant AI Actors in the GAI system risk idencaon process.

Human AI Conguraon

GV-4.2-013

Verify that downstream GAI system impacts (such as the use of third-party plug-

ins) are included in the impact documentaon process.

Value Chain and Component

Integraon

GV-4.2-014

Verify that the organizaonal list of risks related to the use of the GAI system are

updated based on unforeseen GAI system incidents.

AI Actors: AI Deployment, AI Design, AI Development, Operaon and Monitoring

*GOVERN 4.3: Organizaonal pracces are in place to enable AI tesng, idencaon of incidents, and informaon sharing.

Acon ID

Acon

Risks

GV-4.3-001

Allocate resources and adjust adopon, development, and implementaon

meframes to enable independent measurement, connuous monitoring, and

fulsome informaon sharing for GAI system risks.

GV-4.3-002

Develop standardized documentaon templates for ecient review of risk

measurement results.

GV-4.3-003

Establish minimum thresholds for performance and review as part of

deployment approval (“go/”no-go”) policies, procedures, and processes, with

reviewed processes and approval thresholds reecng measurement of GAI

capabilies and risks.

GV-4.3-004

Establish organizaonal roles, policies, and procedures for communicang GAI

system incidents and performance to AI actors and downstream stakeholders, via

community or ocial resources (e.g., AI Incident Database, AVID, AI Ligaon

Database, CVE, OECD Incident Monitor, or others).

Human AI Conguraon, Value

Chain and Component Integraon

GV-4.3-005

Establish policies and procedures for pre-deployment GAI system tesng that

validates organizaonal capability to capture GAI system incident reporng

criteria.

GV-4.3-006

Establish policies, procedures, and processes that bolster independence of risk

management and measurement funcons (e.g., independent reporng chains,

aligned incenves).

GV-4.3-007

Establish policies, procedures, and processes that enable and incenvize in-

context risk measurement via standardized measurement and structured public

feedback approaches.

GV-4.3-008

Organizaonal procedures idenfy the minimum set of criteria necessary for GAI

system incident reporng such as: System ID (auto-generated most likely), Title,

Reporter, System/Source, Data Reported, Date of Incident, Descripon,

Impact(s), Stakeholder(s) Impacted.

AI Actors: Fairness and Bias, Governance and Oversight, Operaon and Monitoring, TEVV

*GOVERN 5.1: Organizaonal policies and pracces are in place to collect, consider, priorize, and integrate feedback from those

external to the team that developed or deployed the AI system regarding the potenal individual and societal impacts related to AI

risks.

Acon ID

Acon

Risks

GV-5.1-001

Allocate me and resources for outreach, feedback, and recourse processes in

GAI system development.

GV-5.1-002

Disclose interacons with GAI systems to users prior to interacve acvies.

Human AI Conguraon

GV-5.1-003

Establish policy, guidelines and processes that: Engage independent experts to

audit models, data sources, licenses, algorithms, and other system components,

Consider sponsoring or engaging in community- based exercises (e.g., bug

bounes, hackathons, compeons) where AI Actors assess and benchmark the

performance of AI systems, including the robustness of content provenance

management under various condions; Document data sources, licenses,

training methodologies, and trade-os considered in the design of AI systems;

Establish mechanisms, plaorms or channels (e.g., user interfaces, web portals,

forums) for independent experts, users, or community members to provide

feedback related to AI systems; Adjudicate and implement relevant feedback at a

regular cadence, Establish transparency mechanisms to track the origin of data

and generated content; Audit and validate these mechanisms.

Human AI Conguraon,

Informaon Integrity, Intellectual

Property

GV-5.1-004

Establish processes to bolster internal AI actor culture in alignment with

organizaonal principles and norms and to empower exploraon of GAI

limitaons beyond development sengs.

Human AI Conguraon, Toxicity,

Bias, and Homogenizaon

GV-5.1-005

Establish the following GAI-specic policies and procedures for independent AI

Actors: Connuous improvement processes for increasing explainability and

migang other risks; Impact assessments, Incenves for internal AI actors to

provide feedback and conduct independent risk management acvies;

Independent management and reporng structures for AI actors engaged in

model and system audit, validaon, and oversight; TEVV processes for the

eecveness of feedback mechanisms employing parcipaon rates, resoluon

me, or similar measurements.

Human AI Conguraon

GV-5.1-006

Provide thorough instrucons for GAI system users to provide feedback and

understand recourse mechanisms.

Human AI Conguraon

GV-5.1-007

Standardize user feedback about GAI system behavior, risks and limitaons for

ecient adjudicaon and incorporaon.

Human AI Conguraon

AI Actors: AI Design, AI Impact Assessment, Aected Individuals and Communies, Governance and Oversight

*GOVERN 6.1: Policies and procedures are in place that address AI risks associated with third-party enes, including risks of

infringement of a third-party’s intellectual property or other rights.

Acon ID

Acon

Risks

GV-6.1-001

Categorize dierent types of GAI content with associated third party risks (i.e.,

Data Privacy, Intellectual Property,

Value Chain and Component

Integraon

GV-6.1-002

Conduct due diligence on third-party enes and end-users from those enes

before entering into agreements with them (e.g., checking references, reviewing

their content handling processes, etc.).

Human AI Conguraon, Value

Chain and Component Integraon

GV-6.1-003

Conduct joint educaonal acvies and events in collaboraon with third-pares

to promote content provenance best pracces.

Informaon Integrity, Value Chain

and Component Integraon

GV-6.1-004

Conduct regular audits of third-party enes to ensure compliance with

contractual agreements.

Value Chain and Component

Integraon

GV-6.1-005

Dene and communicate organizaonal roles and responsibilies for GAI

acquision, human resources, procurement, and talent management processes

in policies and procedures.

Human AI Conguraon

GV-6.1-006

Develop an incident response plan for third pares specically tailored to

address content provenance incidents or breaches and regularly test and update

the incident response plan with feedback form external and third party

stakeholders.

Data Privacy, Informaon

Integrity, Informaon Security,

Value Chain and Component

Integraon

GV-6.1-007

Develop and validate approaches for measuring the success of content

provenance management eorts with third pares (e.g., incidents detected and

response mes).

Informaon Integrity, Value Chain

and Component Integraon

GV-6.1-008

Develop risk tolerance and criteria to quantavely assess and compare the level

of risk associated with dierent third-party enes (i.e., reputaon, track record,

security measure, and the sensivity of the content they handle).

Informaon Security, Value Chain

and Component Integraon

GV-6.1-009

Dra and maintain well-dened contracts and service level agreements (SLAs)

that specify content ownership, usage rights, quality standards, security

requirements, and content provenance expectaons.

Informaon Integrity, Informaon

Security

GV-6.1-010

Establish processes to maintain awareness of evolving risks, technologies, and

best pracces in content provenance management.

Informaon Integrity

GV-6.1-011

Implement a supplier risk assessment framework to connuously evaluate and

monitor third-party enes’ performance and adherence to content provenance

standards and technologies (e.g., digital signatures, watermarks, cryptography,

etc.) to detect anomalies and unauthorized changes; services acquision and

supply chain risk management; legal compliance (e.g., copyright, trademarks,

and data privacy laws).

Data Privacy, Informaon

Integrity, Informaon Security,

Intellectual Property, Value Chain

and Component Integraon

GV-6.1-012

Include audit clauses in contracts that allow the organizaon to verify

compliance with content provenance requirements.

Informaon Integrity

GV-6.1-013

Inventory all third-party enes with access to organizaonal content and

establish approved GAI technology and service provider lists.

Value Chain and Component

Integraon

GV-6.1-014

Maintain detailed records of content provenance, including sources, mestamps,

metadata, and any changes made by third pares.

Informaon Integrity, Value Chain

and Component Integraon

GV-6.1-015

Provide proper training to internal employees on content provenance best

pracces, risks, and reporng procedures.

Informaon Integrity

GV-6.1-016

Update and integrate due diligence processes for GAI acquision and

procurement vendor assessments to include intellectual property, data privacy,

security, and other risks. For example, update policies to: Address roboc

process automaon (RPA), soware-as-a-service (SAAS), and other soluons that

may rely on embedded GAI technologies; Address ongoing audits, assessments,

and alerng, dynamic risk assessments, and real-me reporng tools for

monitoring third-party GAI risks; Address accessibility, accommodaons, or opt-

outs in GAI vendor oerings; Address commercial use of GAI outputs and

secondary use of collected data by third pares; Assess vendor risk controls for

intellectual property infringements and data privacy; Consider policy

adjustments across GAI modeling libraries, tools and APIs, ne-tuned models,

and embedded tools; Establish ownership of GAI acquision and procurement

processes; Include relevant organizaonal funcons in evaluaons of GAI third

pares (e.g., legal, informaon technology (IT), security, privacy, fair lending);

Include instrucon on intellectual property infringement and other third-party

GAI risks in GAI training for AI actors; Screen GAI vendors, open source or

proprietary GAI tools, or GAI service providers against incident or vulnerability

databases; Screen open source or proprietary GAI training data or outputs

against patents, copyrights, trademarks and trade secrets.

Data Privacy, Human AI

Conguraon, Informaon

Security, Intellectual Property,

Value Chain and Component

Integraon, Toxicity, Bias, and

Homogenizaon

GV-6.1-017

Update GAI acceptable use policies to address proprietary and open-source GAI

technologies and data, and contractors, consultants, and other third-party

personnel.

Intellectual Property, Value Chain

and Component Integraon

GV-6.1-018

Update human resource and talent management standards to address

acceptable use of GAI.

Human AI Conguraon

GV-6.1-019

Update third-party contracts, service agreements, and warranes to address GAI

risks; Contracts, service agreements, and similar documents may include GAI-

specic indemnity clauses, dispute resoluon mechanisms, and other risk

controls.

Value Chain and Component

Integraon

AI Actors: Operaon and Monitoring, Procurement, Third-party enes

GOVERN 6.2: Conngency processes are in place to handle failures or incidents in third-party data or AI systems deemed to be

high-risk.

Acon ID

Acon

Risks

GV-6.2-001

Apply exisng organizaonal risk management policies, procedures, and

documentaon processes to third-party GAI data and systems, including open

source data and soware.

Intellectual Property, Value Chain

and Component Integraon

GV-6.2-002

Document downstream GAI system impacts (e.g., the use of third-party plug-ins)

for third party dependencies.

Value Chain and Component

Integraon

GV-6.2-003

Document GAI system supply chain risks to idenfy over-reliance on third party

data or GAI systems and to idenfy fallbacks.

Value Chain and Component

Integraon

GV-6.2-004

Document incidents involving third-party GAI data and systems, including open

source data and soware.

Intellectual Property, Value Chain

and Component Integraon

GV-6.2-005

Enumerate organizaonal GAI system risks based on external dependencies on

third-party data or GAI systems.

Value Chain and Component

Integraon

GV-6.2-006

Establish acceptable use policies that idenfy dependencies, potenal impacts,

and risks associated with third-party data or GAI systems deemed high-risk.

Value Chain and Component

Integraon

GV-6.2-007

Establish conngency and communicaon plans to support fallback alternaves

for downstream users in the event the GAI system is disabled.

Human AI Conguraon, Value

Chain and Component Integraon

GV-6.2-008

Establish incident response plans for third-party GAI technologies deemed high-

risk: Align incident response plans with impacts enumerated in MAP 5.1;

Communicate third-party GAI incident response plans to all relevant AI actors;

Dene ownership of GAI incident response funcons; Rehearse third-party GAI

incident response plans at a regular cadence; Improve incident response plans

based on retrospecve learning; Review incident response plans for alignment

with relevant breach reporng, data protecon, data privacy, or other laws.

Data Privacy, Human AI

Conguraon, Informaon

Security, Value Chain and

Component Integraon, Toxicity,

Bias, and Homogenizaon

GV-6.2-009

Establish organizaonal roles, policies, and procedures for communicang with

data and GAI system providers regarding performance, disclosure of GAI system

inputs, and use of third-party data and GAI systems.

Human AI Conguraon, Value

Chain and Component Integraon

GV-6.2-010

Establish policies and procedures for connuous monitoring of third-party GAI

systems in deployment.

Value Chain and Component

Integraon

GV-6.2-011

Establish policies and procedures that address GAI data redundancy, including

model weights and other system arfacts.

Toxicity, Bias, and Homogenizaon

GV-6.2-012

Establish policies and procedures to test and manage risks related to rollover and

fallback technologies for GAI systems, acknowledging that rollover and fallback

may include manual processing.

GV-6.2-013

Idenfy and document high-risk third-party GAI technologies in organizaonal AI

inventories, including open-source GAI soware.

Intellectual Property, Value Chain

and Component Integraon

GV-6.2-014

Review GAI vendor documentaon for thorough instrucons, meaningful

transparency into data or system mechanisms, ample support and contact

informaon, and alignment with organizaonal principles.

Value Chain and Component

Integraon, Toxicity, Bias, and

Homogenizaon

GV-6.2-015

Review GAI vendor release cadences and roadmaps for irregularies and

alignment with organizaonal principles.

Value Chain and Component

Integraon, Toxicity, Bias, and

Homogenizaon

GV-6.2-016

Review vendor contracts and avoid arbitrary or capricious terminaon of crical

GAI technologies or vendor services and Non-standard terms that may amplify or

defer liability in unexpected ways and Unauthorized data collecon by vendors or

third-pares (e.g., secondary data use); Consider: Clear assignment of liability and

responsibility for incidents, GAI system changes over me (e.g., ne-tuning, dri,

decay); Request: Nocaon and disclosure for serious incidents arising from

third-party data and systems, Service line agreements (SLAs) in vendor contracts

that address incident response, response mes, and availability of crical support.

Human AI Conguraon,

Informaon Security, Value Chain

and Component Integraon

AI Actors: AI Deployment, Operaon and Monitoring, TEVV, Third-party enes

*MAP 1.1: Intended purposes, potenally benecial uses, context specic laws, norms and expectaons, and prospecve sengs in

which the AI system will be deployed are understood and documented. Consideraons include: the specic set or types of users

along with their expectaons; potenal posive and negave impacts of system uses to individuals, communies, organizaons,

society, and the planet; assumpons and related limitaons about AI system purposes, uses, and risks across the development or

product AI lifecycle; and related TEVV and system metrics.

Acon ID

Acon

Risks

MP-1.1-001

Apply risk mapping and measurement plans to third-party and open-source

systems.

Intellectual Property, Value Chain

and Component Integraon

MP-1.1-002

Collaborate with domain experts to explore and document gaps, limitaons,

and risks in pre-deployment tesng and the praccal and contextual dierences

between pre-deployment tesng and the ancipated context(s) of use.

MP-1.1-003

Conduct impact assessments or review past known incidents and failure modes

to priorize and inform risk measurement.

MP-1.1-004

Determine and document the expected and acceptable GAI system context of

use in collaboraon with socio-cultural and other domain experts, by assessing:

Assumpons and limitaons; Direct value to the organizaon; Intended

operaonal environment and observed usage paerns; Potenal posive and

negave impacts to individuals, public safety, groups, communies,

organizaons, democrac instuons, and the physical environment; Social

norms and expectaons.

Toxicity, Bias, and Homogenizaon

MP-1.1-005

Document GAI system ownership, intended use, direct organizaonal value, and

assumpons and limitaons.

MP-1.1-006

Document risk measurement plans that address: Individual and group cognive

biases (e.g., conrmaon bias, funding bias, groupthink) for AI actors involved in

the design, implementaon, and use of GAI systems; Known past GAI system

incidents and failure modes; In-context use and foreseeable misuse, abuse, and

o-label use; Over reliance on quantave metrics and methodologies without

sucient awareness of their limitaons in the context(s) of use; Risks associated

with trustworthy characteriscs across the AI lifecycle; Standard measurement

and structured human feedback approaches; Ancipated human-AI

conguraons.

Human AI Conguraon, Toxicity,

Bias, and Homogenizaon,

Dangerous or Violent

Recommendaons

MP-1.1-007

Document risks related to transparency, accountability, explainability, and

interpretability in risk measurement plans, system risk assessments, and

deployment approval (“go”/”no-go”) decisions.

MP-1.1-008

Document system requirements, ownership, and AI actor roles and

responsibilies for human oversight of GAI systems.

Human AI Conguraon

MP-1.1-009

Document the extent to which a lack of transparency or explainability impedes

risk measurement across the AI lifecycle.

MP-1.1-010

Idenfy and document foreseeable illegal uses or applicaons that surpass

organizaonal risk tolerances.

AI Actors: AI Deployment

*MAP 1.2: Interdisciplinary AI actors, competencies, skills, and capacies for establishing context reect demographic diversity and

broad domain and user experience experse, and their parcipaon is documented. Opportunies for interdisciplinary

collaboraon are priorized.

Acon ID

Acon

Risks

MP-1.2-001

Document the credenals and qualicaons of organizaonal AI actors and AI

actor team composion.

Human AI Conguraon

MP-1.2-002

Establish and empower interdisciplinary teams that reect a wide range of

capabilies, competencies, demographic groups, domain experse, educaonal

backgrounds, lived experiences, professions, and skills across the enterprise to

inform and conduct TEVV of GAI technology, and other risk measurement and

management funcons.

Human AI Conguraon, Toxicity,

Bias, and Homogenizaon

MP-1.2-003

Establish connuous improvement processes to increase diversity and

representaveness in AI actor teams, standard measurement resources, and

structured public feedback parcipants from subgroup populaons in-context.

Human AI Conguraon, Toxicity,

Bias, and Homogenizaon

MP-1.2-004

Verify that AI actor team membership includes demographic diversity,

applicable domain experse, varied educaon backgrounds, and lived

experiences.

Human AI Conguraon, Toxicity,

Bias, and Homogenizaon

MP-1.2-005

Verify that data or benchmarks used in risk measurement, and users,

parcipants, or subjects involved in structured public feedback exercises are

representave of diverse in-context user populaons.

Human AI Conguraon, Toxicity,

Bias, and Homogenizaon

AI Actors: AI Deployment

*MAP 2.1: The specic tasks and methods used to implement the tasks that the AI system will support are dened (e.g., classiers,

generave models, recommenders).

Acon ID

Acon

Risks

MP-2.1-001

Dene GAI system's task(s) that relate to content provenance, such as original

content creaon, media synthesis, or data augmentaon while incorporang

tracking measures.

Informaon Integrity

MP-2.1-002

Establish known assumpons and pracces for determining data origin and

content lineage, for documentaon and evaluaon.

Informaon Integrity

MP-2.1-003

Idenfy and document GAI task limitaons that might impact the reliability or

authencity of the content provenance.

Informaon Integrity

MP-2.1-004

Instute audit trails for data and content ows within the system, including but

not limited to, original data sources, data transformaons, and decision-making

criteria.

MP-2.1-005

Review ecacy of content provenance techniques on a regular basis and

update protocols as necessary.

Informaon Integrity

AI Actors: TEVV

MAP 2.2: Informaon about the AI system’s knowledge limits and how system output may be ulized and overseen by humans is

documented. Documentaon provides sucient informaon to assist relevant AI actors when making decisions and taking

subsequent acons.

Acon ID

Acon

Risks

MP-2.2-001

Assess whether the GAI system fullls its intended purpose within its

operaonal context on a regular basis.

MP-2.2-002

Evaluate whether GAI operators and end-users can accurately understand

content lineage and origin.

Human AI Conguraon,

Informaon Integrity

MP-2.2-003

Idenfy and document how the system relies on upstream data sources for

content provenance and if it serves as an upstream dependency for other

systems.

Informaon Integrity, Value Chain

and Component Integraon

MP-2.2-004

Observe and analyze how the AI system interacts with external networks, and

idenfy any potenal for negave externalies, parcularly where content

provenance might be compromised.

Informaon Integrity

MP-2.2-005

Specify the environments where GAI systems may not funcon as intended

related to content provenance.

Informaon Integrity

AI Actors: End Users

*MAP 2.3: Scienc integrity and TEVV consideraons are idened and documented, including those related to experimental

design, data collecon and selecon (e.g., availability, representaveness, suitability), system trustworthiness, and construct

validaon

Acon ID

Acon

Risks

MP-2.3-001

Assess the accuracy, quality, reliability, and authencity of the GAI content

provenance by comparing it to a set of known ground truth data and by using a

variety of evaluaon methods (e.g., human oversight and automated

evaluaon).

Informaon Integrity

MP-2.3-002

Curate and maintain high quality datasets that are accurate, relevant,

consistent, and representave as well as be well-documented complying with

ethical and legal standards along with diverse data points.

Toxicity, Bias, and Homogenizaon

MP-2.3-003

Deploy and document fact-checking techniques to verify the accuracy and

veracity of informaon generated by GAI systems, especially when the

informaon comes from mulple (or unknown) sources.

Informaon Integrity

MP-2.3-004

Design GAI systems to support content provenance such as tracking the lineage

(e.g., data sources used to train the system, parameters used to generate

content, etc.) and to verify authencity (e.g., using digital signatures or

watermarks).

Informaon Integrity

MP-2.3-005

Develop and implement tesng techniques to idenfy any GAI produced

content (e.g., synthec media) that might be indisnguishable from human-

generated content.

Informaon Integrity

MP-2.3-006

Document GAI content provenance techniques (including experimental

methods), tesng, evaluaon, performance, and validaon metrics throughout

the AI lifecycle.

Informaon Integrity

MP-2.3-007

Implement plans for GAI systems to undergo regular adversarial tesng to

idenfy vulnerabilies and potenal manipulaon risks.

Informaon Security

MP-2.3-008

Integrate GAI systems with exisng content management and version control

systems, to enable content provenance to be tracked across the lifecycle.

Informaon Integrity

MP-2.3-009

Test GAI models using known inputs, context, and environment to conrm they

produce expected outputs across a variety of methods (e.g., unit tests,

integraon tests, and system tests) and help to idenfy and address potenal

problems.

MP-2.3-010

Use diverse large-scale and small-scale datasets for tesng and evaluaon to

ensure that the AI system can perform well on a variety of dierent types of

data.

Toxicity, Bias, and Homogenizaon

MP-2.3-011

Verify that GAI content provenance is accurate and reliable by using

cryptographic techniques and performing formal audits to ensure it has not

been manipulated.

Informaon Integrity

MP-2.3-012

Verify that the AI system’s content provenance complies with relevant laws and

regulaons, such as legal infringement, terms and condions, copyright and

intellectual property rights, when using data sources and generang content.

Informaon Integrity, Intellectual

Property

AI Actors: AI Development, Domain Experts, TEVV

MAP 3.4: Processes for operator and praconer prociency with AI system performance and trustworthiness – and relevant

technical standards and cercaons – are dened, assessed, and documented.

Acon ID

Acon

Risks

MP-3.4-001

Adapt exisng training programs to include modules on content provenance.

Informaon Integrity

MP-3.4-002

Develop cercaon programs that test prociency in managing AI risks and

interpreng content provenance, relevant to specic industry and context.

Informaon Integrity

MP-3.4-003

Delineate human prociency tests from tests of AI capabilies.

Human-AI Conguraon

MP-3.4-004

Integrate human and other qualitave inputs to comprehensively assess

content provenance.

Informaon Integrity

MP-3.4-005

Ensure that output provided to operators and praconers is both interacve

and well-dened, incorporang content provenance data that can be easily

interpreted for eecve downstream decision-making.

Informaon Integrity, Value Chain

and Component Integraon

MP-3.4-006

Establish and adhere to design principles that ensure safe and ethical operaon,

taking into account the interpretaon of content provenance informaon.

Informaon Integrity, Toxicity, Bias,

and Homogenizaon, Dangerous or

Violent Recommendaons

MP-3.4-007

Implement systems to connually monitor and track the outcomes of human-AI

collaboraons for future renement and improvements, integrang a focus on

content provenance wherever applicable.

Human AI Conguraon,

Informaon Integrity

MP-3.4-008

Involve the end-users, praconers, and operators in AI system prototyping and

tesng acvies. Make sure these tests cover various scenarios where content

provenance could play a crical role, such as crisis situaons or ethically

sensive contexts.

Human AI Conguraon,

Informaon Integrity, Toxicity, Bias,

and Homogenizaon

MP-3.4-009

Match the complexity of GAI system explanaons and the provenance data to

the level of the problem and contextual intricacy.

Informaon Integrity

AI Actors: AI Design, AI Development, Domain Experts, End-Users, Human Factors, Operaon and Monitoring

*MAP 4.1: Approaches for mapping AI technology and legal risks of its components – including the use of third-party data or

soware – are in place, followed, and documented, as are risks of infringement of a third party’s intellectual property or other

rights.

Acon ID

Acon

Risks

MP-4.1-001

Conduct audits on third-party processes and personnel including an examinaon

of the third-party’s reputaon.

Value Chain and Component

Integraon

MP-4.1-002

Conduct periodic audits and monitor AI generated content for privacy risks;

address any possible instances of sensive data exposure.

Data Privacy

MP-4.1-003

Consider using synthec data as applicable to train AI models in place of real-

world data to match the stascal properes of real-world data without

disclosing personally idenable informaon.

MP-4.1-004

Develop pracces for periodic monitoring of GAI outputs for possible intellectual

property infringements and other risks and implement processes for responding

to potenal intellectual property infringement claims.

Intellectual Property

MP-4.1-005

Document all aspects of the AI development process including data sources,

model architectures and training procedures to support reproducon of results,

idenfy any potenal problems, and implement migaon strategies.

MP-4.1-006

Document compliance with legal requirements across the AI lifecycle, including

Data Privacy, Intellectual Property

MP-4.1-007

Document training data curaon policies, including policies to verify that

consent was obtained for the likeness or image of individuals.

Obscene, Degrading, and/or

Abusive Content

MP-4.1-008

Employ encrypon techniques and proper safeguards to ensure secure data

storage and transfer to protect data privacy.

Data Privacy, Informaon Security,

Dangerous or Violent

Recommendaons

MP-4.1-009

Establish policies for collecon, retenon, and minimum quality of data, in

consideraon of the following risks: Disclosure of CBRN informaon by removing

CBRN informaon from training data, Use of Illegal or dangerous content;

Training data imbalance across sub-groups by modality, such as languages for

LLMs or skin tone for image generaon; Leak of personally idenable

informaon, including facial likenesses of individuals unless consent is obtained

for use of their images.

CBRN Informaon, Intellectual

Property, Toxicity, Bias, and

Homogenizaon, Dangerous or

Violent Recommendaons, Data

Privacy

MP-4.1-010

Implement bias migaon approaches by addressing sources of bias in the

training data and by evaluang AI models for bias periodically.

Toxicity, Bias, and Homogenizaon

MP-4.1-011

Implement policies and pracces dening how third-party intellectual property

and training data will be used, stored, and protected.

Intellectual Property, Value Chain

and Component Integraon

MP-4.1-012

Implement reproducibility techniques, including: share data publicly or privately

using license and citaon; develop code according to standard soware

pracces; track and document experiments and results; manage the soware

environment and dependencies; ulize virtual environments, version control,

and maintain a requirements document; manage models and arfacts; tracking

AI model versions and documenng model details along with parameters and

experimental results; document data management processes and establish a

tesng/validaon process to maintain reliable results.

Confabulaon, Intellectual

Property, Value Chain and

Component Integraon

MP-4.1-013

Re-evaluate models that were ne-tuned on top of third-party models.

Value Chain and Component

Integraon

MP-4.1-014

Re-evaluate risks when adapng GAI models to new domains.

MP-4.1-015

Review service level agreements and contracts, including license agreements

and any legal documents associated with the third-party intellectual properes,

technologies, and services.

Intellectual Property, Value Chain

and Component Integraon

MP-4.1-016

Use approaches to detect the presence of sensive data in generated output

text, image, video, or audio, and verify that the model will mask any detected

sensive data.

Informaon Integrity

MP-4.1-017

Use trusted sources for training data that are licensed or open source and

ensure that the enty has the legal right for the use of proprietary training data.

Intellectual Property

MP-4.1-018

Apply strong anonymizaon and de-idencaon, and/or dierenal privacy

techniques to protect the privacy of individuals in the training data.

Data Privacy

MP-4.1-019

Verify that third-party models are in compliance with exisng use licenses.

Intellectual Property, Value Chain

and Component Integraon

AI Actors: Governance and Oversight, Operaon and Monitoring, Procurement, Third-party enes

*MAP 5.1: Likelihood and magnitude of each idened impact (both potenally benecial and harmful) based on expected use,

past uses of AI systems in similar contexts, public incident reports, feedback from those external to the team that developed or

deployed the AI system, or other data are idened and documented.

Acon ID

Acon

Risks

MP-5.1-001

Apply TEVV pracces for content provenance (e.g., probing a system's synthec

data generaon capabilies for potenal misuse or vulnerabilies using zero-

knowledge proof approaches).

Informaon Integrity, Informaon

Security

MP-5.1-002

Assess and document risks related to content provenance. e.g., document the

presence, absence, or eecveness of tagging systems, cryptographic hashes,

blockchain-based, or distributed ledger technology soluons that improve

content tracking transparency and immutability.

Informaon Integrity

MP-5.1-003

Consider GAI-specic mapped risks (e.g., complex security requirements,

potenal for emoonal entanglement of users, large supply chains) in esmates

for likelihood, magnitude of impact and risk.

Human AI Conguraon,

Informaon Security, Value Chain

and Component Integraon

MP-5.1-004

Document esmates of likelihood, magnitude of impact, and risk for GAI

systems in a central repository (e.g., organizaonal AI inventory.).

MP-5.1-005

Enumerate potenal impacts related to content provenance, including best-

case, average-case, and worst-case scenarios.

Informaon Integrity

MP-5.1-006

Esmate likelihood of enumerated impact scenarios using past data or expert

judgment, analysis of known public incidents, standard measurement, and

structured human feedback results.

CBRN Informaon, Dangerous or

Violent Recommendaons

MP-5.1-007

Measure risk as the product of esmated likelihood and magnitude of impact of

a GAI outcome.

MP-5.1-008

Priorize risk acceptance, management, or transfer acvies based on risk

esmates.

MP-5.1-009

Priorize standard measurement and structured public feedback processes

based on risk assessment esmates.

MP-5.1-010

Prole risks arising from GAI systems interacng with, manipulang, or

generang content, and outlining known and potenal vulnerabilies and the

likelihood of their occurrence.

Informaon Security

MP-5.1-011

Scope GAI applicaons narrowly to enable risk-based governance and controls.

AI Actors: AI Deployment, AI Design, AI Development, AI Impact Assessment, Aected Individuals and Communies, End-Users,

Operaon and Monitoring

MAP 5.2: Pracces and personnel for supporng regular engagement with relevant AI actors and integrang feedback about

posive, negave, and unancipated impacts are in place and documented.

Acon ID

Acon

Risks

MP-5.2-001

Determine context-based measures to idenfy if new impacts are present due to

the GAI system, including regular engagements with downstream AI actors to

idenfy and quanfy new contexts of unancipated impacts of GAI systems.

Human AI Conguraon, Value

Chain and Component Integraon

MP-5.2-002

Plan regular engagements with AI actors responsible for inputs to GAI systems,

including third-party data and algorithms, to review and evaluate unancipated

impacts.

Human AI Conguraon, Value

Chain and Component Integraon

MP-5.2-003

Publish guidance for external AI actors to report unancipated impacts of the

GAI system and to engage with the organizaon in the event of GAI system

impacts.

Human AI Conguraon

AI Actors: AI Deployment, AI Design, AI Impact Assessment, Aected Individuals and Communies, Domain Experts, End-Users,

Human Factors, Operaon and Monitoring

*MEASURE 1.1: Approaches and metrics for measurement of AI risks enumerated during the MAP funcon are selected for

implementaon starng with the most signicant AI risks. The risks or trustworthiness characteriscs that will not – or cannot – be

measured are properly documented.

Acon ID

Acon

Risks

MS-1.1-001

Assess the eecveness of implemented methods and metrics at an ongoing

cadence as part of connuous improvement acvies.

MS-1.1-002

Collaborate with muldisciplinary experts (e.g., in the elds of responsible use

of GAI, cybersecurity, or digital forensics) to ensure the selected risk

management approaches are robust and eecve.

Informaon Security; CBRN

Informaon, Toxicity, Bias, and

Homogenizaon

MS-1.1-003

Conduct adversarial role-playing exercises, AI red-teaming, or chaos tesng to

idenfy anomalous or unforeseen failure modes.

Informaon Security, Unknowns

MS-1.1-004

Conduct tradional assessment or TEVV exercises to measure the prevalence of

known risks in deployment contexts.

MS-1.1-005

Document GAI risk measurement or tracking approaches, including tracking of

risks that cannot be easily measured before deployment (e.g., ecosystem-level

risks or risks that unfold over longer me scales).

MS-1.1-006

Employ digital signatures and watermarking, blockchain technology, reverse

image and video search, metadata analysis, steganalysis, and/or forensic

analysis to trace the origin and modicaons of digital content.

Informaon Integrity

MS-1.1-007

Employ similarity metrics, tampering indicators, blockchain conrmaon,

metadata consistency, hidden data detecon rate, source reliability, and

consistency with known paerns to measure content provenance risks.

Informaon Integrity

MS-1.1-008

Idenfy content provenance risks in the end-to-end AI supply chain, including

risks associated with data suppliers, data annotators, R&D, joint ventures,

academic or nonprot projects/partners, third party vendors, and contractors.

Informaon Integrity, Value Chain

and Component Integraon

MS-1.1-009

Idenfy potenal content provenance risks and harms in GAI, such as

misinformaon or disinformaon, deepfakes, including NCII, or tampered

content. Enumerate and rank risks and/or harms based on their likelihood and

potenal impact, and determine how well provenance soluons address

specic risks and/or harms.

Informaon Integrity, Dangerous or

Violent Recommendaons,

Obscene, Degrading, and/or

Abusive Content

MS-1.1-010

Implement appropriate approaches and metrics for measuring AI-related

content provenance the and the aforemenoned risks and harms.

Informaon Integrity, Dangerous or

Violent Recommendaons

MS-1.1-011

Integrate tools designed to analyze content provenance and detect data

anomalies, verify the authencity of digital signatures, and idenfy paerns

associated with misinformaon or manipulaon.

Informaon Integrity

MS-1.1-012

Invest in R&D capabilies to evaluate and implement novel methods and

technologies for the measurement of AI-related risks in content provenance,

toxicity, and CBRN.

Informaon Integrity, CBRN

Informaon, Obscene, Degrading,

and/or Abusive Content

MS-1.1-013

Priorize risk measurement according to risk severity as determined during

mapping acvies.

MS-1.1-014

Provide content provenance risk management educaon to AI actors, users, and

stakeholders.

Human AI Conguraon,

Informaon Integrity

MS-1.1-015

Track and document risks or opportunies related to content provenance that

cannot be measured quantavely, including explanaons as to why some risks

cannot be measured (e.g., due to technological limitaons, resource

constraints, or trustworthy consideraons).

Informaon Integrity

MS-1.1-016

Track the number of output data items that are accompanied by provenance

informaon (e.g., watermarks, cryptographic tags).

Informaon Integrity

MS-1.1-017

Track the number of training and input (e.g., prompts) data items that have

provenance records and output data items that potenally infringe on

intellectual property rights.

Informaon Integrity, Intellectual

Property

MS-1.1-018

Track the number of training and input data items covered by intellectual

property rights (e.g., copyright, trademark, trade secret).

Intellectual Property

MS-1.1-019

Validate the reliability and integrity of the original data and measure inherent

dependence on training data and its quality.

AI Actors: AI Development, Domain Experts, TEVV

*MEASURE 1.3: Internal experts who did not serve as front-line developers for the system and/or independent assessors are

involved in regular assessments and updates. Domain experts, users, AI actors external to the team that developed or deployed the

AI system, and aected communies are consulted in support of assessments as necessary per organizaonal risk tolerance

Acon ID

Acon

Risks

MS-1.3-001

Dene relevant groups of interest (e.g., demographic groups, subject maer

experts, past experience with GAI technology) within the context of use as part

of plans for gathering structured public feedback.

Human AI Conguraon, Toxicity,

Bias, and Homogenizaon, CBRN

MS-1.3-002

Dene sequence of acons for AI red-teaming exercises and accompanying

necessary documentaon pracces.

MS-1.3-003

Dene use cases, contexts of use, capabilies, and negave impacts where

structured human feedback exercises, e.g., AI red-teaming, would be most

benecial for AI risk measurement and management based on the context of

use.

MS-1.3-004

Develop a suite of suitable metrics to evaluate structured feedback results,

informed by representave AI actors.

Human AI Conguraon, Toxicity,

Bias, and Homogenizaon, CBRN

MS-1.3-005

Execute independent audit, AI red-teaming, impact assessments, or other

structured human feedback processes in consultaon with representave AI

actors with experse and familiarity in the context of use, and/or who are

representave of the populaons associated with the context of use.

Human AI Conguraon, Toxicity,

Bias, and Homogenizaon, CBRN

MS-1.3-006

Idenfy and implement methods for post-hoc evaluaon of the eecveness of

structured human feedback processes such as auding, impact assessments,

and AI red-teaming.

MS-1.3-007

Idenfy and implement methods for translang, evaluang, and integrang

structured human feedback output into AI risk management processes,

connuous improvement processes, and related organizaonal decision

making.

MS-1.3-008

Idenfy criteria for determining when structured human feedback exercises are

complete.

MS-1.3-009

Idenfy mechanisms and teams to evaluate or other structured human

feedback outcomes.

MS-1.3-010

Recruit auditors, AI red-teams, and structured feedback parcipants in

consideraon of the linguisc, dialectal, and socio-cultural environment of the

expected user base.

Human AI Conguraon

MS-1.3-011

Share structured feedback with relevant AI actors to address idened risks.

Human AI Conguraon

MS-1.3-012

Verify demographic diversity of idened subgroups in structured feedback

exercises.

Toxicity, Bias, and Homogenizaon

MS-1.3-013

Verify those conducng structured human feedback exercises are not directly

involved in system development tasks for the same GAI model.

AI Actors: AI Deployment, AI Development, AI Impact Assessment, Aected Individuals and Communies, Domain Experts, End-

Users, Operaon and Monitoring, TEVV

*MEASURE 2.2: Evaluaons involving human subjects meet applicable requirements (including human subject protecon) and are

representave of the relevant populaon.

Acon ID

Acon

Risks

MS-2.2-001

Assess and manage stascal biases related to GAI content provenance through

techniques such as re-sampling, re-weighng, or adversarial training.

Informaon Integrity, Informaon

Security, Toxicity, Bias, and

Homogenizaon

MS-2.2-002

Disaggregate evaluaon metrics by demographic factors to idenfy any

discrepancies in how content provenance mechanisms work across diverse

populaons.

Informaon Integrity, Toxicity, Bias,

and Homogenizaon

MS-2.2-003

Document how content provenance mechanisms are operated in the context of

privacy and security including: Anonymize data to protect the privacy of human

subjects; Remove any personally idenable informaon (PII) to prevent

potenal harm or misuse.

Data Privacy, Human AI

Conguraon, Informaon

Integrity, Informaon Security,

Dangerous or Violent

Recommendaons

MS-2.2-004

Employ techniques like chaos engineering and stakeholder feedback to evaluate

the quality and integrity of data used in training and the provenance of AI-

generated content.

Informaon Integrity

MS-2.2-005

Idenfy biases present in the training data for downstream migaon using

available techniques (e.g., data visualizaon tools).

Value Chain and Component

Integraon, Toxicity, Bias, and

Homogenizaon

MS-2.2-006

Implement connuous monitoring of GAI system impacts to idenfy whether

GAI outputs are equitable across various sub-populaons. Seek acve and

direct feedback from aected communies to idenfy issues and improve GAI

system fairness.

Toxicity, Bias, and Homogenizaon

MS-2.2-007

Implement robust cybersecurity measures to protect both the research data,

the GAI system and its content provenance from unauthorized access,

breaches, or tampering and unauthorized disclosure of human subject

informaon.

Data Privacy, Human AI

Conguraon, Informaon

Integrity, Informaon Security

MS-2.2-008

Obtain informed consent from human subject evaluaon parcipants. Informed

consent should include: the nature of the study, informaon about the use of

GAI related to content provenance, its purpose, and potenal implicaons.

Data Privacy, Human AI

Conguraon, Informaon

Integrity

MS-2.2-010

Pracce responsible disclosure of ndings and report discovered vulnerabilies

or biases related to GAI systems and its content provenance.

Informaon Integrity, Informaon

Security, Toxicity, Bias, and

Homogenizaon

MS-2.2-011

Provide human subjects with opons to revoke their consent for future use of

their data in GAI applicaons, parcularly in content provenance aspects.

Data Privacy, Human AI

Conguraon, Informaon

Integrity

MS-2.2-012

Use Instuonal Review Boards as applicable for evaluaons that involve

human subjects.

Human AI Conguraon

MS-2.2-013

Use techniques such as anonymizaon or dierenal privacy to minimize the

risks associated with linking AI-generated content back to individual human

subjects.

Data Privacy, Human AI

Conguraon

MS-2.2-014

Verify accountability and fairness through documentaon of the algorithms,

parameters, and methodologies used in the evaluaon to allow for external

scruny.

Toxicity, Bias, and Homogenizaon

MS-2.2-015

Verify that human subjects selected for evaluaon are representave of the

populaon for the relevant GAI use-case; Consider demographics such as age,

gender, race, ethnicity, socioeconomic status, and geographical locaon to

avoid biases in the AI system related to content provenance.

Human AI Conguraon,

Informaon Integrity, Toxicity, Bias,

and Homogenizaon

MS-2.2-016

Work in close collaboraon with domain experts to understand the specic

requirements and potenal pialls related to content provenance in the GAI

system's intended context of use.

Informaon Integrity

AI Actors: AI Development, Human Factors, TEVV

*MEASURE 2.3: AI system performance or assurance criteria are measured qualitavely or quantavely and demonstrated for

condions similar to deployment seng(s). Measures are documented.

Acon ID

Acon

Risks

MS-2.3-001

Analyze dierences between intended and actual populaon of users or data

subjects, including likelihood for errors, incidents, or negave impacts.

Confabulaon, Human AI

Conguraon, Informaon

Integrity

MS-2.3-002

Conduct eld tesng on sampled sub-populaons prior to deployment to the

enre populaon.

MS-2.3-003

Conduct TEVV in the operaonal environment in accordance with organizaonal

policies and regulatory or disciplinary requirements (e.g., informed consent,

instuonal review board approval, human research protecons, privacy

requirements).

Data Privacy

MS-2.3-004

Consider baseline model performance on suites of benchmarks when selecng a

model for ne tuning.

MS-2.3-005

Evaluate claims of model capabilies using empirically validated methods.

MS-2.3-006

Include metrics measuring reporng rates for harmful or oensive content in

eld tesng.

Dangerous or Violent

Recommendaons

MS-2.3-007

Share results of pre-deployment tesng with relevant AI actors, such as those

with system release approval authority.

Human AI Conguraon

MS-2.3-008

Use disaggregated evaluaon methods (e.g., by race, age, gender, ethnicity,

ability, region) to improve granularity of AI system performance measures.

MS-2.3-009

Ulize a purpose-built tesng environment such as NIST Dioptra to empirically

evaluate GAI trustworthy characteriscs.

MS-2.3-010

Verify that mechanisms to collect users’ feedback are visible and traceable.

Human AI Conguraon

AI Actors: AI Deployment, TEVV

*MEASURE 2.5: The AI system to be deployed is demonstrated to be valid and reliable. Limitaons of the generalizability beyond

the condions under which the technology was developed are documented.

Acon ID

Acon

Risks

MS-2.5-001

Apply standard measurement and structured human feedback approaches to

internally-developed and third-party GAI systems.

Value Chain and Component

Integraon

MS-2.5-002

Avoid extrapolang GAI system performance or capabilies from narrow, non-

systemac, and anecdotal assessments.

MS-2.5-003

Conduct security assessments and audits to measure the integrity of training

data, system soware, and system outputs.

Informaon Security

MS-2.5-004

Document the construct validity of methodologies employed in GAI systems

relave to their context of use.

MS-2.5-005

Document the extent to which human domain knowledge is employed to

improve GAI system performance, via, e.g., RLHF, ne-tuning, content

moderaon, business rules.

MS-2.5-006

Establish metrics or KPIs to determine whether GAI systems meet minimum

performance standards for reliability and validity.

MS-2.5-007

Measure, monitor, and document prevalence of erroneous GAI output content,

system availability, and reproducibility of outcomes via eld tesng or other

randomized controlled experiments.

MS-2.5-008

Review and verify sources and citaons in GAI system outputs during pre-

deployment risk measurement and ongoing monitoring acvies.

Confabulaon

MS-2.5-009

Track and document instances of anthropomorphizaon (e.g., human images,

menons of human feelings, cyborg imagery or mofs) in GAI system interfaces.

Human AI Conguraon

MS-2.5-010

Track and document relevant version numbers, planned updates, hoixes, and

other GAI system change management informaon.

MS-2.5-011

Update standard train/test model evaluaon processes for GAI systems. Consider:

Unwanted or undocumented overlaps in train and TEVV data sources, including

their negave spaces (i.e., what is not represented in both); Employing substring

matching or embedding distance approaches to assess similarity across data

parons.

MS-2.5-012

Verify GAI system training data and TEVV data provenance, and that ne-tuning

data is grounded.

Informaon Integrity

AI Actors: Domain Experts, TEVV

*MEASURE 2.6: The AI system is evaluated regularly for safety risks – as idened in the MAP funcon. The AI system to be

deployed is demonstrated to be safe, its residual negave risk does not exceed the risk tolerance, and it can fail safely, parcularly if

made to operate beyond its knowledge limits. Safety metrics reect system reliability and robustness, real-me monitoring, and

response mes for AI system failures.

Acon ID

Acon

Risks

MS-2.6-001

Assess adverse impacts health and wellbeing impacts for supply chain or other AI

actors that are exposed to obscene, toxic, or violent informaon during the

course of GAI training and maintenance.

Human AI Conguraon, Obscene,

Degrading, and/or Abusive

Content, Value Chain and

Component Integraon,

Dangerous or Violent

Recommendaons

MS-2.6-002

Assess levels of toxicity, intellectual property infringement, data privacy

violaons, obscenity, extremism, violence, or CBRN informaon in system training

data.

Data Privacy, Intellectual Property,

Obscene, Degrading, and/or

Abusive Content, Toxicity, Bias,

and Homogenizaon, Dangerous

or Violent Recommendaons,

CBRN Informaon

MS-2.6-003

Measure and document incident response mes, system down mes, and system

availability: Perform standard measurement and structured human feedback on

GAI systems to detect safety and reliability impacts and harms; Apply human

subjects research protocols and other applicable safety controls when conducng

A/B tesng, AI red-teaming, focus groups, or human testbed measurements;

Idenfy and document any applicaons related to robocs, RPA, and autonomous

vehicles; Conduct AI red-teaming exercises to idenfy harms and impacts related

to safety and validity, reliability, privacy, toxicity and other risks; Monitor high-risk

GAI systems connually for safety and reliability risks once deployed; Monitor GAI

systems to detect dri and anomalies relave to expected performance and

training baselines.

Data Privacy, Human AI

Conguraon, Toxicity, Bias, and

Homogenizaon, Dangerous or

Violent Recommendaons

MS-2.6-004

Re-evaluate safety features of ne-tuned models when the risk of harm exceeds

organizaonal risk tolerance.

Dangerous or Violent

Recommendaons

MS-2.6-005

Review GAI system outputs for validity and safety: Review generated code to

assess risks that may arise from unreliable downstream decision-making.

Value Chain and Component

Integraon, Dangerous or Violent

Recommendaons

MS-2.6-006

Track and document past failed GAI system designs to inform risk measurement

for safety and validity risks.

Dangerous or Violent

Recommendaons

MS-2.6-007

Verify capabilies for liming, pausing, updang, or terminang GAI systems

quickly.

MS-2.6-008

Verify rollover, fallback, or redundancy capabilies for high-risk GAI systems.

MS-2.6-009

Verify that GAI system architecture can monitor outputs and performance, and

handle, recover from, and repair errors when security anomalies, threats and

impacts are detected.

Confabulaon, Informaon

Integrity, Informaon Security

MS-2.6-010

Verify that systems properly handle queries that may give rise to inappropriate,

malicious, or illegal usage, including facilitang manipulaon, extoron, targeted

impersonaon, cyber-aacks, and weapons creaon.

CBRN Informaon, Informaon

Security

AI Actors: AI Deployment, AI Impact Assessment, Domain Experts, Operaon and Monitoring, TEVV

MEASURE 2.7: AI system security and resilience – as idened in the MAP funcon – are evaluated and documented.

Acon ID

Acon

Risks

MS-2.7-001

Apply established security measures to: Assess risks of backdoors, compromised

dependencies, data breaches, eavesdropping, man-in-the-middle aacks, reverse

engineering other baseline security concerns; Audit supply chains to idenfy

risks arising from, e.g., data poisoning and malware, soware and hardware

vulnerabilies, third-party personnel and soware; Audit GAI systems, pipelines,

plugins and other related arfacts for unauthorized access, malware, and other

known vulnerabilies.

Data Privacy, Informaon

Integrity, Informaon Security,

Value Chain and Component

Integraon

MS-2.7-002

Assess the completeness of documentaon related to data provenance, access

controls, and incident response procedures. Verify GAI system content

provenance documentaon aligns with relevant regulaons and standards.

Informaon Integrity, Toxicity,

Bias, and Homogenizaon

MS-2.7-003

Benchmark GAI system security and resilience related to content provenance

against industry standards and best pracces. Compare GAI system security

features and content provenance methods against industry state-of-the-art.

Informaon Integrity, Informaon

Security

MS-2.7-004

Conduct user surveys to gather user sasfacon with the AI-generated content

and user percepons of content authencity. Analyze user feedback to idenfy

concerns and/or current literacy levels related to content provenance.

Human AI Conguraon,

Informaon Integrity

MS-2.7-005

Engage with security experts, developers, and researchers through informaon

sharing mechanisms to stay updated with the latest advancements in AI security

related to content provenance. Contribute ndings related to AI system security

and content provenance via informaon sharing mechanisms, workshops, or

publicaons.

Informaon Integrity, Informaon

Security

MS-2.7-006

Establish measures and evaluate GAI resiliency as part of pre-deployment tesng

to ensure GAI will funcon under adverse condions and restore full

funconality in a trustworthy manner.

MS-2.7-007

Idenfy metrics that reect the eecveness of security measures, such as data

provenance, the number of unauthorized access aempts, penetraons, or

provenance vericaon.

Informaon Integrity, Informaon

Security

MS-2.7-008

Maintain awareness of emergent GAI security risks and associated

countermeasures through community resources, ocial guidance, or research

literature.

Informaon Security, Unknowns

MS-2.7-009

Measure reliability of content provenance vericaon methods, such as

watermarking, cryptographic signatures, hashing, blockchain, or other content

provenance techniques. Evaluate the rate of false posives and false negaves in

content provenance, as well as true posives and true negaves for vericaon.

Informaon Integrity

MS-2.7-010

Measure the average response me to security incidents related to content

provenance, and the proporon of incidents resolved with and without

signicant impact.

Informaon Integrity, Informaon

Security

MS-2.7-011

Measure the rate at which recommendaons from security audits and incidents

are implemented related to content provenance. Assess how quickly the AI

system can adapt and improve based on lessons learned from security incidents

and feedback related to content provenance.

Informaon Integrity, Informaon

Security

MS-2.7-012

Monitor and review the completeness and validity of security documentaon

and verify it aligns with the current state of the GAI system and its content

provenance.

Informaon Integrity, Informaon

Security, Toxicity, Bias, and

Homogenizaon

MS-2.7-013

Monitor GAI system downme and measure its impact on operaons.

MS-2.7-014

Monitor GAI systems in deployment for anomalous use and security risks.

Informaon Security

MS-2.7-015

Monitor the number of security-related incident reports from users, indicang

their awareness and willingness to report issues.

Human AI Conguraon,

Informaon Security

MS-2.7-016

Perform AI red-teaming to assess resilience against: Abuse to facilitate aacks on

other systems (e.g., malicious code generaon, enhanced phishing content), GAI

aacks (e.g., prompt injecon), ML aacks (e.g., adversarial examples/prompts,

data poisoning, membership inference, model extracon, sponge examples).

Informaon Security, Toxicity,

Bias, and Homogenizaon,

Dangerous or Violent

Recommendaons

MS-2.7-017

Review deployment approval processes and verify that processes address

relevant GAI security risks.

Informaon Security

MS-2.7-018

Review incident response procedures and verify adequate funconality to

idenfy, contain, eliminate, and recover from complex GAI system incidents that

implicate impacts across the trustworthy characteriscs.

MS-2.7-019

Track and document access and updates to GAI system training data; verify

appropriate security measures for training data at GAI vendors and service

providers.

Informaon Security, Value Chain

and Component Integraon

MS-2.7-020

Track GAI system performance metrics such as response me and throughput

under dierent loads and usage paerns related to content provenance.

Informaon Integrity

MS-2.7-021

Track the number of users who have completed security training programs

regarding the security of content provenance.

Human AI Conguraon,

Informaon Integrity, Informaon

Security

MS-2.7-022

Verify ne-tuning does not compromise safety and security controls.

Informaon Integrity, Informaon

Security, Dangerous or Violent

Recommendaons

MS-2.7-023

Verify organizaonal policies, procedures, and processes for treatment of GAI

security and resiliency risks.

Informaon Security

MS-2.7-024

Verify vendor documentaon for data and soware security controls.

Informaon Security, Value Chain

and Component Integraon

MS-2.7-025

Work with domain experts to capture stakeholder condence in GAI system

security and perceived eecveness related to content provenance.

Informaon Integrity, Informaon

Security

AI Actors: AI Deployment, AI Impact Assessment, Domain Experts, Operaon and Monitoring, TEVV

MEASURE 2.8: Risks associated with transparency and accountability – as idened in the MAP funcon – are examined and

documented.

Acon ID

Acon

Risks

MS-2.8-001

Compile and communicate stascs on policy violaons, take-down requests,

intellectual property infringement, and informaon integrity for organizaonal

GAI systems: Analyze transparency reports across demographic groups,

languages groups, and other segments relevant to the deployment context.

Informaon Integrity, Intellectual

Property, Toxicity, Bias, and

Homogenizaon

MS-2.8-002

Document the instrucons given to data annotators or AI red-teamers.

MS-2.8-003

Document where in the data pipeline human labor is being used.

MS-2.8-004

Establish a mechanism for appealing usage policy violaons.

MS-2.8-005

Maintain awareness of AI regulaons and standards in relevant jurisdicons

related to GAI systems and content provenance.

Informaon Integrity

MS-2.8-006

Measure the eecveness or accessibility of procedures to appeal adverse,

harmful, or incorrect outcomes from GAI systems.

Human AI Conguraon, Toxicity,

Bias, and Homogenizaon,

Dangerous or Violent

Recommendaons

MS-2.8-007

Review and consider GAI system transparency arfacts such as impact

assessments, system cards, model cards, and tradional risk management

documentaon as part of organizaonal decision making.

MS-2.8-008

Review licenses, patents, or other intellectual property rights pertaining to

informaon in system training data.

Intellectual Property

MS-2.8-009

Track AI actor decisions along the lifecycle to determine sources of systemic and

cognive bias and idenfy management and migaon approaches.

Human AI Conguraon, Toxicity,

Bias, and Homogenizaon

MS-2.8-010

Use interpretable machine learning techniques to make AI processes and

outcomes more transparent, and easier to understand how decisions are made.

MS-2.8-011

Use technologies such as blockchain and digital signatures to enable the

documentaon of each instance where content is generated, modied, or shared

to provide a tamper-proof history of the content, promote transparency, and

enable traceability. Robust version control systems can also be applied to track

changes across the AI lifecycle over me.

Informaon Integrity

MS-2.8-012

Verify adequacy of GAI system user instrucons through user tesng.

Human AI Conguraon

MS-2.8-013

Verify that accurate informaon about GAI capabilies, opportunies, risks, and

potenal negave impacts are available on websites, press releases,

organizaonal reports, social media, and public communicaon channels.

MS-2.8-014

Verify the adequacy of feedback funconality in system user interfaces.

Human AI Conguraon

MS-2.8-015

Verify the adequacy of redress processes for severe GAI system impacts.

AI Actors: AI Deployment, AI Impact Assessment, Domain Experts, Operaon and Monitoring, TEVV

MEASURE 2.9: The AI model is explained, validated, and documented, and AI system output is interpreted within its context – as

idened in the MAP funcon – to inform responsible use and governance.

Acon ID

Acon

Risks

MS-2.9-001

Apply and document ML explanaon results such as: Analysis of embeddings,

Counterfactual prompts, Gradient-based aribuons, Model

compression/surrogate models, Occlusion/term reducon.

MS-2.9-002

Apply transparency tools such as Datasheets, Data Nutrion Labels, and Model

Cards to record explanatory and validaon informaon.

MS-2.9-003

Document GAI model details including: Proposed use and organizaonal value;

Assumpons and limitaons, Data collecon methodologies; Data provenance;

Data quality; Model architecture (e.g., convoluonal neural network,

transformers, etc.); Opmizaon objecves; Training algorithms; RLHF

approaches; Fine-tuning approaches; Evaluaon data; Ethical consideraons;

Legal and regulatory requirements.

Informaon Integrity, Toxicity,

Bias, and Homogenizaon

MS-2.9-004

Measure and report: Comparisons to alternave approaches and benchmarks;

Outcomes across demographic groups, languages groups, and other segments

relevant to the deployment context; Reproducibility of outcomes or internal

mechanisms; Sensivity analysis and stress-tesng results.

Toxicity, Bias, and Homogenizaon

MS-2.9-005

Verify calibraon and robustness of applied explanaon techniques and

document their assumpons and limitaons.

AI Actors: AI Deployment, AI Impact Assessment, Domain Experts, End-Users, Operaon and Monitoring, TEVV

*MEASURE 2.10: Privacy risk of the AI system – as idened in the MAP funcon – is examined and documented.

Acon ID

Acon

Risks

MS-2.10-001

Collaborate with other AI actors, domain experts, and legal advisors to evaluate

the impact of GAI applicaons on privacy related to the GAI system and its

content provenance, in domains such as healthcare, nance, and criminal jusce.

Data Privacy, Human AI

Conguraon, Informaon

Integrity

MS-2.10-002

Conduct AI red-teaming to assess GAI system risks such as: Outpung of training

data samples, and subsequent reverse engineering, model extracon, and

membership inference risks; Revealing biometric, condenal, copyrighted,

licensed, patented, personal, proprietary, sensive, or trade-marked; Tracking or

revealing locaon informaon of users or members of training datasets.

Human AI Conguraon,

Intellectual Property

MS-2.10-003

Document collecon, use, management, and disclosure of biometric,

condenal, copyrighted, licensed, patented, personal, proprietary, sensive, or

trade-marked informaon in datasets, in accordance with privacy and data

governance policies and data privacy laws.

Data Privacy, Human AI

Conguraon, Intellectual

Property

MS-2.10-004

Engage directly with end-users and other stakeholders to understand their

expectaons and concerns regarding content provenance. Use this feedback to

guide the design of provenance-tracking mechanisms.

Human AI Conguraon,

Informaon Integrity

MS-2.10-005

Establish and document protocols (authorizaon, duraon, type) and access

controls for training sets or producon data containing biometric, condenal,

copyrighted, licensed, patented, personal, proprietary, sensive, or trade-marked

informaon, in accordance with privacy and data governance policies and data

privacy laws.

Data Privacy, Intellectual Property

MS-2.10-006

Implement consent mechanisms that are demonstrated to allow users to

understand and control how their data is used in the GAI system and its content

provenance.

Data Privacy, Human AI

Conguraon, Informaon

Integrity

MS-2.10-007

Implement mechanisms to monitor, periodically review and document the

provenance data to detect any inconsistencies or unauthorized modicaons.

Informaon Integrity, Informaon

Security

MS-2.10-008

Implement zero-knowledge proofs to balance transparency with privacy and

allow vericaon of claims about content without exposing the actual data.

Data Privacy

MS-2.10-009

Leverage technologies such as blockchain to document the origin of, and any

subsequent modicaons to, generated content to enhance transparency and

provide a secure method for provenance tracking.

Informaon Integrity, Informaon

Security

MS-2.10-010

Track training, input and output items that contains personally idenable

informaon.

Data Privacy

MS-2.10-011

Verify compliance with data protecon regulaons.

Data Privacy

MS-2.10-012

Verify deduplicaon of training data samples.

Toxicity, Bias, and Homogenizaon

MS-2.10-013

Verify organizaonal policies, procedures, and processes for GAI systems address

fundamental tenets of data privacy, e.g., Anonymizaon of private data; Consent

to use data for targeted purposes or applicaons; Data collecon and use in

accordance with legal requirements and organizaonal policies; Reasonable data

retenon limits and requirements; User data deleon and reccaon requests.

Data Privacy, Human AI

Conguraon

MS-2.10-014

Verify that biometric, condenal, copyrighted, licensed, patented, personal,

proprietary, sensive, or trade-marked informaon are removed from GAI

training data.

Intellectual Property

AI Actors: AI Deployment, AI Impact Assessment, Domain Experts, End-Users, Operaon and Monitoring, TEVV

*MEASURE 2.11: Fairness and bias – as idened in the MAP funcon – are evaluated and results are documented.

Acon ID

Acon

Risks

MS-2.11-001

Apply use-case appropriate benchmarks (e.g., Bias Benchmark Quesons, Real

Toxicity Prompts, Winogender) to quanfy systemic bias, stereotyping,

denigraon, and toxicity in GAI system outputs; Document assumpons and

limitaons of benchmarks relave to in-context deployment environment.

Toxicity, Bias, and Homogenizaon

MS-2.11-002

Assess content moderaon and other output ltering technologies or processes

for risks arising from human, systemic, and stascal/computaonal biases.

Toxicity, Bias, and Homogenizaon

MS-2.11-003

Conduct fairness assessments to measure systemic bias. Measure GAI system

performance across demographic groups and subgroups, addressing both quality

of service and any allocaon of services and resources. Idenfy types of harms,

including harms in resource allocaon, representaonal, quality of service,

stereotyping, or erasure, Idenfy across, within, and intersecng groups that

might be harmed; Quanfy harms using: eld tesng with sub-group populaons

to determine likelihood of exposure to generated content exhibing harmful

bias, AI red-teaming with counterfactual and low-context (e.g., “leader,” “bad

guys”) prompts. For ML pipelines or business processes with categorical or

numeric outcomes that rely on GAI, apply general fairness metrics (e.g.,

demographic parity, equalized odds, equal opportunity, stascal hypothesis

tests), to the pipeline or business outcome where appropriate; Custom, context-

specic metrics developed in collaboraon with domain experts and aected

communies; Measurements of the prevalence of denigraon in generated

content in deployment (e.g., sub-sampling a fracon of trac and manually

annotang denigrang content); Analyze quaned harms for contextually

signicant dierences across groups, within groups, and among intersecng

groups; Rene idencaon of within-group and interseconal group disparies,

Evaluate underlying data distribuons and employ sensivity analysis during the

analysis of quaned harms, Evaluate quality metrics including dierenal

output across groups, Consider biases aecng small groups, within-group or

interseconal communies, or single individuals.

Toxicity, Bias, and

Homogenizaon, Dangerous or

Violent Recommendaons

MS-2.11-004

Evaluate pracces along the lifecycle to idenfy potenal sources of human-

cognive bias such as availability, observaonal, groupthink, funding, and

conrmaon bias, and to make implicit decision-making processes more explicit

and open to invesgaon.

Toxicity, Bias, and Homogenizaon

MS-2.11-005

Idenfy the classes of individuals, groups, or environmental ecosystems which

might be impacted by GAI systems through direct engagement with potenally

impacted communies.

Environmental, Toxicity, Bias, and

Homogenizaon

MS-2.11-006

Monitor for representaonal, nancial, or other harms aer GAI systems are

deployed.

Toxicity, Bias, and

Homogenizaon, Dangerous or

Violent Recommendaons

MS-2.11-007

Review, document, and measure sources of bias in training and TEVV data:

Dierences in distribuons of outcomes across and within groups, including

intersecng groups; Completeness, representaveness, and balance of data

sources; demographic group and subgroup coverage in GAI system training data;

Forms of latent systemic bias in images, text, audio, embeddings, or other

complex or unstructured data; Input data features that may serve as proxies for

demographic group membership (i.e., image metadata, language dialect) or

otherwise give rise to emergent bias within GAI systems; The extent to which the

digital divide may negavely impact representaveness in GAI system training

and TEVV data; Filtering of hate speech and toxicity in GAI system training data;

Prevalence of GAI-generated data in GAI system training data.

Toxicity, Bias, and

Homogenizaon, Unknowns

MS-2.11-008

Track and document AI actor credenals and qualicaons.

Human AI Conguraon

MS-2.11-009

Verify accessibility funconality; verify funconality and meliness of

accommodaons and opt-out funconality or processes.

Human AI Conguraon, Toxicity,

Bias, and Homogenizaon

MS-2.11-010

Verify bias management in periodic model updates; test and recalibrate with

updated and more representave data to manage bias within acceptable

tolerances.

Toxicity, Bias, and Homogenizaon

MS-2.11-011

Verify training is not homogenous GAI-produced data in order to migate

concerns of model collapse.

Toxicity, Bias, and Homogenizaon

AI Actors: AI Deployment, AI Impact Assessment, Aected Individuals and Communies, Domain Experts, End-Users, Operaon

and Monitoring, TEVV

MEASURE 2.12: Environmental impact and sustainability of AI model training and management acvies – as idened in the MAP

funcon – are assessed and documented.

Acon ID

Acon

Risks

MS-2.12-001

Assess safety to physical environments when deploying GAI systems.

Dangerous or Violent

Recommendaons

MS-2.12-002

Document ancipated environmental impacts of model development,

maintenance, and deployment in product design decisions.

Environmental

MS-2.12-003

Measure or esmate environmental impacts (e.g., energy and water

consumpon) for training, ne tuning, and deploying models: Verify tradeos

between resources used at inference me versus addional resources required

at training me.

Environmental

MS-2.12-004

Track and document connuous improvement processes that enhance

eecveness of risk measurement for GAI environmental impacts and

sustainability.

Environmental

MS-2.12-005

Verify eecveness of carbon capture or oset programs, and address green-

washing risks.

Environmental

AI Actors: AI Deployment, AI Impact Assessment, Domain Experts, Operaon and Monitoring, TEVV

MEASURE 2.13: Eecveness of the employed TEVV metrics and processes in the MEASURE funcon are evaluated and

documented.

Acon ID

Acon

Risks

MS-2.13-001

Create measurement error models for pre-deployment metrics to demonstrate

construct validity for each metric (i.e., does the metric eecvely operaonalize

the desired concept): Measure or esmate, and document, biases or stascal

variance in applied metrics or structured human feedback processes; Adhere to

applicable laws and regulaons when operaonalizing models in high-volume

sengs (e.g., toxicity classiers and automated content lters); Leverage domain

experse when modeling complex societal constructs such as toxicity.

Confabulaon, Informaon

Integrity, Toxicity, Bias, and

Homogenizaon

MS-2.13-002

Document measurement and structured public feedback processes applied to

organizaonal GAI systems in a centralized repository (i.e., organizaonal AI

inventory).

MS-2.13-003

Review GAI system metrics and associated pre-deployment processes to

determine their ability to sustain system improvements, including the

idencaon and removal of errors, harms, and negave impacts.

Confabulaon, Informaon

Integrity, Dangerous or Violent

Recommendaons

AI Actors: AI Deployment, Operaon and Monitoring, TEVV

*MEASURE 3.1: Approaches, personnel, and documentaon are in place to regularly idenfy and track exisng, unancipated, and

emergent AI risks based on factors such as intended and actual performance in deployed contexts.

Acon ID

Acon

Risks

MS-3.1-001

Assess completeness of known use cases and expected performance of inputs,

such as third-party data or upstream AI systems, or the performance of

downstream systems which use the outputs of the GAI system, directly or

indirectly, through engagement and outreach with AI Actors.

Human AI Conguraon, Value

Chain and Component

Integraon, Toxicity, Bias, and

Homogenizaon

MS-3.1-002

Compare intended use and expected performance of GAI systems across all

relevant contexts.

MS-3.1-003

Elicit and track feedback for previously unknown uses of the GAI systems.

AI Actors: AI Impact Assessment, Operaon and Monitoring, TEVV

MEASURE 3.2: Risk tracking approaches are considered for sengs where AI risks are dicult to assess using currently available

measurement techniques or where metrics are not yet available.

Acon ID

Acon

Risks

MS-3.2-001

Determine if available GAI system risk measurement approaches are applicable

to the GAI system use contexts.

MS-3.2-002

Document the rate of occurrence and severity of GAI harms to the organizaon

and to external AI actors.

Human AI Conguraon

MS-3.2-003

Establish processes for idenfying emergent GAI system risks with external AI

actors.

Human AI Conguraon,

Unknowns

MS-3.2-004

Idenfy measurement approaches for tracking GAI system risks if none exist.

AI Actors: AI Impact Assessment, Domain Experts, Operaon and Monitoring, TEVV

*MEASURE 3.3: Feedback processes for end users and impacted communies to report problems and appeal system outcomes are

established and integrated into AI system evaluaon metrics.

Acon ID

Acon

Risks

MS-3.3-001

Conduct impact assessments on how AI-generated content might aect dierent

social, economic, and cultural groups.

Toxicity, Bias, and Homogenizaon

MS-3.3-002

Conduct studies to understand how end users perceive and interact with GAI

content related to content provenance within context of use. Assess whether the

content aligns with their expectaons and how they may act upon the

informaon presented.

Human AI Conguraon,

Informaon Integrity

MS-3.3-003

Design evaluaon metrics that include parameters for content provenance

quality, validity, reliability, authencity or origin, and integrity of content.

Informaon Integrity

MS-3.3-004

Evaluate GAI system evaluaon metrics based on feedback from relevant AI

actors.

Human AI Conguraon

MS-3.3-005

Evaluate potenal biases and stereotypes that could emerge from the AI-

generated content using appropriate methodologies including computaonal

tesng methods as well as evaluang structured feedback input.

Toxicity, Bias, and Homogenizaon

MS-3.3-006

Implement connuous monitoring of AI-generated content and provenance aer

system deployment for various types of dri. Verify GAI systems are adapve and

able to iteravely improve models and algorithms over me.

Informaon Integrity

MS-3.3-007

Integrate human evaluators to assess content quality and relevance.

Human AI Conguraon

MS-3.3-008

Provide input for training materials about the capabilies and limitaons of GAI

systems related to content provenance for AI actors, other professionals, and the

public about the societal impacts of AI and the role of diverse and inclusive

content generaon.

Human AI Conguraon,

Informaon Integrity, Toxicity,

Bias, and Homogenizaon

MS-3.3-009

Record and integrate structured feedback about content provenance from

operators, users, and potenally impacted communies through the use of

methods such as user research studies, focus groups, or community forums.

Acvely seek feedback on generated content quality and potenal biases. Assess

the general awareness among end users and impacted communies about the

availability of these feedback channels.

Human AI Conguraon,

Informaon Integrity, Toxicity,

Bias, and Homogenizaon

MS-3.3-010

Regularly review structured human feedback and GAI system sensors and update

based on the evolving needs and concerns of the impacted communies.

MS-3.3-011

Ulize independent evaluaons to assess content quality and types of potenal

biases and related negave impacts.

Toxicity, Bias, and Homogenizaon

MS-3.3-012

Verify AI actors engaged in GAI TEVV tasks for content provenance reect diverse

demographic and interdisciplinary backgrounds.

Human AI Conguraon,

Informaon Integrity, Toxicity,

Bias, and Homogenizaon

AI Actors: AI Deployment, Aected Individuals and Communies, End-Users, Operaon and Monitoring, TEVV

MEASURE 4.2: Measurement results regarding AI system trustworthiness in deployment context(s) and across the AI lifecycle are

informed by input from domain experts and relevant AI actors to validate whether the system is performing consistently as

intended. Results are documented.

Acon ID

Acon

Risks

MS-4.2-001

Conduct adversarial tesng to assess the GAI system’s response to inputs

intended to deceive or manipulate its content provenance and understand

potenal misuse scenarios and unintended outputs.

Informaon Integrity, Informaon

Security

MS-4.2-002

Ensure both posive and negave feedback on GAI system funconality is

assessed.

MS-4.2-003

Ensure visible mechanisms to collect users’ feedback are in place, including

systems to report harmful and low quality content.

Human AI Conguraon,

Dangerous or Violent

Recommendaons

MS-4.2-004

Evaluate GAI system content provenance in real-world scenarios to observe its

behavior in praccal environments and reveal issues that might not surface in

controlled and opmized tesng environments.

Informaon Integrity

MS-4.2-005

Evaluate GAI system performance related to content provenance against

predened metrics and update the evaluaon criteria as necessary to adapt to

changing contexts and requirements.

Informaon Integrity

MS-4.2-006

Implement interpretability and explainability methods to evaluate GAI system

decisions related to content provenance and verify alignment with intended

purpose.

Informaon Integrity, Toxicity,

Bias, and Homogenizaon

MS-4.2-007

Integrate structured human feedback results into calibraon and update

processes for tradional measurement approaches (e.g., benchmarks,

performance assessments, data quality measurements).

MS-4.2-008

Measure GAI system inputs and outputs to account for content provenance, data

provenance, source reliability, contextual relevance and coherence, and security

implicaons.

Informaon Integrity, Informaon

Security

MS-4.2-009

Monitor and document instances where human operators or other systems

override the GAI's decisions. Evaluate these cases to understand if the overrides

are linked to issues related to content provenance.

Informaon Integrity

MS-4.2-010

Verify and document the incorporaon of structured human feedback results

into design, implementaon, deployment approval (“go”/“no-go” decisions),

monitoring, and decommission decisions.

MS-4.2-011

Verify that GAI system development and deployment related to content

provenance integrates trustworthiness characteriscs.

Informaon Integrity

MS-4.2-012

Verify the performance of user feedback and recourse mechanisms, including

analyses across various sub-groups.

Human AI Conguraon, Toxicity,

Bias, and Homogenizaon

MS-4.2-013

Work with domain experts to integrate insights from stakeholder feedback

analysis into TEVV metrics and associated acons, and connuous improvement

processes.

MS-4.2-014

Work with domain experts to review feedback from end users, operators, and

potenally impacted individuals and communies—enumerated in the Map

funcon.

Human AI Conguraon

MS-4.2-015

Work with domain experts who understand the GAI system context of use to

evaluate the content’s validity, relevance, and potenal biases.

Toxicity, Bias, and Homogenizaon

AI Actors: AI Deployment, Domain Experts, End-Users, Operaon and Monitoring, TEVV

*MANAGE 1.3: Responses to the AI risks deemed high priority, as idened by the MAP funcon, are developed, planned, and

documented. Risk response opons can include migang, transferring, avoiding, or accepng.

Acon ID

Acon

Risks

MG-1.3-001

Allocate resources and me for GAI risk management acvies, including

planning for incident response and other migaon acvies.

MG-1.3-002

Document residual GAI system risks that persist aer risk migaon or transfer.

MG-1.3-003

Document trade-os, decision processes, and relevant measurement and

feedback results for risks that do not surpass organizaonal risk tolerance.

MG-1.3-004

Migate, transfer, or avoid risks that surpass organizaonal risk tolerances.

MG-1.3-005

Monitor the eecveness of risk controls (e.g., via eld tesng, parcipatory

engagements, performance assessments, user feedback mechanisms).

Human AI Conguraon

AI Actors: AI Deployment, AI Impact Assessment, Operaon and Monitoring

MANAGE 2.2: Mechanisms are in place and applied to sustain the value of deployed AI systems.

Acon ID

Acon

Risks

MG-2.2-001

Compare GAI system outputs against pre-dened organizaon risk tolerance,

guidelines, and principles, and review and audit AI-generated content against

these guidelines.

MG-2.2-002

Document training data sources to trace the origin and provenance of AI-

generated content.

Informaon Integrity

MG-2.2-003

Evaluate feedback loops between GAI system content provenance and human

reviewers, and update make updates where needed. Implement real-me

monitoring systems to detect GAI systems and content provenance dri as it

happens.

Informaon Integrity

MG-2.2-004

Evaluate GAI content and data for representaonal biases and employ

techniques such as re-sampling, re-ranking, or adversarial training to migate

biases in the generated content.

Informaon Security, Toxicity,

Bias, and Homogenizaon

MG-2.2-005

Filter GAI output for harmful or biased content, potenal misinformaon, and

CBRN-related or NCII content.

CBRN Informaon, Obscene,

Degrading, and/or Abusive

Content, Toxicity, Bias, and

Homogenizaon, Dangerous or

Violent Recommendaons

MG-2.2-006

Implement version control for models and datasets to track changes and

facilitate rollback if necessary.

MG-2.2-007

Incorporate feedback from users, external experts, and the public to adapt the

GAI system and monitoring processes.

Human AI Conguraon

MG-2.2-008

Incorporate human review processes to assess and lter content in accordance

with the socio-cultural knowledge and values of the context of use and to

idenfy limitaons and nuances that automated processes might miss; verify

that human reviewers are trained on content guidelines and potenal biases of

GAI system and its content provenance.

Informaon Integrity, Toxicity,

Bias, and Homogenizaon

MG-2.2-009

Integrate informaon from data management and machine learning security

countermeasures like red teaming, and dierenal privacy, and authencaon

protocols to ensure data and models are protected from potenal risks.

CBRN Informaon, Data Privacy,

Informaon Security

MG-2.2-010

Use feedback from internal and external AI actors, users, individuals, and

communies, to assess impact of AI-generated content.

Human AI Conguraon

MG-2.2-011

Use real-me auding tools such as distributed ledger technology to track and

validate the lineage and authencity of AI-generated data.

Informaon Integrity

MG-2.2-012

Use structured feedback mechanisms to solicit and capture user input about AI-

generated content to detect subtle shis in quality or alignment with community

and societal values.

Human AI Conguraon, Toxicity,

Bias, and Homogenizaon

AI Actors: AI Deployment, AI Impact Assessment, Governance and Oversight, Operaon and Monitoring

*MANAGE 2.3: Procedures are followed to respond to and recover from a previously unknown risk when it is idened.

Acon ID

Acon

Risks

MG-2.3-001

Develop and update GAI system incident response and recovery plans and

procedures to address the following: Review and maintenance of policies and

procedures to account for newly encountered uses; Review and maintenance of

policies and procedures for detecon of unancipated uses; Verify response

and recovery plans account for the GAI system supply chain; Verify response

and recovery plans are updated for and include necessary details to

communicate with downstream GAI system Actors: Points-of-Contact (POC),

Contact informaon, nocaon format.

Value Chain and Component

Integraon

MG-2.3-002

Maintain protocols to log changes made to GAI systems during incident

response and recovery.

MG-2.3-003

Review, update and maintain incident response and recovery plans to integrate

insights from GAI system use cases and contexts and needs of relevant AI

actors.

Human AI Conguraon

MG-2.3-004

Verify and maintain measurements that GAI systems are operang within

organizaonal risk tolerances post incident.

AI Actors: AI Deployment, Operaon and Monitoring

*MANAGE 2.4: Mechanisms are in place and applied, and responsibilies are assigned and understood, to supersede, disengage, or

deacvate AI systems that demonstrate performance or outcomes inconsistent with intended use.

Acon ID

Acon

Risks

MG-2.4-001

Enforce change management processes, and risk and impact assessments

across all intended uses and contexts before deploying GAI system updates.

MG-2.4-002

Establish and maintain communicaon plans to inform AI stakeholders as part

of the deacvaon or disengagement process of a specic GAI system or

context of use, including reasons, workarounds, user access removal, alternave

processes, contact informaon, etc.

Human AI Conguraon

MG-2.4-003

Establish and maintain procedures for escalang GAI system incidents to the

organizaonal risk authority when specic criteria for deacvaon or

disengagement is met for a parcular context of use or for the GAI system as a

whole.

MG-2.4-004

Establish and maintain procedures for the remediaon of issues which trigger

incident response processes for the use of a GAI system, and provide

stakeholders melines associated with the remediaon plan.

MG-2.4-005

Establish and regularly review specic criteria that warrants the deacvaon of

GAI systems in accordance with set risk tolerances and appetes.

AI Actors: AI Deployment, Governance and Oversight, Operaon and Monitoring

*MANAGE 3.1: AI risks and benets from third-party resources are regularly monitored, and risk controls are applied and

documented.

Acon ID

Acon

Risks

MG-3.1-001

Apply organizaonal risk tolerances and controls (e.g., acquision and

procurement processes; assessing personnel credenals and qualicaons,

performing background checks; ltering GAI input and outputs, grounding,

ne tuning) to third-party GAI resources: Apply organizaonal risk tolerance

to the ulizaon of third-party datasets and other GAI resources; Apply

organizaonal risk tolerances to ne-tuned third-party models; Apply

organizaonal risk tolerance to exisng third-party models adapted to a new

domain; Reassess risk measurements aer ne-tuning third-party GAI

models.

Value Chain and Component

Integraon

MG-3.1-002

Audit GAI system supply chain risks (e.g., data poisoning, malware, other

soware and hardware vulnerabilies; labor pracces; data privacy and

localizaon compliance; geopolical alignment).

Data Privacy, Informaon Security,

Value Chain and Component

Integraon, Toxicity, Bias, and

Homogenizaon

MG-3.1-003

Decommission third-party systems that exceed organizaonal risk tolerances.

Value Chain and Component

Integraon

MG-3.1-004

Idenfy and maintain documentaon for third-party AI systems, and

components, in organizaonal AI inventories.

Value Chain and Component

Integraon

MG-3.1-005

Iniate review of third-party organizaons/developers prior to their use of

GAI models, and during their use of GAI models for their own applicaons, to

monitor for abuse and policy violaons.

Value Chain and Component

Integraon, Toxicity, Bias, and

Homogenizaon, Dangerous or

Violent Recommendaons

MG-3.1-006

Re-assess model risks aer ne-tuning and for any third-party GAI models

deployed for applicaons and/or use cases that were not evaluated in inial

tesng.

Value Chain and Component

Integraon

MG-3.1-007

Review GAI training data for CBRN informaon and intellectual property; scan

output for plagiarized, trademarked, patented, licensed, or trade secret

material.

Intellectual Property, CBRN

Informaon

MG-3.1-008

Update acquision and procurement policies, procedures, and processes to

address GAI risks and failure modes.

MG-3.1-009

Use, review, update, and share various transparency arfacts (e.g., system

cards and model cards) for third-party models. Document or retain

documentaon for: Training data content and provenance, methodology,

tesng, validaon, and clear instrucons for use from GAI vendors and

suppliers, Informaon related to third-party informaon security policies,

procedures, and processes.

Informaon Integrity, Informaon

Security, Value Chain and

Component Integraon

AI Actors: AI Deployment, Operaon and Monitoring, Third-party enes

MANAGE 3.2: Pre-trained models which are used for development are monitored as part of AI system regular monitoring and

maintenance.

Acon ID

Acon

Risks

MG-3.2-001

Apply explainable AI (XAI) techniques (e.g., analysis of embeddings, model

compression/disllaon, gradient-based aribuons, occlusion/term reducon,

counterfactual prompts, word clouds) as part of ongoing connuous

improvement processes to migate risks related to unexplainable GAI systems.

MG-3.2-002

Document how pre-trained models have been adapted (ne-tuned) for the

specic generave task, including any data augmentaons, parameter

adjustments, or other modicaons. Access to un-tuned (baseline) models must

be available to support debugging the relave inuence of the pre-trained

weights compared to the ne-tuned model weights.

MG-3.2-003

Document sources and types of training data and their origins, potenal biases

present in the data related to the GAI applicaon and its content provenance,

architecture, training process of the pre-trained model including informaon on

hyperparameters, training duraon, and any ne-tuning processes applied.

Informaon Integrity, Toxicity, Bias,

and Homogenizaon

MG-3.2-004

Evaluate user reported problemac content and integrate feedback into system

updates.

Human AI Conguraon,

Dangerous or Violent

Recommendaons

MG-3.2-005

Implement content lters to prevent the generaon of inappropriate, harmful,

toxic, false, illegal, or violent content related to the GAI applicaon, including

for CSAM and NCII. These lters can be rule-based or leverage addional

machine learning models to ag problemac inputs and outputs.

Informaon Integrity, Toxicity, Bias,

and Homogenizaon, Dangerous or

Violent Recommendaons,

Obscene, Degrading, and/or

Abusive Content

MG-3.2-006

Implement real-me monitoring processes for analyzing generated content

performance and trustworthiness characteriscs related to content provenance

to idenfy deviaons from the desired standards and trigger alerts for human

intervenon.

Informaon Integrity

MG-3.2-007

Leverage feedback and recommendaons from organizaonal boards or

commiees related to the deployment of GAI applicaons and content

provenance when using third-party pre-trained models.

Informaon Integrity, Value Chain

and Component Integraon

MG-3.2-008

Maintain awareness of relevant laws and regulaons related to content

generaon, data privacy, and user protecons and work in conjuncon with

legal experts to review and assess the potenal liabilies associated with AI-

generated content.

Data Privacy, Intellectual Property

Informaon Integrity

MG-3.2-009

Provide use case examples as material for training employees and stakeholders

about the trustworthiness implicaons of GAI applicaons and content

provenance and to raise awareness about potenal risks in fostering a risk

management culture.

Informaon Integrity

MG-3.2-010

Use human moderaon systems to review generated content in accordance

with human-AI conguraon policies established in the Govern funcon,

aligned with socio-cultural norms in the context of use, and for sengs where

AI models are demonstrated to perform poorly.

Human AI Conguraon

MG-3.2-011

Use organizaonal risk tolerance to evaluate acceptable risks and performance

metrics and decommission or retrain pre-trained models that perform outside

of dened limits.

CBRN Informaon, Confabulaon

AI Actors: AI Deployment, Operaon and Monitoring, Third-party enes

*MANAGE 4.1: Post-deployment AI system monitoring plans are implemented, including mechanisms for capturing and evaluang

input from users and other relevant AI actors, appeal and override, decommissioning, incident response, recovery, and change

management.

Acon ID

Acon

Risks

MG-4.1-001

Collaborate with external researchers, industry experts, and community

representaves to maintain awareness of emerging best pracces and

technologies in content provenance.

Informaon Integrity, Toxicity, Bias,

and Homogenizaon

MG-4.1-002

Conduct adversarial tesng at a regular cadence; test against various

adversarial inputs and scenarios; idenfy vulnerabilies and assess the AI

system’s resilience to content provenance aacks.

Informaon Integrity, Informaon

Security

MG-4.1-003

Conduct red-teaming exercises to surface failure modes of content provenance

mechanisms. Evaluate the eecveness of red-teaming approaches for

uncovering potenal vulnerabilies and improving overall content provenance.

Informaon Integrity, Informaon

Security

MG-4.1-004

Employ user-friendly channels such as feedback forms, e-mails, or hotlines for

users to report issues, concerns, or unexpected GAI outputs to feed into

monitoring pracces.

Human AI Conguraon

MG-4.1-005

Establish, maintain, and evaluate eecveness of organizaonal processes and

procedures to monitor GAI systems within context of use.

MG-4.1-006

Evaluate the use of senment analysis to gauge user senment regarding GAI

content performance and impact, and work in collaboraon with AI actors

experienced in user research and experience.

Human AI Conguraon

MG-4.1-007

Implement acve learning techniques to idenfy instances where the model

fails or produces unexpected outputs.

Confabulaon

MG-4.1-008

Integrate digital watermarks, blockchain technology, cryptographic hash

funcons, metadata embedding, or other content provenance techniques

within AI-generated content to track its source and manipulaon history.

Informaon Integrity

MG-4.1-009

Measure system outputs related to content provenance at a regular cadence

and integrate insights into monitoring processes.

Informaon Integrity

MG-4.1-010

Monitor GAI training data for representaon of dierent user groups.

Human AI Conguraon, Toxicity,

Bias, and Homogenizaon

MG-4.1-011

Perform periodic review of organizaonal adherence to GAI system monitoring

plans across all contexts of use.

MG-4.1-012

Share transparency reports with internal and external stakeholders that detail

steps taken to update the AI system to enhance transparency and

accountability.

MG-4.1-013

Track dataset modicaons for content provenance by monitoring data

deleons, reccaon requests, and other changes that may impact the

veriability of content origins.

Informaon Integrity

MG-4.1-014

Verify risks associated with gaps in GAI system monitoring plans are accepted at

the appropriate organizaonal level.

MG-4.1-015

Verify that AI actors responsible for monitoring reported issues can eecvely

evaluate GAI system performance and its content provenance, and promptly

escalate issues for response.

Human AI Conguraon,

Informaon Integrity

AI Actors: AI Deployment, Aected Individuals and Communies, Domain Experts, End-Users, Human Factors, Operaon and

Monitoring

MANAGE 4.2: Measurable acvies for connual improvements are integrated into AI system updates and include regular

engagement with interested pares, including relevant AI actors.

Acon ID

Acon

Risks

MG-4.2-001

Adopt agile development methodologies, and iterave development and

feedback loops to allow for rapid adjustments based on external input related to

content provenance.

Informaon Integrity

MG-4.2-002

Conduct regular audits of GAI systems and publish reports detailing the

performance, feedback received, and improvements made.

MG-4.2-003

Employ explainable AI methods to enhance transparency and interpretability of

GAI content provenance to help AI actors and stakeholders understand how and

why specic content is generated.

Human AI Conguraon,

Informaon Integrity

MG-4.2-004

Employ stakeholder feedback captured in the Map funcon to understand user

experiences and percepons about AI-generated content and its provenance;

include user interacons and feedback from real-world scenarios.

Human AI Conguraon,

Informaon Integrity

MG-4.2-005

Form cross-funconal teams leveraging experse from across the AI lifecycle

including AI designers and developers, socio-technical experts, and experts in

the context of use and idenfy mechanisms to include end users in

consultaons.

Human AI Conguraon

MG-4.2-006

Pracce and follow incident response plans for addressing the generaon of

inappropriate or harmful content and adapt processes based on ndings to

prevent future occurrences. Conduct post-mortem analyses of incidents with

relevant AI actors, to understand the root causes and implement prevenve

measures.

Human AI Conguraon,

Dangerous or Violent

Recommendaons

MG-4.2-007

Provide external stakeholders with regular updates about the progress,

challenges, and improvements made based on their feedback through the use of

public venues such as online plaorms and communies, and open-source

iniaves.

Intellectual Property

MG-4.2-008

Simulate various scenarios to test GAI system responses and verify intended

performance across dierent situaons.

MG-4.2-009

Use visualizaons to represent the GAI model behavior to ease non-technical

stakeholders understanding of GAI system funconality.

Human-AI Conguraon

AI Actors: AI Deployment, AI Design, AI Development, Aected Individuals and Communies, End-Users, Operaon and

Monitoring, TEVV

*MANAGE 4.3: Incidents and errors are communicated to relevant AI actors, including aected communies. Processes for tracking,

responding to, and recovering from incidents and errors are followed and documented.

Acon ID

Acon

Risks

MG-4.3-001

Conduct aer-acon assessments for GAI system incidents to verify incident

response and recovery processes are followed and eecve.

MG-4.3-002

Establish and maintain change management records and procedures for GAI

systems, including the reasons for each change, how the change could impact

each intended context of use, and step-by-step details of how changes were

planned, tested, and deployed.

MG-4.3-003

Establish and maintain policies and procedures to record and track GAI system

reported errors, near-misses, incidents, and negave impacts.

Confabulaon, Informaon

Integrity

MG-4.3-004

Establish processes and procedures for regular sharing of informaon about

errors, incidents, and negave impacts for each and across contexts, sectors,

and AI actors, including the date reported, the context of use, the number of

reports for each issue, and assessments of impact and severity.

Confabulaon, Human AI

Conguraon, Informaon

Integrity

AI Actors: AI Deployment, Aected Individuals and Communies, Domain Experts, End-Users, Human Factors, Operaon and

Monitoring

Appendix A. Primary GAI Consideraons

The following primary consideraons were derived as overarching themes from the GAI PWG

consultaon process. These consideraons (Governance, Pre-Deployment Tesng, Content Provenance,

and Incident Disclosure) are relevant to any organizaon designing, developing, and using GAI and also

inform the Acons to Manage GAI risks. Informaon included about the primary consideraons is not

exhausve, but highlights the most relevant topics derived from the GAI PWG.

Acknowledgments: These consideraons could not have been surfaced without the helpful analysis and

contribuons from the community and NIST sta GAI PWG leads: George Awad, Luca Belli, Mat Heyman,

Yooyoung Lee, Reva Schwartz, and Kyra Yee.

Governance

A.1.1. Overview

Like any other technology system, governance principles and techniques can be used to manage risks

related to generave AI models, capabilies, and applicaons. Organizaons may choose to apply their

exisng risk ering to GAI systems, or they may opt to revise or update AI system risk levels to address

these unique GAI risks. This secon describes how organizaonal governance regimes may be re-

evaluated and adjusted for GAI contexts. It also addresses third-party consideraons for governing across

the AI value chain.

A.1.2. Organizaonal Governance

GAI opportunies, risks and long-term performance characteriscs are typically less well-understood

than non-generave AI tools. and may be perceived and acted upon by humans in ways that vary greatly.

Accordingly, GAI may call for dierent levels of oversight from AI actors or dierent human-AI

conguraons in order to manage their risks eecvely. Organizaons’ use of GAI systems may also

warrant addional human review, tracking and documentaon, and greater management oversight.

AI technology can produce varied outputs in mulple modalies and present many classes of user

interfaces. This leads to a broader set of AI actors interacng with GAI systems for widely diering

applicaons and contexts of use. These can include data labeling and preparaon, development of GAI

models, content moderaon, code generaon and review, text generaon and eding, image and video

generaon, summarizaon, search, and chat. These acvies can take place within organizaonal

sengs or in the public domain.

Organizaons can restrict AI applicaons that cause harm, exceed stated risk tolerances, or that conict

with their tolerances or values. Governance tools and protocols that are applied to other types of AI

systems can be applied to GAI systems. These plans and acons include:

• Accessibility and reasonable

accommodaons

• AI actor credenals and qualicaons

• Alignment to organizaonal values

• Auding and assessment

• Change-management controls

• Commercial use

• Data provenance

• Data protecon

• Data retenon

• Consistency in use of dening key terms

• Decommissioning

• Discouraging anonymous use

• Educaon

• Impact assessments

• Incident response

• Monitoring

• Opt-outs

• Risk-based controls

• Risk mapping and measurement

• Science-backed TEVV pracces

• Secure soware development pracces

• Stakeholder engagement

• Synthec content detecon and

labeling tools and techniques

• Whistleblower protecons

• Workforce diversity and

interdisciplinary teams

Establishing acceptable use policies and guidance for the use of GAI in formal human-AI teaming sengs

as well as dierent levels of human-AI conguraons can help to decrease risks arising from misuse,

abuse, inappropriate repurpose, and misalignment between systems and users. These pracces are just

one example of adapng exisng governance protocols for GAI contexts.

A.1.3. Third-Party Consideraons

Organizaons may seek to acquire, embed, incorporate, or use open source or proprietary third-party

GAI models, systems, or generated data for various applicaons across an enterprise. Use of these GAI

tools and inputs has implicaons for all funcons of the organizaon – including but not limited to

acquision, human resources, legal, compliance, and IT services – regardless of whether they are carried

out by employees or third pares. Many of the acons cited above are relevant and opons for

addressing third-party consideraons.

Third party GAI integraons may give rise to increased intellectual property, data privacy, or informaon

security risks, poinng to the need for clear guidelines for transparency and risk management regarding

the collecon and use of third-party data for model inputs. Organizaons may consider varying risk

controls for foundaon models, ne-tuned models, and embedded tools, enhanced processes for

interacng with external GAI technologies or service providers. Organizaons can apply standard or

exisng risk controls and processes to proprietary or open-source GAI technologies, data, and third-party

service providers, including acquision and procurement due diligence, requests for soware bills of

materials (SBOMs), applicaon of service level agreements (SLAs), and statement on standards for

aestaon engagement (SSAE) reports to help with third-party transparency and risk management for

GAI systems.

A.1.4. Pre-Deployment Tesng

Appendix B. Overview

The diverse ways and contexts in which GAI systems may be developed, used, and repurposed

complicates risk mapping and pre-deployment measurement eorts. Robust test, evaluaon, validaon,

and vericaon (TEVV) processes can be iteravely applied – and documented – in early stages of the AI

lifecycle and informed by representave AI actors (see Figure 3 of the AI RMF). Unl new and rigorous

early lifecycle TEVV approaches are developed and matured for GAI, organizaons may use

recommended “pre-deployment tesng” pracces to measure performance, capabilies, limits, risks,

and impacts. This secon describes risk measurement and esmaon as part of pre-deployment TEVV,

and examines the state of play for pre-deployment tesng methodologies.

Appendix C. Limitaons of Current Pre-deployment Test Approaches

Currently available pre-deployment TEVV processes used for GAI applicaons may be inadequate, non-

systemacally applied, or fail to reect or mismatched to deployment contexts. For example, the

anecdotal tesng of GAI system capabilies through video games or standardized tests designed for

humans (e.g., intelligence tests, professional licensing exams) does not guarantee GAI system validity or

reliability in those domains. Similarly, jailbreaking or prompt-engineering tests may not systemacally

assess validity or reliability risks.

Measurement gaps can arise from mismatches between laboratory and real-world sengs. Current

100

tesng approaches oen remain focused on laboratory condions or restricted to benchmark test

101

datasets and in silico techniques that may not extrapolate well to—or directly assess GAI impacts in –

102

real world condions. For example, current measurement gaps for GAI make it dicult to precisely

103

esmate its potenal ecosystem-level or longitudinal risks and related polical, social, and economic

104

impacts. Gaps between benchmarks and real-world use of GAI systems may likely be exacerbated due to

105

prompt sensivity and broad heterogeneity of contexts of use.

106

A.1.5. Structured Public Feedback

107

Structured public feedback can be used to evaluate whether GAI systems are performing as intended

108

and to calibrate and verify tradional measurement methods. Examples of structured feedback include,

109

but are not limited to:

110

• Parcipatory Engagement Methods: Methods used to solicit feedback from civil society groups,

111

aected communies, and users, including focus groups, small user studies, and surveys.

112

• Field Tesng: Methods used to determine how people interact with, consume, use, and make

113

sense of AI-generated informaon, and subsequent acons and eects, including UX, usability,

114

and other structured, randomized experiments.

115

• AI Red-teaming: A structured tesng exercise used to probe an AI system to nd aws and

116

vulnerabilies such as inaccurate, harmful, or discriminatory outputs, oen in a controlled

117

environment and in collaboraon with system developers.

118

Informaon gathered from structured public feedback can inform design, implementaon, deployment

119

approval, maintenance, or decommissioning decisions. Results and insights gleaned from these exercises

120

can serve mulple purposes, including improving data quality and preprocessing, bolstering governance

121

decision making, and enhancing system documentaon and debugging pracces. When implemenng

122

feedback acvies, organizaons should follow human subjects research requirements and best

123

pracces such as informed consent and subject compensaon.

124

C.1.1.1. Parcipatory Engagement Methods

125

On an ad hoc or more structured basis, organizaons can design and use a variety of channels to engage

126

external stakeholders in product development or review. Focus groups with select experts can provide

127

feedback on a range of issues. Small user studies can provide feedback from representave groups or

128

populaons. Anonymous surveys can be used to poll or gauge reacons to specic features. Parcipatory

129

engagement methods are oen less structured than eld tesng or red teaming, and are more

130

commonly used in early stages of AI or product development.

131

Appendix D. Field Tesng

132

Field tesng involves structured sengs to evaluate risks and impacts and to simulate the condions

133

under which the GAI system will be deployed. Field style tests can be adapted from a focus on user

134

preferences and experiences towards AI risks and impacts – both negave and posive. When carried

135

out with large groups of users, these tests can provide esmaons of the likelihood of risks and impacts

136

in real world interacons.

137

Organizaons may also collect feedback on outcomes, harms, and user experience directly from users in

138

the producon environment aer a model has been released, in accordance with human subject

139

standards such as informed consent and compensaon. Organizaons should follow applicable human

140

subjects research requirements, and best pracces such as informed consent and subject compensaon,

141

when implemenng feedback acvies.

142

Appendix E. AI Red-teaming

143

AI red-teaming exercises are oen conducted in a controlled environment and in collaboraon with AI

144

developers building AI models. AI red-teaming can be performed before or aer AI models or systems

145

are made available to the broader public; this secon focuses on red-teaming in pre-deployment

146

contexts.

147

The quality of AI red-teaming outputs is related to the background and experse of the AI red-team

148

itself. Demographically and interdisciplinarily diverse AI red-teams can be used to idenfy aws in the

149

varying contexts where GAI will be used. For best results, AI red-teams should demonstrate domain

150

experse, and awareness of socio-cultural aspects within the deployment context. AI red-teaming results

151

should be given addional analysis before they are incorporated into organizaonal governance and

152

decision making, policy and procedural updates, and AI risk management eorts.

153

Various types of AI red-teaming may be appropriate, depending on the use case:

154

• General Public: Performed by general users (not necessarily AI or technical experts) who are

155

expected to use the model or interact with its outputs, and who bring their own lived

156

experiences and perspecves to the task of AI red-teaming. These individuals may have been

157

provided instrucons and material to complete tasks which may elicit harmful model behaviors.

158

This type of exercise can be more eecve with large groups of AI-teamers.

159

• Expert: Performed by specialists with experse in the domain or specic AI red-teaming context

160

of use (e.g., medicine, biotech, cybersecurity).

161

• Combinaon: In scenarios when it is dicult to idenfy and recruit specialists with sucient

162

domain and contextual experse, AI red-teaming exercises may leverage both expert and

163

general public parcipants. For example, expert AI red-teamers could modify or verify the

164

prompts wrien by general public AI red-teamers. These approaches may also expand coverage

165

of the AI risk aack surface.

166

• Human / AI: Performed by GAI in combinaon with specialist or non-specialist human teams.

167

GAI-led red-teaming can be more cost eecve than human red teamers alone. Human or GAI-

168

led AI red-teaming may be beer suited for elicing dierent types of harms.

169

A.1.6. Content Provenance

170

Appendix F. Overview

171

GAI technologies can be leveraged for many applicaons such as content generaon and synthec data.

172

Some aspects of GAI output, such as the producon of deepfake content, can challenge our ability to

173

disnguish human-generated content from AI-generated content. To help manage and migate these

174

risks, digital transparency mechanisms like provenance data tracking can trace the origin and history of

175

content. Provenance data tracking and synthec content detecon can help provide greater informaon

176

about both authenc and synthec content to users, enabling trustworthiness in AI systems. When

177

combined with other organizaonal accountability mechanisms, digital content transparency can enable

178

processes to trace negave outcomes back to their source, improve informaon integrity, and uphold

179

public trust. Provenance data tracking and synthec content detecon mechanisms provide informaon

180

about the origin of content and its history to assist in GAI risk management eorts.

181

Provenance data can include informaon about generated content’s creators, date/me of creaon,

182

locaon, modicaons, and sources, including metadata informaon. Metadata can be tracked for text,

183

images, videos, audio, and underlying datasets. Provenance data tracking employs various methods and

184

metrics to assess the authencity, integrity, credibility, intellectual property rights, and potenal

185

manipulaons in GAI output. Some well-known techniques for provenance data tracking include

186

watermarking, metadata tracking, digital ngerprinng, and human authencaon, among others.

187

Appendix G. Provenance Data Tracking Approaches

188

Provenance data tracking techniques for GAI systems can be used to track the lineage and integrity of

189

data inputs, metadata, and AI-generated content. Provenance data tracking records the origin and

190

history for digital content, allowing its authencity to be determined. It consists of techniques to record

191

metadata as well as percepble and impercepble digital watermarks on digital content. Data

192

provenance refers to tracking the origin and history of input data through metadata and digital

193

watermarking techniques. Provenance data tracking processes can include and assist AI actors across the

194

lifecycle who may not have full visibility or control over the various trade-os and cascading impacts of

195

early-stage model decisions on downstream performance and synthec outputs. For example, by

196

selecng a given model to priorize computaonal eciency over accuracy, an AI actor may

197

inadvertently aect provenance tracking reliability. Organizaonal risk management eorts for

198

enhancing content provenance include:

199

• Tracking provenance of training data and metadata for GAI systems;

200

• Documenng provenance data limitaons within GAI systems;

201

• Monitoring system capabilies and limitaons in deployment through rigorous TEVV processes;

202

• Evaluang how humans engage, interact with, or adapt to GAI content (especially in decision

203

making tasks informed by GAI content), and how they react to applied provenance techniques

204

such as percepble disclosures.

205

Organizaons can document and delineate GAI system objecves and limitaons to idenfy gaps where

206

provenance data may be most useful. For instance, GAI systems used for content creaon may require

207

watermarking techniques to idenfy the source of content or metadata management to trace content

208

origins and modicaons. Further narrowing of GAI task denions to include provenance data can

209

enable organizaons to maximize the ulity of provenance data and risk management eorts.

210

A.1.7. Enhancing Content Provenance through Structured Public Feedback

211

While indirect feedback methods such as automated error collecon systems are useful, they oen lack

212

the context and depth that direct input from end users can provide. Organizaons can leverage feedback

213

approaches described in the Pre-Deployment Tesng secon to capture input from external sources such

214

as through AI red-teaming.

215

Integrang pre- and post-deployment external feedback into the monitoring process of applicaons

216

involving AI-generated content can help enhance awareness of performance changes and migate

217

potenal risks and harms. There are many ways to capture and make use of user feedback – before and

218

aer GAI systems are deployed – to gain insights about authencaon ecacy and vulnerabilies,

219

impacts of adversarial threats, unintended consequences resulng from the ulizaon of content

220

provenance approaches, and other unancipated behavior associated with content manipulaon.

221

Organizaons can track and document the provenance of datasets to idenfy instances in which AI-

222

generated data is a potenal root cause of performance issues with the GAI system.

223

A.1.8. Incident Disclosure

224

Appendix H. Overview

225

AI incidents can be dened as an event, circumstance, or series of events in which the development, use,

226

or malfuncon of one or more AI systems directly or indirectly contributes to idened harms. These

227

harms include injury or damage to the health of an individual or group of people; disrupon of the

228

management and operaon of crical infrastructure; violaons of human rights or a breach of

229

obligaons under applicable law intended to protect legal and labor rights; or damage to property,

230

communies, or the environment. AI incidents can occur in the aggregate (i.e., for systemic

231

discriminaon) or acutely (i.e., for one individual).

232

Appendix I. State of AI Incident Tracking and Disclosure

233

Formal channels do not currently exist to report and document AI incidents. However, a number of

234

publicly-available databases have been created to document their occurrence. These reporng channels

235

make decisions on an ad hoc basis about what kinds of incidents to track. Some, for example, track by

236

amount of media coverage.

237

Documenng, reporng, and sharing informaon about GAI incidents can help migate and prevent

238

harmful outcomes by assisng relevant AI actors in tracing impacts to their source. Greater awareness

239

and standardizaon of GAI incident reporng could promote this transparency and improve GAI risk

240

management across the AI ecosystem.

241

Appendix J. Documentaon and Involvement of AI Actors

242

AI actors should be aware of their roles in reporng AI incidents. To beer understand previous incidents

243

and implement measures to prevent similar ones in the future, organizaons could consider developing

244

guidelines for publicly available incident reporng which include informaon about AI actor

245

responsibilies. These guidelines would help AI system operators idenfy GAI incidents across the AI

246

lifecycle and with AI actors regardless of role. Documentaon and review of third party inputs and

247

plugins for GAI systems is especially important for AI actors in the context of incident disclosure; LLM

248

inputs and content delivered through these plugins is oen distributed, with inconsistent or insucient

249

access control.

250

Documentaon pracces including logging, recording, and analyzing GAI incidents can facilitate

251

smoother sharing of informaon with relevant AI actors. Regular informaon sharing, change

252

management records, version history and metadata can also empower AI actors responding to and

253

managing AI incidents.

254

Appendix K. References

255

AI Risks and Trustworthiness, NIST Trustworthy & Responsible AI Resource Center. Naonal Instute of

256

Standards and Technology.

257

hps://airc.nist.gov/AI_RMF_Knowledge_Base/AI_RMF/Foundaonal_Informaon/3-sec-characteriscs.

258

AI RMF Playbook. Naonal Instute of Standards and Technology.

259

hps://airc.nist.gov/AI_RMF_Knowledge_Base/Playbook.

260

AI RMF Proles. Naonal Instute of Standards and Technology.

261

hps://airc.nist.gov/AI_RMF_Knowledge_Base/AI_RMF/Core_And_Proles/6-sec-prole.

262

AI Incident Database. hps://incidentdatabase.ai/.

263

AI Risk Management Framework. Naonal Instute of Standards and Technology.

264

hps://www.nist.gov/itl/ai-risk-management-framework.

265

AI Risk Management Framework. Naonal Instute of Standards and Technology. Appendix A:

266

Descripons of AI Actor Tasks, NIST Trustworthy & Responsible AI Resource Center. Naonal Instute of

267

Standards and Technology.

268

hps://airc.nist.gov/AI_RMF_Knowledge_Base/AI_RMF/Appendices/Appendix_A#:~:text=AI%20actors%

269

20in%20this%20category,data%20providers%2C%20system%20funders%2C%20product.

270

AI Risk Management Framework. Naonal Instute of Standards and Technology. Appendix B: How AI

271

Risks Dier from Tradional Soware Risks. Naonal Instute of Standards and Technology.

272

hps://airc.nist.gov/AI_RMF_Knowledge_Base/AI_RMF/Appendices/Appendix_B.

273

Alba, D., (2023) How Fake AI Photo of a Pentagon Blast Went Viral and Briey Spooked Stocks.

274

Bloomberg. hps://www.bloomberg.com/news/arcles/2023-05-22/fake-ai-photo-of-pentagon-blast-

275

goes-viral-trips-stocks-briey.

276

Atherton, D. (2024) Deepfakes and Child Safety: A Survey and Analysis of 2023 Incidents and Responses.

277

AI Incident Database. hps://incidentdatabase.ai/blog/deepfakes-and-child-safety/.

278

Authencang AI-Generated Content (2024). Informaon Technology Industry Council.

279

hps://www.ic.org/policy/ITI_AIContentAuthorizaonPolicy_122123.pdf.

280

Badyal, N. et al., (2023) Intenonal Biases in LLM Responses. arXiv. hps://arxiv.org/pdf/2311.07611.

281

Bing Chat: Data Exltraon Exploit Explained. Embrace The Red.

282

hps://embracethered.com/blog/posts/2023/bing-chat-data-exltraon-poc-and-x/.

283

Bommasani, R. et al., (2022) Picking on the Same Person: Does Algorithmic Monoculture lead to

284

Outcome Homogenizaon? arXiv. hps://arxiv.org/pdf/2211.13972.

285

Boyarskaya, M. et al., (2020) Overcoming Failures of Imaginaon in AI Infused System Development and

286

Deployment. arXiv. hps://arxiv.org/pdf/2011.13416.

287

Browne, D. et al., (2023) Securing the AI Pipeline. Mandiant.

288

hps://www.mandiant.com/resources/blog/securing-ai-pipeline.

289

Building a Glossary for Synthec Media Transparency Methods, Part 1: Indirect Disclosure (2023)

290

Partnership on AI. hps://partnershiponai.org/glossary-for-synthec-media-transparency-methods-part-

291

1-indirect-disclosure/.

292

Burgess, M., (2024) Generave AI’s Biggest Security Flaw Is Not Easy to Fix. WIRED.

293

hps://www.wired.com/story/generave-ai-prompt-injecon-hacking/.

294

Burtell, M. et al., (2024) The Surprising Power of Next Word Predicon: Large Language Models

295

Explained, Part 1. Georgetown CSET. hps://cset.georgetown.edu/arcle/the-surprising-power-of-next-

296

word-predicon-large-language-models-explained-part-1/.

297

Carlini, N., et al., (2021) Extracng Training Data from Large Language Models. Usenix.

298

hps://www.usenix.org/conference/usenixsecurity21/presentaon/carlini-extracng.

299

Carlini, N. et al., (2023) Quanfying Memorizaon Across Neural Language Models. ICLR 2023.

300

hps://arxiv.org/pdf/2202.07646.

301

Carlini, N. et al., (2024) Stealing Part of a Producon Language Model. arXiv.

302

hps://arxiv.org/abs/2403.06634.

303

Chandra, B. et al., (2023) Dismantling the Disinformaon Business of Chinese Inuence Operaons.

304

RAND. hps://www.rand.org/pubs/commentary/2023/10/dismantling-the-disinformaon-business-of-

305

chinese.html.

306

Dahl, M. et al., (2024) Large Legal Ficons: Proling Legal Hallucinaons in Large Language Models. arXiv.

307

hps://arxiv.org/abs/2401.01301.

308

De Angelo, D., (2024) Short, Mid and Long-Term Impacts of AI in Cybersecurity. Palo Alto Networks.

309

hps://www.paloaltonetworks.com/blog/2024/02/impacts-of-ai-in-cybersecurity/.

310

De Freitas, J., et al. (2023) Chatbots and Mental Health: Insights into the Safety of Generave AI. Harvard

311

Business School. hps://www.hbs.edu/ris/Publicaon%20Files/23-011_c1bdd417-f717-47b6-bccb-

312

5438c6e65c1a_f6fd9798-3c2d-4932-b222-056231fe69d7.pdf.

313

Dietvorst, B. et al., (2014) Algorithm Aversion: People Erroneously Avoid Algorithms Aer Seeing Them

314

Err. Journal of Experimental Psychology. hps://markeng.wharton.upenn.edu/wp-

315

content/uploads/2016/10/Dietvorst-Simmons-Massey-2014.pdf.

316

Duhigg, C., (2012) How Companies Learn Your Secrets. New York Times.

317

hps://www.nymes.com/2012/02/19/magazine/shopping-habits.html.

318

Elsayed, G. et al., (2024) Images altered to trick machine vision can inuence humans too. Google

319

DeepMind. hps://deepmind.google/discover/blog/images-altered-to-trick-machine-vision-can-

320

inuence-humans-too/.

321

Epstein, Z. et al., (2023). Art and the science of generave AI. Science.

322

hps://www.science.org/doi/10.1126/science.adh4451.

323

Execuve Order on the Safe, Secure, and Trustworthy Development and Use of Arcial Intelligence

324

(2023) The White House.hps://www.whitehouse.gov/brieng-room/presidenal-

325

acons/2023/10/30/execuve-order-on-the-safe-secure-and-trustworthy-development-and-use-of-

326

arcial-intelligence/.

327

Fair Informaon Pracce Principles (FIPPs). FPC. hps://www.fpc.gov/resources/pps/.

328

Generave arcial intelligence (AI) - ITSAP.00.041. (2023) Canadian Centre for Cyber Security.

329

hps://www.cyber.gc.ca/en/guidance/generave-arcial-intelligence-ai-itsap00041.

330

GPT-4 System Card (2023) OpenAI. hps://cdn.openai.com/papers/gpt-4-system-card.pdf.

331

GPT-4 Technical Report (2024) OpenAI. hps://arxiv.org/pdf/2303.08774.

332

Greshake, K. et al., (2023). Not what you've signed up for: Compromising Real-World LLM-Integrated

333

Applicaons with Indirect Prompt Injecon. arXiv. hps://arxiv.org/abs/2302.12173.

334

Feer, M. et al., (2024). Red-Teaming for Generave AI: Silver Bullet or Security Theater? arXiv.

335

hps://arxiv.org/pdf/2401.15897.

336

Haran, R., (2023). Securing LLM Systems Against Prompt Injecon. NVIDIA.

337

hps://developer.nvidia.com/blog/securing-llm-systems-against-prompt-injecon/.

338

Harwell, D., (2023) AI-generated child sex images spawn new nightmare for the web. Washington Post.

339

hps://www.washingtonpost.com/technology/2023/06/19/arcial-intelligence-child-sex-abuse-

340

images/.

341

Hubinger, E. et al, (2024) “Sleeper Agents: Training Decepve LLMs that Persist Through Safety Training”,

342

arXiv e-prints. hps://arxiv.org/abs/2401.05566.

343

Jain, S. et al., (2023) Algorithmic Pluralism: A Structural Approach To Equal Opportunity. arXiv.

344

hps://arxiv.org/pdf/2305.08157.

345

Ji, Z. et al (2023) Survey of Hallucinaon in Natural Language Generaon. ACM Comput. Surv. 55, 12,

346

Arcle 248. hps://doi.org/10.1145/3571730

347

Jussupow, E. et al., (2020) Why Are We Averse Towards Algorithms? A Comprehensive Literature Review

348

on Algorithm Aversion. ECIS 2020. hps://aisel.aisnet.org/ecis2020_rp/168/.

349

Katzman, J., et al., (2023) Taxonomizing and measuring representaonal harms: a look at image tagging.

350

AAAI. hps://dl.acm.org/doi/10.1609/aaai.v37i12.26670.

351

Kirchenbauer, J. et al., (2023) A Watermark for Large Language Models. OpenReview.

352

hps://openreview.net/forum?id=aX8ig9X2a7.

353

Kleinberg, J. et al., (May 2021) Algorithmic monoculture and social welfare. PNAS.

354

hps://www.pnas.org/doi/10.1073/pnas.2018340118.

355

Lakatos, S., (2023) A Revealing Picture. Graphika. hps://graphika.com/reports/a-revealing-picture.

356

Lenaerts-Bergmans, B., (2024) Data Poisoning: The Exploitaon of Generave AI. Crowdstrike.

357

hps://www.crowdstrike.com/cybersecurity-101/cyberaacks/data-poisoning/.

358

Liang, W. et al., (2023) GPT detectors are biased against non-nave English writers. arXiv.

359

hps://arxiv.org/abs/2304.02819.

360

Luccioni, A. et al., (2023) Power Hungry Processing: Was Driving the Cost of AI Deployment? arXiv.

361

hps://arxiv.org/pdf/2311.16863.

362

Mouton, C. et al., (2024) The Operaonal Risks of AI in Large-Scale Biological Aacks. RAND.

363

hps://www.rand.org/pubs/research_reports/RRA2977-2.html.

364

Nicole, L. et al., (2023) Humans Are Biased. Generave Ai Is Even Worse. Bloomberg.

365

hps://www.bloomberg.com/graphics/2023-generave-ai-bias/.

366

Northcu, C. et al., (2021) Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks.

367

arXiv. hps://arxiv.org/pdf/2103.14749.

368

OECD (2023), "Advancing accountability in AI: Governing and managing risks throughout the lifecycle for

369

trustworthy AI", OECD Digital Economy Papers, No. 349, OECD Publishing, Paris,

370

hps://doi.org/10.1787/2448f04b-en.

371

OECD AI Incidents Monitor. OECD.AI Policy Observatory. hps://oecd.ai/en/incidents-methodology.

372

Padmakumar, V. et al., (2024) Does wring with language models reduce content diversity? ICLR.

373

hps://arxiv.org/pdf/2309.05196.

374

Paresh, D., (2023) ChatGPT Is Cung Non-English Languages Out of the AI Revoluon. WIRED.

375

hps://www.wired.com/story/chatgpt-non-english-languages-ai-revoluon/.

376

Qu, Y. et al., (2023) Unsafe Diusion: On the Generaon of Unsafe Images and Hateful Memes From Text-

377

To-Image Models. arXiv. hps://arxiv.org/pdf/2305.13873.

378

Rafat, K. et al., (2023) Migang carbon footprint for knowledge disllaon based deep learning model

379

compression. PLOS One. hps://journals.plos.org/plosone/arcle?id=10.1371/journal.pone.0285668.

380

Roadmap for Researchers on Priories Related to Informaon Integrity Research and Development

381

(2022) The White House. hps://www.whitehouse.gov/wp-content/uploads/2022/12/Roadmap-

382

Informaon-Integrity-RD-2022.pdf?.

383

Sandbrink, J., (2023) Arcial intelligence and biological misuse: Dierenang risks of language models

384

and biological design tools. arXiv. hps://arxiv.org/pdf/2306.13952.

385

Satariano, A. et al., (2023) The People Onscreen Are Fake. The Disinformaon Is Real. New York Times.

386

hps://www.nymes.com/2023/02/07/technology/arcial-intelligence-training-deepfake.html.

387

Schaul, K. et al., (2024) Inside the secret list of websites that make AI like ChatGPT sound smart.

388

Washington Post. hps://www.washingtonpost.com/technology/interacve/2023/ai-chatbot-learning/.

389

Shelby, R. et al., (2023) Sociotechnical Harms of Algorithmic Systems: Scoping a Taxonomy for Harm

390

Reducon. arXiv. hps://arxiv.org/pdf/2210.05791.

391

Shevlane, T. et al., (2023) Model evaluaon for extreme risks. arXiv. hps://arxiv.org/pdf/2305.15324.

392

Shumailov, I. et al., (2023) The curse of recursion: training on generated data makes models forget. arXiv.

393

hps://arxiv.org/pdf/2305.17493v2.

394

Skaug Sætra, H. et al., (2022). Psychological interference, liberty and technology. Technology in Society.

395

hps://www.sciencedirect.com/science/arcle/pii/S0160791X22001142.

396

Smith, A. et al., (2023) Hallucinaon or Confabulaon? Neuroanatomy as metaphor in Large Language

397

Models. PLOS Digital Health.

398

hps://journals.plos.org/digitalhealth/arcle?id=10.1371/journal.pdig.0000388.

399

Soice, E. et al., (2023) Can large language models democraze access to dual-use biotechnology? arXiv.

400

hps://arxiv.org/abs/2306.03809.

401

Staab, R. et al., (2023) Beyond Memorizaon: Violang Privacy via Inference With Large Language

402

Models. arXiv. hps://arxiv.org/pdf/2310.07298

403

Stanford, S. et al., (2023) Whose Opinions Do Language Models Reect? arXiv.

404

hps://arxiv.org/pdf/2303.17548.

405

Strubell, E. et al., (2019) Energy and Policy Consideraons for Deep Learning in NLP. arXiv.

406

hps://arxiv.org/pdf/1906.02243.

407

Thiel, D. (2023) Invesgaon Finds AI Image Generaon Models Trained on Child Abuse. Stanford Cyber

408

Policy Center. hps://cyber.fsi.stanford.edu/news/invesgaon-nds-ai-image-generaon-models-

409

trained-child-abuse.

410

The Toxicity Issue. Jigsaw, Google. hps://current.withgoogle.com/the-current/toxicity/.

411

Tufekci, Z. (2015) Algorithmic Harms Beyond Facebook and Google: Emergent Challenges of

412

Computaonal Agency. hps://ctlj.colorado.edu/wp-content/uploads/2015/08/Tufekci-nal.pdf

413

Turri, V. et al., (2023) Why We Need to Know More: Exploring the State of AI Incident Documentaon

414

Pracces. AAAI/ACM Conference on AI, Ethics, and Society.

415

hps://dl.acm.org/doi/fullHtml/10.1145/3600211.3604700.

416

Urbina, F. et al., (2022) Dual use of arcial-intelligence-powered drug discovery. Nature Machine

417

Intelligence. hps://www.nature.com/arcles/s42256-022-00465-9.

418

Wang, Y. et al., (2023) Do-Not-Answer: A Dataset for Evaluang Safeguards in LLMs. arXiv.

419

hps://arxiv.org/pdf/2308.13387.

420

Wang, X. et al., (2023) Energy and Carbon Consideraons of Fine-Tuning BERT. ACL Anthology.

421

hps://aclanthology.org/2023.ndings-emnlp.607.pdf.

422

Wardle, C. et al., (2017) Informaon Disorder: Toward an interdisciplinary framework for research and

423

policy making. Council of Europe. hps://rm.coe.int/informaon-disorder-toward-an-interdisciplinary-

424

framework-for-researc/168076277c.

425

Weatherbed, J., (2024) Trolls have ooded X with graphic Taylor Swi AI fakes. The Verge.

426

hps://www.theverge.com/2024/1/25/24050334/x-twier-taylor-swi-ai-fake-images-trending.

427

Weidinger, L. et al., (2021) Ethical and social risks of harm from Language Models. arXiv.

428

hps://arxiv.org/pdf/2112.04359.

429

Weidinger, L. et al. (2023) Sociotechnical Safety Evaluaon of Generave AI Systems. arXiv.

430

hps://arxiv.org/pdf/2310.11986.

431

Weidinger, L. et al., (2022) Taxonomy of Risks posed by Language Models. FAccT ’22.

432

hps://dl.acm.org/doi/pdf/10.1145/3531146.3533088.

433

Wu, K. et al., (2024) How well do LLMs cite relevant medical references? An evaluaon framework and

434

analyses. arXiv. hps://arxiv.org/pdf/2402.02008.

435

Yin, L. et al., (2024) OpenAI’s GPT Is A Recruiter’s Dream Tool. Tests Show There’s Racial Bias. Bloomberg.

436

hps://www.bloomberg.com/graphics/2024-openai-gpt-hiring-racial-discriminaon/.

437

Yu, Z. et al., (March 2024) Don’t Listen To Me: Understanding and Exploring Jailbreak Prompts of Large

438

Language Models. arXiv. hps://arxiv.org/html/2403.17336v1

439

Zhang, Y. et al., (2023) Human favorism, not AI aversion: People’s percepons (and bias) toward

440

generave AI, human experts, and human–GAI collaboraon in persuasive content generaon. Judgment

441

and Decision Making. hps://www.cambridge.org/core/journals/judgment-and-decision-

442

making/arcle/human-favorism-not-ai-aversion-peoples-percepons-and-bias-toward-generave-ai-

443

human-experts-and-humangai-collaboraon-in-persuasive-content-

444

generaon/419C4BD9CE82673EAF1D8F6C350C4FA8.

445

Zhang, Y. et al., (2023) Siren’s Song in the AI Ocean: A Survey on Hallucinaon in Large Language Models.

446

arXiv. hps://arxiv.org/pdf/2309.01219.

447

Zhao, X. et al., (2023) Provable Robust Watermarking for AI-Generated Text. Semanc Scholar.

448

hps://www.semancscholar.org/paper/Provable-Robust-Watermarking-for-AI-Generated-Text-Zhao-

449

Ananth/75b68d0903af9d9f6e47ce3cf7e1a7d27ec811dc.

450