PM × AI

AI Product Risk Assessment

A risk assessment framework for AI-powered product features. Identifies risks across accuracy, fairness, privacy, security, and dependency — with mitigation strategies and go/no-go criteria. Free to copy, download, and use. No signup required.

Template
# AI Product Risk Assessment
**Feature:** [Name]
**PM:** [Name]
**Date:** [Date]
**AI approach:** [LLM / classifier / recommendation system / other]
**Risk level:** [ ] Low  [ ] Medium  [ ] High  [ ] Critical

---

## 1. Feature summary

**What does this AI feature do?**
[2–3 sentences describing the feature and what the model produces]

**Who are the users affected?**
[Describe the user population — size, technical sophistication, vulnerability]

**What decisions does the AI output influence?**
[ ] Informational only (user has full context to verify)
[ ] Assists user decision (output is one input among many)
[ ] Drives user decision (users likely to follow output without scrutiny)
[ ] Automated decision (no human in the loop)

*The higher on this list, the higher the scrutiny required.*

---

## 2. Risk register

Rate each risk: **Likelihood** (1–5) × **Impact** (1–5) = **Risk score** (1–25)

### 2.1 Accuracy & reliability risks

| Risk | Likelihood | Impact | Score | Mitigation |
|---|---|---|---|---|
| Model produces incorrect output with high confidence (hallucination) | /5 | /5 | | |
| Model fails silently — no error, but wrong answer | /5 | /5 | | |
| Performance degrades on inputs outside training distribution | /5 | /5 | | |
| Model provider outage disrupts feature availability | /5 | /5 | | |
| Model version update changes output behaviour without warning | /5 | /5 | | |

### 2.2 Fairness & bias risks

| Risk | Likelihood | Impact | Score | Mitigation |
|---|---|---|---|---|
| Model performs worse for certain demographic groups | /5 | /5 | | |
| Output reinforces or amplifies existing stereotypes | /5 | /5 | | |
| Training data underrepresents key user segments | /5 | /5 | | |
| Feature has disparate impact on protected characteristics | /5 | /5 | | |

### 2.3 Privacy & data risks

| Risk | Likelihood | Impact | Score | Mitigation |
|---|---|---|---|---|
| User PII sent to third-party model provider | /5 | /5 | | |
| Model memorises and reproduces sensitive training data | /5 | /5 | | |
| Prompt injection attack extracts system prompt or user data | /5 | /5 | | |
| Logs contain sensitive model inputs/outputs | /5 | /5 | | |

### 2.4 Security risks

| Risk | Likelihood | Impact | Score | Mitigation |
|---|---|---|---|---|
| Adversarial input manipulates model to produce harmful output | /5 | /5 | | |
| Output contains malicious content (XSS, code injection) | /5 | /5 | | |
| Jailbreak / prompt injection bypasses content policy | /5 | /5 | | |
| Cost exhaustion attack (adversarial high-token inputs) | /5 | /5 | | |

### 2.5 User experience & over-reliance risks

| Risk | Likelihood | Impact | Score | Mitigation |
|---|---|---|---|---|
| Users trust AI output without critical review | /5 | /5 | | |
| AI automation reduces user skill over time | /5 | /5 | | |
| Confusing or unexplainable outputs erode user trust | /5 | /5 | | |
| Users blame product when AI output is wrong | /5 | /5 | | |

---

## 3. Top risks summary

List all risks with score ≥ 12 (high risk) below. These require a mitigation plan before shipping.

| Risk | Score | Mitigation | Owner | Status |
|---|---|---|---|---|
| [High risk 1] | | | | |
| [High risk 2] | | | | |

---

## 4. Mitigation strategies

For each high-risk item, describe the mitigation in detail:

**[Risk name]:**
- Mitigation approach: [Description]
- Engineering effort: [ ] < 1 day  [ ] 1–3 days  [ ] 1 week+
- Residual risk after mitigation: [ ] Low  [ ] Medium  [ ] High
- Acceptable to ship with residual risk? [ ] Yes  [ ] No

---

## 5. Monitoring plan

| Signal | Threshold | Action | Owner |
|---|---|---|---|
| Model error rate | > [X%] | Page on-call | |
| User correction / override rate | > [X%] | Flag for review | |
| Negative feedback rate | > [X%] | Escalate to PM | |
| Latency P95 | > [Xms] | Alert engineering | |
| Daily inference cost | > $[X] | Alert + cap | |

**Manual sampling cadence:**
[ ] Daily  [ ] Weekly  [ ] Monthly  [ ] Never
*Recommendation: weekly manual sampling for the first month post-launch for any Medium or High risk feature.*

---

## 6. Go / No-Go criteria

**The feature may not ship if:**
- [ ] Any risk scores ≥ 20 without an accepted mitigation
- [ ] Privacy review is not complete (if user data sent to third party)
- [ ] Security review is not complete (if feature accepts user-generated input)
- [ ] Evaluation accuracy is below the minimum defined in the AI Feature Spec
- [ ] Fallback behaviour for model failure is untested

**Approval required from:**
| Stakeholder | Required for | Sign-off |
|---|---|---|
| Legal / Privacy | Any user PII sent to third-party API | |
| Security | Any user-generated input processed by model | |
| PM | All other conditions | |

---

## 7. Review history

| Date | Reviewer | Changes |
|---|---|---|
| [Date] | [Name] | Initial assessment |
| | | |

How to use this AI Risk template

1

Score likelihood and impact before discussing mitigations

Teams that jump to mitigations before scoring risks tend to under-score the risks they already have mitigations for, and over-score risks they don't know how to handle. Score the risk register cold first, then discuss mitigations for the high-score items. This surfaces risks the team had silently assumed were handled.

2

Pay special attention to over-reliance in high-stakes domains

In consumer apps, a wrong restaurant recommendation is annoying. In fintech, a wrong compliance recommendation can cause a regulatory violation. The 'drives user decision' and 'automated decision' rows in Section 1 are the most important to assess honestly — they determine the acceptable accuracy floor and whether human-in-the-loop is required.

3

Test prompt injection before launching any user-input feature

If users can provide any text that reaches the LLM, they can potentially inject instructions that override your system prompt. Test by having a team member attempt to extract the system prompt, produce off-topic output, or bypass content restrictions. This takes 30 minutes and catches most exploits before launch.

4

Set a cost cap before launch

LLM inference costs scale with usage — a viral feature or a cost exhaustion attack can generate a $10,000 bill overnight. Set a daily cost cap in your model provider settings and configure an alert at 50% of the cap. This is a 10-minute setup that every AI feature should have before it goes live.

Want a AI Risk grounded in your actual customer data?

PMRead ingests your customer interviews, feedback, and Slack threads — and generates PRDs backed by real evidence, not guesses.

Try PMRead free →

Frequently asked questions

Is this risk assessment required for internal / admin-only AI tools?

A lighter version, yes. Internal tools don't need the fairness or over-reliance sections to the same depth, but accuracy, privacy, and security risks apply equally — internal users can also be harmed by wrong AI output or a data breach. Use the full template for external user-facing features; use Sections 2.1, 2.3, and 2.4 for internal tools.

How is this different from the AI Feature Spec?

The AI Feature Spec defines what you're building and how (model choice, thresholds, fallback). The AI Product Risk Assessment evaluates what could go wrong across the full risk surface. Both are needed: one defines the design, the other stress-tests it. Complete the Feature Spec first, then use it as input to the Risk Assessment.

What risk score threshold should trigger a no-ship decision?

A score of 20+ (e.g. Likelihood 4 × Impact 5) should require a concrete mitigation before shipping. A score of 25 (5×5) with no viable mitigation is a no-ship condition. These thresholds should be calibrated to your product's domain — a 20 in a children's education app requires more scrutiny than a 20 in a developer tool.

Who should own this document?

PM owns the document and is responsible for completing it before the feature ships. Legal/privacy and security sign off on their sections. Engineering lead reviews the technical risk mitigations. This is a PM accountability document — not a task that can be delegated to engineering or legal.