Files
gmail-inbox-architect/skills/rule-review/SKILL.md
T

598 lines
24 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
name: rule-review
description: >
Periodic work session to review emails flagged with the "! Rule Needed" label and design
new sorting rules for them. Use when the user says "review my rule needed folder", "review
uncaught emails", "what rules should we build", "go through the rule needed list", "I've
been tagging emails for a few weeks — let's build some rules", "time to review flagged
emails", or "I have emails that slipped through". Collaboratively interviews the user in
plain English, designs rules for clear patterns, and creates task orders for complex ones.
Maintenance workflow — runs after first deploy is complete. Part of the Gmail Inbox
Architect plugin.
---
# rule-review — Gmail Inbox Architect
## What This Skill Does
Periodic work session that scans the `! Rule Needed` Gmail label, groups flagged emails
by apparent type, interviews you about each group in plain English, and either designs
a rule on the spot (for clear patterns) or creates a task order for a subcontractor to
analyze (for complex or ambiguous ones).
This is a collaborative process — you describe the pattern, Claude asks questions and
suggests signals, you confirm or provide more context. At the end, new rules are in the
Tracker and ready to deploy. You don't need to know anything about Gmail internals.
This skill is entirely self-contained. All schemas, templates, and routing logic are
embedded below.
---
## When This Skill Is Triggered
- User says: "let's review my rule needed folder", "review uncaught emails",
"what rules should we build", "go through the rule needed list",
"I've been tagging emails for a few weeks — let's build some rules",
"time to review the flagged emails"
- User says: "I have a few emails that need rules" — even without the label name
---
## Design Principles
**This is a work session, not a monitor.** Rule review is intentional, periodic work —
not a background process. The user schedules it; Claude doesn't initiate it.
**Non-technical throughout.** No mention of filters, APIs, queries, or schemas in
user-facing messages. The user describes email patterns in plain English; Claude
translates internally.
**One question per message.** Don't dump a list of questions at once. Ask one thing,
wait, then continue.
**Collaborative pattern design.** Claude suggests candidate signals; user confirms,
corrects, or provides more context. Claude never locks in a rule the user didn't agree to.
**Complexity routing is binary and bounded.** Each email cluster gets one decision:
handle in this session, or create a task order. Claude's complexity limit is 23 sampling
calls or 1015 emails examined per cluster. Beyond that, create a task order.
**No Gmail mutations without explicit approval.** The only mutation allowed in this skill
is removing the `! Rule Needed` label from emails after rules have been built for them —
and that requires the user to confirm it before Claude touches anything.
---
## Prerequisites
Before doing any other work, run Phase 1 (sanity check). This skill requires:
- `_status.md` exists with Phase = "Deploy" or "Maintenance" (first deploy must be complete)
- `Bible.md` exists with taxonomy and bylaws
- Tracker exists (Google Sheet or XLSX)
- `! Rule Needed` label exists in Gmail (created automatically by the deploy skill on first deploy)
If Phase is earlier than "Deploy":
> "It looks like your rules haven't been deployed yet. The `! Rule Needed` label gets
> created as part of the first deploy — once that's done, I can review flagged emails
> with you. Want to finish the deploy first?"
If `! Rule Needed` label doesn't exist yet:
> "I don't see the `! Rule Needed` label in your Gmail yet. That label gets created
> automatically when we finish the first deploy — it's how you flag emails that slipped
> through. Once it exists, come back and I can review anything you've tagged. Ready to
> run deploy?"
---
## Dependencies
- **Google Workspace MCP** — to search Gmail for the `! Rule Needed` label, read Bible.md
and _status.md, update the Tracker, remove the label from processed emails (with approval)
- **handoff skill** — invoked for clusters that exceed the complexity threshold
- **deploy skill** — invoked directly after a Claude-authored rule is added to the Tracker
---
## Execution Flow
```
Phase 1 → Sanity check (project artifacts, label exists, deploy has run)
Phase 2 → Load the queue (search the ! Rule Needed label, group by type)
Phase 3 → Triage (quick scan: what's here, what's already in progress, what's new)
Phase 4 → Interview each cluster (collaborative pattern design, one cluster at a time)
Phase 5 → Design the rule (Claude builds the rule object; user confirms signal and label)
Phase 6 → Route and execute (simple → Tracker + deploy; complex → Tracker + handoff)
Phase 7 → Cleanup (remove ! Rule Needed label from addressed emails, with approval)
Phase 8 → Update _status.md and brief the user
```
---
## Phase 1 — Sanity Check
Read `_status.md` from the project Drive folder.
Check:
1. Phase = "Deploy" or "Maintenance"
2. `Bible Location` has a valid Drive URL
3. `Tracker Location` has a valid Drive URL or path
If Phase is earlier than "Deploy" → redirect (see Prerequisites above).
Read `Bible.md` — extract:
- The `! Rule Needed` label name (may have been customized; default is `! Rule Needed`)
- The full taxonomy and bylaws
- Decisions Log entries (especially any policy notes about do-not-automate senders)
If Bible.md uses a different label name (e.g. `_needs-rule` or `! Needs Rule`), use
that name everywhere in this skill. Do not hardcode `! Rule Needed`.
---
## Phase 2 — Load the Queue
Search Gmail for all messages with the `! Rule Needed` label.
```
search_gmail_messages query: "label:! Rule Needed"
```
If the label name has spaces or special characters, quote it appropriately.
Read message headers only: sender, sender domain, subject, date received, current labels.
Do NOT read message bodies — headers are sufficient for pattern identification.
**Hard limit: 50 messages max per session.** If the queue exceeds 50:
> "You've got more than 50 emails flagged — that's a healthy backlog! Let's work through
> the most common patterns first. I'll focus on the ones that appear most frequently, and
> we can come back for the rest."
>
> Process the 50 most recent.
---
## Phase 2a — De-duplicate Against Tracker
Before grouping, check the Tracker for any entries with notes containing
`"task_order_pending"` or `"rule_review_in_progress"`. These represent email types
that were already identified in a prior rule-review session and are either:
- Waiting for a subcontractor to return results (task_order_pending)
- Waiting for import and deploy (rule_review_in_progress)
For emails in the `! Rule Needed` queue that match an existing Tracker pending entry:
- Do NOT re-present them as new clusters
- Note them silently (they'll be reported in the Phase 3 triage summary)
Matching heuristic: if the sender domain or subject pattern in the Tracker entry
matches the email's sender domain or a clear subject signal — flag it as already tracked.
---
## Phase 3 — Triage
Group the loaded emails by apparent type. Grouping signals, in priority order:
1. **Identical sender domain** — strongest grouping signal
(all from fedex.com → one cluster)
2. **Common subject keyword** — if sender domains vary but subjects share a pattern
("invoice", "statement", "shipment notification")
3. **Identical sender address** — for individual contacts without domain patterns
4. **Apparent category** — if neither domain nor subject clusters, group by what they
appear to be about (financial, operational, notifications)
After grouping, present a triage summary:
> "I found [N] emails flagged for review. Here's what I see:
>
> **New patterns to address:**
> 1. [Cluster description] — [N] emails, all from [domain / pattern]
> 2. [Cluster description] — [N] emails about [subject pattern]
> 3. [Cluster description] — [N] emails from various senders, similar topic
>
> **Already being handled:** [N] emails — rules for these are already in progress
> (I'll skip them today and they'll get cleaned up once those rules are deployed).
>
> Want to start with the most common one, or is there a specific group you want to
> tackle first?"
Wait for the user's response before proceeding to Phase 4.
---
## Phase 4 — Interview Each Cluster
Work through one cluster at a time. For each cluster:
### Step 4.1 — Show What Claude Sees
Present the cluster in plain English — no technical jargon:
> "Let's look at the [cluster description]. I'm seeing [N] emails, all from [domain]
> with subjects like '[example subject 1]' and '[example subject 2]'.
>
> What made you flag these? What would you want to happen to emails like this automatically?"
Wait for the user's answer.
### Step 4.2 — Confirm the Target Label
Based on the user's description, propose a label from the taxonomy:
> "Based on what you're describing, it sounds like these belong in [label path].
> Does that sound right?"
If it's unclear which label:
> "Should these go under [option A] or [option B]? [Brief description of what each means]"
Don't propose a label that isn't in the taxonomy. If nothing fits:
> "I don't see a perfect home for these in your current label structure. We could either
> add a new label, or stretch the definition of [closest existing label]. Which feels better?"
### Step 4.3 — Identify the Signal
Once you know the target label, find the trigger. Suggest candidate signals in plain English:
> "For these to sort automatically, Gmail needs something consistent to look for.
> I'm seeing a few options — which fits best?
>
> A) Always from [domain.com] — sort anything from that address
> B) From [domain.com] with 'invoice' or 'statement' in the subject
> C) Something about the content of the email itself (this is harder — takes a bit more setup)
>
> Or is there something else I should be looking for?"
Wait for confirmation.
### Step 4.4 — Sample Additional Emails (if needed)
If the signal isn't clear after the initial group, offer to look at more:
> "I want to make sure this rule won't catch emails it shouldn't. Can I look at a few
> more from this sender to check the pattern?"
Call `search_gmail_messages` to pull up to 10 more emails from the same sender domain.
Report back on sender, subject, and current labels — never body content.
**Complexity limit:** If after 23 sampling calls (1015 emails examined) the pattern
is still unclear, or if the user describes something that requires reading email content
(not just sender/subject/attachment), this cluster exceeds Claude's complexity threshold.
Move to Phase 6b (task order path).
### Step 4.5 — Handle Actions
Confirm what should happen when the rule fires:
> "When one of these arrives, what should happen?
>
> A) Just sort it into [label] — it'll still show up in your inbox
> B) Sort it and skip the inbox — it goes straight to [label], you look at it when you want
> C) Sort it, skip the inbox, and mark it as read — only if you're sure you don't need
> to see it arrive each time"
Default recommendation: sort only (option A) unless the email is clearly a notification
or receipt the user never needs to act on immediately.
---
## Phase 5 — Design the Rule
Once the signal and action are confirmed, Claude builds the rule object internally.
### Deployability Routing
| Signal type | Deployability |
|---|---|
| Sender domain (from:domain.com) | gmail_filter_safe |
| Sender domain + subject keyword | gmail_filter_safe |
| Participant pattern (from/to/cc domain) | gmail_filter_safe |
| Attachment present (has:attachment) | gmail_filter_safe |
| Attachment filename or extension | apps_script_needed |
| Message body content | apps_script_needed |
| Sender + body content combination | apps_script_needed |
| "I'd know it when I see it" / topic / tone / intent | studio_candidate |
### Rule Object (internal — not shown to user)
```
rule_id — [auto-assigned: GF-[CAT]-RR-[DESC] or AS-[CAT]-RR-[DESC]]
enabled — FALSE
priority — [assigned by category and specificity]
rule_name — [plain English description, max 60 chars]
from_domain — [confirmed sender domain(s)]
to_or_cc_domain — [if participant rule; blank otherwise]
subject_contains — [confirmed keyword(s); blank if not used]
has_attachment — TRUE/FALSE
base_label — [confirmed label path from taxonomy]
archive — TRUE/FALSE [from Phase 4.5]
mark_read — TRUE/FALSE [from Phase 4.5]
queue_for_ai_process — FALSE [default; TRUE only if explicitly needed]
deployability — [from routing table above]
confidence — high [Claude-authored, user-confirmed]
risk — [low / medium / high — assess by archive+mark_read and pattern breadth]
notes — "Rule designed in rule-review session [date]. Pattern: [description].
Emails sampled: [N]. User confirmed signal: [signal description]."
```
### Confirm With the User Before Writing
Translate the rule back into plain English before writing it to the Tracker:
> "Here's what I'm planning to set up:
>
> Any email from **[domain]** [with '[keyword]' in the subject] → sorted into **[label]**
> [and skipped from your inbox / and marked as read].
>
> That covers about [N] emails you've already flagged. Going forward, any new ones from
> this sender will sort automatically.
>
> Look right? I'll add it to your list and then we can turn it on."
Wait for confirmation. If the user wants to adjust anything, revise and re-present.
Do NOT write to the Tracker until the user says it looks right.
---
## Phase 6a — Simple Path: Claude-Authored Rule
**Condition:** Pattern is clear, deployability is gmail_filter_safe or apps_script_needed,
complexity threshold not exceeded, user confirmed the rule.
### Step 6a.1 — Write to Tracker
Add the rule to the Tracker with enabled=FALSE.
For a Google Sheet Tracker:
Call `append_table_rows` or `modify_sheet_values` to add the row to the Rules sheet.
For an XLSX Tracker:
Invoke the xlsx skill to append the row.
Rule ID format for rule-review-authored rules:
```
GF-[CAT]-RR-[DESCRIPTOR] e.g. GF-VENDOR-RR-FEDEX
AS-[CAT]-RR-[DESCRIPTOR] e.g. AS-DOC-RR-STEP-ATTACH
```
The `RR` segment identifies rules authored during rule-review sessions vs.
subcontractor-proposed rules from import. This helps track rule provenance over time.
After writing:
> "Added. [Rule name] is in your list, inactive for now.
>
> Want to turn it on right now? I can deploy it immediately. Or if you'd rather finish
> reviewing all the flagged emails first and deploy everything at once, we can do that too."
Wait for the user's preference. If they want to deploy now, invoke the deploy skill for
this specific rule. If they want to batch it, continue to the next cluster.
---
## Phase 6b — Complex Path: Task Order
**Condition:** Any of the following:
- Pattern still unclear after 23 sampling calls (1015 emails examined)
- User describes something that feels like "I'd know it when I see it"
- Cluster has 20+ emails with no consistent sender/subject signal
**Do not route to task order for:**
- Clear Apps Script candidates (attachment filename check, straightforward body keyword).
These go to Phase 6a with deployability=apps_script_needed.
### Step 6b.1 — Explain the routing in plain English
> "This one is a bit complex for me to nail down on my own — the pattern isn't consistent
> enough for me to write a reliable rule without doing a deeper analysis. The good news is
> we can send this to your analysis tool and have it dig through a bigger sample to find
> the right signal.
>
> I'll create a note for this group so nothing gets lost, and when the results come
> back you can import them the same way as before."
### Step 6b.2 — Write a Tracker placeholder entry
Add a row to the Tracker marking this cluster as a pending task order:
```
rule_id: PENDING-[CAT]-RR-[DESCRIPTOR] e.g. PENDING-VENDOR-RR-ACME-COMPLEX
enabled: FALSE
rule_name: [plain English description of the pattern]
from_domain: [any domains observed — may be incomplete]
base_label: [target label if known; blank if unclear]
deployability: [best guess: gmail_filter_safe / apps_script_needed / tbd]
confidence: low
risk: medium
notes: "task_order_pending: [date]. Rule-review session identified pattern but
exceeded complexity threshold. [N] emails sampled — no clear signal found.
User description: [verbatim user description of what they want to happen].
Pending subcontractor analysis. Do not re-process in next rule-review session."
```
### Step 6b.3 — Invoke handoff skill
After writing the placeholder:
> "I've noted this group so we don't lose it. Let me set up the task order."
Invoke the `handoff` skill. Pass the cluster description and user intent as context.
The handoff skill handles the full task order creation process.
When the subcontractor returns results → `import` skill → `deploy` skill.
Update the PENDING Tracker entry in the import session once the rule is finalized.
---
## Phase 7 — Cleanup
After all clusters have been addressed (rules designed or task orders created), offer to
remove the `! Rule Needed` label from emails that now have rules covering them.
**This is the only Gmail mutation in this skill. It requires explicit user approval.**
### Step 7.1 — Identify emails to unflag
For each cluster where a rule was designed (Phase 6a or 6b), identify the specific
emails in the queue that triggered it. These are the emails the rule now covers.
Do NOT unflag emails from clusters that are still unaddressed.
### Step 7.2 — Ask for confirmation
> "For the rules we just built — would you like me to remove the '! Rule Needed' tag
> from the emails we used as examples? That way they won't clutter the folder.
>
> This won't delete the emails or move them — it just removes the flag. Your new rules
> will catch fresh ones going forward.
>
> [N] emails would be untagged. OK to do that?"
Wait for explicit confirmation. If the user says yes, proceed. If no or unsure, skip.
### Step 7.3 — Remove the label (only after explicit approval)
Call `modify_gmail_message_labels` or `batch_modify_gmail_message_labels` to remove
the `! Rule Needed` label from confirmed emails only.
After completing:
> "Done — [N] emails untagged. The folder is clean."
---
## Phase 8 — Update _status.md and Brief the User
Update `_status.md`:
```
Last Updated: [timestamp]
Last Agent: Claude (CoWork) — rule-review skill
Phase: Maintenance
Last Completed Step: Rule review session. [N] clusters reviewed.
Rules added: [N] — [N] gmail_filter_safe, [N] apps_script_needed.
Task orders created: [N] — pending subcontractor analysis via handoff skill.
Emails untagged: [N] (with user approval) / Cleanup deferred.
Pending Work: [NONE / or: [N] task orders pending. Results return via import skill.]
Handoff Target: [NONE / or: [tool name] — [N] clusters sent]
Notes: [Any taxonomy questions or new patterns worth noting for next session.]
```
### Closing Brief
If rules were deployed in session:
> "All done. [N] new rules are live in your Gmail — emails like the ones you flagged will
> sort automatically from now on."
If rules were added to Tracker but not deployed:
> "All done. [N] new rules are in your list, inactive. Say 'deploy the rules' when you're
> ready to turn them on."
If task orders were created:
> "[N] clusters are being sent to [tool] for deeper analysis — when the results come
> back, say 'import the results' and I'll review and add them."
If the queue was empty:
> "Your '! Rule Needed' folder is empty — nothing to review right now. Keep using that
> label whenever you get an email that doesn't sort the way you'd like, and I'll be here
> when you're ready to build more rules."
---
## Reference: Rule ID Format
```
Claude-authored in rule-review:
Gmail filter: GF-[CAT]-RR-[DESCRIPTOR] e.g. GF-VENDOR-RR-FEDEX-NOTIF
Apps Script: AS-[CAT]-RR-[DESCRIPTOR] e.g. AS-DOC-RR-STEP-ATTACH
Placeholder: PENDING-[CAT]-RR-[DESCRIPTOR] e.g. PENDING-CLIENT-RR-ACME-TONE
Subcontractor-proposed (via import):
Gmail filter: GF-[CAT]-[DESCRIPTOR] e.g. GF-FINANCE-AMEX-STMT
Apps Script: AS-[CAT]-[DESCRIPTOR] e.g. AS-DOC-STEP-FILE
Studio: SC-[CAT]-[DESCRIPTOR] e.g. SC-SUPPORT-COMPLAINT-TONE
Category abbreviations:
FINANCE / VENDOR / CLIENT / OPS / DEV / SALES / TRAVEL / NOTIF / SUPPORT / DOC / PERSONAL
```
---
## Reference: Complexity Threshold Quick Reference
| Situation | Path |
|---|---|
| Clear sender domain, 12 signals confirmed | Phase 6a — write rule now |
| Clear Apps Script signal (filename, body keyword) | Phase 6a — apps_script_needed |
| Pattern unclear after 23 sample calls | Phase 6b — task order |
| 20+ emails, no consistent signal | Phase 6b — task order |
| "I'd know it when I see it" — semantic | Phase 6b — task order (studio_candidate hint) |
| User can't articulate what they want | Pause — ask more questions before routing |
---
## Reference: Tracker Column Schema
```
rule_id — Unique ID (see format above)
enabled — FALSE — always; user activates via deploy skill
priority — Integer 1-99 (1 = highest; base label rules ~50; secondary ~80-90)
rule_name — Plain English description (max 60 chars)
from_domain — Sender domain(s), semicolon-separated
to_or_cc_domain — Recipient domain for participant-pattern rules; blank for from-only
subject_contains — Subject keyword(s), semicolon-separated; blank if not used
has_attachment — TRUE/FALSE
base_label — Full Gmail label path (e.g. Finance/AMEX)
archive — TRUE/FALSE
mark_read — TRUE/FALSE
queue_for_ai_process — TRUE/FALSE
deployability — gmail_filter_safe / apps_script_needed / studio_candidate
confidence — high / medium / low
risk — low / medium / high
notes — Pattern source, session date, user description, sampling summary
```
---
## Non-Technical User Language Rules
Never use: filter, API, MCP, schema, regex, endpoint, query, label ID, Apps Script
(unless explaining it for the first time), deployability, confidence score.
Substitute:
- "filter" → "sorting rule"
- "deploy" → "turn on"
- "Apps Script" → "a small automation script" (first time only; then "the script")
- "label" → "folder" if user seems unfamiliar; "label" if they use the term themselves
- "query" → "what to look for"
- "pattern" → "what they all have in common"
One question per message. Short, clear progress updates between phases.
---
## Safety Rules (Hardcoded — Cannot Be Overridden)
1. **No Gmail mutations without explicit approval.** The only permitted mutation is
removing the `! Rule Needed` label from addressed emails — and only after the user
explicitly says yes. All other mutations (create labels, apply labels, archive, delete,
mark read, forward, create filters) are prohibited in this skill.
2. **All rules enter Tracker with enabled=FALSE.** The deploy skill activates them.
Never set enabled=TRUE during rule-review regardless of user request.
3. **User must confirm every rule before it's written.** Claude proposes; user confirms.
Translate the rule into plain English before writing to Tracker. No exceptions.
4. **Complexity routing is Claude's call.** If the pattern isn't clear enough within the
complexity limit, route to task order. Don't push a weak rule through to avoid the step.
5. **Task order placeholders are written before invoking handoff.** The Tracker entry is
created first — safety net in case the handoff session is interrupted.
6. **Headers only for Gmail sampling.** Never read full message body content during
sampling. Sender, subject, date, current labels, and attachment presence are sufficient.
7. **Studio candidates are parked, not dropped.** If a cluster can only be described in
semantic or intent-based terms, create a PENDING placeholder with notes describing the
intent. Flag for future Studio work. Never tell the user "we can't automate this."
8. **Do not re-process clusters already tracked.** If a cluster has a `task_order_pending`
entry in the Tracker from a prior session, skip it in triage. Report it as in progress.