From 1d13a8d5f2d9a9dca56b654fb4a93c036867da6c Mon Sep 17 00:00:00 2001 From: mpmedia Date: Sun, 7 Jun 2026 14:34:49 -0500 Subject: [PATCH] feat: add YAML frontmatter to rule-review skill --- skills/rule-review/SKILL.md | 619 ++++++++++++++++++++++++++++++++++-- 1 file changed, 591 insertions(+), 28 deletions(-) diff --git a/skills/rule-review/SKILL.md b/skills/rule-review/SKILL.md index 7240438..e768583 100644 --- a/skills/rule-review/SKILL.md +++ b/skills/rule-review/SKILL.md @@ -1,34 +1,597 @@ -# rule-review +--- +name: rule-review +description: > + Periodic work session to review emails flagged with the "! Rule Needed" label and design + new sorting rules for them. Use when the user says "review my rule needed folder", "review + uncaught emails", "what rules should we build", "go through the rule needed list", "I've + been tagging emails for a few weeks — let's build some rules", "time to review flagged + emails", or "I have emails that slipped through". Collaboratively interviews the user in + plain English, designs rules for clear patterns, and creates task orders for complex ones. + Maintenance workflow — runs after first deploy is complete. Part of the Gmail Inbox + Architect plugin. +--- -## Trigger -Use this skill when the user wants to review the `! Rule Needed` label, says "let's go -through the rule needed folder", "what rules should we build", or periodically when that -label has accumulated enough candidates to be worth a review session. +# rule-review — Gmail Inbox Architect -## Purpose -Interactive interview over the contents of the `! Rule Needed` Gmail label. For each -flagged email type, Claude designs the right long-term rule using the full toolbox — -in cost order: Gmail filter first, then Apps Script, then Studio flow (last resort only). +## What This Skill Does -## Toolbox Decision Tree -1. **Gmail filter** — Can this be expressed as a from:/to:/subject:/has: query? Use this. - Free, instant, zero ongoing cost. -2. **Apps Script** — Does it require attachment filename inspection, body keyword + sender - combinations, or other logic filters can't express? Use Apps Script. -3. **Studio flow** — Only if (1) and (2) genuinely cannot do the job AND the action - required goes beyond labeling (creating Tasks, filing to Drive, forwarding with context, - triggering workflows). Studio is limited and expensive — flag this clearly to the user. +Periodic work session that scans the `! Rule Needed` Gmail label, groups flagged emails +by apparent type, interviews you about each group in plain English, and either designs +a rule on the spot (for clear patterns) or creates a task order for a subcontractor to +analyze (for complex or ambiguous ones). -## Interview Pattern -For each `! Rule Needed` email: -- What made you flag this? What did you want to happen automatically? -- Who sends it? Is the domain consistent? -- Is there a subject pattern? An attachment type? -- How often does this type arrive? -- What's the right action: label only, archive, star, create task, file attachment? +This is a collaborative process — you describe the pattern, Claude asks questions and +suggests signals, you confirm or provide more context. At the end, new rules are in the +Tracker and ready to deploy. You don't need to know anything about Gmail internals. -## Outputs -Proposed rules formatted for `import` → `deploy` flow, added to Tracker with status = proposed. +This skill is entirely self-contained. All schemas, templates, and routing logic are +embedded below. -## Skill Stub — Implementation Pending -Full SKILL.md to be written in build session. Load CW-020_Concept.md for scope and design intent. +--- + +## When This Skill Is Triggered + +- User says: "let's review my rule needed folder", "review uncaught emails", + "what rules should we build", "go through the rule needed list", + "I've been tagging emails for a few weeks — let's build some rules", + "time to review the flagged emails" +- User says: "I have a few emails that need rules" — even without the label name + +--- + +## Design Principles + +**This is a work session, not a monitor.** Rule review is intentional, periodic work — +not a background process. The user schedules it; Claude doesn't initiate it. + +**Non-technical throughout.** No mention of filters, APIs, queries, or schemas in +user-facing messages. The user describes email patterns in plain English; Claude +translates internally. + +**One question per message.** Don't dump a list of questions at once. Ask one thing, +wait, then continue. + +**Collaborative pattern design.** Claude suggests candidate signals; user confirms, +corrects, or provides more context. Claude never locks in a rule the user didn't agree to. + +**Complexity routing is binary and bounded.** Each email cluster gets one decision: +handle in this session, or create a task order. Claude's complexity limit is 2–3 sampling +calls or 10–15 emails examined per cluster. Beyond that, create a task order. + +**No Gmail mutations without explicit approval.** The only mutation allowed in this skill +is removing the `! Rule Needed` label from emails after rules have been built for them — +and that requires the user to confirm it before Claude touches anything. + +--- + +## Prerequisites + +Before doing any other work, run Phase 1 (sanity check). This skill requires: +- `_status.md` exists with Phase = "Deploy" or "Maintenance" (first deploy must be complete) +- `Bible.md` exists with taxonomy and bylaws +- Tracker exists (Google Sheet or XLSX) +- `! Rule Needed` label exists in Gmail (created automatically by the deploy skill on first deploy) + +If Phase is earlier than "Deploy": +> "It looks like your rules haven't been deployed yet. The `! Rule Needed` label gets +> created as part of the first deploy — once that's done, I can review flagged emails +> with you. Want to finish the deploy first?" + +If `! Rule Needed` label doesn't exist yet: +> "I don't see the `! Rule Needed` label in your Gmail yet. That label gets created +> automatically when we finish the first deploy — it's how you flag emails that slipped +> through. Once it exists, come back and I can review anything you've tagged. Ready to +> run deploy?" + +--- + +## Dependencies + +- **Google Workspace MCP** — to search Gmail for the `! Rule Needed` label, read Bible.md + and _status.md, update the Tracker, remove the label from processed emails (with approval) +- **handoff skill** — invoked for clusters that exceed the complexity threshold +- **deploy skill** — invoked directly after a Claude-authored rule is added to the Tracker + +--- + +## Execution Flow + +``` +Phase 1 → Sanity check (project artifacts, label exists, deploy has run) +Phase 2 → Load the queue (search the ! Rule Needed label, group by type) +Phase 3 → Triage (quick scan: what's here, what's already in progress, what's new) +Phase 4 → Interview each cluster (collaborative pattern design, one cluster at a time) +Phase 5 → Design the rule (Claude builds the rule object; user confirms signal and label) +Phase 6 → Route and execute (simple → Tracker + deploy; complex → Tracker + handoff) +Phase 7 → Cleanup (remove ! Rule Needed label from addressed emails, with approval) +Phase 8 → Update _status.md and brief the user +``` + +--- + +## Phase 1 — Sanity Check + +Read `_status.md` from the project Drive folder. + +Check: +1. Phase = "Deploy" or "Maintenance" +2. `Bible Location` has a valid Drive URL +3. `Tracker Location` has a valid Drive URL or path + +If Phase is earlier than "Deploy" → redirect (see Prerequisites above). + +Read `Bible.md` — extract: +- The `! Rule Needed` label name (may have been customized; default is `! Rule Needed`) +- The full taxonomy and bylaws +- Decisions Log entries (especially any policy notes about do-not-automate senders) + +If Bible.md uses a different label name (e.g. `_needs-rule` or `! Needs Rule`), use +that name everywhere in this skill. Do not hardcode `! Rule Needed`. + +--- + +## Phase 2 — Load the Queue + +Search Gmail for all messages with the `! Rule Needed` label. + +``` +search_gmail_messages query: "label:! Rule Needed" +``` + +If the label name has spaces or special characters, quote it appropriately. + +Read message headers only: sender, sender domain, subject, date received, current labels. +Do NOT read message bodies — headers are sufficient for pattern identification. + +**Hard limit: 50 messages max per session.** If the queue exceeds 50: +> "You've got more than 50 emails flagged — that's a healthy backlog! Let's work through +> the most common patterns first. I'll focus on the ones that appear most frequently, and +> we can come back for the rest." +> +> Process the 50 most recent. + +--- + +## Phase 2a — De-duplicate Against Tracker + +Before grouping, check the Tracker for any entries with notes containing +`"task_order_pending"` or `"rule_review_in_progress"`. These represent email types +that were already identified in a prior rule-review session and are either: +- Waiting for a subcontractor to return results (task_order_pending) +- Waiting for import and deploy (rule_review_in_progress) + +For emails in the `! Rule Needed` queue that match an existing Tracker pending entry: +- Do NOT re-present them as new clusters +- Note them silently (they'll be reported in the Phase 3 triage summary) + +Matching heuristic: if the sender domain or subject pattern in the Tracker entry +matches the email's sender domain or a clear subject signal — flag it as already tracked. + +--- + +## Phase 3 — Triage + +Group the loaded emails by apparent type. Grouping signals, in priority order: + +1. **Identical sender domain** — strongest grouping signal + (all from fedex.com → one cluster) +2. **Common subject keyword** — if sender domains vary but subjects share a pattern + ("invoice", "statement", "shipment notification") +3. **Identical sender address** — for individual contacts without domain patterns +4. **Apparent category** — if neither domain nor subject clusters, group by what they + appear to be about (financial, operational, notifications) + +After grouping, present a triage summary: + +> "I found [N] emails flagged for review. Here's what I see: +> +> **New patterns to address:** +> 1. [Cluster description] — [N] emails, all from [domain / pattern] +> 2. [Cluster description] — [N] emails about [subject pattern] +> 3. [Cluster description] — [N] emails from various senders, similar topic +> +> **Already being handled:** [N] emails — rules for these are already in progress +> (I'll skip them today and they'll get cleaned up once those rules are deployed). +> +> Want to start with the most common one, or is there a specific group you want to +> tackle first?" + +Wait for the user's response before proceeding to Phase 4. + +--- + +## Phase 4 — Interview Each Cluster + +Work through one cluster at a time. For each cluster: + +### Step 4.1 — Show What Claude Sees + +Present the cluster in plain English — no technical jargon: + +> "Let's look at the [cluster description]. I'm seeing [N] emails, all from [domain] +> with subjects like '[example subject 1]' and '[example subject 2]'. +> +> What made you flag these? What would you want to happen to emails like this automatically?" + +Wait for the user's answer. + +### Step 4.2 — Confirm the Target Label + +Based on the user's description, propose a label from the taxonomy: + +> "Based on what you're describing, it sounds like these belong in [label path]. +> Does that sound right?" + +If it's unclear which label: +> "Should these go under [option A] or [option B]? [Brief description of what each means]" + +Don't propose a label that isn't in the taxonomy. If nothing fits: +> "I don't see a perfect home for these in your current label structure. We could either +> add a new label, or stretch the definition of [closest existing label]. Which feels better?" + +### Step 4.3 — Identify the Signal + +Once you know the target label, find the trigger. Suggest candidate signals in plain English: + +> "For these to sort automatically, Gmail needs something consistent to look for. +> I'm seeing a few options — which fits best? +> +> A) Always from [domain.com] — sort anything from that address +> B) From [domain.com] with 'invoice' or 'statement' in the subject +> C) Something about the content of the email itself (this is harder — takes a bit more setup) +> +> Or is there something else I should be looking for?" + +Wait for confirmation. + +### Step 4.4 — Sample Additional Emails (if needed) + +If the signal isn't clear after the initial group, offer to look at more: + +> "I want to make sure this rule won't catch emails it shouldn't. Can I look at a few +> more from this sender to check the pattern?" + +Call `search_gmail_messages` to pull up to 10 more emails from the same sender domain. +Report back on sender, subject, and current labels — never body content. + +**Complexity limit:** If after 2–3 sampling calls (10–15 emails examined) the pattern +is still unclear, or if the user describes something that requires reading email content +(not just sender/subject/attachment), this cluster exceeds Claude's complexity threshold. +Move to Phase 6b (task order path). + +### Step 4.5 — Handle Actions + +Confirm what should happen when the rule fires: + +> "When one of these arrives, what should happen? +> +> A) Just sort it into [label] — it'll still show up in your inbox +> B) Sort it and skip the inbox — it goes straight to [label], you look at it when you want +> C) Sort it, skip the inbox, and mark it as read — only if you're sure you don't need +> to see it arrive each time" + +Default recommendation: sort only (option A) unless the email is clearly a notification +or receipt the user never needs to act on immediately. + +--- + +## Phase 5 — Design the Rule + +Once the signal and action are confirmed, Claude builds the rule object internally. + +### Deployability Routing + +| Signal type | Deployability | +|---|---| +| Sender domain (from:domain.com) | gmail_filter_safe | +| Sender domain + subject keyword | gmail_filter_safe | +| Participant pattern (from/to/cc domain) | gmail_filter_safe | +| Attachment present (has:attachment) | gmail_filter_safe | +| Attachment filename or extension | apps_script_needed | +| Message body content | apps_script_needed | +| Sender + body content combination | apps_script_needed | +| "I'd know it when I see it" / topic / tone / intent | studio_candidate | + +### Rule Object (internal — not shown to user) + +``` +rule_id — [auto-assigned: GF-[CAT]-RR-[DESC] or AS-[CAT]-RR-[DESC]] +enabled — FALSE +priority — [assigned by category and specificity] +rule_name — [plain English description, max 60 chars] +from_domain — [confirmed sender domain(s)] +to_or_cc_domain — [if participant rule; blank otherwise] +subject_contains — [confirmed keyword(s); blank if not used] +has_attachment — TRUE/FALSE +base_label — [confirmed label path from taxonomy] +archive — TRUE/FALSE [from Phase 4.5] +mark_read — TRUE/FALSE [from Phase 4.5] +queue_for_ai_process — FALSE [default; TRUE only if explicitly needed] +deployability — [from routing table above] +confidence — high [Claude-authored, user-confirmed] +risk — [low / medium / high — assess by archive+mark_read and pattern breadth] +notes — "Rule designed in rule-review session [date]. Pattern: [description]. + Emails sampled: [N]. User confirmed signal: [signal description]." +``` + +### Confirm With the User Before Writing + +Translate the rule back into plain English before writing it to the Tracker: + +> "Here's what I'm planning to set up: +> +> Any email from **[domain]** [with '[keyword]' in the subject] → sorted into **[label]** +> [and skipped from your inbox / and marked as read]. +> +> That covers about [N] emails you've already flagged. Going forward, any new ones from +> this sender will sort automatically. +> +> Look right? I'll add it to your list and then we can turn it on." + +Wait for confirmation. If the user wants to adjust anything, revise and re-present. +Do NOT write to the Tracker until the user says it looks right. + +--- + +## Phase 6a — Simple Path: Claude-Authored Rule + +**Condition:** Pattern is clear, deployability is gmail_filter_safe or apps_script_needed, +complexity threshold not exceeded, user confirmed the rule. + +### Step 6a.1 — Write to Tracker + +Add the rule to the Tracker with enabled=FALSE. + +For a Google Sheet Tracker: +Call `append_table_rows` or `modify_sheet_values` to add the row to the Rules sheet. + +For an XLSX Tracker: +Invoke the xlsx skill to append the row. + +Rule ID format for rule-review-authored rules: +``` +GF-[CAT]-RR-[DESCRIPTOR] e.g. GF-VENDOR-RR-FEDEX +AS-[CAT]-RR-[DESCRIPTOR] e.g. AS-DOC-RR-STEP-ATTACH +``` + +The `RR` segment identifies rules authored during rule-review sessions vs. +subcontractor-proposed rules from import. This helps track rule provenance over time. + +After writing: +> "Added. [Rule name] is in your list, inactive for now. +> +> Want to turn it on right now? I can deploy it immediately. Or if you'd rather finish +> reviewing all the flagged emails first and deploy everything at once, we can do that too." + +Wait for the user's preference. If they want to deploy now, invoke the deploy skill for +this specific rule. If they want to batch it, continue to the next cluster. + +--- + +## Phase 6b — Complex Path: Task Order + +**Condition:** Any of the following: +- Pattern still unclear after 2–3 sampling calls (10–15 emails examined) +- User describes something that feels like "I'd know it when I see it" +- Cluster has 20+ emails with no consistent sender/subject signal + +**Do not route to task order for:** +- Clear Apps Script candidates (attachment filename check, straightforward body keyword). + These go to Phase 6a with deployability=apps_script_needed. + +### Step 6b.1 — Explain the routing in plain English + +> "This one is a bit complex for me to nail down on my own — the pattern isn't consistent +> enough for me to write a reliable rule without doing a deeper analysis. The good news is +> we can send this to your analysis tool and have it dig through a bigger sample to find +> the right signal. +> +> I'll create a note for this group so nothing gets lost, and when the results come +> back you can import them the same way as before." + +### Step 6b.2 — Write a Tracker placeholder entry + +Add a row to the Tracker marking this cluster as a pending task order: + +``` +rule_id: PENDING-[CAT]-RR-[DESCRIPTOR] e.g. PENDING-VENDOR-RR-ACME-COMPLEX +enabled: FALSE +rule_name: [plain English description of the pattern] +from_domain: [any domains observed — may be incomplete] +base_label: [target label if known; blank if unclear] +deployability: [best guess: gmail_filter_safe / apps_script_needed / tbd] +confidence: low +risk: medium +notes: "task_order_pending: [date]. Rule-review session identified pattern but + exceeded complexity threshold. [N] emails sampled — no clear signal found. + User description: [verbatim user description of what they want to happen]. + Pending subcontractor analysis. Do not re-process in next rule-review session." +``` + +### Step 6b.3 — Invoke handoff skill + +After writing the placeholder: +> "I've noted this group so we don't lose it. Let me set up the task order." + +Invoke the `handoff` skill. Pass the cluster description and user intent as context. +The handoff skill handles the full task order creation process. + +When the subcontractor returns results → `import` skill → `deploy` skill. +Update the PENDING Tracker entry in the import session once the rule is finalized. + +--- + +## Phase 7 — Cleanup + +After all clusters have been addressed (rules designed or task orders created), offer to +remove the `! Rule Needed` label from emails that now have rules covering them. + +**This is the only Gmail mutation in this skill. It requires explicit user approval.** + +### Step 7.1 — Identify emails to unflag + +For each cluster where a rule was designed (Phase 6a or 6b), identify the specific +emails in the queue that triggered it. These are the emails the rule now covers. + +Do NOT unflag emails from clusters that are still unaddressed. + +### Step 7.2 — Ask for confirmation + +> "For the rules we just built — would you like me to remove the '! Rule Needed' tag +> from the emails we used as examples? That way they won't clutter the folder. +> +> This won't delete the emails or move them — it just removes the flag. Your new rules +> will catch fresh ones going forward. +> +> [N] emails would be untagged. OK to do that?" + +Wait for explicit confirmation. If the user says yes, proceed. If no or unsure, skip. + +### Step 7.3 — Remove the label (only after explicit approval) + +Call `modify_gmail_message_labels` or `batch_modify_gmail_message_labels` to remove +the `! Rule Needed` label from confirmed emails only. + +After completing: +> "Done — [N] emails untagged. The folder is clean." + +--- + +## Phase 8 — Update _status.md and Brief the User + +Update `_status.md`: + +``` +Last Updated: [timestamp] +Last Agent: Claude (CoWork) — rule-review skill +Phase: Maintenance +Last Completed Step: Rule review session. [N] clusters reviewed. + Rules added: [N] — [N] gmail_filter_safe, [N] apps_script_needed. + Task orders created: [N] — pending subcontractor analysis via handoff skill. + Emails untagged: [N] (with user approval) / Cleanup deferred. +Pending Work: [NONE / or: [N] task orders pending. Results return via import skill.] +Handoff Target: [NONE / or: [tool name] — [N] clusters sent] +Notes: [Any taxonomy questions or new patterns worth noting for next session.] +``` + +### Closing Brief + +If rules were deployed in session: +> "All done. [N] new rules are live in your Gmail — emails like the ones you flagged will +> sort automatically from now on." + +If rules were added to Tracker but not deployed: +> "All done. [N] new rules are in your list, inactive. Say 'deploy the rules' when you're +> ready to turn them on." + +If task orders were created: +> "[N] clusters are being sent to [tool] for deeper analysis — when the results come +> back, say 'import the results' and I'll review and add them." + +If the queue was empty: +> "Your '! Rule Needed' folder is empty — nothing to review right now. Keep using that +> label whenever you get an email that doesn't sort the way you'd like, and I'll be here +> when you're ready to build more rules." + +--- + +## Reference: Rule ID Format + +``` +Claude-authored in rule-review: + Gmail filter: GF-[CAT]-RR-[DESCRIPTOR] e.g. GF-VENDOR-RR-FEDEX-NOTIF + Apps Script: AS-[CAT]-RR-[DESCRIPTOR] e.g. AS-DOC-RR-STEP-ATTACH + Placeholder: PENDING-[CAT]-RR-[DESCRIPTOR] e.g. PENDING-CLIENT-RR-ACME-TONE + +Subcontractor-proposed (via import): + Gmail filter: GF-[CAT]-[DESCRIPTOR] e.g. GF-FINANCE-AMEX-STMT + Apps Script: AS-[CAT]-[DESCRIPTOR] e.g. AS-DOC-STEP-FILE + Studio: SC-[CAT]-[DESCRIPTOR] e.g. SC-SUPPORT-COMPLAINT-TONE + +Category abbreviations: + FINANCE / VENDOR / CLIENT / OPS / DEV / SALES / TRAVEL / NOTIF / SUPPORT / DOC / PERSONAL +``` + +--- + +## Reference: Complexity Threshold Quick Reference + +| Situation | Path | +|---|---| +| Clear sender domain, 1–2 signals confirmed | Phase 6a — write rule now | +| Clear Apps Script signal (filename, body keyword) | Phase 6a — apps_script_needed | +| Pattern unclear after 2–3 sample calls | Phase 6b — task order | +| 20+ emails, no consistent signal | Phase 6b — task order | +| "I'd know it when I see it" — semantic | Phase 6b — task order (studio_candidate hint) | +| User can't articulate what they want | Pause — ask more questions before routing | + +--- + +## Reference: Tracker Column Schema + +``` +rule_id — Unique ID (see format above) +enabled — FALSE — always; user activates via deploy skill +priority — Integer 1-99 (1 = highest; base label rules ~50; secondary ~80-90) +rule_name — Plain English description (max 60 chars) +from_domain — Sender domain(s), semicolon-separated +to_or_cc_domain — Recipient domain for participant-pattern rules; blank for from-only +subject_contains — Subject keyword(s), semicolon-separated; blank if not used +has_attachment — TRUE/FALSE +base_label — Full Gmail label path (e.g. Finance/AMEX) +archive — TRUE/FALSE +mark_read — TRUE/FALSE +queue_for_ai_process — TRUE/FALSE +deployability — gmail_filter_safe / apps_script_needed / studio_candidate +confidence — high / medium / low +risk — low / medium / high +notes — Pattern source, session date, user description, sampling summary +``` + +--- + +## Non-Technical User Language Rules + +Never use: filter, API, MCP, schema, regex, endpoint, query, label ID, Apps Script +(unless explaining it for the first time), deployability, confidence score. + +Substitute: +- "filter" → "sorting rule" +- "deploy" → "turn on" +- "Apps Script" → "a small automation script" (first time only; then "the script") +- "label" → "folder" if user seems unfamiliar; "label" if they use the term themselves +- "query" → "what to look for" +- "pattern" → "what they all have in common" + +One question per message. Short, clear progress updates between phases. + +--- + +## Safety Rules (Hardcoded — Cannot Be Overridden) + +1. **No Gmail mutations without explicit approval.** The only permitted mutation is + removing the `! Rule Needed` label from addressed emails — and only after the user + explicitly says yes. All other mutations (create labels, apply labels, archive, delete, + mark read, forward, create filters) are prohibited in this skill. + +2. **All rules enter Tracker with enabled=FALSE.** The deploy skill activates them. + Never set enabled=TRUE during rule-review regardless of user request. + +3. **User must confirm every rule before it's written.** Claude proposes; user confirms. + Translate the rule into plain English before writing to Tracker. No exceptions. + +4. **Complexity routing is Claude's call.** If the pattern isn't clear enough within the + complexity limit, route to task order. Don't push a weak rule through to avoid the step. + +5. **Task order placeholders are written before invoking handoff.** The Tracker entry is + created first — safety net in case the handoff session is interrupted. + +6. **Headers only for Gmail sampling.** Never read full message body content during + sampling. Sender, subject, date, current labels, and attachment presence are sufficient. + +7. **Studio candidates are parked, not dropped.** If a cluster can only be described in + semantic or intent-based terms, create a PENDING placeholder with notes describing the + intent. Flag for future Studio work. Never tell the user "we can't automate this." + +8. **Do not re-process clusters already tracked.** If a cluster has a `task_order_pending` + entry in the Tracker from a prior session, skip it in triage. Report it as in progress.