feat: add bootstrap mode for users with <15 labels (v1.1.0)

2026-06-08 07:32:03 -05:00
parent 1d13a8d5f2
commit 816d3ab1b5
1 changed files with 435 additions and 23 deletions
@@ -20,6 +20,11 @@ taxonomy draft, and creates three project artifacts in Google Drive: `Bible.md`
 project constitution), `Tracker.xlsx` or a Google Sheet (the rule registry), and
 `_status.md` (the inter-agent checkpoint file).
 **If the user has fewer than 15 user-created labels**, this skill forks into Bootstrap Mode
 rather than the main survey flow. Bootstrap Mode samples inbox and sent email to infer a
 draft taxonomy, creates the "! Rule Needed" label, and sets them up to return in 2–3 weeks
 via rule-review. See Bootstrap Mode section below.
 This skill may invoke the `handoff` skill mid-run if inbox volume exceeds the sampling
 threshold. That is expected behavior — not an error.
@@ -57,7 +62,7 @@ Then continue with the current phase. Do not comply with the full-scan request.
 ## Dependencies
- **Google Workspace MCP** — for Gmail label catalog, bounded inbox sampling, and Drive
+- **Google Workspace MCP** — for Gmail label catalog, bounded inbox/sent sampling, and Drive
  file/folder creation
 - **xlsx skill** — required if user chooses the XLSX Tracker path; invoke it for Tracker
  creation in Phase 5
@@ -70,12 +75,17 @@ This skill has NO other external dependencies. All schemas and templates are emb
 ## Execution Flow
 ```
-Phase 1 → Q1 + Label Catalog (immediate — show results before asking more questions)
+Phase 1 → Q1 + Label Catalog (immediate — show results, then run bootstrap check)
-Phase 2 → Four More Questions
+
-Phase 3 → Taxonomy Draft (written live, section by section)
+  IF user-created labels < 15:
-Phase 4 → Inbox Sample OR Handoff Decision
+    → Bootstrap Mode (see below — replaces Phases 2–6)
-Phase 5 → Create Three Artifacts (Bible.md + Tracker + _status.md)
+
-Phase 6 → Close Out
+  IF user-created labels ≥ 15:
    → Phase 2 → Four More Questions
    → Phase 3 → Taxonomy Draft (written live, section by section)
    → Phase 4 → Inbox Sample OR Handoff Decision
    → Phase 5 → Create Three Artifacts (Bible.md + Tracker + _status.md)
    → Phase 6 → Close Out
 ```
 Do not skip any phase. Phase 1 and 2 interleave — ask Q1 first, run the label catalog
@@ -117,7 +127,7 @@ Keep only `type: user` labels. These represent the user's real organizational de
 Show a clean summary grouped by top-level category. Use plain English — not raw data.
-**If they have labels:**
+**If they have 15 or more user labels:**
 > "Great news — you've already done more work than you might realize. Here's what
 > you already have:
 >
@@ -128,20 +138,409 @@ Show a clean summary grouped by top-level category. Use plain English — not ra
 > That's [total] labels total. Every one represents an organizational decision you've
 > already made. We're going to build on this, not start over."
-**If they have very few labels (under 5):**
+**If they have fewer than 15 user labels:**
-> "You're starting fresh — you only have [N] labels set up. That's actually fine —
+> "You've got [N] labels set up — that's a starting point, but not quite enough to
-> it means we get to design the whole system cleanly with no cleanup debt. Let me
+> build a full sorting system from yet. The good news: I can look at your actual email
-> ask you a few quick questions."
+> patterns right now and sketch out what your system should look like based on who
 > you're actually talking to. Give me about a minute."
-**If they have zero labels:**
+Then immediately proceed to Bootstrap Mode. Do NOT continue to Phase 2.
 > "You're starting from scratch — no custom labels yet. That's the easiest starting
 > point. We'll design the whole structure from the ground up. A couple of quick
 > questions first."
-### Step 1.4 — Store the inventory in context
+**If they have zero user labels:**
 > "You're starting from scratch — no custom labels yet. No problem. Let me take a
 > quick look at who you've been emailing and I'll draft a system based on your
 > real patterns."
-Keep the full label list in memory. You need it for Phase 3 (taxonomy draft) and
+Then immediately proceed to Bootstrap Mode.
-Phase 5 (Bible.md). Do NOT write anything to Drive yet.
+
 ### Step 1.4 — Bootstrap Check
 Count only `type: user` labels from the catalog.
 - **If count < 15** → proceed to **Bootstrap Mode** (below). Do not run Phases 2–6.
 - **If count ≥ 15** → proceed to **Phase 2** (normal survey flow).
 ---
 ---
 # BOOTSTRAP MODE
 ## For users with fewer than 15 user-created labels
 Bootstrap Mode replaces Phases 2–6 for users who don't yet have an established label
 system. Instead of analyzing existing labels, we sample their actual email traffic to
 infer a draft taxonomy, set them up with the "! Rule Needed" label, and send them off
 for 2–3 weeks of tagging before returning via rule-review.
 ---
 ## Bootstrap Phase B1 — Sample Inbox and Sent Email
 ### Internal Domain Exclusion List (Hardcoded — MPM-Specific)
 **Always exclude these domains from ALL domain frequency analysis** — inbox and sent.
 These are MPM internal domains. Email to/from these addresses tells us nothing about
 external relationships worth labeling.
 ```
 mpmedia.tv
 messagepoint.tv
 messagepointmedia.com
 messagepoint.media
 mpm.to
 ```
 Additionally, exclude any email address whose local part (before the @) matches any
 of these patterns — these are automated/system senders regardless of domain:
 ```
 noreply
 no-reply
 donotreply
 do-not-reply
 mailer-daemon
 postmaster
 bounce
 bounces
 notifications
 notification
 automated
 ```
 Apply these exclusions before counting. They should never appear in the domain frequency
 tables or the draft taxonomy.
 ### Step B1.1 — Inbox Sample (100 emails)
 Call `search_gmail_messages` with query `in:inbox` and `max_results: 100`.
 For each result, capture:
 - `from` — sender email address
 - `subject` — subject line
 - `snippet` — email preview/snippet (do NOT read full body)
 Extract the sender domain from the `from` field (the part after @).
 Apply the exclusion list. Discard excluded domains before counting.
 Build an **inbox domain frequency table**: domain → count of messages in sample.
 ### Step B1.2 — Sent Sample (up to 250 emails)
 Call `search_gmail_messages` with query `in:sent` and `max_results: 250`.
 For each result, capture:
 - `to` — recipient email address(es)
 - `date` — send date (for recency weighting)
 Extract the recipient domain(s) from the `to` field.
 Apply the exclusion list. Discard excluded domains before counting.
 **Recency weighting:** Double-count any domain that appears in sends within the last
 90 days. Recent sending patterns are stronger signal than historical ones.
 Build a **sent domain frequency table**: domain → weighted count of sends.
 ### Step B1.3 — Domain Analysis
 Classify each domain that survived the exclusion filter:
 | Tier | Criteria | Signal | Rule Type Likely |
 |---|---|---|---|
 | **Bidirectional** | inbox_count ≥ 2 AND sent_count ≥ 2 | Strongest — real two-way relationship | participant-pattern (from OR to OR cc) |
 | **Sent-heavy** | sent_count ≥ 3, inbox_count < 2 | Likely client/partner you email proactively | participant-pattern; confirm with user |
 | **Inbox-heavy** | inbox_count ≥ 5, sent_count < 2 | Likely vendor, service, or notification sender | from-only filter |
 | **Low signal** | inbox_count < 5 AND sent_count < 3 | Not enough data; skip | n/a |
 Drop all Low Signal domains. They don't have enough data to justify a rule.
 Also scan the inbox snippets for these common patterns to surface additional early wins:
 - Subject contains "invoice", "statement", "receipt", "payment" → Finance signal
 - Subject contains "shipping", "tracking", "order" → Vendor/Notifications signal
 - Subject contains "alert", "notification", "reminder" → Notifications signal
 ---
 ## Bootstrap Phase B2 — Draft Taxonomy
 Build a draft taxonomy from the domain analysis. Use the reference taxonomy from
 Phase 3 in the main flow (Finance, Vendors, Clients, Notifications, etc.) as a scaffold,
 and populate it with the actual domains you found.
 ### Step B2.1 — Map domains to categories
 For each domain that survived analysis:
 - Bidirectional domains → place in **Clients/** or **Operations/** (prompt user to confirm)
 - Sent-heavy domains → tentatively **Clients/**, flag for confirmation
 - Inbox-heavy, subject=invoice/statement/receipt → **Finance/**
 - Inbox-heavy, subject=shipping/tracking/order → **Vendors/Shipping** or **Notifications/**
 - Inbox-heavy, subject=alert/notification → **Notifications/**
 - Other inbox-heavy → **Vendors/** (default for unknown services)
 ### Step B2.2 — Present findings
 Show a plain-English summary of what you found. Keep it conversational — this is not
 a technical report.
 > "Okay, I took a look at your recent email traffic. Here's what I found:
 >
 > **Looks like regular relationships (you email these people and they email back):**
 > [List bidirectional domains — 1-2 sentences each on what they seem to be]
 >
 > **Looks like services or vendors you use:**
 > [List inbox-heavy domains with pattern notes — e.g., 'americanexpress.com — 18 messages,
 > mostly statements and alerts']
 >
 > **Possible clients or partners you reach out to:**
 > [List sent-heavy domains]
 >
 > Based on this, here's the label structure I'd suggest starting with:
 >
 > [Show draft taxonomy — indented list, clearly labeled as 'suggested']
 >
 > Does this look roughly right? Any relationships I'm obviously missing or anything
 > that's in the wrong category?"
 Wait for feedback. Accept corrections. This is a conversation, not a presentation.
 Adjust the taxonomy based on what they say. Once they confirm it's roughly right,
 move to B3.
 ---
 ## Bootstrap Phase B3 — Ask Drive Folder Location
 > "Where should I set up your project folder in Google Drive? I can put it anywhere —
 > just name a folder and a parent location, or I can create it at the top level of
 > your Drive."
 Default if they say "wherever is fine": create `Gmail Inbox Architect — [first name or
 account prefix]` at Drive root.
 ---
 ## Bootstrap Phase B4 — Create "! Rule Needed" Label
 This label is how the user tags emails they want sorted automatically. It's the primary
 input to the rule-review skill.
 Ask for explicit approval before creating anything:
 > "The most important thing I can set up for you right now is a label called
 > '! Rule Needed'. Whenever you get an email and think 'I wish this went somewhere
 > automatically' — just tag it with that label. After a few weeks of doing that,
 > I can look at what you tagged and build the sorting rules for you.
 >
 > Can I create that label in your Gmail now?"
 Wait for an explicit "yes" before proceeding.
 If approved: call `manage_gmail_label` with `action: create` and `name: ! Rule Needed`.
 If declined: note it, tell them they can create it manually, and proceed.
 ---
 ## Bootstrap Phase B5 — Offer Starter Labels
 Based on the top 2–3 highest-confidence clusters from B2, offer to create matching
 labels now. Present as an offer — not automatic.
 > "Based on what I found, the most obvious labels to start with would be:
 >
 > - **[Category 1]** — for email from [domain(s)] (these are [plain English description])
 > - **[Category 2]** — for email from [domain(s)]
 > - **[Category 3]** — for email about [pattern description]
 >
 > Want me to create any of these? You don't have to — they can wait until we build the
 > full system later. But having even 2–3 labels gives you a head start."
 Create only the ones the user explicitly approves. One approval call per label.
 Use the same taxonomy path structure confirmed in B2 (e.g., `Finance/Receipts`,
 `Vendors/Shipping`).
 Do not create more than 5 labels in bootstrap mode regardless of what the domain
 analysis found. The goal is a quick start, not a complete system.
 ---
 ## Bootstrap Phase B6 — Create Lightweight Artifacts
 Create two artifacts (not three — no Tracker yet, there are no rules to track).
 ### Artifact B-1: Project Drive Folder
 Call `create_drive_folder` using the location from Phase B3.
 ### Artifact B-2: Bible.md (Bootstrap Version)
 Create a lightweight Bible.md that reflects bootstrap status. Key differences from the
 full survey Bible.md: taxonomy is marked as preliminary, no bylaws required yet,
 Tracker section is omitted.
 **Bootstrap Bible.md template:**
 ```markdown
 # Gmail Inbox Architect — Project Bible
 **Created:** [YYYY-MM-DD]
 **Account:** [email address]
 **Project Folder:** [Drive folder URL]
 **Phase:** Bootstrap — email patterns surveyed; taxonomy is preliminary
 **Mode:** Bootstrap (< 15 labels at setup — return via rule-review after tagging)
 ---
 ## Setup Notes
 This project was started in bootstrap mode — the user did not yet have an established
 label system. The taxonomy below was inferred from inbox and sent email patterns rather
 than from an existing label structure. It should be considered a working draft until
 confirmed through actual use.
 Return path: after 2–3 weeks of using "! Rule Needed", the user should say
 "review my rule needed folder" to run the rule-review skill and start building rules.
 ---
 ## Environment Decision
 **Tracker format:** To be determined — no Tracker created yet (not enough rules to track)
 **Rule:** Confirm Tracker format (Google Sheet vs. XLSX) when the first rule-review session runs.
 ---
 ## Label Taxonomy v1 (Preliminary)
 > Bootstrap draft. Inferred from inbox/sent domain patterns — not from an existing
 > label system. Expect significant revision after first rule-review session.
 [Paste the taxonomy confirmed in B2 — indented list]
 ---
 ## Domain Analysis Summary
 ### Bidirectional Domains (highest confidence — real two-way relationships)
 [List domains with counts: "acme.com — 8 inbox, 12 sent"]
 ### Sent-Heavy Domains (outbound relationships — likely clients/partners)
 [List domains with counts]
 ### Inbox-Heavy Domains (services/vendors/notifications)
 [List domains with counts + subject pattern notes]
 ### Exclusions Applied
 Internal MPM domains excluded from all analysis:
 mpmedia.tv, messagepoint.tv, messagepointmedia.com, messagepoint.media, mpm.to
 ---
 ## Labels Created at Bootstrap
 [List any labels created in B4 and B5, with dates]
 ---
 ## Key Decisions Log
 | Date | Decision | Reason |
 |---|---|---|
 | [today] | Bootstrap mode used | < 15 user labels at setup |
 | [today] | "! Rule Needed" label [created / not created] | [user's decision] |
 ---
 ## Open Questions
 - Confirm bidirectional domains: are these clients, partners, or internal contacts?
 - Bylaw definitions needed for all taxonomy categories
 - Tracker format (Google Sheet vs. XLSX) to be confirmed at first rule-review session
 ---
 ## Tools & Agents Involved
 - **Claude (CoWork)** — Architect. All Gmail mutations go through Claude.
 ---
 ## Change Log
 | Date | Agent | Change |
 |---|---|---|
 | [today] | Claude (survey — bootstrap) | Bootstrap Bible created. Taxonomy v1 preliminary draft from domain analysis. |
 ```
 ### Artifact B-3: _status.md
 ```markdown
 # Gmail Inbox Architect — Project Status
 **Last Updated:** [YYYY-MM-DD HH:MM UTC]
 **Last Agent:** Claude (CoWork) — survey skill (bootstrap mode)
 **Phase:** Bootstrap
 **Last Completed Step:** Bootstrap survey complete. Domain analysis run. Taxonomy v1 preliminary.
  "! Rule Needed" label [created / pending]. [N] starter labels created.
 **Pending Work:** User tagging emails with "! Rule Needed" for 2–3 weeks. Return via rule-review.
 **Handoff Target:** NONE
 **Tracker Format:** TBD — to be confirmed at first rule-review session
 **Tracker Location:** NONE — no Tracker created yet
 **Bible Location:** [Drive URL]
 **Labels Cataloged:** [N] user labels at setup (bootstrap threshold: < 15)
 **Taxonomy Status:** Preliminary v1 — inferred from domain analysis, not from label history
 **Bootstrap Mode:** TRUE
 **Notes:** Domain analysis: [N] bidirectional, [N] sent-heavy, [N] inbox-heavy domains identified.
  Return trigger: user says "review my rule needed folder" after 2–3 weeks of tagging.
  Do NOT re-run survey — go straight to rule-review.
 ---
 ## How To Use This File
 This file is the handoff checkpoint for the Gmail Inbox Architect project.
 Any agent starting a new session should:
 1. Read this file first to understand current state
 2. Read Bible.md for full project context
 3. Continue from "Pending Work" above
 4. NOTE: This project is in bootstrap mode. When the user returns, invoke rule-review,
   not survey.
 ```
 ---
 ## Bootstrap Phase B7 — Bootstrap Close Out
 Be specific and brief. The user needs to know exactly what to do next.
 > "Here's where we are:
 >
 > ✅ **Draft label structure** — based on your actual email patterns, here's what I
 >    recommend: [brief 3–4 line recap of top taxonomy categories]
 >
 > ✅ **'! Rule Needed' label** — [created in your Gmail / ready to create when you want]
 >    Whenever you get an email and think 'this should be sorted automatically' — tag it
 >    with this label. That's all you have to do.
 >
 > ✅ **Project Bible** — [link]
 >    I've saved your draft label structure and everything I found about your email
 >    patterns. It's ready to pick up where we left off.
 >
 > **What to do next:**
 > Use '! Rule Needed' for the next 2–3 weeks. Tag anything that feels like it should
 > be automatic. Don't overthink it — if it bugs you that it's in your inbox, tag it.
 >
 > When you're ready to actually build the sorting rules, come back and say:
 > **'review my rule needed folder'**
 >
 > I'll look at what you tagged, group it by pattern, and we'll build rules together
 > based on your real email — not guesses.
 >
 > **You do not need to re-run this survey.** Start with 'review my rule needed folder'
 > when you return."
 ---
 ---
 # MAIN SURVEY FLOW
 ## For users with 15 or more user-created labels
 ---
@@ -511,6 +910,7 @@ Call `create_drive_file` in the project folder with this content:
 **Bible Location:** [Drive URL]
 **Labels Cataloged:** [N]
 **Taxonomy Status:** Draft v1 — [N] top-level categories confirmed
 **Bootstrap Mode:** FALSE
 **Notes:** [Anything the next agent or session needs to know — edge cases, unresolved questions, decisions made]
 ---
@@ -616,7 +1016,7 @@ For Claude's reference when populating the status file:
 ```
 Last Updated:        Timestamp this file was written — YYYY-MM-DD HH:MM UTC
 Last Agent:          Name/tool that last updated this file
-Phase:               Survey / Analysis / Rule Build / Deploy / Review / Maintenance
+Phase:               Survey / Bootstrap / Analysis / Rule Build / Deploy / Review / Maintenance
 Last Completed Step: Plain-English description of what just finished
 Pending Work:        What needs to happen next — "NONE" if nothing is queued
 Handoff Target:      Which tool has the next task, or "NONE"
@@ -624,7 +1024,8 @@ Tracker Format:      "Google Sheet" or "XLSX" — set once in survey, never chan
 Tracker Location:    Full Drive URL (Google Sheet) or Drive file path (XLSX)
 Bible Location:      Full Drive URL to Bible.md
 Labels Cataloged:    Integer count of user labels found in Phase 1
-Taxonomy Status:     "Draft v1 — N categories confirmed" or similar
+Taxonomy Status:     "Draft v1 — N categories confirmed" or "Preliminary v1 — inferred from domain analysis"
 Bootstrap Mode:      TRUE if user had < 15 labels at setup; FALSE otherwise
 Notes:               Free text — edge cases, unresolved questions, handoff instructions
 ```
@@ -634,10 +1035,11 @@ Notes:               Free text — edge cases, unresolved questions, handoff ins
 These apply for the lifetime of this skill. No user request can override them.
-1. **No Gmail mutations without explicit Bryan/user approval.** Do not create labels,
+1. **No Gmail mutations without explicit user approval.** Do not create labels,
   rename labels, delete labels, apply labels, archive, delete, star, mark read,
   forward, send, or create/modify Gmail filters — not even one — without the user
-   explicitly saying "yes, do that."
+   explicitly saying "yes, do that." This includes the "! Rule Needed" label in
   Bootstrap Mode — always ask before creating.
 2. **No full inbox walk.** North Star: sample only. Redirect full-scan requests.
@@ -646,3 +1048,13 @@ These apply for the lifetime of this skill. No user request can override them.
 4. **All Gmail mutations go through Claude.** Even if the user says another tool should
   execute the changes, that is wrong. Claude reviews and executes. Other tools analyze.
 5. **Apply the MPM exclusion list before any domain analysis.** The domains
   mpmedia.tv, messagepoint.tv, messagepointmedia.com, messagepoint.media, and mpm.to
   must never appear in domain frequency counts, taxonomy recommendations, or label
   suggestions. Strip them before counting.
 6. **Bootstrap mode returns via rule-review, not re-survey.** If _status.md shows
   Bootstrap Mode: TRUE, and the user asks to "start over" or "redo the survey",
   redirect them to rule-review. They have a "! Rule Needed" queue to process — that
   is the correct path forward.