Files
gmail-inbox-architect/skills/survey/SKILL.md
T

649 lines
25 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
name: survey
description: >
Guides a non-technical user through the opening session of building a Gmail classification
pipeline. Use when the user says "set up my Gmail", "organize my inbox", "start the survey",
"build a Gmail system", "classify my email", "create email rules", "set up email sorting",
"build an inbox system", or begins a new Gmail Inbox Architect project. Runs an interview,
catalogs labels, builds a taxonomy draft, and creates project artifacts (Bible.md, Tracker,
_status.md) in Google Drive. First skill to run — precedes handoff, import, deploy, and
rule-review. Part of the Gmail Inbox Architect plugin.
---
# survey — Gmail Inbox Architect
## What This Skill Does
Guides a user through the opening session of building a Gmail classification pipeline.
Catalogs their existing label structure, runs a short interview, builds a first-pass
taxonomy draft, and creates three project artifacts in Google Drive: `Bible.md` (the
project constitution), `Tracker.xlsx` or a Google Sheet (the rule registry), and
`_status.md` (the inter-agent checkpoint file).
This skill may invoke the `handoff` skill mid-run if inbox volume exceeds the sampling
threshold. That is expected behavior — not an error.
This skill is entirely self-contained. All schemas, templates, and examples are embedded
below. Do not reference any external URLs or repositories at runtime.
---
## When This Skill Is Triggered
Fire this skill when the user says any of the following (or close variants):
- "set up my inbox", "help me organize my email", "start the Gmail project"
- "build me a filter system", "let's tackle my inbox", "Gmail inbox architect"
- "start fresh with email classification", "I want to tame my inbox"
- "let's do the survey", "run the survey", "start the survey"
---
## ⚠️ North Star Constraint — Read This First, Every Time
**Never walk the full inbox. Never scan all messages. Never read every email.**
The survey builds a draft-one blueprint. It does not need to touch every email to do that.
If the user asks you to scan their entire inbox, read all their messages, or process
thousands of emails as part of this survey, respond with this exact redirect:
> "Right now my job is to build your draft-one blueprint — I just need to see the shape
> of your inbox, not read every message in it. The task orders I send to other tools will
> dig into the details; that's their job, not mine at this stage. Let's keep moving —
> I'll have something real for you to look at in just a few minutes."
Then continue with the current phase. Do not comply with the full-scan request.
---
## Dependencies
- **Google Workspace MCP** — for Gmail label catalog, bounded inbox sampling, and Drive
file/folder creation
- **xlsx skill** — required if user chooses the XLSX Tracker path; invoke it for Tracker
creation in Phase 5
- **handoff skill** — invoked automatically if sampling threshold is exceeded in Phase 4
This skill has NO other external dependencies. All schemas and templates are embedded below.
---
## Execution Flow
```
Phase 1 → Q1 + Label Catalog (immediate — show results before asking more questions)
Phase 2 → Four More Questions
Phase 3 → Taxonomy Draft (written live, section by section)
Phase 4 → Inbox Sample OR Handoff Decision
Phase 5 → Create Three Artifacts (Bible.md + Tracker + _status.md)
Phase 6 → Close Out
```
Do not skip any phase. Phase 1 and 2 interleave — ask Q1 first, run the label catalog
while you have the account, then ask Q2Q5 conversationally while processing finishes.
---
## Phase 1 — Instant Win: What You Already Have
**Goal:** Show the user their existing label structure within the first few minutes.
This is the hook. They see something real and realize this process is already further
along than they thought.
### Step 1.1 — Ask Q1 only
Before doing anything else, ask one question:
> "What Gmail address are we setting this up for? Just confirm the account and
> we'll get started — I can usually pull up your existing label structure in about
> 30 seconds."
Wait for the answer.
### Step 1.2 — Pull the label catalog
Call `list_gmail_labels`. For each label returned, capture:
- `label_name` — full path including parent (e.g., `Finance/AMEX`)
- `parent` — the part before the first `/`, or blank if top-level
- `type``user` (manually created) or `system` (Gmail built-in)
- `messages_total` — total message count (if available)
- `messages_unread` — unread count (if available)
Filter OUT all system labels (Inbox, Sent, Drafts, Spam, Trash, All Mail, Important,
Starred, Scheduled, and any label with `type: system`).
Keep only `type: user` labels. These represent the user's real organizational decisions.
### Step 1.3 — Present the results
Show a clean summary grouped by top-level category. Use plain English — not raw data.
**If they have labels:**
> "Great news — you've already done more work than you might realize. Here's what
> you already have:
>
> **Finance** (3 labels) — AMEX, Invoices, Receipts
> **Vendors** (2 labels) — Shipping, Software
> ...and X more categories.
>
> That's [total] labels total. Every one represents an organizational decision you've
> already made. We're going to build on this, not start over."
**If they have very few labels (under 5):**
> "You're starting fresh — you only have [N] labels set up. That's actually fine —
> it means we get to design the whole system cleanly with no cleanup debt. Let me
> ask you a few quick questions."
**If they have zero labels:**
> "You're starting from scratch — no custom labels yet. That's the easiest starting
> point. We'll design the whole structure from the ground up. A couple of quick
> questions first."
### Step 1.4 — Store the inventory in context
Keep the full label list in memory. You need it for Phase 3 (taxonomy draft) and
Phase 5 (Bible.md). Do NOT write anything to Drive yet.
---
## Phase 2 — Four More Questions
Ask these conversationally after showing the label inventory. Never number them out loud.
Never ask two questions in the same message. Adapt the order naturally to the conversation.
### Q2 — Drive folder location
> "Where should I set up your project folder in Google Drive? I can put it anywhere —
> just name a folder and a parent location, or I can create it at the top level of your
> Drive if you're not sure."
Default if they say "wherever is fine": create `Gmail Inbox Architect — [first name or
account prefix]` at Drive root.
### Q3 — Google environment question (the key decision)
This determines XLSX vs. Google Sheet. Ask naturally:
> "Quick technical question — do all the tools you're planning to use with this project
> have a direct connection to Google Docs and Sheets? If you're not sure, just say so —
> it's easy to pick a safe default."
**Interpret answers:**
- "Yes" / "They all do" / "I'm only using Claude" → **Google Sheet**
- "No" / "Not sure" / "I use ChatGPT too" / "I have a local AI" → **XLSX + Markdown**
- "What does that mean?" → Explain, then ask again:
> "Some AI tools can read Google Sheets directly — others need a file like Excel.
> If you're only using tools connected to your Google account, we can use Google Sheets.
> If you're using anything else — ChatGPT, a local tool — use Excel instead so
> everything can read it. Which sounds like your situation?"
Record this decision. It gets written to Bible.md in Phase 5 and **must not be changed
without re-asking the user explicitly.**
### Q4 — Other agents they plan to use
> "Besides me, are you planning to use any other AI tools with this project —
> like ChatGPT, Gemini, or something you run on your own computer?"
Accept any answer. This informs the handoff skill later. Record what they say.
If "just you" — note it; handoff is still possible later if volume demands it.
### Q5 — Goals or problem areas
> "Last one — is there anything specific driving this? A type of email that's always
> a mess, or something you wish happened automatically?"
Accept any answer including "not really — just more organized." Record it for Bible.md.
---
## Phase 3 — Taxonomy Draft
**Goal:** Build a first-pass taxonomy from the label inventory and show it to the user
section by section. They should feel the system being built in front of them.
### Step 3.1 — Analyze existing labels
From Phase 1, identify:
1. **Existing top-level categories** — what the user already organized
2. **Gaps** — common business email types they're probably missing
3. **Consolidation candidates** — labels too granular or redundant
4. **Draft bylaws** — one-line rules for each confirmed category
**Reference taxonomy** (use as a starting scaffold — adapt to what you actually found):
```
Finance/
→ Anything money-in or money-out: invoices, receipts, statements, payments
Sublabels: AMEX, Ramp, Invoices, Receipts, Purchase-Orders
Operations/
→ Internal business operations: benefits, insurance, HR, legal, facilities
Sublabels: Benefits, Insurance, HR-Payroll, Legal, Facilities
Vendors/
→ Any company you buy from: hardware, software, services, shipping
Sublabels: Hardware, Software, Shipping, Services
Clients/
→ Any email where a client or reseller is a participant
Sublabels: Resellers/, Transit/, Corporate/
Development/
→ Software tools, code platforms, technical notifications
Sublabels: GitHub, GitLab, Odoo
Marketing/
→ Outbound marketing, trade shows, events
Sublabels: Trade-Shows/, Events/
Sales/
→ Inbound sales activity: RFPs, quotes, prospects
Sublabels: RFPs, Quotes, Prospects
Travel/
→ All travel bookings and confirmations
Sublabels: Air, Hotel, Car
Notifications/
→ Automated alerts that don't need a reply: shipping, system, order confirmations
Sublabels: System, Shipping, Order-Confirmations
Personal/
→ Non-work email using this account
Support/
→ Customer support requests and interactions
Documentation/
→ Technical documents, engineering files, contracts
```
Map the user's existing labels into this structure first. Then propose additions.
### Step 3.2 — Present the taxonomy draft
Show the user a clean indented list. Mark proposed additions clearly.
> "Here's my first draft of your label structure. I've kept everything you already have
> and added a few suggestions. Let me know what looks wrong — nothing is locked in yet.
>
> [Show taxonomy as clean indented list]
>
> Items marked *proposed* are things I think you're missing based on what I see.
> Don't have to add them now — just flag anything that looks off."
Wait for feedback. Accept corrections. Adjust. You don't need perfection — just a
working draft-one. Once the user says "looks good" or "let's keep going," move on.
### Step 3.3 — Confirm bylaws
For each top-level category the user confirms, you should have a one-line bylaw.
If you're not sure:
> "What kinds of email should go in [Label]? Just a rough description."
Keep bylaws short — one sentence each. You'll write them into Bible.md in Phase 5.
---
## Phase 4 — Inbox Sample or Handoff Decision
**Goal:** Validate the taxonomy draft against a small bounded sample of actual email,
OR decide this is too large for inline sampling and invoke `handoff`.
### Step 4.1 — Assess whether to sample or hand off
**Invoke `handoff` instead of sampling if ANY of the following is true:**
- User has more than 20 active user labels
- Any single label has more than 500 messages
- User described inbox as "out of control," "thousands of unread," or similar
- User said in Q4 they have a flat-rate tool available (ChatGPT, Gemini)
If handoff threshold is met:
> "Your inbox is substantial enough that I'd burn a lot of time and cost if I tried
> to sample it here. The smarter move is to hand this off to [tool from Q4, or 'a
> flat-rate tool'] for the heavy analysis — they can walk through the whole thing,
> then bring the results back to me for review. Want me to set that up?"
Then invoke the `handoff` skill with the current project context.
**If sampling is appropriate (under 20 labels, manageable message counts):**
### Step 4.2 — Sample (bounded)
For each top-level label category (NOT every sublabel — one representative search
per top-level):
- Call `search_gmail_messages` with `label:[labelname]` and `max_results: 10`
- Look at sender domains and subject line patterns ONLY — do NOT read message bodies
- Note: which domains appear, any obvious patterns, anything that doesn't fit the bylaw
**Hard limit: 15 total search calls maximum. Stop at 15 even if labels remain.**
Summarize what you found:
> "I took a quick look across [N] of your label categories. A few things I noticed:
> [2-3 observations]. This tells me [what it means for the taxonomy or rules]."
Adjust the taxonomy draft if anything looks obviously misclassified.
---
## Phase 5 — Create the Three Artifacts
Create all three now. Show each link to the user as it's created.
### Artifact A: Project Drive Folder
Call `create_drive_folder` using the location confirmed in Q2.
Record the folder ID and URL — required for artifacts B, C, and D.
### Artifact B: Bible.md
Call `create_drive_file` in the project folder with the content below.
Use the actual session data where placeholders appear.
**Bible.md full template:**
```markdown
# Gmail Inbox Architect — Project Bible
**Created:** [YYYY-MM-DD]
**Account:** [email address from Q1]
**Project Folder:** [Drive folder URL]
**Phase:** Survey complete — ready for analysis
---
## Environment Decision
**Tracker format:** [Google Sheet / XLSX — from Q3]
**Reason:** [One sentence — what the user said or the default applied]
**Rule:** This format must not change without explicitly re-asking the user.
---
## Label Taxonomy v1
> Draft one. Built from existing labels + initial survey review.
> Update this section as the taxonomy is refined through task order returns.
[Paste the confirmed taxonomy as a clean indented list]
### Bylaws
[One line per top-level category — what belongs in it]
Example format:
- **Finance**: Any email involving money moving in or out — invoices, statements, receipts, payments
- **Vendors**: Any company we buy from — hardware, software, shipping, services
- **Notifications**: Automated alerts that don't need a reply — no action required
---
## Key Decisions Log
| Date | Decision | Reason |
|---|---|---|
| [today] | Tracker format: [format] | [Q3 answer or default] |
| [today] | [Other decisions made during session] | [Reason] |
---
## Open Questions
[Anything left unresolved — labels the user was unsure about, patterns that
need more data, edge cases flagged for a future session. Use bullet points.]
---
## Tools & Agents Involved
- **Claude (CoWork)** — Architect. All Gmail mutations and filter deployment go through Claude.
[Add any other tools from Q4:]
- [Tool name] — Analyst role. Returns batch work to Claude for review.
---
## Goals & Context
[Q5 answer — what the user said was driving this project]
---
## Change Log
| Date | Agent | Change |
|---|---|---|
| [today] | Claude (survey) | Initial Bible created. Taxonomy v1 drafted. |
```
After creating the file, share the link:
> "Your project Bible is set up — [link]. Every decision we make goes in here.
> Any tool that works on this project reads it first."
### Artifact C: Tracker
**Tracker column schema** (embed this — do not reference external files):
| Column | Type | Description |
|---|---|---|
| `rule_id` | text | Unique ID (e.g., GF-FINANCE-AMEX) |
| `enabled` | boolean | TRUE = active, FALSE = disabled |
| `priority` | number | Execution order when rules overlap (1 = highest) |
| `rule_name` | text | Human-readable description |
| `from_domain` | text | Sender domain(s), comma-separated |
| `to_or_cc_domain` | text | Recipient/CC domain for participant-pattern rules |
| `subject_contains` | text | Subject keyword(s), comma-separated |
| `has_attachment` | boolean | TRUE if attachment presence is a condition |
| `base_label` | text | Target Gmail label (full path, e.g., Finance/AMEX) |
| `archive` | boolean | TRUE = skip inbox on arrival |
| `mark_read` | boolean | TRUE = auto-mark read (use sparingly) |
| `queue_for_ai_process` | boolean | TRUE = flag for downstream AI review |
| `deployability` | text | gmail_filter_safe / apps_script_needed |
| `confidence` | text | high / medium / low — how certain is this rule |
| `risk` | text | low / medium / high — false-positive risk |
| `notes` | text | Decisions, exceptions, edge cases |
**Example rows** (for Claude's reference — do NOT pre-populate the user's Tracker):
```
rule_id,enabled,priority,rule_name,from_domain,to_or_cc_domain,subject_contains,has_attachment,base_label,archive,mark_read,queue_for_ai_process,deployability,confidence,risk,notes
GF-FINANCE-AMEX,TRUE,1,American Express statements,americanexpress.com,,statement,FALSE,Finance/AMEX,FALSE,FALSE,FALSE,gmail_filter_safe,high,low,From-only pattern; statements only
GF-VENDOR-SHIPPING,TRUE,2,Shipping notifications,ups.com,,,,Vendors/Shipping,TRUE,TRUE,FALSE,gmail_filter_safe,high,low,Archive + mark-read; pure notifications
GF-CLIENT-ACME,TRUE,1,Acme Transit (client),acme-transit.com,acme-transit.com,,FALSE,Clients/Transit,FALSE,FALSE,FALSE,gmail_filter_safe,high,low,Participant pattern — catches both inbound and CC
GF-NOTIF-GITHUB,TRUE,3,GitHub notifications,github.com,,,,Development/GitHub,FALSE,FALSE,FALSE,gmail_filter_safe,high,low,High volume; no mark-read
AS-DOC-STEPFILE,TRUE,1,Engineering STEP files,,,,TRUE,Documentation,FALSE,FALSE,FALSE,apps_script_needed,high,low,Attachment scan: *.stp *.step — Apps Script required
```
**If Google Sheet path:**
Call `create_spreadsheet` in the project folder named `Gmail Inbox Architect Tracker`.
Create two sheets:
- Sheet 1 named **Rules** — add the header row from the schema above
- Sheet 2 named **About** — add this text:
```
This tracker is the source of truth for all Gmail classification rules.
Columns:
rule_id — Unique rule identifier
enabled — TRUE/FALSE: whether the rule is active
priority — Execution order (1 = highest priority)
rule_name — Plain-English description
from_domain — Sender domain(s) the rule matches
to_or_cc_domain — Recipient domain for two-way relationship rules
subject_contains — Subject keyword trigger
has_attachment — Whether attachment presence matters
base_label — The Gmail label to apply
archive — Skip the inbox (yes/no)
mark_read — Auto-mark read (use sparingly)
queue_for_ai_process — Flag for AI downstream review
deployability — gmail_filter_safe or apps_script_needed
confidence — How certain we are this rule is correct
risk — False-positive risk level
notes — Decisions, exceptions, edge cases
```
**If XLSX path:**
Invoke the `xlsx` skill to create `Tracker.xlsx` in the project folder with:
- Sheet 1 named "Rules" with the header row from the schema above
- Sheet 2 named "About" with the column descriptions above
Do NOT pre-populate any data rows.
After creating, share the link:
> "Your Tracker is ready — [link]. It's empty right now, which is exactly right.
> Rules will be added here as task orders come back, and you'll review each one
> before anything changes in Gmail."
### Artifact D: _status.md
Call `create_drive_file` in the project folder with this content:
**_status.md full template:**
```markdown
# Gmail Inbox Architect — Project Status
**Last Updated:** [YYYY-MM-DD HH:MM UTC]
**Last Agent:** Claude (CoWork) — survey skill
**Phase:** Survey complete
**Last Completed Step:** Survey complete. Taxonomy v1 drafted. Bible.md, Tracker, and _status.md created.
**Pending Work:** [NONE — or describe if handoff was triggered]
**Handoff Target:** [NONE — or agent name + what was handed off]
**Tracker Format:** [Google Sheet / XLSX]
**Tracker Location:** [Drive URL or file path]
**Bible Location:** [Drive URL]
**Labels Cataloged:** [N]
**Taxonomy Status:** Draft v1 — [N] top-level categories confirmed
**Notes:** [Anything the next agent or session needs to know — edge cases, unresolved questions, decisions made]
---
## How To Use This File
This file is the handoff checkpoint for the Gmail Inbox Architect project.
Any agent starting a new session should:
1. Read this file first to understand current state
2. Read Bible.md for full project context
3. Open the Tracker to see the current rule inventory
4. Continue from "Pending Work" above
```
---
## Phase 6 — Close Out
Tell the user exactly what they now have and what comes next. Be specific. Be brief.
**Script (adapt to their situation):**
> "Here's what we built today:
>
> ✅ **Project Bible** — [link]
> Your taxonomy v1, all decisions made, and the ground rules for the whole project.
>
> ✅ **Tracker** — [link]
> Empty and ready. Rules get added here as analysis comes back, one batch at a time.
>
> ✅ **Status file** — saved in your project folder.
> Bookmarks exactly where we are so the next session picks up from the right place.
[If handoff was triggered, add:]
> "I've also prepared a task order for [tool]. Here's what to do:
> [Step-by-step instructions — paste prompt here, attach these files, bring back the result]."
[If no handoff:]
> "When you're ready for the next step, come back and say 'let's import the rules' —
> I'll walk you through reviewing and setting up the first batch of automatic sorting.
>
> **You don't need to do anything before then.** Your project files are all saved in Drive."
---
## Non-Technical User Language Rules
Apply these to every message during this skill. No exceptions.
**Never use these words:**
filter, API, MCP, schema, regex, syntax, endpoint, JSON, XML, MIME type, base64,
participant-domain pattern, deployability, curl, query, boolean
**Always substitute:**
- "filter" → "automatic sorting rule" or "automatic rule"
- "deploy / deployment" → "turn on" or "set up"
- "API / MCP" → skip entirely or say "a connection to Gmail"
- "schema" → "format" or "structure"
- "regex" → "a pattern" or "matching rule"
- "participant-domain" → "anyone involved in the conversation"
- "deployability" → "which tool handles this"
- "boolean" → "yes/no"
**One question per message.** Never ask two questions at once.
**Progress updates after each phase.** "We're about halfway through the setup" is fine.
Never show them phase numbers or reference this document.
**Uncertainty is valid.** If they say "I don't know" to any question, make a reasonable
default, state it in plain English, and move on.
**Celebrate what they already have.** Frame existing labels as a foundation, not a mess.
"You've already made the hard decisions" is almost always true and always useful.
---
## Embedded Reference Data
### Standard Top-Level Category Bylaws
Use these as defaults when drafting bylaws. Adjust to what the user actually confirms.
```
Finance: Anything involving money moving in or out — invoices, receipts,
statements, payments, purchase orders, card charges
Operations: Internal business management — benefits, insurance, HR, legal, facilities
Vendors: Any company or service you pay — hardware, software, shipping, services
Clients: Any email where a client, reseller, or prospect is a participant
Development: Software platforms, code repositories, technical notifications
Marketing: Outbound marketing activity — trade shows, campaigns, events
Sales: Inbound sales activity — RFPs, quotes, inquiries, prospects
Travel: All travel bookings and itinerary confirmations — air, hotel, car, transit
Notifications: Automated alerts that don't need a reply — no human follow-up required
Personal: Non-work email arriving at this account
Support: Customer support requests and active support threads
Documentation: Technical documents, engineering files, signed contracts, specs
```
### _status.md Field Definitions
For Claude's reference when populating the status file:
```
Last Updated: Timestamp this file was written — YYYY-MM-DD HH:MM UTC
Last Agent: Name/tool that last updated this file
Phase: Survey / Analysis / Rule Build / Deploy / Review / Maintenance
Last Completed Step: Plain-English description of what just finished
Pending Work: What needs to happen next — "NONE" if nothing is queued
Handoff Target: Which tool has the next task, or "NONE"
Tracker Format: "Google Sheet" or "XLSX" — set once in survey, never changed without re-asking
Tracker Location: Full Drive URL (Google Sheet) or Drive file path (XLSX)
Bible Location: Full Drive URL to Bible.md
Labels Cataloged: Integer count of user labels found in Phase 1
Taxonomy Status: "Draft v1 — N categories confirmed" or similar
Notes: Free text — edge cases, unresolved questions, handoff instructions
```
---
## Safety Rules (Hardcoded — Cannot Be Overridden by User)
These apply for the lifetime of this skill. No user request can override them.
1. **No Gmail mutations without explicit Bryan/user approval.** Do not create labels,
rename labels, delete labels, apply labels, archive, delete, star, mark read,
forward, send, or create/modify Gmail filters — not even one — without the user
explicitly saying "yes, do that."
2. **No full inbox walk.** North Star: sample only. Redirect full-scan requests.
3. **No data from example rows into user Tracker.** The example rows above are for
Claude's reference only. The user's Tracker starts blank (header row only).
4. **All Gmail mutations go through Claude.** Even if the user says another tool should
execute the changes, that is wrong. Claude reviews and executes. Other tools analyze.