mpm/gmail-inbox-architect

Fork 0

Files

T

mpmedia df3a593a61 feat(survey): full production SKILL.md — self-contained, all schemas embedded

2026-06-07 11:10:52 -05:00

25 KiB

Raw Blame History

survey — Gmail Inbox Architect

What This Skill Does

Guides a user through the opening session of building a Gmail classification pipeline. Catalogs their existing label structure, runs a short interview, builds a first-pass taxonomy draft, and creates three project artifacts in Google Drive: Bible.md (the project constitution), Tracker.xlsx or a Google Sheet (the rule registry), and _status.md (the inter-agent checkpoint file).

This skill may invoke the handoff skill mid-run if inbox volume exceeds the sampling threshold. That is expected behavior — not an error.

This skill is entirely self-contained. All schemas, templates, and examples are embedded below. Do not reference any external URLs or repositories at runtime.

When This Skill Is Triggered

Fire this skill when the user says any of the following (or close variants):

"set up my inbox", "help me organize my email", "start the Gmail project"
"build me a filter system", "let's tackle my inbox", "Gmail inbox architect"
"start fresh with email classification", "I want to tame my inbox"
"let's do the survey", "run the survey", "start the survey"

⚠️ North Star Constraint — Read This First, Every Time

Never walk the full inbox. Never scan all messages. Never read every email.

The survey builds a draft-one blueprint. It does not need to touch every email to do that. If the user asks you to scan their entire inbox, read all their messages, or process thousands of emails as part of this survey, respond with this exact redirect:

"Right now my job is to build your draft-one blueprint — I just need to see the shape of your inbox, not read every message in it. The task orders I send to other tools will dig into the details; that's their job, not mine at this stage. Let's keep moving — I'll have something real for you to look at in just a few minutes."

Then continue with the current phase. Do not comply with the full-scan request.

Dependencies

Google Workspace MCP — for Gmail label catalog, bounded inbox sampling, and Drive file/folder creation
xlsx skill — required if user chooses the XLSX Tracker path; invoke it for Tracker creation in Phase 5
handoff skill — invoked automatically if sampling threshold is exceeded in Phase 4

This skill has NO other external dependencies. All schemas and templates are embedded below.

Execution Flow

Phase 1 → Q1 + Label Catalog (immediate — show results before asking more questions)
Phase 2 → Four More Questions
Phase 3 → Taxonomy Draft (written live, section by section)
Phase 4 → Inbox Sample OR Handoff Decision
Phase 5 → Create Three Artifacts (Bible.md + Tracker + _status.md)
Phase 6 → Close Out

Do not skip any phase. Phase 1 and 2 interleave — ask Q1 first, run the label catalog while you have the account, then ask Q2–Q5 conversationally while processing finishes.

Phase 1 — Instant Win: What You Already Have

Goal: Show the user their existing label structure within the first few minutes. This is the hook. They see something real and realize this process is already further along than they thought.

Step 1.1 — Ask Q1 only

Before doing anything else, ask one question:

"What Gmail address are we setting this up for? Just confirm the account and we'll get started — I can usually pull up your existing label structure in about 30 seconds."

Wait for the answer.

Step 1.2 — Pull the label catalog

Call list_gmail_labels. For each label returned, capture:

label_name — full path including parent (e.g., Finance/AMEX)
parent — the part before the first /, or blank if top-level
type — user (manually created) or system (Gmail built-in)
messages_total — total message count (if available)
messages_unread — unread count (if available)

Filter OUT all system labels (Inbox, Sent, Drafts, Spam, Trash, All Mail, Important, Starred, Scheduled, and any label with type: system).

Keep only type: user labels. These represent the user's real organizational decisions.

Step 1.3 — Present the results

Show a clean summary grouped by top-level category. Use plain English — not raw data.

If they have labels:

"Great news — you've already done more work than you might realize. Here's what you already have:

Finance (3 labels) — AMEX, Invoices, Receipts Vendors (2 labels) — Shipping, Software ...and X more categories.

That's [total] labels total. Every one represents an organizational decision you've already made. We're going to build on this, not start over."

If they have very few labels (under 5):

"You're starting fresh — you only have [N] labels set up. That's actually fine — it means we get to design the whole system cleanly with no cleanup debt. Let me ask you a few quick questions."

If they have zero labels:

"You're starting from scratch — no custom labels yet. That's the easiest starting point. We'll design the whole structure from the ground up. A couple of quick questions first."

Step 1.4 — Store the inventory in context

Keep the full label list in memory. You need it for Phase 3 (taxonomy draft) and Phase 5 (Bible.md). Do NOT write anything to Drive yet.

Phase 2 — Four More Questions

Ask these conversationally after showing the label inventory. Never number them out loud. Never ask two questions in the same message. Adapt the order naturally to the conversation.

Q2 — Drive folder location

"Where should I set up your project folder in Google Drive? I can put it anywhere — just name a folder and a parent location, or I can create it at the top level of your Drive if you're not sure."

Default if they say "wherever is fine": create Gmail Inbox Architect — [first name or account prefix] at Drive root.

Q3 — Google environment question (the key decision)

This determines XLSX vs. Google Sheet. Ask naturally:

"Quick technical question — do all the tools you're planning to use with this project have a direct connection to Google Docs and Sheets? If you're not sure, just say so — it's easy to pick a safe default."

Interpret answers:

"Yes" / "They all do" / "I'm only using Claude" → Google Sheet
"No" / "Not sure" / "I use ChatGPT too" / "I have a local AI" → XLSX + Markdown
"What does that mean?" → Explain, then ask again:

"Some AI tools can read Google Sheets directly — others need a file like Excel. If you're only using tools connected to your Google account, we can use Google Sheets. If you're using anything else — ChatGPT, a local tool — use Excel instead so everything can read it. Which sounds like your situation?"

Record this decision. It gets written to Bible.md in Phase 5 and must not be changed without re-asking the user explicitly.

Q4 — Other agents they plan to use

"Besides me, are you planning to use any other AI tools with this project — like ChatGPT, Gemini, or something you run on your own computer?"

Accept any answer. This informs the handoff skill later. Record what they say. If "just you" — note it; handoff is still possible later if volume demands it.

Q5 — Goals or problem areas

"Last one — is there anything specific driving this? A type of email that's always a mess, or something you wish happened automatically?"

Accept any answer including "not really — just more organized." Record it for Bible.md.

Phase 3 — Taxonomy Draft

Goal: Build a first-pass taxonomy from the label inventory and show it to the user section by section. They should feel the system being built in front of them.

Step 3.1 — Analyze existing labels

From Phase 1, identify:

Existing top-level categories — what the user already organized
Gaps — common business email types they're probably missing
Consolidation candidates — labels too granular or redundant
Draft bylaws — one-line rules for each confirmed category

Reference taxonomy (use as a starting scaffold — adapt to what you actually found):

Finance/
  → Anything money-in or money-out: invoices, receipts, statements, payments
  Sublabels: AMEX, Ramp, Invoices, Receipts, Purchase-Orders

Operations/
  → Internal business operations: benefits, insurance, HR, legal, facilities
  Sublabels: Benefits, Insurance, HR-Payroll, Legal, Facilities

Vendors/
  → Any company you buy from: hardware, software, services, shipping
  Sublabels: Hardware, Software, Shipping, Services

Clients/
  → Any email where a client or reseller is a participant
  Sublabels: Resellers/, Transit/, Corporate/

Development/
  → Software tools, code platforms, technical notifications
  Sublabels: GitHub, GitLab, Odoo

Marketing/
  → Outbound marketing, trade shows, events
  Sublabels: Trade-Shows/, Events/

Sales/
  → Inbound sales activity: RFPs, quotes, prospects
  Sublabels: RFPs, Quotes, Prospects

Travel/
  → All travel bookings and confirmations
  Sublabels: Air, Hotel, Car

Notifications/
  → Automated alerts that don't need a reply: shipping, system, order confirmations
  Sublabels: System, Shipping, Order-Confirmations

Personal/
  → Non-work email using this account

Support/
  → Customer support requests and interactions

Documentation/
  → Technical documents, engineering files, contracts

Map the user's existing labels into this structure first. Then propose additions.

Step 3.2 — Present the taxonomy draft

Show the user a clean indented list. Mark proposed additions clearly.

"Here's my first draft of your label structure. I've kept everything you already have and added a few suggestions. Let me know what looks wrong — nothing is locked in yet.

[Show taxonomy as clean indented list]

Items marked proposed are things I think you're missing based on what I see. Don't have to add them now — just flag anything that looks off."

Wait for feedback. Accept corrections. Adjust. You don't need perfection — just a working draft-one. Once the user says "looks good" or "let's keep going," move on.

Step 3.3 — Confirm bylaws

For each top-level category the user confirms, you should have a one-line bylaw. If you're not sure:

"What kinds of email should go in [Label]? Just a rough description."

Keep bylaws short — one sentence each. You'll write them into Bible.md in Phase 5.

Phase 4 — Inbox Sample or Handoff Decision

Goal: Validate the taxonomy draft against a small bounded sample of actual email, OR decide this is too large for inline sampling and invoke handoff.

Step 4.1 — Assess whether to sample or hand off

Invoke handoff instead of sampling if ANY of the following is true:

User has more than 20 active user labels
Any single label has more than 500 messages
User described inbox as "out of control," "thousands of unread," or similar
User said in Q4 they have a flat-rate tool available (ChatGPT, Gemini)

If handoff threshold is met:

"Your inbox is substantial enough that I'd burn a lot of time and cost if I tried to sample it here. The smarter move is to hand this off to [tool from Q4, or 'a flat-rate tool'] for the heavy analysis — they can walk through the whole thing, then bring the results back to me for review. Want me to set that up?"

Then invoke the handoff skill with the current project context.

If sampling is appropriate (under 20 labels, manageable message counts):

Step 4.2 — Sample (bounded)

For each top-level label category (NOT every sublabel — one representative search per top-level):

Call search_gmail_messages with label:[labelname] and max_results: 10
Look at sender domains and subject line patterns ONLY — do NOT read message bodies
Note: which domains appear, any obvious patterns, anything that doesn't fit the bylaw

Hard limit: 15 total search calls maximum. Stop at 15 even if labels remain.

Summarize what you found:

"I took a quick look across [N] of your label categories. A few things I noticed: [2-3 observations]. This tells me [what it means for the taxonomy or rules]."

Adjust the taxonomy draft if anything looks obviously misclassified.

Phase 5 — Create the Three Artifacts

Create all three now. Show each link to the user as it's created.

Artifact A: Project Drive Folder

Call create_drive_folder using the location confirmed in Q2. Record the folder ID and URL — required for artifacts B, C, and D.

Artifact B: Bible.md

Call create_drive_file in the project folder with the content below. Use the actual session data where placeholders appear.

Bible.md full template:

# Gmail Inbox Architect — Project Bible
**Created:** [YYYY-MM-DD]
**Account:** [email address from Q1]
**Project Folder:** [Drive folder URL]
**Phase:** Survey complete — ready for analysis

---

## Environment Decision

**Tracker format:** [Google Sheet / XLSX — from Q3]
**Reason:** [One sentence — what the user said or the default applied]
**Rule:** This format must not change without explicitly re-asking the user.

---

## Label Taxonomy v1

> Draft one. Built from existing labels + initial survey review.
> Update this section as the taxonomy is refined through task order returns.

[Paste the confirmed taxonomy as a clean indented list]

### Bylaws

[One line per top-level category — what belongs in it]

Example format:
- **Finance**: Any email involving money moving in or out — invoices, statements, receipts, payments
- **Vendors**: Any company we buy from — hardware, software, shipping, services
- **Notifications**: Automated alerts that don't need a reply — no action required

---

## Key Decisions Log

| Date | Decision | Reason |
|---|---|---|
| [today] | Tracker format: [format] | [Q3 answer or default] |
| [today] | [Other decisions made during session] | [Reason] |

---

## Open Questions

[Anything left unresolved — labels the user was unsure about, patterns that
need more data, edge cases flagged for a future session. Use bullet points.]

---

## Tools & Agents Involved

- **Claude (CoWork)** — Architect. All Gmail mutations and filter deployment go through Claude.
[Add any other tools from Q4:]
- [Tool name] — Analyst role. Returns batch work to Claude for review.

---

## Goals & Context

[Q5 answer — what the user said was driving this project]

---

## Change Log

| Date | Agent | Change |
|---|---|---|
| [today] | Claude (survey) | Initial Bible created. Taxonomy v1 drafted. |

After creating the file, share the link:

"Your project Bible is set up — [link]. Every decision we make goes in here. Any tool that works on this project reads it first."

Artifact C: Tracker

Tracker column schema (embed this — do not reference external files):

Column	Type	Description
`rule_id`	text	Unique ID (e.g., GF-FINANCE-AMEX)
`enabled`	boolean	TRUE = active, FALSE = disabled
`priority`	number	Execution order when rules overlap (1 = highest)
`rule_name`	text	Human-readable description
`from_domain`	text	Sender domain(s), comma-separated
`to_or_cc_domain`	text	Recipient/CC domain for participant-pattern rules
`subject_contains`	text	Subject keyword(s), comma-separated
`has_attachment`	boolean	TRUE if attachment presence is a condition
`base_label`	text	Target Gmail label (full path, e.g., Finance/AMEX)
`archive`	boolean	TRUE = skip inbox on arrival
`mark_read`	boolean	TRUE = auto-mark read (use sparingly)
`queue_for_ai_process`	boolean	TRUE = flag for downstream AI review
`deployability`	text	gmail_filter_safe / apps_script_needed
`confidence`	text	high / medium / low — how certain is this rule
`risk`	text	low / medium / high — false-positive risk
`notes`	text	Decisions, exceptions, edge cases

Example rows (for Claude's reference — do NOT pre-populate the user's Tracker):

rule_id,enabled,priority,rule_name,from_domain,to_or_cc_domain,subject_contains,has_attachment,base_label,archive,mark_read,queue_for_ai_process,deployability,confidence,risk,notes
GF-FINANCE-AMEX,TRUE,1,American Express statements,americanexpress.com,,statement,FALSE,Finance/AMEX,FALSE,FALSE,FALSE,gmail_filter_safe,high,low,From-only pattern; statements only
GF-VENDOR-SHIPPING,TRUE,2,Shipping notifications,ups.com,,,,Vendors/Shipping,TRUE,TRUE,FALSE,gmail_filter_safe,high,low,Archive + mark-read; pure notifications
GF-CLIENT-ACME,TRUE,1,Acme Transit (client),acme-transit.com,acme-transit.com,,FALSE,Clients/Transit,FALSE,FALSE,FALSE,gmail_filter_safe,high,low,Participant pattern — catches both inbound and CC
GF-NOTIF-GITHUB,TRUE,3,GitHub notifications,github.com,,,,Development/GitHub,FALSE,FALSE,FALSE,gmail_filter_safe,high,low,High volume; no mark-read
AS-DOC-STEPFILE,TRUE,1,Engineering STEP files,,,,TRUE,Documentation,FALSE,FALSE,FALSE,apps_script_needed,high,low,Attachment scan: *.stp *.step — Apps Script required

If Google Sheet path: Call create_spreadsheet in the project folder named Gmail Inbox Architect Tracker. Create two sheets:

Sheet 1 named Rules — add the header row from the schema above

Sheet 2 named About — add this text:

This tracker is the source of truth for all Gmail classification rules.

Columns:
rule_id — Unique rule identifier
enabled — TRUE/FALSE: whether the rule is active
priority — Execution order (1 = highest priority)
rule_name — Plain-English description
from_domain — Sender domain(s) the rule matches
to_or_cc_domain — Recipient domain for two-way relationship rules
subject_contains — Subject keyword trigger
has_attachment — Whether attachment presence matters
base_label — The Gmail label to apply
archive — Skip the inbox (yes/no)
mark_read — Auto-mark read (use sparingly)
queue_for_ai_process — Flag for AI downstream review
deployability — gmail_filter_safe or apps_script_needed
confidence — How certain we are this rule is correct
risk — False-positive risk level
notes — Decisions, exceptions, edge cases

If XLSX path: Invoke the xlsx skill to create Tracker.xlsx in the project folder with:

Sheet 1 named "Rules" with the header row from the schema above
Sheet 2 named "About" with the column descriptions above Do NOT pre-populate any data rows.

After creating, share the link:

"Your Tracker is ready — [link]. It's empty right now, which is exactly right. Rules will be added here as task orders come back, and you'll review each one before anything changes in Gmail."

Artifact D: _status.md

Call create_drive_file in the project folder with this content:

_status.md full template:

# Gmail Inbox Architect — Project Status

**Last Updated:** [YYYY-MM-DD HH:MM UTC]
**Last Agent:** Claude (CoWork) — survey skill
**Phase:** Survey complete
**Last Completed Step:** Survey complete. Taxonomy v1 drafted. Bible.md, Tracker, and _status.md created.
**Pending Work:** [NONE — or describe if handoff was triggered]
**Handoff Target:** [NONE — or agent name + what was handed off]
**Tracker Format:** [Google Sheet / XLSX]
**Tracker Location:** [Drive URL or file path]
**Bible Location:** [Drive URL]
**Labels Cataloged:** [N]
**Taxonomy Status:** Draft v1 — [N] top-level categories confirmed
**Notes:** [Anything the next agent or session needs to know — edge cases, unresolved questions, decisions made]

---

## How To Use This File

This file is the handoff checkpoint for the Gmail Inbox Architect project.
Any agent starting a new session should:
1. Read this file first to understand current state
2. Read Bible.md for full project context
3. Open the Tracker to see the current rule inventory
4. Continue from "Pending Work" above

Phase 6 — Close Out

Tell the user exactly what they now have and what comes next. Be specific. Be brief.

Script (adapt to their situation):

"Here's what we built today:

✅ Project Bible — [link] Your taxonomy v1, all decisions made, and the ground rules for the whole project.

✅ Tracker — [link] Empty and ready. Rules get added here as analysis comes back, one batch at a time.

✅ Status file — saved in your project folder. Bookmarks exactly where we are so the next session picks up from the right place.

[If handoff was triggered, add:]

"I've also prepared a task order for [tool]. Here's what to do: [Step-by-step instructions — paste prompt here, attach these files, bring back the result]."

[If no handoff:]

"When you're ready for the next step, come back and say 'let's import the rules' — I'll walk you through reviewing and setting up the first batch of automatic sorting.

You don't need to do anything before then. Your project files are all saved in Drive."

Non-Technical User Language Rules

Apply these to every message during this skill. No exceptions.

Never use these words: filter, API, MCP, schema, regex, syntax, endpoint, JSON, XML, MIME type, base64, participant-domain pattern, deployability, curl, query, boolean

Always substitute:

"filter" → "automatic sorting rule" or "automatic rule"
"deploy / deployment" → "turn on" or "set up"
"API / MCP" → skip entirely or say "a connection to Gmail"
"schema" → "format" or "structure"
"regex" → "a pattern" or "matching rule"
"participant-domain" → "anyone involved in the conversation"
"deployability" → "which tool handles this"
"boolean" → "yes/no"

One question per message. Never ask two questions at once.

Progress updates after each phase. "We're about halfway through the setup" is fine. Never show them phase numbers or reference this document.

Uncertainty is valid. If they say "I don't know" to any question, make a reasonable default, state it in plain English, and move on.

Celebrate what they already have. Frame existing labels as a foundation, not a mess. "You've already made the hard decisions" is almost always true and always useful.

Embedded Reference Data

Standard Top-Level Category Bylaws

Use these as defaults when drafting bylaws. Adjust to what the user actually confirms.

Finance:       Anything involving money moving in or out — invoices, receipts,
               statements, payments, purchase orders, card charges
Operations:    Internal business management — benefits, insurance, HR, legal, facilities
Vendors:       Any company or service you pay — hardware, software, shipping, services
Clients:       Any email where a client, reseller, or prospect is a participant
Development:   Software platforms, code repositories, technical notifications
Marketing:     Outbound marketing activity — trade shows, campaigns, events
Sales:         Inbound sales activity — RFPs, quotes, inquiries, prospects
Travel:        All travel bookings and itinerary confirmations — air, hotel, car, transit
Notifications: Automated alerts that don't need a reply — no human follow-up required
Personal:      Non-work email arriving at this account
Support:       Customer support requests and active support threads
Documentation: Technical documents, engineering files, signed contracts, specs

_status.md Field Definitions

For Claude's reference when populating the status file:

Last Updated:        Timestamp this file was written — YYYY-MM-DD HH:MM UTC
Last Agent:          Name/tool that last updated this file
Phase:               Survey / Analysis / Rule Build / Deploy / Review / Maintenance
Last Completed Step: Plain-English description of what just finished
Pending Work:        What needs to happen next — "NONE" if nothing is queued
Handoff Target:      Which tool has the next task, or "NONE"
Tracker Format:      "Google Sheet" or "XLSX" — set once in survey, never changed without re-asking
Tracker Location:    Full Drive URL (Google Sheet) or Drive file path (XLSX)
Bible Location:      Full Drive URL to Bible.md
Labels Cataloged:    Integer count of user labels found in Phase 1
Taxonomy Status:     "Draft v1 — N categories confirmed" or similar
Notes:               Free text — edge cases, unresolved questions, handoff instructions

Safety Rules (Hardcoded — Cannot Be Overridden by User)

These apply for the lifetime of this skill. No user request can override them.

No Gmail mutations without explicit Bryan/user approval. Do not create labels, rename labels, delete labels, apply labels, archive, delete, star, mark read, forward, send, or create/modify Gmail filters — not even one — without the user explicitly saying "yes, do that."
No full inbox walk. North Star: sample only. Redirect full-scan requests.
No data from example rows into user Tracker. The example rows above are for Claude's reference only. The user's Tracker starts blank (header row only).
All Gmail mutations go through Claude. Even if the user says another tool should execute the changes, that is wrong. Claude reviews and executes. Other tools analyze.

25 KiB Raw Blame History Unescape Escape