# handoff — Gmail Inbox Architect

## What This Skill Does

Produces a complete task order package for a subcontractor AI tool to walk Gmail labels,
identify sender patterns, and propose classification rules. One task order per major label
tree (or sub-tree if the tree is large). The user is the delivery person — they paste the
prompt, attach the files, and bring back the results.

This skill is invoked either:
- Automatically by the `survey` skill when inbox volume exceeds the inline sampling threshold
- Directly by the user ("create the task orders", "set up the handoff")

This skill is entirely self-contained. All templates are embedded below.

---

## When This Skill Is Triggered

- Survey invokes it directly (most common)
- User says: "create the task orders", "set up the handoff", "build the task orders",
  "I want to hand this off to ChatGPT", "prepare the handoff package"

---

## Prerequisites — Survey Must Be Complete

Before doing any other work, run Phase 1 (sanity check). This skill requires:
- `_status.md` exists in the project folder with Phase = "Survey complete" or later
- `Bible.md` exists with at least taxonomy v1
- Tracker exists (Google Sheet or XLSX)

If any artifact is missing or phase is wrong, do NOT proceed. Say:

> "It looks like the survey hasn't finished yet. Before I can build the task orders, I need
> the project setup to be complete. Let's run the survey first — just say 'start the survey'
> and I'll walk you through it."

---

## Dependencies

- **Google Workspace MCP** — to read Bible.md, _status.md from project Drive folder,
  and to create task order folders and files in Drive
- **handoff skill** has no other external dependencies

---

## Execution Flow

```
Phase 1 → Sanity check (verify survey artifacts exist and are complete)
Phase 2 → Confirm subcontractor tool (and verify it has Gmail access)
Phase 3 → Read project context from Bible.md
Phase 4 → Build batch plan (divide label trees into task order batches)
Phase 5 → Generate task order packages in Drive (one folder per batch)
Phase 6 → Brief the user (exactly how to deliver each task order)
Phase 7 → Update _status.md
```

---

## Phase 1 — Sanity Check

Read `_status.md` from the project Drive folder.

Check:
1. `Phase` field contains "Survey complete" or any later phase
2. `Bible Location` field has a valid Drive URL
3. `Tracker Location` field has a valid Drive URL or file path

If any check fails → redirect to survey (see Prerequisites above).

If `Phase` is already "Handoff in progress" or later, ask:
> "It looks like we've already started task orders for this project. Do you want to add
> more batches, or are you bringing back results from a batch that's already been sent?"

Wait for answer. If adding batches → continue. If returning results → invoke `import` skill.

---

## Phase 2 — Confirm Subcontractor Tool

If the subcontractor tool is NOT already known from survey Q4, ask:

> "Which AI tool are you planning to use for the analysis — ChatGPT, Gemini, or something
> else?"

Wait for answer. Then confirm Gmail access:

> "Does [tool] have a direct connection to your Gmail account? For example, in ChatGPT,
> this would be the Gmail skill or a Google Workspace connection."

**If yes:** Continue.

**If no or unsure:**
> "This process requires the tool to be able to read your Gmail directly. Without that
> connection, it won't be able to walk your labels and propose rules. You'll need someone
> technical on your team to help get that set up before we can proceed. Once it's connected,
> come back and I'll build the task orders."

Do NOT attempt to guide the Gmail connection setup. Stop here and wait.

Record the tool name. This goes into every task order prompt.

**Supported tool names for prompt optimization:**
- ChatGPT (any variant with Gmail/Workspace skill)
- Gemini (Google account connected)
- Other / Unknown → use generic prompt template

---

## Phase 3 — Read Project Context

Read `Bible.md` from the project Drive folder. Extract:
- Label taxonomy v1 (the full indented list)
- All bylaws
- Environment decision (Google Sheet vs. XLSX)
- Any policy notes from the Decisions Log

Also read the label inventory from `_status.md` (Labels Cataloged count) or from the
Tracker if it has been populated.

If the Tracker is still blank (survey just completed), the label inventory is what you
held in context from the survey. Use it now.

Build an internal working list of all user labels with:
- `label_name` — full path
- `parent` — top-level category
- `messages_total` — count from survey

---

## Phase 4 — Build Batch Plan

Divide the label inventory into batches. Each batch becomes one task order.

### Primary split: by top-level label category

Each top-level category (Finance, Vendors, Clients, etc.) is one batch.

### Sub-batch trigger: split a top-level category if EITHER is true
- The category has more than 5 sublabels
- The category has more than 500 total messages across all its sublabels

When sub-batching, split by:
- Natural sublabel groupings (e.g., Clients/Resellers as one batch, Clients/Corporate as another)
- Or alphabetically for very large flat lists

### Batching rules
- Never put more than 8 labels in a single task order
- Never put more than 1,000 messages in a single task order (estimate from counts)
- Always keep a top-level category whole if it fits within the limits
- Notifications, Personal, and system-like labels (low volume, clear rules) can be
  grouped together into a single "Miscellaneous" batch

### Present the batch plan to the user

Before generating any files, show the user the plan:

> "Here's how I'm going to split the work into task orders:
>
> Batch 1 — Finance (3 labels, ~800 messages)
> Batch 2 — Vendors (4 labels, ~400 messages)
> Batch 3 — Clients/Resellers (5 labels, ~2,100 messages)
> Batch 4 — Clients/Corporate (2 labels, ~300 messages)
> Batch 5 — Development + Notifications + Personal (6 labels, ~600 messages)
>
> That's 5 task orders total. Does this look right, or do you want to adjust anything?"

Wait for approval. Adjust if needed. Then proceed.

---

## Phase 5 — Generate Task Order Packages

Create a folder called `Task Orders` in the project Drive folder.
Inside it, create one subfolder per batch named: `Batch [N] - [Category Name]`

Inside each batch subfolder, create 4 files using `create_drive_file`:

### File 1: `00_PROMPT.md`

This is the text the user pastes into the subcontractor tool. It must be entirely
self-contained — the user pastes it without modification.

**PROMPT.md template (ChatGPT with Gmail skill):**

```
You are a Gmail classification analyst working on an inbox organization project.

PROJECT CONTEXT: Before doing anything else, read the file CONTEXT.md that has been
attached to this message. It contains the taxonomy, bylaws, and project rules that
govern all rules you propose. Do not proceed without reading it.

YOUR TASK — [BATCH NAME, e.g., "Finance label tree"]:
Walk the Gmail labels listed in LABEL_LIST.csv. For each label:

STEP 1 — Review messages
  - Open the label in Gmail
  - Review the 20-30 most recent messages (not just subject lines — look at sender domains)
  - Note: who is sending mail to this label, what subject patterns appear, any attachment types

STEP 2 — Identify patterns  
  For each distinct sender domain you find:
  - Does it clearly belong in this label? Or in a different one?
  - Is a subject qualifier needed to avoid labeling too broadly?
  - Does it send emails that clearly require separate work-type labels
    (e.g., invoices vs. general updates from the same domain)?
  - Are there attachment patterns (file types, naming conventions)?

STEP 3 — Propose rules
  Return exactly three CSV files using the schemas in OUTPUT_FORMAT.md:
  
  filter_rules.csv    — rules Gmail can apply natively (domain, subject, participant pattern)
  apps_script_rules.csv — rules that require attachment filename inspection
  taxonomy_notes.csv  — taxonomy gaps, policy observations, things that need Claude's review

IMPORTANT RULES:
  - Include your basis for EVERY rule in the notes column
  - Use participant-domain logic (from OR to OR cc) for client/reseller domains
  - Use from-only logic for vendors, notification services, and financial senders
  - Do NOT propose rules that depend on reading email body text — filters and Apps Script
    work from headers, subject lines, and attachment filenames only
  - Confidence and risk ratings must reflect your actual certainty
  - Flag ambiguous cases in taxonomy_notes.csv with status "do_not_automate_yet"

Return all three CSV files as attachments. Do not summarize them — just return the files.
```

**PROMPT.md template (Gemini):**
Use the same content as ChatGPT above. Gemini's Gmail integration works identically
for this task.

**PROMPT.md template (Generic / Unknown tool):**

```
You are a Gmail classification analyst working on an inbox organization project.

You need Gmail access for this task. Before proceeding, confirm that you can read
Gmail labels and message sender information (not message bodies) for this account.
If you cannot, say so immediately.

[Then continue with the same STEP 1-3 content as ChatGPT template above]
```

Substitute `[BATCH NAME]` with the actual batch name (e.g., "Finance label tree").

---

### File 2: `01_CONTEXT.md`

This is the project context the subcontractor reads first. Extract from Bible.md.

**CONTEXT.md template:**

```markdown
# Gmail Inbox Architect — Project Context for Analysis

**Project account:** [email address from Bible.md]
**Date generated:** [today]
**Batch:** [batch name]

---

## Label Taxonomy

This is the confirmed taxonomy for this project. All rules you propose must fit within
this structure. Do not propose labels outside this taxonomy without flagging it in
taxonomy_notes.csv.

[Paste the full taxonomy from Bible.md here — the indented label list]

---

## Bylaws

These are the rules for what belongs in each label. They are binding — follow them
when deciding which label a rule should target.

[Paste all bylaws from Bible.md here]

---

## Key Rules for This Analysis

1. Participant-domain logic applies to client and reseller labels:
   Use (from:domain.com OR to:domain.com OR cc:domain.com) — not just from:.
   
2. From-only logic applies to vendors and services:
   Use from:domain.com only — not participant pattern.
   
3. Secondary labels are additive:
   A thread can have both a client label AND a work-type label (Finance/Invoices,
   Sales/Quotes, etc.). Propose secondary rules where patterns are clear.

4. Apps Script is for attachment inspection only:
   Propose apps_script_needed rules ONLY for rules where the trigger is an attachment
   filename or extension pattern. Not for body text.

5. Flag ambiguous cases:
   If you are unsure whether a domain belongs in this taxonomy, flag it in
   taxonomy_notes.csv with status "do_not_automate_yet" and explain why.

---

## Decisions Log (Key Policy Notes)

[Paste relevant entries from Bible.md Decisions Log — especially any policy notes
about specific senders, domains, or label structures]
```

Fill in actual content from Bible.md at runtime.

---

### File 3: `02_LABEL_LIST.csv`

The specific labels in this batch, with message counts. The subcontractor uses this
to know exactly what to walk.

**Schema:**

```
label_name,parent,messages_total,messages_unread,bylaw,scan_priority,notes
```

Populate from the label inventory built in Phase 3. Assign `scan_priority`:
- 1 = highest message volume or most business-critical
- 2 = medium
- 3 = low volume or low urgency

Add the `bylaw` text for the parent label so the subcontractor has it inline.

**Example:**
```
Finance/AMEX,Finance,412,0,"Any email involving money moving in or out — invoices statements receipts payments",1,Check for statement keyword; from-only pattern
Finance/Invoices,Finance,238,12,"Any email involving money moving in or out",1,May overlap with vendor domains; check for secondary rules
Finance/Receipts,Finance,856,0,"Any email involving money moving in or out",2,High volume; likely archive+mark-read candidates
```

---

### File 4: `03_OUTPUT_FORMAT.md`

The exact schemas the subcontractor must use for its return files.

**OUTPUT_FORMAT.md template:**

```markdown
# Required Output Format

Return exactly three CSV files. Use these exact column names. Do not add or remove columns.

---

## File 1: filter_rules.csv

Rules that Gmail can execute natively. Use this for domain, subject keyword,
and participant-pattern rules.

Columns (in this order):
rule_id,enabled,priority,rule_name,from_domain,to_or_cc_domain,subject_contains,has_attachment,base_label,archive,mark_read,queue_for_ai_process,deployability,confidence,risk,notes

Field definitions:
rule_id         — Your proposed ID. Format: GF-[CATEGORY]-[DESCRIPTION] e.g. GF-FINANCE-AMEX-STMT
enabled         — FALSE (Claude will review and enable approved rules)
priority        — Integer 1-99 (lower = higher priority). Base labels: 50. Secondary: 80-90.
rule_name       — Plain English description (max 60 chars)
from_domain     — Sender domain(s). Multiple: use semicolons. e.g. domain1.com; domain2.com
to_or_cc_domain — For participant rules: same domain as from_domain. Leave blank for from-only rules.
subject_contains — Subject keyword(s). Multiple: use semicolons. Leave blank if not needed.
has_attachment  — TRUE or FALSE
base_label      — Full Gmail label path. e.g. Finance/AMEX
archive         — TRUE or FALSE (TRUE = skip inbox)
mark_read       — TRUE or FALSE (use sparingly — only for pure notifications)
queue_for_ai_process — TRUE or FALSE (TRUE = flag for Claude's downstream review)
deployability   — Always: gmail_filter_safe
confidence      — high / medium / low (your certainty this rule is correct)
risk            — low / medium / high (false-positive risk if this fires on wrong emails)
notes           — REQUIRED. Your basis: what emails triggered this proposal and why.

---

## File 2: apps_script_rules.csv

Rules that require attachment filename/extension inspection. Use ONLY for rules where
the trigger is an attachment file type or naming pattern.

Same columns as filter_rules.csv, with these differences:
rule_id         — Format: AS-[CATEGORY]-[DESCRIPTION] e.g. AS-DOC-STEP-FILE
deployability   — Always: apps_script_needed
notes           — Must include: exact filename pattern (e.g. *.stp, *.step, inv-*.pdf)

---

## File 3: taxonomy_notes.csv

Observations about the taxonomy — gaps, policy notes, things that need Claude's review.
This is where you flag anything that doesn't fit neatly into a rule.

Columns (in this order):
proposal_id,status,proposed_label,parent,reason,candidate_sources,notes

Field definitions:
proposal_id     — TAX-NNN sequential number
status          — One of: confirmed_existing / proposed_new / policy_note / do_not_automate_yet
proposed_label  — The label path being referenced or proposed (or blank for policy notes)
parent          — Parent label (or blank)
reason          — Why you're flagging this
candidate_sources — Domain(s) or sender(s) that prompted this observation
notes           — Additional context. For do_not_automate_yet: explain what would need to
                  be true before this could be automated.

---

## Important Notes

- Return ALL THREE files even if one has zero rows (include header row only)
- Do not return the files as code blocks — attach them as actual CSV files
- Do not summarize or explain in chat — just return the files
- If you are unsure about something, put it in taxonomy_notes.csv with status do_not_automate_yet
```

---

## Phase 6 — Brief the User

After all task order folders are created in Drive, give the user clear delivery instructions.

> "Your task orders are ready in your project Drive folder — I've created [N] batches.
> Here's how to use them:
>
> **For each batch (start with Batch 1):**
> 1. Open the batch folder in Drive
> 2. Open `00_PROMPT.md` — copy the entire text
> 3. Paste it into [tool name] and attach the other three files from the folder
> 4. Submit it and let the tool work
> 5. When it returns three CSV files, bring them back here and say
>    'import the Finance results' (or whatever batch you just ran)
>
> You don't need to run all the batches before bringing results back.
> You can run Batch 1, bring the results to me, then run Batch 2. I'll
> review and refine the rules as each batch comes back.
>
> Which batch do you want to start with?"

---

## Phase 7 — Update _status.md

After all task orders are generated, update `_status.md`:

```
Last Updated: [timestamp]
Last Agent: Claude (CoWork) — handoff skill
Phase: Handoff in progress
Last Completed Step: [N] task order batches created in Drive Task Orders folder.
Pending Work: User delivering batches to [tool name]. Return results via import skill.
Handoff Target: [tool name]
Notes: Batches created: [list batch names]. Each returns 3 CSV files to import.
```

---

## Batch Size Reference

| Condition | Action |
|---|---|
| Top-level category ≤ 8 labels AND ≤ 500 messages | One batch |
| Top-level category > 5 sublabels OR > 500 messages | Split into sub-batches |
| Never more than 8 labels per batch | Split further |
| Never more than ~1,000 messages per batch | Split further |
| Notifications + Personal + low-volume misc | Group together |

---

## Embedded Template: Rule ID Format

```
Gmail filter rules:   GF-[CATEGORY]-[SUBCATEGORY]-[DESCRIPTOR]
                      GF-FINANCE-AMEX-STMT
                      GF-CLIENT-RESELLER-ACME-PARTICIPANT
                      GF-VENDOR-SHIPPING-FEDEX

Apps Script rules:    AS-[CATEGORY]-[DESCRIPTOR]
                      AS-DOC-STEP-FILE
                      AS-FINANCE-VENDOR-INVOICE-PDF
```

Category abbreviations to use in rule IDs:
```
FINANCE / VENDOR / CLIENT / OPS / DEV / SALES / TRAVEL / NOTIF / SUPPORT / DOC / PERSONAL
```

---

## Safety Rules (Cannot Be Overridden)

1. No Gmail mutations. This skill creates Drive files only. It does not touch Gmail.

2. Always confirm the tool has Gmail access before generating task orders. If access
   is uncertain, stop and redirect to an AI expert in the org.

3. Never skip the sanity check. If survey artifacts are missing, redirect to survey.

4. Task orders are reviewed by Claude before anything goes to Gmail. Import skill
   brings results back to Claude. Deploy skill executes. Nothing is automatic.

5. The subcontractor proposes. Claude decides. Always.