refactor(handoff): remove prescriptive batching — subcontractor determines its own methodology

This commit is contained in:
2026-06-07 11:51:16 -05:00
parent ff7be709fb
commit 42eaeda0d0
+50 -83
View File
@@ -131,44 +131,36 @@ Build an internal working list of all user labels with:
---
## Phase 4 — Build Batch Plan
## Phase 4 — Prepare the Task Order
Divide the label inventory into batches. Each batch becomes one task order.
Claude's job here is to prepare the context — not to plan the subcontractor's methodology.
A capable peer model will determine its own optimal approach once it has the project context
and the output requirements. Do not prescribe how many messages to review, how to split the
work, or in what order to process labels.
### Primary split: by top-level label category
### Step 4.1 — Compile the full label inventory
Each top-level category (Finance, Vendors, Clients, etc.) is one batch.
From the label list built in Phase 3, prepare a complete LABEL_LIST.csv containing ALL user
labels with message counts, scan priority guidance, and bylaw references. This is the
subcontractor's working inventory — give it everything.
### Sub-batch trigger: split a top-level category if EITHER is true
- The category has more than 5 sublabels
- The category has more than 500 total messages across all its sublabels
### Step 4.2 — Create the initial task order
When sub-batching, split by:
- Natural sublabel groupings (e.g., Clients/Resellers as one batch, Clients/Corporate as another)
- Or alphabetically for very large flat lists
Generate one task order covering the full label inventory. The subcontractor decides
whether to process everything in one pass or propose its own batching plan. Both are fine.
### Batching rules
- Never put more than 8 labels in a single task order
- Never put more than 1,000 messages in a single task order (estimate from counts)
- Always keep a top-level category whole if it fits within the limits
- Notifications, Personal, and system-like labels (low volume, clear rules) can be
grouped together into a single "Miscellaneous" batch
If the subcontractor returns a batching proposal rather than results:
- Review the proposal with the user in plain English
- Ask for approval
- Create additional task order folders for subsequent batches as requested
### Present the batch plan to the user
### Step 4.3 — Confirm with the user before creating Drive files
Before generating any files, show the user the plan:
> "I'm going to prepare a task order with your full label inventory and let [tool] determine
> the most efficient way to work through it — it may do everything in one pass or propose
> splitting into batches. Either way works. Ready to go?"
> "Here's how I'm going to split the work into task orders:
>
> Batch 1 — Finance (3 labels, ~800 messages)
> Batch 2 — Vendors (4 labels, ~400 messages)
> Batch 3 — Clients/Resellers (5 labels, ~2,100 messages)
> Batch 4 — Clients/Corporate (2 labels, ~300 messages)
> Batch 5 — Development + Notifications + Personal (6 labels, ~600 messages)
>
> That's 5 task orders total. Does this look right, or do you want to adjust anything?"
Wait for approval. Adjust if needed. Then proceed.
Wait for confirmation, then proceed to Phase 5.
---
@@ -184,68 +176,52 @@ Inside each batch subfolder, create 4 files using `create_drive_file`:
This is the text the user pastes into the subcontractor tool. It must be entirely
self-contained — the user pastes it without modification.
**PROMPT.md template (ChatGPT with Gmail skill):**
**PROMPT.md template (ChatGPT with Gmail skill / Gemini):**
```
You are a Gmail classification analyst working on an inbox organization project.
You are a Gmail classification analyst. You have been given a Gmail inbox organization project.
PROJECT CONTEXT: Before doing anything else, read the file CONTEXT.md that has been
attached to this message. It contains the taxonomy, bylaws, and project rules that
govern all rules you propose. Do not proceed without reading it.
Read CONTEXT.md before doing anything else. It contains the taxonomy, bylaws, and project
decisions that govern all rules you propose.
YOUR TASK — [BATCH NAME, e.g., "Finance label tree"]:
Walk the Gmail labels listed in LABEL_LIST.csv. For each label:
Your goal: analyze the Gmail labels in LABEL_LIST.csv and propose classification rules that
will automatically sort incoming email according to the taxonomy in CONTEXT.md.
STEP 1 — Review messages
- Open the label in Gmail
- Review the 20-30 most recent messages (not just subject lines — look at sender domains)
- Note: who is sending mail to this label, what subject patterns appear, any attachment types
You have direct access to the Gmail account. Use that access as you see fit to identify
sender patterns, domain signals, subject patterns, and attachment types — whatever gives
you confident rule proposals.
STEP 2 — Identify patterns
For each distinct sender domain you find:
- Does it clearly belong in this label? Or in a different one?
- Is a subject qualifier needed to avoid labeling too broadly?
- Does it send emails that clearly require separate work-type labels
(e.g., invoices vs. general updates from the same domain)?
- Are there attachment patterns (file types, naming conventions)?
Return your findings using the exact schemas in OUTPUT_FORMAT.md:
STEP 3 — Propose rules
Return exactly three CSV files using the schemas in OUTPUT_FORMAT.md:
filter_rules.csv — rules Gmail can execute natively (domain, subject, participant)
apps_script_rules.csv — rules requiring attachment filename/extension inspection
taxonomy_notes.csv — taxonomy gaps, policy observations, and anything needing review
filter_rules.csv — rules Gmail can apply natively (domain, subject, participant pattern)
apps_script_rules.csv — rules that require attachment filename inspection
taxonomy_notes.csv — taxonomy gaps, policy observations, things that need Claude's review
If the label inventory is large and you prefer to work in batches, propose a batching
plan and execute the first batch. Subsequent task orders will be prepared for remaining work.
IMPORTANT RULES:
- Include your basis for EVERY rule in the notes column
- Use participant-domain logic (from OR to OR cc) for client/reseller domains
- Use from-only logic for vendors, notification services, and financial senders
- Do NOT propose rules that depend on reading email body text — filters and Apps Script
work from headers, subject lines, and attachment filenames only
- Confidence and risk ratings must reflect your actual certainty
- Flag ambiguous cases in taxonomy_notes.csv with status "do_not_automate_yet"
Hard constraints that must be respected:
- Participant-domain logic (from OR to OR cc) for client and reseller labels
- From-only logic for vendors, services, and financial senders
- No rules dependent on message body text — filters and Apps Script work from headers,
subjects, and attachment filenames only
- Document your basis for every proposed rule in the notes column
- Use taxonomy_notes.csv status "do_not_automate_yet" for anything you are not confident about
Return all three CSV files as attachments. Do not summarize them — just return the files.
Return all three CSV files. Do not summarize — just return the files.
```
**PROMPT.md template (Gemini):**
Use the same content as ChatGPT above. Gemini's Gmail integration works identically
for this task.
**PROMPT.md template (Generic / Unknown tool):**
```
You are a Gmail classification analyst working on an inbox organization project.
[Add this preamble before the main prompt above:]
You need direct Gmail access for this task. Before proceeding, confirm that you can
read Gmail labels and message sender information for this account. If you cannot, say
so immediately — do not attempt to work around it.
You need Gmail access for this task. Before proceeding, confirm that you can read
Gmail labels and message sender information (not message bodies) for this account.
If you cannot, say so immediately.
[Then continue with the same STEP 1-3 content as ChatGPT template above]
[Then continue with the ChatGPT template above]
```
Substitute `[BATCH NAME]` with the actual batch name (e.g., "Finance label tree").
---
### File 2: `01_CONTEXT.md`
@@ -463,16 +439,6 @@ Notes: Batches created: [list batch names]. Each returns 3 CSV files to import.
---
## Batch Size Reference
| Condition | Action |
|---|---|
| Top-level category ≤ 8 labels AND ≤ 500 messages | One batch |
| Top-level category > 5 sublabels OR > 500 messages | Split into sub-batches |
| Never more than 8 labels per batch | Split further |
| Never more than ~1,000 messages per batch | Split further |
| Notifications + Personal + low-volume misc | Group together |
---
## Embedded Template: Rule ID Format
@@ -509,3 +475,4 @@ FINANCE / VENDOR / CLIENT / OPS / DEV / SALES / TRAVEL / NOTIF / SUPPORT / DOC /
5. The subcontractor proposes. Claude decides. Always.