assistant-claw/atlas/skills/claw-email-parser/SKILL.md
Vega (Atlas scaffolding) ce9f27320a Add Atlas profile under atlas/ — boss-perspective project execution radar
This adds the full Atlas (总助 Claw / 老板视角项目执行雷达) scaffolding as a
sibling profile to the existing Vega general-purpose assistant. All Atlas content
lives under atlas/ to keep the existing top-level skeleton intact.

What's included:

- atlas/IDENTITY.md, SOUL.md, USER.md, AGENTS.md, MEMORY.md, BOOTSTRAP.md,
  HEARTBEAT.md, TOOLS.md (+ zh-CN mirrors) — full OpenClaw 8-piece set
  matching the zero-cca convention
- atlas/skills/ — 6 sub-skills with frontmatter:
  claw-email-parser / claw-project-tracker / claw-people-observer /
  claw-customer-radar / claw-boss-distiller / claw-report-writer
- atlas/skills/claw-boss-distiller/ — adapter notes for nuwa-skill, 5-layer
  boss_skill seed template (23 rules across Expression DNA / Mental Models /
  Decision Heuristics / Anti-Patterns / Honest Boundaries), and a complete
  synthetic distillation demo (10 input emails -> validated 5-layer output)
- atlas/mcp-tools/email-extractor/ — Python implementation of stages 1-3
  (fetch + decode + dequote), 7 pytest tests passing, CLI: atlas-extract
- atlas/state-schemas/ — formal JSON schemas for project / person / customer
  cards with the no-employee-rating hard constraint baked in
- atlas/client-deck/ — 2-page client-facing pitch document
- autopilots/atlas-*.yaml — 5 autopilot configs (daily / weekly / monthly /
  quarterly + andon event-triggered) for a future Multica-side scheduler

Notes:

- nuwa-skill (MIT, https://github.com/alchaincyf/nuwa-skill) NOT vendored;
  fetch at deploy time via instructions in
  atlas/skills/claw-boss-distiller/upstream/README.md
- Vega-side prompts/skills/tools/autopilots/docs scaffold left untouched
- Top-level README.md updated with a brief Atlas pointer; rest preserved
2026-05-09 17:00:29 +08:00

2.5 KiB
Raw Blame History

name description
claw-email-parser Wraps the email-extractor MCP tool; orchestrates fetch → extract → write canonical Email JSON. The thin LLM layer that decides what to do with low-confidence extractions.

claw-email-parser

Purpose

Atlas's intake skill. Given a date range, pull and extract emails into canonical JSON, surface anything the rule-based extractor was uncertain about, and let the LLM make a judgment call (or punt to the boss).

Inputs

  • since: ISO date — e.g. 2025-05-09 (V0 default = 12 months ago)
  • until: ISO date — e.g. today
  • mode: full_backfill | incremental (incremental reads state/.last_sync)

Outputs

  • N × state/extracted/YYYY-MM/<thread_id>/<msg_id>.json
  • Updated state/.last_sync
  • Run summary in state/runs/YYYY-MM-DD.extract.json with counts: fetched, extracted, failed, low_confidence_intents, new_customer_domains, new_alias_collisions

Judgment Rules (LLM layer)

The MCP tool handles 95% mechanically. The LLM layer handles:

  1. Intent classification ambiguity (confidence < 0.6) — read the message and call it
  2. Customer domain disambiguationsupport@notify.clientco.com vs wang@clientco.com — same customer? boss-confirm or auto-merge based on org-name presence
  3. Alias merging proposals — when the same human appears under 2+ identities, propose a merge with evidence (signature line match, project overlap)
  4. Dequote escape hatch — if regex strategies leave a clearly garbage body_text_clean (e.g., 90% punctuation), retry with LLM-based dequoting

Failure Modes

Failure Behavior
Email server unreachable Retry with backoff; if 3 retries fail, write failure to run summary, exit gracefully
Single message extraction fails Skip + log, do not abort run
Quota / rate limit Exponential backoff; checkpoint progress to resume next run
LLM call fails Mark intent = unknown, low_confidence = true, continue

Sample I/O

Input: Run incremental from state/.last_sync = 2026-05-08T00:00:00Z

Output (run summary):

{
  "run_id": "2026-05-09T07:30:00Z-extract",
  "fetched": 47,
  "extracted_ok": 45,
  "extraction_failed": 2,
  "low_confidence_intents": 4,
  "new_customer_domains": ["@newprospect.io"],
  "new_alias_collisions": 1,
  "duration_ms": 31200
}

Dependencies

  • MCP tool: email-extractor (see mcp-tools/email-extractor.md)
  • State: read state/people/aliases.json, state/customers/domain_map.json