This adds the full Atlas (总助 Claw / 老板视角项目执行雷达) scaffolding as a sibling profile to the existing Vega general-purpose assistant. All Atlas content lives under atlas/ to keep the existing top-level skeleton intact. What's included: - atlas/IDENTITY.md, SOUL.md, USER.md, AGENTS.md, MEMORY.md, BOOTSTRAP.md, HEARTBEAT.md, TOOLS.md (+ zh-CN mirrors) — full OpenClaw 8-piece set matching the zero-cca convention - atlas/skills/ — 6 sub-skills with frontmatter: claw-email-parser / claw-project-tracker / claw-people-observer / claw-customer-radar / claw-boss-distiller / claw-report-writer - atlas/skills/claw-boss-distiller/ — adapter notes for nuwa-skill, 5-layer boss_skill seed template (23 rules across Expression DNA / Mental Models / Decision Heuristics / Anti-Patterns / Honest Boundaries), and a complete synthetic distillation demo (10 input emails -> validated 5-layer output) - atlas/mcp-tools/email-extractor/ — Python implementation of stages 1-3 (fetch + decode + dequote), 7 pytest tests passing, CLI: atlas-extract - atlas/state-schemas/ — formal JSON schemas for project / person / customer cards with the no-employee-rating hard constraint baked in - atlas/client-deck/ — 2-page client-facing pitch document - autopilots/atlas-*.yaml — 5 autopilot configs (daily / weekly / monthly / quarterly + andon event-triggered) for a future Multica-side scheduler Notes: - nuwa-skill (MIT, https://github.com/alchaincyf/nuwa-skill) NOT vendored; fetch at deploy time via instructions in atlas/skills/claw-boss-distiller/upstream/README.md - Vega-side prompts/skills/tools/autopilots/docs scaffold left untouched - Top-level README.md updated with a brief Atlas pointer; rest preserved
2.5 KiB
2.5 KiB
| name | description |
|---|---|
| claw-email-parser | Wraps the email-extractor MCP tool; orchestrates fetch → extract → write canonical Email JSON. The thin LLM layer that decides what to do with low-confidence extractions. |
claw-email-parser
Purpose
Atlas's intake skill. Given a date range, pull and extract emails into canonical JSON, surface anything the rule-based extractor was uncertain about, and let the LLM make a judgment call (or punt to the boss).
Inputs
since: ISO date — e.g.2025-05-09(V0 default = 12 months ago)until: ISO date — e.g. todaymode:full_backfill|incremental(incremental readsstate/.last_sync)
Outputs
- N ×
state/extracted/YYYY-MM/<thread_id>/<msg_id>.json - Updated
state/.last_sync - Run summary in
state/runs/YYYY-MM-DD.extract.jsonwith counts: fetched, extracted, failed, low_confidence_intents, new_customer_domains, new_alias_collisions
Judgment Rules (LLM layer)
The MCP tool handles 95% mechanically. The LLM layer handles:
- Intent classification ambiguity (confidence < 0.6) — read the message and call it
- Customer domain disambiguation —
support@notify.clientco.comvswang@clientco.com— same customer? boss-confirm or auto-merge based on org-name presence - Alias merging proposals — when the same human appears under 2+ identities, propose a merge with evidence (signature line match, project overlap)
- Dequote escape hatch — if regex strategies leave a clearly garbage
body_text_clean(e.g., 90% punctuation), retry with LLM-based dequoting
Failure Modes
| Failure | Behavior |
|---|---|
| Email server unreachable | Retry with backoff; if 3 retries fail, write failure to run summary, exit gracefully |
| Single message extraction fails | Skip + log, do not abort run |
| Quota / rate limit | Exponential backoff; checkpoint progress to resume next run |
| LLM call fails | Mark intent = unknown, low_confidence = true, continue |
Sample I/O
Input: Run incremental from state/.last_sync = 2026-05-08T00:00:00Z
Output (run summary):
{
"run_id": "2026-05-09T07:30:00Z-extract",
"fetched": 47,
"extracted_ok": 45,
"extraction_failed": 2,
"low_confidence_intents": 4,
"new_customer_domains": ["@newprospect.io"],
"new_alias_collisions": 1,
"duration_ms": 31200
}
Dependencies
- MCP tool:
email-extractor(seemcp-tools/email-extractor.md) - State: read
state/people/aliases.json,state/customers/domain_map.json