Three things in this commit:
1. Atlas skills now agentskills.io / Hermes-compatible
- Each atlas/skills/claw-*/SKILL.md frontmatter enriched with version,
author, license, and metadata.hermes block (tags, category,
related_skills, boundaries)
- New atlas/skills/DESCRIPTION.md per Hermes category convention
- New atlas/INTEGRATION-hermes.md — step-by-step SOP to install Atlas
onto hermes-agent runtime (cp skills, fetch nuwa upstream, configure
env, wire cron, smoke test). Documents the branding override and
self-improving-loop guardrail.
2. nuwa-skill mirror prep (waiting on org-repo creation)
- scripts/mirror-nuwa-to-moments.sh — one-shot bare-clone + push --mirror
- docs/decisions/0001-mirror-nuwa-skill.md — ADR explaining the why,
the bot-token scope limitation, and the manual one-time repo creation
step required at https://git.moments.top/repo/create
3. README rewrite
- Atlas-forward navigation table ("想做什么 → 看哪里")
- Quickstart sections for browsing, running tests locally, fetching
nuwa upstream (public + air-gapped variants), and Hermes integration
- Preserved all original Vega working agreements
- Roadmap with explicit Atlas / Vega tracks
Bot account (multica-bot) lacks write:organization scope so cannot create
the nuwa-skill repo via API. After human creates the empty repo at
git.moments.top/Moments.top/nuwa-skill, run scripts/mirror-nuwa-to-moments.sh
to populate it.
72 lines
2.7 KiB
Markdown
72 lines
2.7 KiB
Markdown
---
|
||
name: claw-email-parser
|
||
description: Wraps the email-extractor MCP tool; orchestrates fetch → extract → write canonical Email JSON. The thin LLM layer that decides what to do with low-confidence extractions.
|
||
version: 0.1.0
|
||
author: Moments / Atlas team
|
||
license: MIT
|
||
metadata:
|
||
hermes:
|
||
category: atlas
|
||
tags: [email, imap, extraction, atlas, intake]
|
||
related_skills: [claw-project-tracker, claw-people-observer, claw-customer-radar]
|
||
---
|
||
|
||
# claw-email-parser
|
||
|
||
## Purpose
|
||
|
||
Atlas's intake skill. Given a date range, pull and extract emails into canonical JSON, surface anything the rule-based extractor was uncertain about, and let the LLM make a judgment call (or punt to the boss).
|
||
|
||
## Inputs
|
||
|
||
- `since`: ISO date — e.g. `2025-05-09` (V0 default = 12 months ago)
|
||
- `until`: ISO date — e.g. today
|
||
- `mode`: `full_backfill` | `incremental` (incremental reads `state/.last_sync`)
|
||
|
||
## Outputs
|
||
|
||
- N × `state/extracted/YYYY-MM/<thread_id>/<msg_id>.json`
|
||
- Updated `state/.last_sync`
|
||
- Run summary in `state/runs/YYYY-MM-DD.extract.json` with counts: fetched, extracted, failed, low_confidence_intents, new_customer_domains, new_alias_collisions
|
||
|
||
## Judgment Rules (LLM layer)
|
||
|
||
The MCP tool handles 95% mechanically. The LLM layer handles:
|
||
|
||
1. **Intent classification ambiguity** (confidence < 0.6) — read the message and call it
|
||
2. **Customer domain disambiguation** — `support@notify.clientco.com` vs `wang@clientco.com` — same customer? boss-confirm or auto-merge based on org-name presence
|
||
3. **Alias merging proposals** — when the same human appears under 2+ identities, propose a merge with evidence (signature line match, project overlap)
|
||
4. **Dequote escape hatch** — if regex strategies leave a clearly garbage `body_text_clean` (e.g., 90% punctuation), retry with LLM-based dequoting
|
||
|
||
## Failure Modes
|
||
|
||
| Failure | Behavior |
|
||
|---------|---------|
|
||
| Email server unreachable | Retry with backoff; if 3 retries fail, write failure to run summary, exit gracefully |
|
||
| Single message extraction fails | Skip + log, do not abort run |
|
||
| Quota / rate limit | Exponential backoff; checkpoint progress to resume next run |
|
||
| LLM call fails | Mark intent = `unknown`, low_confidence = true, continue |
|
||
|
||
## Sample I/O
|
||
|
||
**Input:** Run incremental from `state/.last_sync` = 2026-05-08T00:00:00Z
|
||
|
||
**Output (run summary):**
|
||
```json
|
||
{
|
||
"run_id": "2026-05-09T07:30:00Z-extract",
|
||
"fetched": 47,
|
||
"extracted_ok": 45,
|
||
"extraction_failed": 2,
|
||
"low_confidence_intents": 4,
|
||
"new_customer_domains": ["@newprospect.io"],
|
||
"new_alias_collisions": 1,
|
||
"duration_ms": 31200
|
||
}
|
||
```
|
||
|
||
## Dependencies
|
||
|
||
- MCP tool: `email-extractor` (see `mcp-tools/email-extractor.md`)
|
||
- State: read `state/people/aliases.json`, `state/customers/domain_map.json`
|