The standard · April 2026 revision
One portable SME package, every LLM
SMETP v0.5 is the open protocol for capturing what an expert knows and shipping it as a folder any modern agent runtime can execute — Anthropic Agent Skills, ChatGPT Custom GPTs, Gemini system instructions, OpenAI function-calling, MCP servers. One canonical package. Many derivative wrappers.
Spec version 0.5.0 · 7-phase lifecycle · Phase 0 · Phase 1 · Phase 2 · Phase 5 · Phase 6
v0.5.0 — what changed from v0.4
The April 2026 deep-research-report traced the gap between SMETP's prior artifact (a single skill.md) and what the elicitation, calibration, and agent-packaging literature requires. v0.5 closes that gap. Additive only — v0.4 documents still validate and round-trip cleanly.
- Portable skill package is the canonical artifact — replacing single-file
skill.md. The package is a folder withSKILL.md,MANIFEST.yaml,references/,providers/,monitors/, andscripts/. - Anthropic-style thin SKILL.md — frontmatter + when-to-use + when-not-to-use, with bulk material under
references/using progressive disclosure. - Per-LLM compiled wrappers ship in
providers/: Claude system prompt, ChatGPT instructions, Gemini system_instruction JSON, OpenAI function-calling tool schemas, MCP server descriptor. - Validation manifest is now a JSON document (not a narrated paragraph) with SME-match, outcome-match, Brier, ECE, AUROC, group calibration, and adversarial pass-rate.
- Confidence is a fitted object, not a hand-picked number.
confidence_model.methodisisotonic,platt,bayesian-posterior, oruncalibratedwith bins from a held-out split. - Monitors are first-class:
monitors/thresholds.yamlships with default triggers (PSI, ECE rolling, decision-rate shift, override-rate, missing-data, policy refresh). - Deterministic, reproducible compile — same bag, same bytes. The package can be hashed, cached, and audit-compared without surprises.
The package layout
Every paid package compiles to this folder. The free tier ships only SKILL.md, README.md, and three reference markdowns — enough to paste into any LLM, but without the JSON schema, validation, providers, or monitors that paid tiers ship.
skill-name/
├── SKILL.md # Anthropic-style thin nav
├── README.md # package-level orientation
├── MANIFEST.yaml # name, version, owners, runs_on
├── CHANGELOG.md
├── references/
│ ├── skill-document.json # canonical wire schema
│ ├── workflow.md # ordered decision flow
│ ├── graph.md # entities + mermaid
│ ├── dictionary.md # vocabulary the SME uses
│ ├── decision-logic.md # zones, compensatory rules, disqualifiers
│ ├── elicitation-provenance.md # CDM/ACTA sessions, coders, kappa
│ ├── edge-cases.md # known failure modes
│ ├── policy-sources.md # regulations + jurisdictions
│ └── validation-manifest.json # SME-match · outcome-match · Brier · ECE
├── providers/
│ ├── README.md
│ ├── claude-system-prompt.md # Claude Projects / Anthropic Agent Skills
│ ├── chatgpt-instructions.md # Custom GPT instructions
│ ├── gemini-system.json # Vertex AI system_instruction + tools
│ ├── openai-tools.json # function-calling tool schemas
│ └── mcp-server.json # MCP server descriptor
├── monitors/
│ └── thresholds.yaml # drift signals + re-elicitation triggers
└── scripts/
└── execute.py # runtime stub (calls @smetp/runtime)The 7-phase lifecycle
v0.5 fuses the strongest pieces of CDM, ACTA, SHELF, IDEA, Anthropic Agent Skills, OpenAI traces, and NIST-style lifecycle management into one governed pipeline. Each phase has a primary output and a minimum exit gate.
| Phase | Primary output | Exit gate |
|---|---|---|
| 0 · Scope | Risk memo, regulatory map, success criteria | Use case stable, high-value, has historical cases |
| 1 · Capture | Transcripts, incident timelines, ACTA artifacts, elicited thresholds | ≥3 concrete cases plus routine-cue audit |
| 2 · Analyze | Coded factors, decision requirements, contradiction log, ontology crosswalk | Dual-coder review; contradictions explicit |
| 3 · Codify | Canonical skill package + executable logic | Every rule has provenance, missing-data handling, safety |
| 4 · Validate | Validation manifest + release recommendation | SME-match, outcome-match, calibration, adversarial pass |
| 5 · Deploy | Shadow + rollout plan | Human-review pathway, rollback triggers configured |
| 6 · Monitor | Drift dashboard, override review, re-elicitation plan | Owners + thresholds + recurrence assigned |
Validation, calibration, monitoring
v0.5 separates fidelity from usefulness. Fidelity asks whether the package reproduces the expert; usefulness asks whether it predicts or improves outcomes. Both ship in the same manifest:
- Fidelity — SME-match, dual-coder concordance, second-reviewer agreement (kappa).
- Predictive validity — outcome-match, AUROC, AUPRC when labels exist.
- Calibration — Brier score, ECE, reliability diagram (always shipped, never optional).
- Robustness — adversarial pass-rate, OOD performance, shadow-run delta.
- Fairness & compliance — group calibration, prohibited-feature checks, protected-proxy review.
- Operations — latency, missing-data rate, escalation rate, override rate.
- Auditability — rule IDs, evidence refs, trace completeness, reproducibility of every logged decision.
monitors/thresholds.yamlships with sensible defaults — feature drift (PSI > 0.20), decision drift (rate shift > 15%), calibration drift (rolling ECE > 0.05), override drift (> 10% in any zone), missing-data spike, policy refresh, time-based revalidation. Tune per domain on the Ultimate tier.
One protocol, two artifact tiers
The standard ships in two flavors so the wow moment is free:
- Free starter — three markdowns (
skill.md,workflow.md,graph.md) you can paste into any LLM as a system prompt. No card, no signup. - Full v0.5 package — the folder above, compiled deterministically from the same bag, with per-LLM wrappers, the validation manifest, and monitors. On Paid and Ultimate tiers.
The spec is the standard
@smetp/spec is the contract. It ships as MIT-licensed JSON Schema + zod under packages/spec. The semver of the package is the semver of the protocol. Major bumps are breaking changes; minor bumps are additive. v0.5 addsPackageLayout and ProviderWrappers schemas alongside the existing SkillDocument.
The graph is the moat
Run it yourself in 60 seconds
# Capture an SME and compile the v0.5 package locally
npx @smetp/cli interview \
--domain finance \
--role "mortgage underwriter" \
--sme "Jane Doe" \
--years 18 \
--out jane.json
npx @smetp/cli compile \
--in jane.json \
--tier paid \
--out ./jane-package
# Drop it into Claude Projects, ChatGPT, Gemini, MCP …
ls jane-package/providersThat's the entire dependency on us: zero. The hosted product is what you graduate to when you want the cross-tenant graph, the live capture canvas, and the operator co-pilot in the loop.