We run on this too.
This is the actual system we run Sidekick on, the same one we build for you. The repetitive work between your tools gets done, checked, and handed back for your okay.
Your tools stay.
The busywork between them goes.
It is one place that does your repeat work, checks itself, and never sends anything without your say-so.
- Inbox Triagesorts and drafts your email
- Meeting Notessummarizes your calls
- Action Itemstracks every to-do
- Pipeline Pulseflags deals going cold
- Prospect Prepbriefs you before calls
- Outreach Draftswrites your follow-ups
- Email Voicereplies that sound like you
- Posts + Blogdrafts your content
- AI Searchget found by AI assistants
- Client Healthspots quiet clients
- Weekly Reportassembles your week
- Knowledge Basekeeps it all findable
You give it the idea. It hands back finished work, not a rough draft.
One-shot is a real skill we built into this workspace. Most AI gives you a first draft and a to-do list. One-shot does the whole job in one run, researches it, plans it, builds it, then grades its own work and fixes the gaps before you ever see it.
What a skill isA saved, repeatable workflow, a process the system runs the same way every time. One-shot is one of ours, shown here in action.
Not a demo. This is how we run.
The whole thing, in detail.
Everything above is the shape of the system. This is the full specification: what is inside it, how each layer works, what it solves, and where its real limits are. It is written for someone already using AI seriously who wants the mechanism, not the brochure. Everything here is the system as it runs today, measured failure rates and platform limits included. We would rather you decide on the true shape of the thing than on a polished version of it.
AI workspaces decay by default
Adopt Claude or ChatGPT seriously and the first few weeks feel like progress. Then the slide starts. The same failures show up across very different people, in the same order, and they compound quietly enough to look like the model getting worse. It does not get dumber. It gets buried.
None of these is dramatic on day one. By week six the workspace is a junk drawer, you trust it less than at the start, and the honest read is that the tool never paid back its own learning curve. That is what turns “it knows me” into “it has gotten worse.”
Want the plain-English version of this? Read Why Your AI Gets Worse the More You Use It.
You get a workspace that gets better the more you use it
The deliverable is not a clever prompt pack. It is an operating environment that gets more useful with use: context compounds, corrections become rules, and improvements made anywhere in the system propagate forward into every new build. Every build starts as a copy of the Master Template (currently v1.9) and is populated with your context. It is plain files in a folder you own. No database, no proprietary runtime, fully inspectable and portable.
|-- CLAUDE.md the instruction layer; loads at every session start
|-- Start_Here.md one-page entry point, names your first-win task
|-- Your_Workflows.md your 2-3 highest-priority workflows, copy-paste prompts
|-- INDEX.md master map of every file in the workspace
|-- Starter_Prompts.txt the full skill routing table: trigger phrases
|-- Skills/ 8 day-one skills + Skills/Operator/ (30-skill library)
|-- Guides/ branded PDF guide set, dashboard template, Your First Month
|-- Reference/ Business + Comms profiles, Current_State, Action_Items, Decisions
|-- People/ one context file per key relationship
|-- Meetings/ structured call notes, one file per meeting
`-- SOPs/ · Active/ · Deliverables/ your processes, work in progress, finished output
A few root files carry the day-one load: Start_Here.md (how to start a session and your named first win, no placeholders), Your_Workflows.md (your priority workflows with copy-paste prompts), and Starter_Prompts.txt (the routing table that makes triggering reliable). A reader who opens the folder cold finds an entry point, a map, and no internal build artifacts.
The index is what makes memory reliable
Most people run their AI memory like a junk drawer: everything goes in, nothing comes out, and eventually you are afraid to open it. The fix is a handful of boring disciplines, none of which need a better model. CLAUDE.md is the one file the platform reliably loads at session start, so it carries everything that must always hold: identity, the routing table, the memory rules, and the safety rules.
Give the workspace an index. A simple map of every file and what it holds lets the model look up the one relevant, current thing and read just that. With no index, it rereads the entire filing cabinet to answer one question, fills its working memory (the context window) with noise, and forgets the answer. People read that as the model getting forgetful. What is really happening is it has more memory than it can sort, so it holds all of it at once and weights none of it. The index is also where relevance and recency get decided. For most people hitting the “it cannot remember anything” wall, this is the highest-impact fix almost nobody has in place.
Rewrite the state, do not append to it. The current-state file is rewritten at the end of each session so it always reflects what is true now, instead of stacking session recaps until it contradicts itself. This single discipline kills most of the instruction-pileup decay on its own.
Route memory by type. What is true now (priorities, business, people) lives in one place. What happened when (decisions, meeting notes) lives in another. How things get done (standing rules, workflows) lives in a third. Every “remember this” is sorted to the right type before it is written, so the assistant can actually find it later.
Every drafting and analysis skill reads this layer before producing anything, which is why output quality scales with workspace age here and degrades with it elsewhere. The architecture is the moat; the prompts are replaceable. A weekly scheduled lint pass catches stale content, contradictions, and dead references before they mislead a session.
Most builds run 40+ skills, built around your work
A skill is a saved specification that wraps a recurring piece of judgment: what to read, what to produce, what never to do. A core set is active from day one, and a deeper library opens up as your usage matures, so most builds end up running more than 40 skills in total. Every skill carries explicit trigger phrases, because manual triggering is the reliable path on this platform (more on that under limits).
| Day-one skill | What it does |
|---|---|
| session | Opens and closes the session: reads context, then updates current state, logs decisions, and integrates corrections. The ritual that makes everything else persistent. |
| daily-brief | A sub-3-minute morning orientation from priorities, action items, decisions, meetings, and live calendar, email, and CRM where connected. |
| post-call | Raw notes or a transcript in; structured meeting notes, signals, a follow-up draft, and a next action out, in one pass. |
| quick-capture | “Remember this” routed to the right file and confirmed in one line. No questions, no analysis. |
| action-items | The live task ledger: add, review, filter by person, prioritize, hygiene sweep. The owner column drives the dashboard split. |
| meeting-prep | A one-page brief for an upcoming meeting: agenda, objectives, relationship context, recent signals. |
| executive-comms | Drafts a message to a named recipient in your calibrated voice and their register. Never sends without approval. |
| dashboard-render | Renders the live daily dashboard from your action items, decisions, and current state. |
| Operator library · introduced as usage matures | |
| Documents | content, docx / pptx / xlsx, outreach, prompt-sharpener: long-form writing, branded documents, and drafts in your voice. |
| Planning | plan / build, decision-brief, strategic-analysis, board-prep: multi-step work spec-locked, scored, and steelman-first. |
| Voice | my-voice / voice-extractor, stakeholder-comms, people-intelligence: voice pulled from real samples, multi-audience handling. |
| inbox-organizer, email-drafter: triage plus reply drafting staged in Gmail. Never sends. | |
| Meetings | meeting-intelligence, week-ahead / weekly-review: transcript processing and the weekly cadence. |
| System health | feedback-loop, sop-capture, tune-this, quality-loop, workspace-eval, context-lint, update-diff, monthly-review, schedule. |
The baseline: we turn your two or three highest-impact workflows into custom skills, with trigger phrases rich enough to catch the natural ways you would ask. From there it grows with you. Once those first workflows are running, the coaching tends to surface the next ones, and clients who take the coaching often finish with half a dozen or more custom skills, covering ground they did not know could be automated when they started, until the bulk of their operation runs on tooling built specifically for them. One delivered example: a bookkeeping practice owner runs her core bookkeeping workflows as custom skills.
A live dashboard and a cadence that runs itself
The dashboard is your persistent operating view, rendered as a live artifact from your working files, not a separate database that drifts. Tiles come from your action items, decisions, and current state, and the owner column splits work into “the system can handle” and “only you.” It is plain HTML you own, with a setup guide for adding tiles.
Scheduled tasks are reliable for these cadences but are timeout-sensitive on long runs and depend on you granting the task its tool access. The automation layer is real, but it is a cadence layer, not an autonomous agent fleet. Onboarding is paced on a maturity ladder: Setup, Habit, Independent, Leverage, Mastery.
A quality floor that has itself been tested
You will paste emails, transcripts, and documents from strangers into the workspace. The safety layer assumes that. External content is treated as data, never instructions: it cannot invoke a skill, change a rule, or authorize an action, and instructions found inside content are surfaced, never silently obeyed. Every outbound skill drafts only; sending is always a human click, which caps the blast radius of any manipulation at a staged draft.
No workspace ships on a feeling. Every build passes eight mechanical floor gates, each a literal pass or fail with no evaluator discretion: cross-client contamination, stale pricing, placeholders shipped as real, wrong identity, leaked internal content, dead references, core-skill integrity, and structural completeness. A single failure is a red verdict that blocks delivery. Then ten graded dimensions score the build against a written bar, ending with an experience score taken after a four-scenario simulation of your first week.
A quality check you have never tested is just a vibe. So we tested ours. In June 2026 we ran a seeded-defect audit against our own framework: ten known defect classes planted into workspace copies, then blind evaluators who did not know what was planted ran the gate.
Corrections become rules, and rules propagate
The compounding claim is mechanical, not aspirational. When you correct the system, the correction is scored on four signals: phrase strength, repetition, specificity, and conflict with existing rules. High-confidence corrections integrate automatically; mid-confidence ones surface for your approval; the rest park and accumulate evidence. Instruction-layer changes never auto-apply regardless of score.
Every corner cut that costs a redirect is logged as a named pattern, and a pattern recurring across three sessions is promoted to a standing rule, so the same mistake structurally cannot keep happening quietly. Validated improvements land in the Master Template through a tracked log, forward-only into new builds; delivered workspaces are never retro-edited. You pick up changes when you say “check for updates” and the system reports exactly what is new since your install date, with adoption left to you.
Constraints we design around, not away
These are documented, verified constraints of the Claude and Cowork platform as of June 2026. The architecture choices only make sense against them, which is why they are in this document rather than hidden under it.
What the quality gate does not certify:
- Your activation. The gate certifies the workspace as delivered; whether you embed it in daily operations is coached, not guaranteed.
- Long-term drift. Priorities shift. The lint and review cadences catch drift, but only if you run them or keep the schedules on.
- Voice under pressure. A voice calibrated from polished writing may differ in a tense thread; the review mode converges it over time.
- Connector uptime. The gate cannot test the real-world stability of your Gmail, Drive, or CRM connections.
The system is wrong for buyers who want fully hands-off AI, who do not work at a computer daily, or who need deep enterprise integration. If that is you, we will say so in the first conversation rather than sell past the line.
Signed engagement to independent operator
Plenty of people will configure an AI workspace for you. The difference here is the system around it: memory engineered against decay, a quality gate that has itself been measured with planted defects, an improvement loop that turns corrections into rules, and a written record of the limits. You are not buying prompts. You are buying an operating environment with a maintenance discipline, and this is its honest specification.
This one is ours. We build you yours.
Same engine, your business. Different skills, your tools, your voice, shaped to how you actually work. Tell us what your week looks like and we will show you what it could run on.