Driver View — MIXI Executive Workshop

Home Facilitator Script

🇯🇵 JP

DRIVER VIEW · facilitator only

Ready

00:00

Progress

Shortcuts: Space start/pause · N next · B prev

Prompts (Click to Copy)

P1-01 Spec-to-build (cross-review · Plan mode)

Create a new empty folder at ~/workshop-demo/ai-executive/ and build inside it the MIXI "AI Executive cross-review agenda evaluation system". This is a working executive-support tool: three agents (CEO / CFO / CTO) evaluate an agenda from their own angle, cross-review each other, and then a Facilitator agent produces a consolidated recommendation. The point of the tool is surfacing blind spots that a single evaluation can't see.

## Three-phase processing

### Phase 1: Independent evaluation (parallel)
Each agent (CEO / CFO / CTO) evaluates the agenda without seeing the others. Output fields:
- stance: "support" / "oppose" / "conditional support" / "defer"
- summary: 2–3 line take
- key_points: 3 main points
- risks: 2 concerns
- quantitative_notes: numbers (CFO must include, others optional)

### Phase 2: Cross-review (parallel)
Each agent reads the other two Phase 1 outputs and responds. Four fields per agent:
- agreements: 1–2 points they concur with
- blind_spots: 2–3 things the others missed — this is where the tool earns its keep
- counter_arguments: 0–2 dissenting points
- questions: 1–2 follow-up questions or info requests

### Phase 3: Synthesis (serial)
A Facilitator agent reads everything from Phase 1 and Phase 2 and emits:
- consensus: 2–4 agreed points
- conflicts: disagreements + a recommended resolution path
- actions: 3 executable actions (title / owner / deadline / success_metric)
- residual_risks: remaining risks + how to monitor them
- decision_recommendation: "GO" / "NO-GO" / "conditional GO" + reason

## Agent specs

- CEO: "You are MIXI's CEO. Focus: (1) alignment with the 3-year vision, (2) competitive edge, (3) mid-to-long-term shareholder value. Put executive intuition into words. JSON output required." Key: OPENAI_API_KEY_CEO
- CFO: "You are MIXI's CFO. Focus: (1) ROI/IRR, (2) quantified risk, (3) financial discipline. No optimism bias — always cover the downside. JSON output required." Key: OPENAI_API_KEY_CFO
- CTO: "You are MIXI's CTO. Focus: (1) technical feasibility, (2) scalability, (3) engineering load. Picture the system six months in. Be honest about tech debt. JSON output required." Key: OPENAI_API_KEY_CTO
- Facilitator: "Strategy-meeting facilitator. Consolidate the 3 evaluations + cross-reviews into one recommendation. Don't force consensus — surface the disagreement clearly. JSON output required." Key: OPENAI_API_KEY (may reuse any of the above)

## Technical requirements

Backend: Python 3.11+ / FastAPI (async) / OpenAI SDK AsyncOpenAI / Pydantic v2 / python-dotenv
- Phase 1 and Phase 2 run with asyncio.gather (parallel)
- OpenAI calls use response_format={"type":"json_object"} for structured output
- Rate-limit retry: exponential backoff, max 3 (1s / 2s / 4s)
- Timeout: 30s per phase

Frontend: single static/index.html / vanilla JS / SSE for streaming / marked.js from CDN for Markdown rendering
- UI updates per phase/agent completion (progress bar + card light-up)
- "Copy to clipboard" so the output drops straight into meeting notes

API: POST /api/review { "agenda": "..." } → SSE stream
- event: phase1_start / agent_eval (×3) / phase1_done
- event: phase2_start / cross_review (×3) / phase2_done
- event: phase3_start / synthesis / phase3_done
- event: complete / error

Docker: docker compose up -d to start / localhost:8000 serves both frontend and API / base image python:3.11-slim

Logs: raw IO saved to logs/{timestamp}-{request_id}.json

## 🔑 API key handling: Web UI settings page (important)

- Don't ask the user to edit .env by hand (Windows UX priority)
- The browser's /settings page uses password inputs
- App atomically writes to .env and auto-checks .gitignore

### Settings endpoints
- GET /settings                      → settings page HTML
- GET /api/settings/status           → { all_keys_set, masked, gitignore_ok }
- POST /api/settings                 → validate (OpenAI test call) → write .env → apply immediately
- POST /api/settings/rotate          → replace one key
- GET /api/settings/gitignore-check  → return .gitignore state

### First-run redirect
- GET / checks status
- all_keys_set=false → auto-redirect to /settings?first_run=true
- All keys set → evaluation screen

### Security
- /settings endpoints only accept localhost (bind 127.0.0.1:8000:8000)
- Never log or render the raw key (show last 4 chars, masked)
- Light regex check on OpenAI key format
- .env is chmod 600

## File structure

ai-executive/
├── .env.example
├── .gitignore
├── README.md
├── docker-compose.yml
├── Dockerfile
├── requirements.txt
├── app/
│   ├── __init__.py
│   ├── main.py       # FastAPI + SSE + redirect logic
│   ├── agents.py     # 4 agents
│   ├── pipeline.py   # 3-phase orchestration
│   ├── settings.py   # Settings API + .env writer
│   ├── env_writer.py # atomic .env write + gitignore check
│   ├── models.py     # Pydantic schemas
│   └── prompts.py    # system prompts
└── static/
    ├── index.html    # agenda evaluation UI
    └── settings.html # API key settings page

## Constraints
- 14 files max / 700 lines of code max
- Runtime: typically 40–70s (hard ceiling 90s/phase)
  - Phase 1 parallel: 8–20s / Phase 2 parallel: 10–25s / Phase 3: 5–15s
- Cost (7 OpenAI calls):
  - gpt-4o-mini: $0.01–0.02/agenda (default)
  - gpt-4o: $0.25–0.50/agenda
- Model switch: DEFAULT_MODEL env var + /settings dropdown
- First-run setup (docker up → /settings → evaluation) must fit in 60 seconds

## Phase 2 partial-failure handling (required)
If 1–2 agents fail:
1. Retry once
2. If still failing, record {"status":"unavailable"}
3. Facilitator consolidates only available outputs, flags the gap in residual_risks
4. Mark decision_recommendation confidence as "weak"
5. UI grays out the missing card, shows a retry button
Never produce a GO/NO-GO from 2 out of 3 voices without flagging it. Gaps must be visible.

## What I want in Plan mode first

1. File list + responsibility per file
2. Pseudocode for each phase (≤ 30 lines)
3. API endpoint spec (including full SSE event list)
4. Error-handling approach
5. Skeleton Pydantic models
6. External deps (requirements.txt equivalent)
7. Startup smoke-test steps

After I approve, generate all files. Don't place API keys yet — just create .env.example and I'll fill in keys through the Settings UI.

P1-02 API keys through the Web UI (6 principles)

Start the app with docker compose up -d. Open http://localhost:8000 in a browser.

Because this is a first run, the app should auto-redirect to /settings?first_run=true. I'll paste the four OpenAI keys I issued this morning (OPENAI_API_KEY / _CEO / _CFO / _CTO) into the four password fields by hand.

When I click "Validate & Save" the app should:
1. Test-call OpenAI with each key to verify it's valid
2. If all four pass, atomically write them to .env
3. Check that .env is in .gitignore, and add it if not
4. Reflect keys into the in-memory cache (no restart needed)

After a successful save, run git check-ignore -v .env in the terminal to prove .env is being ignored.

I want to show the executives that key values never touch chat and never touch screen recording. Make the password-input + validation flow work end to end so we can demo it cleanly.

P1-03 Run an agenda, watch the stream

After saving Settings, the app should return me to /. In the agenda text area, paste:

Should MIXI invest ¥300M in new game X?

and hit "Evaluate agenda". You'll see SSE stream Phase 1 → 2 → 3 over about 40–45 seconds:
- Phase 1: three agent cards light up one by one
- Phase 2: cross-review — watch the blind_spots section
- Phase 3: Facilitator's consolidated report, ending in GO / NO-GO / conditional GO

If something errors out, self-heal (the SSE async plumbing, the Settings redirect logic, the OpenAI AsyncClient, whatever). Verify "Copy to clipboard" drops the full Markdown into the meeting minutes.

P1-04 Add a feature in plain English

Add a new section at the end of the report called "Three concrete actions for next quarter". The section reads all three exec evaluations and outputs one consolidated recommendation.

After implementing, verify hot reload picks it up on browser refresh. Walk me through the change via Visual Diff and let me Accept it.

P2-01 One-pager

Generate a one-pager for a new MIXI initiative called "AI Executive Talent Academy".
Include: problem / solution / target / differentiation / monetization / 12-month roadmap / team.
Format for A4 portrait print. Use MIXI's warm brand palette.

P2-02 Monthly review, 5 slides

Five slides for a monthly performance review.
1: Executive summary — 3 KPIs
2: Segment performance — table + small chart
3: Three highlights of the month
4: Concerns and countermeasures
5: Focus for next month
Draft in 3 minutes. The assumption is an exec tweaks it before the morning meeting.

P3-01 Cowork PDF summary

Read the PDF at ~/Downloads/mixi-ir.pdf and write a 3-line summary to ~/Desktop/summary.md. Just the essentials, quote any numbers that matter, exec-level granularity.

Fallback

When things break:

Wi-Fi totally down: Switch to iPhone tethering (need 3 Mbps+). If tethering also fails, hit [B] for the pre-recorded demo, tell the room "we're going offline for this part" without apologizing too much, and run the rest on video through to the Cowork demo in Part 3. [A] needs no network. [B] needs no network either — it's a local video file.

Session Notes

6 Principles — Quick Ref

Never paste in chat (→ password input)
Go through .env (→ app writes it)
Check .gitignore (→ auto on Save)
Rotate on a schedule (→ /settings button)
Least privilege (→ at OpenAI issuance)
Split prod / dev keys (→ revoke today's key after session)

URLs

localhost:8000 → main (evaluation)
localhost:8000/settings → key setup
First run: auto-redirect

Offline fallback

Wi-Fi flaky → iPhone tethering (3 Mbps+)
Totally down → [B] pre-recorded video
ai-executive won't start → [A] copy backup