Karamelmar 7c9b61bfce Fix end-to-end startup: project registration, credentials, trust dialog, ready marker

- start.sh: auto-register project in ~/.config/context-studio/projects/ before
  launching Electron — without this acquireProjectLock() silently skips writing
  the lock file, waitForServers() never finds the registry port, all agent ports
  stay null (localhost:null errors)

- start.sh: mount all known Claude Code credential locations into container
  (~/.claude/.credentials.json, ~/.claude.json, $CLAUDE_CONFIG_DIR variants)
  not just ~/.anthropic which was empty on this system

- bin/claude: create /tmp/cs-ready-<agentId> on host after 3s delay so CS Core's
  CLI ready marker poll resolves instead of timing out after 10s

- workflow.sh: add hasTrustDialogAccepted:true to all agent settings.json so
  claude goes straight to priming without the folder trust dialog

- prereqs.sh: add ensure_api_key() — checks all credential locations, prompts
  with masked input if none found, offers to save to shell profile

- wizard.sh: trap SIGINT for graceful abort — gum confirm popup, reverts created
  project dir and cloned core dir, leaves installed packages untouched

- core.sh: set _WIZARD_CORE_CLONED=true before clone for cleanup tracking

- electron-config.js: increase serverStartupTimeout 30s→90s (config file in
  core/config/, not source — safe to edit)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-03-09 21:20:25 +01:00

6.9 KiB

Raw Permalink Blame History

Handover — Context Studio Wizard

Last updated: 2026-03-09

Current status

Fully working end-to-end. The wizard generates a project, ./start.sh starts the container, registers the project, launches the Electron UI, agents start cleanly, and kai's terminal opens and primes without any trust dialog or startup errors.

What was fixed this session (newest first)

1. Trust dialog bypass

File: lib/workflow.sh → agent .claude/settings.json Symptom: Claude Code shows "Quick safety check — do you trust this folder?" on every start, blocking /prime injection. Fix: Added "hasTrustDialogAccepted": true to every generated agent's .claude/settings.json.

2. CLI ready marker — container/host gap

File: lib/container.sh → generated bin/claude Symptom: [CLI] Timeout waiting for CLI ready marker for kai (10000ms) — 10s delay before /prime. Root cause: CS Core polls /tmp/cs-ready-<agentId> on the host. Claude runs inside the container, so it can't create this file on the host. Fix: bin/claude detects when it's being invoked interactively from an agent PTY ($PWD == $PROJECT_DIR/workflow/agents/*), extracts the agent ID from basename "$PWD", and spawns a background job (sleep 3 && touch /tmp/cs-ready-<agentId>) before running podman exec.

3. Project not registered → `localhost:null` for all agents

File: lib/container.sh → generated start.sh Symptom: All agent SSE URLs are http://localhost:null/... → DOMException, no agent communication. Root cause (deep):

waitForServers() in server-management.js polls runtimeConfig.findRuntimeByWorkflowDir()
That function maps workflowDir → project UUID → lock file at ~/.config/context-studio/projects/locks/<uuid>.json
acquireProjectLock() in launcher.js writes the lock file — but silently skips it if the project is not registered in ~/.config/context-studio/projects/<uuid>.json
Without the lock file, waitForServers() always times out → applyRuntimePorts() never called → all agent ports remain null Fix: start.sh now auto-registers the project before launching Electron. It scans ~/.config/context-studio/projects/ for an existing entry matching $PROJECT_DIR/workflow, and if none is found, writes a new <uuid>.json registration file using python3.

4. `serverStartupTimeout` is in `electron-config.js`, not `system.json`

File: ~/.context-studio/core/config/electron-config.js Symptom: 30s startup timeout even when servers start in ~3s. Root cause: The timeout is read from ctx.getElectronConfig()?.startup.serverStartupTimeout. This comes from electron-config.js in the core config dir, NOT from the workflow's system.json. Fix: Changed serverStartupTimeout from 30000 to 90000 in electron-config.js. Note: electron-config.js is a config file (not source), so editing it is appropriate.

5. Credential mounts — `~/.claude/.credentials.json` not mounted

File: lib/container.sh → generated start.sh Symptom: Claude Code inside container can't authenticate to Anthropic. Root cause: start.sh was only mounting ~/.anthropic (empty on this system). Actual credentials are at ~/.claude/.credentials.json (or $CLAUDE_CONFIG_DIR/.credentials.json, ~/.claude.json, $CLAUDE_CONFIG_DIR/.claude.json). Fix: start.sh now builds _CREDS_ARGS array and conditionally mounts whichever credential files exist on the host. All known Claude Code credential locations are checked.

6. API key check in wizard

File: lib/prereqs.sh → ensure_api_key() Symptom: Agents fail if ANTHROPIC_API_KEY not set and no credentials file mounted. Fix: Added ensure_api_key() called from check_prerequisites(). Checks in order: ANTHROPIC_API_KEY env var → $CLAUDE_CONFIG_DIR/.credentials.json → ~/.claude/.credentials.json → ~/.claude.json → $CLAUDE_CONFIG_DIR/.claude.json → ~/.anthropic/.credentials.json. If none found, prompts for API key with masked input and offers to save to shell profile.

7. Ctrl+C graceful abort with cleanup

File: wizard.sh Fix: trap 'handle_sigint' INT in main(). On Ctrl+C: shows gum confirm popup. If confirmed: removes $PROJECT_DIR (if created) and $CS_CORE_DIR (if cloned this session). State flags: _WIZARD_PROJECT_CREATED and _WIZARD_CORE_CLONED (set at moment of action). Installed packages (git, podman) are never reverted.

Key architecture facts

Lock file mechanism (critical for startup)

Lock file: ~/.config/context-studio/projects/locks/<uuid>.json
Project registration: ~/.config/context-studio/projects/<uuid>.json
start.sh auto-registers the project before launching Electron
Without registration, acquireProjectLock() silently skips writing the lock file
Without the lock file, all agent ports remain null → localhost:null errors

How CS Core + Electron work together

start.sh starts the container, registers the project, then launches Electron
Electron's server-management.js spawns node core/start.js --ui-mode=headless
That process starts all A2A agent servers on the host (not in container)
Servers register with the registry (port 8000), write the lock file
waitForServers() polls until lock file appears + health check passes
applyRuntimePorts() is called → agent ports loaded from lock file
CLI ready marker (/tmp/cs-ready-<agentId>) created by bin/claude after 3s delay

Credential lookup order (container mounts)

ANTHROPIC_API_KEY env var (passed via -e)
~/.anthropic/ (mounted read-only, always)
$CLAUDE_CONFIG_DIR/.credentials.json or ~/.claude/.credentials.json (mounted if exists)
~/.claude.json (mounted if exists)
$CLAUDE_CONFIG_DIR/.claude.json (mounted if exists)

What runs where

Container: Claude Code binary only (sleep infinity + podman exec)
Host: Electron UI, CS Core, all A2A agent server processes (node)
bin/claude: bridges host agent calls → container claude

Generated files per project

bin/claude — wrapper: workdir fallback, credential routing, ready marker creation
start.sh — starts container, mounts credentials, registers project, launches Electron
stop.sh — force-removes container
update.sh — git pull core, npm update claude-code in container, apt upgrade

File locations

Wizard: /home/elmar/Projects/ContextStudioWizard/
Core config (editable): /home/elmar/.context-studio/core/config/electron-config.js
Core (read-only source): /home/elmar/.context-studio/core/ (never modify source, never push)
CS settings: ~/.config/context-studio/projects/
Lock files: ~/.config/context-studio/projects/locks/

What NOT to do

Never modify ~/.context-studio/core/ source files — read-only
electron-config.js in core/config/ is an exception — it is a config file, safe to edit
Never commit or push to the core repo

6.9 KiB Raw Permalink Blame History