Skip to main content
RFX Response Contract Officer Solutions Enterprise Solutions Defense Solutions Collaborative Machines Cognitive Mesh Full Product Life Cycle RDS Knowledge Base
Product

Collaborative Machines — Multi-Agent Reasoning Platform

Evolutionary multi-agent systems using first-principles decomposition, polyglot memory, and human-in-the-loop governance

26 Capabilities
0 Fully Tested
145 Use Cases

Overview

Collaborative Machines is an evolutionary multi-agent reasoning platform that enables teams of AI agents to solve complex, high-dimensional problems through first-principles decomposition, polyglot memory, and human-in-the-loop governance. The platform reached **v1.0.0 GA on 2026-03-19** and now delivers **26 capabilities** across 6 domains: core reasoning (4-stage first-principles engine with Clarify→Strip→Rebuild→Critique pipeline, **32 canonical agent roles** with 8-stage bootstrap and post-bootstrap validation), multi-agent coordination (team synthesis across 10 archetypes, MetaOrganization coordination with abstraction-graph routing), polyglot persistence (pgvector episodic memory, Neo4j knowledge graph, MongoDB executive state with graceful degradation), intelligence and governance (OODA loop scheduling with bounded queue + circuit breaker, 5-gate HITL approval with persistent MongoDB store and SLA scheduler, full DNA Evolution pipeline with `promote_dna.py` CLI, perspective-driven self-assessment using real DB queries, 4-domain simulation environments with counterfactual replay), skills and integration (35+ typed skills with policy enforcement, LangChain/LangGraph bridge, MCP server with lifecycle CI tests), operational excellence (circuit breaker resilience, injection defense, OpenTelemetry observability, Prometheus/Grafana dashboards), and a new **operator console backend** (decision traces, learning events, WebSocket live updates with authenticated pub/sub). Functional behavior is verified by **270+ Python integration tests (~945 test functions across 90 files)**; end-to-end browser-driven verification covers the operator console's auth, health, HITL, task lifecycle, and cross-endpoint smoke flows. Playwright coverage of additional capabilities is the next milestone.

Why Collaborative Machines

📊

Executive View

Collaborative Machines delivers a structured hierarchy where Agents form Teams and Teams form Organizations — each agent specializing in a domain (cost engineering, threat analysis, compliance auditing, systems engineering) and synthesizing perspectives through first-principles reasoning. Unlike single-purpose integrations, the platform reasons across domains, incorporates human judgment at critical decision points through a 5-gate HITL model, and produces an auditable trace showing assumptions stripped, primitives identified, alternatives evaluated, and confidence scored. Human-in-the-loop governance gates ensure humans approve high-stakes decisions (new abstractions, cross-org trade-offs, safety determinations) while allowing agents to operate autonomously on routine tasks.…

⚙️

Technical Architecture

The platform runs Python 3.12 + FastAPI with polyglot persistence across three stores: PostgreSQL + pgvector (HNSW) for episodic memory and semantic search, Neo4j 5.26 for knowledge graph traversal and goal hierarchies, and MongoDB 7.0 for executive state, decision traces, and HITL workflows. The first-principles engine implements a 4-stage state machine (Clarify, Strip, Rebuild, Critique) with configurable iteration limits and confidence thresholds. Agent identity is defined in YAML DNA files with primitive taxonomies, skill packs, and memory configurations — 31 canonical roles ship out of the box.…

👤

User Experience

Operations teams submit tasks through the REST API or integrate via LangChain/LangGraph pipelines. The system decomposes problems using first-principles reasoning — stripping assumptions and conventions to identify fundamental primitives, then rebuilding solutions from ground truth. When multiple perspectives are needed, teams of specialized agents (threat analyst + cost engineer + compliance auditor) each reason independently before a synthesist agent merges their outputs, flagging disagreements for human review.…

Available Now — 26 Capabilities

First-Principles Engine (4-Stage Pipeline)

🟩 Delivered

CAP-CM-001 · 3 use cases

4-stage pipeline (Clarify → Strip → Rebuild → Critique) with iteration control, confidence scoring, and primitive taxonomy. Invoked by LivingAgent runtime.

Agent DNA Loading & Bootstrap

🟩 Delivered

CAP-CM-002 · 8 use cases

31 canonical agent roles in dna/agents/. 8-stage bootstrap sequence with strict-mode YAML validation. Integration-tested via RA-01–RA-04. Runtime introspection exposed via /api/v1/admin/agents (Pydantic-validated list of bootstrapped agents) and /api/v1/admin/bootstrap-report (Stage 8 team/org coverage and validation pass status).

Living Agent Runtime

🟩 Delivered

CAP-CM-003 · 19 use cases

RECEIVE → RETRIEVE → ASSEMBLE → RUN FP → VALIDATE → PERSIST → RETURN execution flow. Memory retrieval with graceful degradation on store failure. Restart control plane: POST /api/v1/agents/{agent_id}/restart reloads DNA from disk and rebuilds FP engine + skill executor in place.

Team Coordination & Synthesis

🟩 Delivered

CAP-CM-004 · 3 use cases

12 team archetypes in dna/teams/. Multi-agent dispatch, conflict detection, result aggregation, pattern mining. RA-01–RA-04 integration tested.

Organization Coordination

🟩 Delivered

CAP-CM-005 · 17 use cases

Cross-team routing, abstraction governance, domain synthesis.

Triple-Store Memory (pgvector + Neo4j + MongoDB)

🟩 Delivered

CAP-CM-006 · 3 use cases

Polyglot persistence: pgvector HNSW for episodic memory, Neo4j for knowledge graph and goal hierarchies, MongoDB for executive state and decision traces. Graceful degradation on single-store failure.

Task Persistence & Orphan Recovery

🟩 Delivered

CAP-CM-007 · 3 use cases

MongoDB-backed task persistence with compound indexes, soft-delete, orphan sweep.

Goal & Plan Registry

🟩 Delivered

CAP-CM-008 · 3 use cases

StrategicGoal, TacticalGoal, Plan lifecycle with cascade abandonment and variance tracking. MongoDB + Neo4j hierarchy. UI lives in cm_web_ui/src/pages/GoalsPlans.tsx (sibling project).

OODA Loop Scheduler

🟩 Delivered

CAP-CM-009 · 10 use cases

Per-agent async observe/orient/decide/action cycle with config-driven cadence, preemption/interrupt queue, simulation-gated decide phase.

HITL Governance (5-Gate Approval)

🟩 Delivered

CAP-CM-010 · 3 use cases

5 approval gates (G1-G5), 4 autonomy levels (Autonomous→Notify→Approve→Direct), SLA enforcement, confidence recalibration, MongoDB + in-memory stores.

DNA Evolution Pipeline

🟩 Delivered

CAP-CM-011 · 9 use cases

Trigger detection (performance degradation, repeated escalation, capability gap, self-review), simulation validation, trial execution with parallel versioning, rollback on failure.

Perspective Engine

🟩 Delivered

CAP-CM-012 · 3 use cases

4 scorers (efficiency, quality, learning rate, collaboration), composite awareness aggregation, self-review triggers.

Simulation Environment

🟩 Delivered

CAP-CM-013 · 17 use cases

4 domain environments (cost_schedule, cybersecurity_triage, rfx_proposal, infrastructure_incident). Episode storage with Qdrant/vector.

Skills Runtime

🟩 Delivered

CAP-CM-014 · 10 use cases

35+ typed skills with registry, resolver, executor, policy enforcer (deny-by-default), and audit trail. Jinja2 rendering for skill templates.

LangChain/LangGraph Bridge

🟩 Delivered

CAP-CM-015 · 3 use cases

CMAgentTool for LangChain chains, cm_agent_node for LangGraph state machines, subgraph builder, HITL interrupt bridge (EscalationSignal → interrupt).

Knowledge Base Integration

🟩 Delivered

CAP-CM-016 · 3 use cases

KB retriever with semantic search, domain taxonomy, agent profile coupling.

MCP Server

🟩 Delivered

CAP-CM-017 · 2 use cases

SSE transport, task/KB/agent/system tools. FastMCP adapter. Tool implementations mostly stubs.

Cross-Model Validation (Anthropic)

🟩 Delivered

CAP-CM-018 · 3 use cases

MultiModelValidator with revision loop and 4-dimension scoring (accuracy, safety, completeness, clarity). Anthropic as verifier.

REST API (Tasks, Goals, HITL, Admin)

🟩 Delivered

CAP-CM-019 · 3 use cases

FastAPI with auth (API key + JWT), rate limiting, CORS, observability middleware. Task CRUD, goal/plan CRUD, HITL queue, admin introspection, KB search.

WebSocket Live Updates

🟩 Delivered

CAP-CM-020 · 3 use cases

WS manager with room-based messaging, presence tracking, connection lifecycle.

Error Handling & Resilience

🟩 Delivered

CAP-CM-021 · 3 use cases

Circuit breaker (state machine), cascading recovery orchestration, graceful degradation on memory store failure, retry with exponential backoff + jitter.

Security & Access Control

🟩 Delivered

CAP-CM-022 · 3 use cases

JSONB parameterization (SQL injection defense), MongoDB regex escaping, classification enforcement (deny-by-default), scope isolation (agent/team/org), audit logging.

Observability Stack

🟩 Delivered

CAP-CM-023 · 3 use cases

structlog structured logging, OpenTelemetry tracing, Prometheus custom metrics (FP_STAGE_*, AGENT_RUN_*, OODA_*), Grafana dashboards. NOTE: facts bundle (2026-05-06) confirms UC-CM-66/67/68 implementation_present=false — these UCs reference observability surfaces not yet exposed as queryable API endpoints.

Health Monitoring

🟩 Delivered

CAP-CM-024 · 2 use cases

Liveness, readiness, and deep health checks covering pgvector, Neo4j, MongoDB, bootstrap.

Learning & Feedback Loop

🟩 Delivered

CAP-CM-025 · 3 use cases

Feedback capture, event storage (MongoDB), self-improvement triggers.

Decision Trace & Learning Surfaces (Operator Console Backend)

🟩 Delivered

CAP-CM-026 · 3 use cases

Backend surface for the operator console (Dashboard, Decision Explorer, HITL Queue, Learning Feed). /api/v1/traces supports filter by agent/team/pipeline_stage/confidence with pagination; backed by InMemoryTraceStore. /api/v1/learning/events[+/metrics] backed by InMemoryLearningEventStore. WebSocket /api/v1/ws with ConnectionManager and channel-based pub/sub on hitl and learning channels; Bearer JWT / X-API-Key auth required before connect (CM-SEC-001). Frontend surfaces live in sibling cm_web_ui/ project.

Capability Maturity Levels

🟢 Delivered & Tested🟩 Delivered🟡 Partially Delivered🟨 Stubbed🟠 Designed🟣 Future

Shared Platform Foundation

All RDS products share infrastructure that accelerates delivery and ensures consistency:

sf_shared

LLM factory, auth, BaseTask, agent profiles

sf-ui

React components, hooks, Tailwind palette

Knowledge Base

pgvector hybrid search, 7 content domains

Collaboration Platform

WebSocket rooms, presence, real-time sync

Interested in Collaborative Machines — Multi-Agent Reasoning Platform?

RDS delivers and extends Collaborative Machines through fixed-cost Capability Delivery Sprints — start with a Capability Pilot to see multi-agent reasoning working on your hardest problems in weeks.

Discuss Your Program