Skip to content

phillipclapham/flowscript

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

118 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FlowScript

Your AI rewrites your auth system in 20 minutes flat. Two days later your senior engineer is still explaining what it broke — because it didn't know you rejected JWT three weeks ago, or why, or what that decision blocks.

Tests npm PyPI License: MIT Website


FlowScript gives AI tools a structured reasoning graph — not text search over notes, but typed queries over actual decision logic: why(), tensions(), blocked(), alternatives(), whatIf(). Sub-ms local traversal on project-scale graphs.

CLAUDE.md stores facts. FlowScript stores reasoning — why you chose Postgres, what tensions that creates, what's blocked, what breaks if you change your mind. They're complementary. Use both.

This isn't competing with RAG either. RAG finds relevant documents. FlowScript answers "why did we decide against the alternatives?" Different operation entirely.


Try It

Use Claude Code or Cursor? Install and add to MCP config:

npm install -g flowscript-core

In ~/.claude/settings.json (Claude Code) or your Cursor MCP config:

{
  "mcpServers": {
    "flowscript": {
      "command": "flowscript-mcp",
      "args": ["--demo", "./project-memory.json"]
    }
  }
}

Restart your editor. You now have 12 reasoning tools. --demo seeds a sample project so you can explore immediately — ask about tensions, blockers, or why a decision was made. Remove --demo when you're ready for your own project.

Copy this CLAUDE.md snippet into your project to tell the agent when to record decisions, tensions, and blockers during normal coding.

Use something else? FlowScript is just an npm package:

import { Memory } from 'flowscript-core';

const mem = new Memory();
const q = mem.question("Which database for agent memory?");
mem.alternative(q, "Redis").decide({ rationale: "speed critical" });
mem.alternative(q, "SQLite").block({ reason: "no concurrent writes" });
mem.tension(mem.thought("sub-ms reads"), mem.thought("$200/mo cluster"), "performance vs cost");

mem.query.tensions();   // structured tradeoffs with named axes
mem.query.blocked();    // what's stuck + downstream impact
mem.save("./memory.json");

Building with agent frameworks? Drop-in adapters for LangGraph, CrewAI, Google ADK, and OpenAI Agents SDK: pip install flowscript-agents[langgraph]details below.


What the Queries Actually Return

These traverse a typed reasoning graph, not strings. Here's real output from the demo project:

tensions() — every tradeoff, with named axes:

><[performance vs cost]
  "Redis gives sub-ms reads but cluster costs $200/mo"
  vs "PostgreSQL on shared hosting is $15/mo and handles our scale"

><[statelessness vs revocability]
  "JWT with refresh tokens — stateless, scalable"
  vs "JWT revocation needs a blocklist — server-side state anyway"

blocked() — what's stuck and why:

[blocked] SQLite — zero ops, embedded, good enough for MVP
  reason: "Cannot handle concurrent writes from multiple API workers"

why("modular-monolith") — causal chain backward:

Collapsed back to a modular monolith — same code boundaries, one deploy
  ← Started with microservices but the overhead is killing velocity
    → Premature distribution is worse than premature optimization

alternatives("database-question") — what was considered, what was decided:

? Which database for user sessions and agent state?
  [decided] PostgreSQL — "Best balance of cost, reliability, and query power"
  [open]    Redis — sub-ms reads, great for session cache
  [blocked] SQLite — cannot handle concurrent writes

whatIf("postgresql") — downstream impact if this changes:

If PostgreSQL is removed:
  → Session storage needs replacement
  → Server-side auth decision invalidated
  → ULID migration becomes irrelevant

That last one is a .fs file — the human-readable format your PM can review without knowing code:

? Which database for user sessions and agent state?
  [decided] || PostgreSQL — battle-tested, ACID, rich querying
  || Redis — sub-ms reads, great for session cache
  [blocked] || SQLite — zero ops, embedded, good enough for MVP

thought: Started with microservices but overhead is killing velocity
  -> thought: Collapsed back to modular monolith — same boundaries, one deploy
    <- Premature distribution is worse than premature optimization

But I Already Use CLAUDE.md

Good. Keep using it. CLAUDE.md tells your agent what to do. FlowScript tells it how to reason about what it's doing.

Facts: "we use Postgres, prefer functional style, run tests first." Reasoning: why Postgres, what tensions that creates, what's blocked, what breaks if you change your mind.

They're complementary. CLAUDE.md is your agent's cheat sheet. FlowScript is its working memory.

The difference matters when your agent hits a decision point. With just CLAUDE.md, it re-derives every tradeoff from scratch. With FlowScript, it queries tensions() and already knows the performance-vs-cost axis from three weeks ago. It queries blocked() and knows SQLite is off the table and why. It queries why("postgres") and gets the full causal chain without re-reading the codebase.


Works With Your Stack

MCP (Claude Code, Cursor)

14 tools, local file persistence, zero cloud dependency. Setup above. Your reasoning stays on your machine — no cloud, no telemetry.

Agent Frameworks (Python)

pip install flowscript-agents[langgraph]   # or crewai, google-adk, openai-agents, all

Drop-in replacements for each framework's native memory interface. Same API your framework expects, but now query.tensions() works:

from flowscript_agents.langgraph import FlowScriptStore

store = FlowScriptStore("./agent-memory.json")

# Standard LangGraph operations
store.put(("agents", "planner"), "db_decision", {"value": "chose Redis for speed"})
items = store.search(("agents", "planner"), query="Redis")

# The part that's new — semantic queries on the same data
blockers = store.memory.query.blocked()
tensions = store.memory.query.tensions()
why_chain = store.memory.query.why(node_id)

Also available: CrewAI (FlowScriptStorage), Google ADK (FlowScriptMemoryService), OpenAI Agents SDK (FlowScriptSession). All expose .memory.query for FlowScript queries.


Temporal Intelligence

Memory that gets smarter over time, not just bigger.

Tier Meaning Behavior
current Recent observations May be pruned if not reinforced
developing Emerging patterns (2+ touches) Building confidence
proven Validated through use (3+ touches) Protected from pruning
foundation Core truths Always preserved, even under budget pressure

Nodes graduate automatically. prune() moves dormant nodes to an append-only audit trail (.audit.jsonl) — crash-safe, always recoverable. Proven and foundation tiers survive budget constraints. Your agent never loses hard-won knowledge.

The memory compresses itself, and the compression reveals structure that verbosity obscures. A decision that keeps coming back up earns its place. One-off observations fade.

Session Lifecycle

Queries keep knowledge alive. Every query — why(), whatIf(), tensions(), blocked(), alternatives() — touches the returned nodes, incrementing frequency and updating timestamps. This is what drives graduation: knowledge that keeps getting queried earns its place.

// Start of session: orient with token-budgeted summary + active issues
const orientation = mem.sessionStart({ maxTokens: 4000 });
// → summary, blockers, tensions, garden stats, tier distribution

// During session: queries automatically touch returned nodes
mem.query.tensions();     // nodes get frequency++
mem.query.why(nodeId);    // causal chain nodes get frequency++

// End of session: prune dormant, save, get before/after stats
const wrap = mem.sessionWrap();
// → { nodesBefore, tiersBefore, pruned, gardenAfter, nodesAfter, tiersAfter, saved }

Touch-on-query is enabled by default. Disable with { touchOnQuery: false } for read-only analysis.


The Complete Developer Loop

import { Memory } from 'flowscript-core';

// 1. Load or create (zero-friction first run)
const mem = Memory.loadOrCreate('./agent-memory.json');

// 2. Wire to your agent (14 tools, OpenAI function calling format)
const tools = mem.asTools();

// 3. Agent builds reasoning via tool calls during work
//    (no FlowScript syntax needed — the agent handles it)

// 4. Inject memory into prompts (respects token budget)
const context = mem.toFlowScript({
  maxTokens: 4000,
  strategy: 'tier-priority'  // proven knowledge always included
});

// 5. Or extract from existing conversations
const mem2 = await Memory.fromTranscript(agentLog, {
  extract: async (prompt) => await yourLLM(prompt)
});

// 6. End of session — prune dormant, save, get stats
const wrap = mem.sessionWrap();
// → pruned dormant nodes to .audit.jsonl, saved to disk
// → { nodesBefore, pruned, nodesAfter, tiersAfter, saved }

Four budget strategies: tier-priority (foundation first), recency, frequency, relevance (topic match). ~3:1 compression ratio vs prose.


Comparison

FlowScript Embedding stores CLAUDE.md / state dicts
"Why did we decide X?" why(id) — typed causal chain No No
"What's blocking progress?" blocked() — with impact scoring No Manual grep
"What tradeoffs exist?" tensions() — named axes No No
"What alternatives were considered?" alternatives(id) No If you wrote them down
"What if we change this?" whatIf(id) — downstream impact No No
Human-readable export .fs files No Yes (but flat)
Token-budgeted injection 4 strategies No Manual truncation
Temporal tiers + graduation Automatic No No
Audit trail .audit.jsonl (append-only) No Git history

Under the hood: a local symbolic graph, not a vector database. Nodes are typed (thought, question, decision, insight, action, completion). Relationships are typed (causes, tension, derives_from, temporal, alternative). States are typed (decided, blocked, exploring, parked). Queries traverse this structure. No embeddings, no LLM calls, no network.


Ecosystem

Package What Install
flowscript-core TypeScript SDK + MCP server + CLI npm install flowscript-core
flowscript-agents Python — LangGraph, CrewAI, Google ADK, OpenAI Agents pip install flowscript-agents
flowscript-ldp Python IR + query engine (foundation layer) pip install flowscript-ldp
flowscript.org Web editor + D3 visualization + live queries Browser

FlowScript is the first implementation of LDP Mode 3 (Semantic Graphs). Three independent systems converged on symbolic notation for AI reasoning without cross-pollination — SynthLang, FlowScript, MetaGlyph. When independent builders converge, the insight is structural.


Documentation

FLOWSCRIPT_SYNTAX.md — 21-marker spec | QUERY_ENGINE.md — 5 queries, TypeScript API | FLOWSCRIPT_LEARNING.md — beginner guide | examples/ — demo memory, CLAUDE.md snippet, golden files


Governance

FlowScript's why() produces typed explanatory chains and the append-only audit trail provides decision provenance — the kind of structured explainability that regulations like the EU AI Act (Articles 12, 13, 86) are starting to require for high-risk AI systems.


Contributing

Use FlowScript. Report what's friction. Open issues with evidence from real use. Framework integration PRs welcome.

Issues | Discussions


Known Limitations

  • Single-agent access: save() uses non-atomic writeFileSync. Two agents writing the same file concurrently will clobber each other. Use separate memory files per agent.
  • .fs format is lossy: Temporal metadata, config, and snapshots are not preserved in .fs files. Use .json for the full operational loop.
  • CJS package: Exports CommonJS. Works in ESM projects via Node.js CJS interop (import { Memory } from 'flowscript-core' works in Node 18+). Native ESM build planned.

MIT. Built by Phillip Clapham.

About

Structured reasoning graph for AI coding tools. Five typed queries — why(), tensions(), blocked(), alternatives(), whatIf() — over your project's actual decision logic. MCP server + TypeScript SDK. 628 tests.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors