Back to Blog
F-PulseStewardobservabilityreliabilityETLopen sourceMemory Layer

Beyond Logs: The F-Pulse Steward and the Workspace-Reliability Layer Every ETL Tool Forgot

June 7, 202610 min readBy Hybridyn Engineering
Updated June 2026: This post documents the Steward at its 1.1 launch, when two detectors shipped. Since then the layer has grown to 25 live detectors across all seven observability levels (35 finding kinds total; 10 still contract-ready). The architecture described below is unchanged — only the number of shipped detectors has grown.

There's a layer above pipeline execution that nobody ships. Every popular ETL tool — Fivetran, Airbyte, Airflow, Prefect, Dagster, Hevo, Matillion — stops at "build, schedule, run, show logs." The implicit assumption is that observability lives per-job: was this run successful, did this DAG complete, is this connector throwing 429s.

That's enough for a five-pipeline workspace. It is dangerously not enough for a fifty-pipeline workspace.

This post is about the layer we've spent the last quarter building into F-Pulse OSS — the Steward — and why we believe a workspace-level reliability observer is the next obvious thing every data tool should ship. The Steward is the answer to a question per-job monitoring can't answer: what is structurally wrong with the way we've assembled all of these pipelines, taken as a whole?

The pattern we kept seeing

Three real things we saw, repeatedly, in workspaces we visited:

  1. Two engineers built effectively the same pipeline. Same Salesforce object → almost the same Snowflake table. No alert fires because both pipelines run green. The duplicate burns warehouse compute, doubles the API cost on the Salesforce side, and quietly drifts apart until a stakeholder asks "why do these numbers disagree."
  2. The same source got pulled three times for three downstream uses. Stripe charges flowing to one warehouse for finance, another for product analytics, a third for a BI extract. All three pay the API cost, all three replay incremental logic, all three handle the same edge cases differently. A managed table downstream of one well-monitored pull would have replaced all three.
  3. A duplicate got resolved, then re-introduced six weeks later. Someone rebuilt the consolidation a teammate had deleted. There was no record that this had been resolved, so there was no obvious signal that this was a regression. The team relearned the same lesson.

None of these are pipeline failures. They're architecture failures. The pipelines are individually fine. The composition of the pipelines, looked at together, is wrong.

Per-job monitoring is structurally blind to this class of issue. You'd need a layer that scans across pipelines, looks for patterns at the workspace level, remembers what you decided about those patterns, and gets louder when you ignore them.

That's the Steward.

What the Steward is, exactly

A read-only workspace observer that runs alongside F-Pulse's existing execution and monitoring layers. It scans your workflow set, derives findings against a typed contract, and surfaces them in a dropdown panel in the F-Pulse UI. It never mutates pipelines. It never runs fixes. It never changes execution policy. Advisor, not actor.

The contract covers seven observability levels and thirty-five finding kinds:

LevelWhat it watchesExample finding kinds
ArchitectureStructural / design-level — duplicate extraction, redundant transfer, lineage cascadeduplicate_source, duplicate_pipeline, redundant_transfer, lineage_cascade
PipelineEnd-to-end run health, SLA, partial outputsla_breach, partial_output, retry_storm, failure_rca
NodeStep-level transforms, join/filter/cast behaviourempty_output, join_explosion, filter_dropped_all, cast_failure
ConnectorSource/sink transport — auth, rate limit, reachabilityconnector_auth_failure, connector_rate_limit, credential_near_expiry
DataSchema, freshness, volume, qualityschema_drift, null_spike, freshness_miss, volume_anomaly, partition_missing
GovernancePII movement, credential sprawl, environment crossingpii_leak, credential_sprawl, env_crossing, unapproved_destination
CostCost drift, runaway compute, warehouse wastecost_drift, warehouse_waste, cost_recommendation

F-Pulse 1.1 launched with two detectors: duplicate_source and duplicate_pipeline, implemented in a sub-agent called the Archeologist. (Since launch the layer has grown to 25 live detectors across all seven levels — see the note at the top.) The remaining finding kinds are contract-ready — the UI, the notification bridge, the Memory Layer, and the suppression rules don't need re-shaping when their detectors ship. The schema is frozen. The plumbing is wired. Detectors slot in.

We'd rather show "0 live" on a level than soft-label everything "Certified" or "Beta." If something is contract-ready but not yet implemented, we say so. The matrix is honest.

What the Archeologist actually does today

The Archeologist runs on demand (via the Re-scan button) and is fast enough — sub-50ms on a typical workspace — to run on every /api/steward/findings request. It computes two signatures per pipeline:

  • Source signature: SHA-256[:16] hash of connection_id + connector_type + at least one object-identity field (table / file_path / query / url / object / endpoint). Pagination params, retry config, and sample size are deliberately excluded — they don't change what dataset is being read. Field ordering inside params is normalised so two semantically-equal sources hash identically.
  • Pipeline signature: the set of source signatures plus the set of sink signatures. Two pipelines with the same source and sink set have the same pipeline signature, regardless of the intermediate transform graph.

Then it flags:

  • Duplicate source — two or more pipelines read the same logical source into different destinations. Surfaces a consolidation opportunity — usually a managed table downstream of one extract.
  • Duplicate pipeline — two or more pipelines have the same source set and the same sink set. Usually an accident — two people built equivalent flows.

What it deliberately does not flag:

  • Linear medallion chains (raw → staging → cleansed → modeled reading from the same source). That's a single logical dataset traversing layers, not a duplicate.
  • Fan-out (same source, different sinks) → flagged as duplicate_source (to surface the consolidation opportunity) but not as duplicate_pipeline (because the shapes differ).
  • Workflows explicitly dismissed as intentional. The Steward remembers via a per-workspace suppression file and stops flagging them. The dismissal carries an optional reason ("DR replication", "data-vault gold layer", "legacy job kept for audit only") that's logged in memory.

The Memory Layer

Detection is the easy part. The interesting part is what happens between detections — the part that turns a one-shot scanner into something that remembers.

When you take action on a finding (Dismiss (intentional), Acknowledge, Mark resolved), the Steward writes an event to a per-workspace append-only JSONL journal at /steward//memory.jsonl. Three behaviours emerge from the journal:

1. Persistent-occurrence escalation

Every emit is recorded with the scan ID it came from. The StewardMemory.persistent_occurrences() aggregate counts the distinct scans a signature has appeared in — not the per-scan workflow count. Re-running the same scan twice in 30 seconds doesn't inflate the counter, because the two re-runs share scan boundaries the user wouldn't perceive as separate events.

When persistent occurrences cross escalate_after_n_occurrences (configurable; default 5), the next scan promotes the finding's severity one step — P3 → P2 → P1. A line is appended to the finding body explaining why ("_Severity escalated from P2 to P1 because this finding has been surfaced in 7 separate scans without resolution._"). P1 findings (after escalation) light the header badge red instead of violet.

What you keep ignoring gets louder, not quieter. This is the bit that fights alert fatigue from the other side: noise about what hasn't been done, not just notifications about what is broken.

2. Rebound detection

If you mark a finding Mark resolved and the same signature later re-emerges (someone re-built the duplicate, a teammate reverted your consolidation, an upstream config got rolled back), the next emit is annotated as a rebound:

(rebounded) 2 pipelines have identical source → sink shape

…

_This finding had been resolved previously (last on
2026-06-05T12:17:08+00:00) and has re-appeared. Likely a regression —
review whether the original fix was reverted or a teammate
re-introduced the pattern._

Rebound is now a first-class state — distinct from open — because a regression is genuinely different information from a new issue. It is the explicit answer to "have we seen this before, and what did we decide last time."

3. Sanitized dismiss reasons (durable lessons)

When you dismiss a finding with a reason, the reason text is logged in the journal after being scrubbed for five secret patterns:

  • AWS access key IDs (AKIA / ASIA prefixes)
  • Bearer tokens
  • password=… / secret=… k/v pairs
  • user:password@host URI credentials
  • Private-IP ranges

The sanitized reasons accumulate into a corpus of typed, human-approved lessons. In 1.1, the lesson store + search API are reachable via POST /api/steward/lessons/search — the foundation for the 1.2 Incident Analyst, which will automatically search prior lessons whenever a new failure path matches a previously-resolved signature.

That's the Memory Layer in one line: operator fixes become typed, human-approved, searchable lessons.

Why this ships in OSS

Most "AI for data" features in 2026 are tier-gated. Enterprise-only AIOps add-ons start at high-five-figures-monthly. We deliberately put the Steward in F-Pulse OSS for three reasons:

1. It's what makes F-Pulse feel like more than another ETL tool. The first time a user adds their fifth pipeline and sees "3 pipelines read the same source — consider a managed table" appear in the header, they understand the product is thinking with them, not just running their jobs. That moment shouldn't sit behind a paywall — it's the moment that earns the install.

2. Single-user OSS still benefits. Even one engineer on one laptop accidentally builds duplicates. The Steward catches them. The capability is valuable regardless of team size.

3. Plus monetises team-scale, not capability-scale. F-Pulse+ adds cross-workspace correlation (shared Steward memory across many teams), RBAC-aware approval chains on proposed actions, and SLA-backed integrations with PagerDuty / Opsgenie. But the detection capability itself lives in OSS. We don't paywall thinking.

What's pure code vs LLM

No part of the detection path is an LLM. The Archeologist is plain Python: hash the source params, hash the sink params, group by signature, emit findings. The escalation logic is plain Python: walk the journal, count distinct scans, compare to threshold. The rebound logic is plain Python: track resolve timestamps per signature, annotate the next emit.

There are no hallucinated findings, because there are no findings synthesized by a model. Re-running the detector on the same input produces the same outputs (deterministic finding IDs are SHA-256[:16] hashes of the underlying signatures). The persistence layer uses this for upsert semantics — a finding seen 17 times across scans is one row with occurrences = 17, not 17 rows.

LLMs sit next to the Steward — the F-Pulse AI Copilot can summarise a finding or draft a remediation plan when you ask — but the LLM is never in the decision path. We made that choice on purpose. The Steward must be reproducible, must be auditable, and must be trustworthy enough that a P1-escalated finding can be acted on without a second-guess pass.

The honest scope today

To be precise about what 1.1 ships and what it doesn't:

Ships in 1.1:

  • Architecture-level: duplicate_source + duplicate_pipeline detectors (Archeologist), sub-50ms scan
  • Memory Layer: persistent-occurrence counter, severity escalation (P3 → P2 → P1), rebound detection, sanitized dismiss reasons
  • Lesson store with propose / approve / revalidate workflow + POST /api/steward/lessons/search API
  • 7-state finding lifecycle (open / acknowledged / dismissed / resolved / rebounded / suppressed / expired)
  • Memory tab in the Steward dropdown (live in-app view of persistent-occurrence counts + recent event stream)
  • 12-scenario validation pack (currently 12/12 passing) covering everything that ships

Contract-ready for 1.2 onwards (frozen schema, no live detector yet):

  • Pipeline-level: sla_breach, partial_output, retry_storm, failure_rca
  • Node-level: empty_output, join_explosion, filter_dropped_all, cast_failure
  • Connector-level: auth_failure, rate_limit, unreachable, credential_near_expiry
  • Data-level: schema_drift, null_spike, freshness_miss, volume_anomaly, partition_missing
  • Governance-level: pii_leak, credential_sprawl, env_crossing, unapproved_destination
  • Cost-level: cost_drift, warehouse_waste, cost_recommendation

This is the bet: the contract is general enough that the remaining detector kinds slot into the same UI, the same notification bridge, the same Memory Layer, and the same suppression model without re-shaping anything. If the detector is pure code (and we will not ship one that isn't), it inherits all of the above for free.

Try it

F-Pulse OSS 1.1 ships under Apache 2.0 — no row counting, no per-seat fees, no telemetry-on-by-default.

git clone https://github.com/hybridyn/fpulse-oss
cd fpulse-oss
docker compose up -d

Open http://localhost:8001. Add five pipelines. Watch the Steward badge in the header. The first time it tells you something true about your workspace that you didn't notice on your own, you'll understand why we think this layer is the next obvious thing every data tool should ship.

Want to talk about it? Reach the team — we'd be especially interested in workspaces where the Archeologist surfaces something surprising.

Build data pipelines visually

F-Pulse is open source. Try it in under 3 minutes.