Ask HN: Is Orchestration Solving the Wrong Problem in Data Pipelines?
I have spent most of my career in data integration (StreamSets was my last company), and I keep running into the same pattern:
* When pipelines break, we reach for more orchestration.
* When semantics get lost, we bolt on a catalog.
* When trust disappears, we add observability layers.
Each of these layers feels like a workaround. The real issue seems to be that our data lifecycle is fragmented. Context, ownership, and meaning tend to disappear along the way.
What if we made tables the core unit of data flow? What if teams could publish, transform, and subscribe to versioned tables directly?
That's the idea behind Tabsdata. It is:
* Built around immutable, versioned tables
* Declarative instead of orchestrated
* Based on free-form Python transformations in a controlled environment
* Open core, implemented in Rust with Python bindings
* Delivered with both a UI and a full CLI
Demo video: https://www.youtube.com/watch?v=qCZIRC9khmA
Website: https://tabsdata.com
I would love your feedback:
* Have orchestration tools become a crutch in your data stack?
* What would a table-first approach need to earn your trust?
* Are there better ways to preserve semantics and metadata across teams?
Thanks in advance for any thoughts or pushback.