I was a Data Scientist for 10 years before becoming a quadriplegic. For the past 3 months, I built VibeETL from scratch: A lightning-fast, visual Alteryx alternative powered by Polars & React Flow.

Posted by card_chase@reddit | LocalLLaMA | View on Reddit | 18 comments

Hey r/LocalLLaMA

I spent nearly a decade working in the trenches as a data scientist, wrestling with massive datasets, handling messy enterprise schemas, and using just about every major ETL tool on the market. A few years ago, my life changed completely when I became a quadriplegic. But my passion for building software close to the metal never stopped.

For the past 3 months, I’ve dedicated my time to engineering a visual data manipulation platform from the ground up—exactly how I always wished it existed when I was working in the industry.

It’s called VibeETL, and it is officially ready for the community to test, break, and scale.

🔗 Repository: https://github.com/cardchase/VibeETL

⚡ Built for True Scalability & Infinite Speed

Because I’ve worked with heavy legacy systems, I designed VibeETL to completely avoid visual and computational lag:

Blazing Fast Polars Core: The backend is powered entirely by Polars and Rust-native optimizations, leveraging zero-copy Apache Arrow memory transport allocations.
Zero-Dependency BFS Snap Layout: I ripped out heavyweight third-party layout libraries like dagre to eliminate Vite HMR dependency freezes. I engineered a native Topological BFS Layout algorithm directly inside the React Flow canvas to instantly snap connected node matrices from left to right.
Lag-Free UI Buffering: Form parameter side-panels use localized component input shielding (SafeInput). Keystroke mutations are containerized so that editing complex custom formulas or handling 40+ sport-betting and historical odds column layouts drops typing lag to absolute zero without thrashing the master canvas.
Isolate Process Jailing: The Python Code node runs custom data scripts and machine learning algorithms inside an ultra-secure, ephemeral local subprocess jail featuring a strict 30-second execution cutoff to prevent computing freezes or main server thread crashes.

🌌 Built for the AI Age: Drop in Your Own Custom Tools!

I designed VibeETL with a strict rule: Absolute Community Extensibility. The manifest-driven Python backend makes it incredibly easy to build new processing blocks.

If you use autonomous coding agents (like an AI anti-gravity agent), you can literally hand it the workspace base template folder, ask it to write a new data tool, drop the generated folder straight into the codebase, write a Pull Request, and instantly contribute to the ecosystem.

🛠️ Where I Need the Community's Help to Test & Harden:

While the primary data ingestion paths, data cleansing engines, database read/write blocks, and high-density spreadsheet grid layouts are fully stable and hardened to enterprise specs on local machine environments, I haven't been able to fully validate some of the external cloud paths myself.

I would love for developers, cloud architects, and machine learning specialists to pull the repo and actively test/break:

The Gemini Vision AI Integration: Validating image captioning ingestion pipelines and token processing loops.
Cloud Connectors & Google Cloud Tools: Pushing the limits of our Google Sheets inputs/outputs, GCS streams, and secure credential path-jailing guards.
Hardware & GPU Acceleration: Seeing how far we can push matrix weight scaling (like running Nvidia RAPIDS cuDF/cuML or PyTorch CUDA drivers) within our isolated Python subprocess container jail if you have a local GPU environment.

🚀 Getting Started on Localhost Loopback:

You can clone the repo, run the automated launch script, and have a fully responsive, beautiful, glassmorphic visual canvas workspace running on your local loopback port in seconds:

Bash

# Clone the Core
git clone https://github.com/cardchase/VibeETL.git
cd VibeETL

# Run the automated launcher
# Windows:
.\run.ps1
# Mac/Linux:
./run.sh

Please take a look at the code, run some of your own historical datasets through it, and let me know your thoughts. I am incredibly proud to share this first version with you all, and I cannot wait to see what tools the community builds and contributes via PRs.

Let's build the future of visual data engineering together! 🔥

p.s. I have built this using Gemini and voice on the anti gravity platform I know this is about local models and now that the product is enterprise ready for testing you can just drop the folder to your model's context and tell it what it has to build and it will build it up from there. I have I tried to make it as simple as I possibly can And the best part is I plan to keep it free for the community It comes with the MIT licence.

p.s. I'm a quadriplegic and have typing challenges obviously this has been This post has been created by AI but the content is what I intended to and it has been correctly communicated

[-]

illgettheownerforyou@reddit

I don’t understand why people are downvoting a quadriplegic for using AI to communicate- that’s like downvoting someone without legs for using a wheelchair to get around.

Is it just that some users here don’t understand what a quadriplegic is?

I appreciate your effort and will try it out when I get home next week!

[-]

JockY@reddit

I suspect it’s mostly bots.

I still choose to believe that most humans aren’t that much of a dick, even on the internet.

[-]

segmond@reddit

hah, most people are dicks, IRL and worse on the internet.

[-]

Serious-Zucchini@reddit

In a past life, I did a lot of work in ETL. I don't have a use case right now but might do in the near future.

[-]

Both-Signature-3980@reddit

Pretty sure I saw this project pop up on GitHub a few days ago. The visual ETL layout looks clean and I like the Polars backend choice.

I tested the cloud connector paths with Qoest API for pulling in scraped and OCR data into the workspace. It worked fine on the local setup.

[-]

entsnack@reddit

Thanks GPT!

[-]

BawbbySmith@reddit

when I became a quadriplegic.

...Can we give him a goddamn pass on this one please.

[-]

entsnack@reddit

How come Becky Tyler isn't posting AI slop then?

[-]

epicfilemcnulty@reddit

What does emojis in the readme have to do with data science? With any science? With the actual text in the readme, for crying out loud? Jeez, I'm so fucking tired of AI slop and everyday’s revelations.

[-]

card_chase@reddit (OP)

Took me the same time to paste the reply as you took for typing f*. I would bet my response is a winch more helpful than yours.

[-]

LetsGoBrandon4256@reddit

Took me the same time to paste the reply as you took typing f*.

About the same amount of time I'd dedicate to reading slops as well. You might be onto something.

[-]

zenis04@reddit

🚀🚀🚀

[-]

danja@reddit

Ok, the install script appears to have worked a treat. But now I can't see how to add a node, where to put the python script...

I've more to say, but will frame it as a new post.

[-]

danja@reddit

I appreciate that you've done some interesting work, and quadriplegia must be quite the hassle. But what I'm not getting is the USP. In simple terms, what can it do?

I have a potential use case : I produce music (new album! https://github.com/danja/attone), but am seriously deficient when it comes to creating visuals. On a platform like YouTube, having something to look at is half the story. Does this sound in scope?

[-]

card_chase@reddit (OP)

Here is the cleanly formatted version of your response, fully optimized for Reddit's markdown engine.

You can copy and paste the text block below straight into the comment box:

First off, congrats on the release of Attone! I took a look at your repo and your work with generative plugins and synthesizer modeling—that is seriously cool stuff, and exactly the kind of hacker spirit I love.

To answer your question directly: Yes, this is absolutely in scope for what VibeETL can handle because of how extensible it is. People usually think 'ETL' (Extract, Transform, Load) is just for moving corporate spreadsheet rows around. But VibeETL is built on top of a lightning-fast Polars engine and features a sandboxed Python Code tool, meaning you can run arbitrary scripts on any data stream.

Here is exactly how your music use-case maps onto the platform's USP:

1. Extract & Transform (Python DSP)

Since our Python tool runs inside an isolated local subprocess jail, you can pip install advanced signal processing libraries like scipy.io.wavfile or librosa directly into the backend environment. You can drop a Python node onto the canvas to ingest your .wav files, extract track frequencies, and map out the volume envelopes into a Polars DataFrame.

Here is a quick example of the code you'd run inside that node:

Python

import polars as pl
import librosa
import numpy as np
import os

# 1. Define the path to your raw audio file
audio_path = "C:/path/to/your/album/attone_track_01.wav"

# 2. Load the audio file using librosa
# sr=None preserves the original sample rate
y, sr = librosa.load(audio_path, sr=None)

# 3. Extract the Volume Envelope (RMS Energy)
# This measures the amplitude/loudness of the track over time
frame_length = 2048
hop_length = 512
rms_energy = librosa.feature.rms(y=y, frame_length=frame_length, hop_length=hop_length)[0]

# 4. Generate the corresponding timestamps for each audio frame
frames = range(len(rms_energy))
times = librosa.frames_to_time(frames, sr=sr, hop_length=hop_length)

# 5. Extract the global Tempo (BPM)
tempo, _ = librosa.beat.beat_track(y=y, sr=sr)

# 6. Build the Polars DataFrame to pass back to the VibeETL Canvas
# VibeETL automatically captures 'df_out' and streams it downstream
df_out = pl.DataFrame({
    "Time_Seconds": times,
    "Volume_RMS": rms_energy,
    "Track_BPM": np.full(len(times), tempo[0]), # Broadcast tempo to all rows
    "Track_Name": np.full(len(times), os.path.basename(audio_path))
})

2. Generative Processing

That mathematical array can then be fed into downstream nodes. You could use our built-in Gemini AI node to dynamically generate prompts for image/art generation based on the track's mood, or use another Python node to feed the data into an image generation library.

3. Load (Exporting)

You could use a final Python node to take those generated frames and use OpenCV/FFmpeg to encode them into an .mp4 video output, fully synchronized to your audio.

The core USP of VibeETL is that it gives you a zero-lag, drag-and-drop visual canvas to automate any complex code pipeline. If there isn't a native node for what you need, you just script it in the Python node.

Even better, since your background is in building open-source plugins, VibeETL has a 'Zero-Code SDK'. You could take your custom audio parsing and video exporting Python scripts, drop them into a tool manifest folder, and VibeETL will automatically generate a beautiful, native drag-and-drop node for the UI.

Would love for you to spin up the repo, try it out, and let's see if we can wire up a visualizer pipeline for your new tracks!

[-]

danja@reddit

Ok, because you were kind enough to mention my album, I'll have a play with this thing. I know from recent experience that whenever I try anything new in python, pip don't work directly. Vaguely Ubuntu environment, any recommended approach? AI response is fine, I'd do the same, but I haven't pulled the repo yet...oh yeah, point - AGENTS.md and/or CLAUDE.md, any skills?

[-]

ta1901@reddit

Python programmer here.

Each Python project needs to have its own folder and environment. Have you done that?
Python core libraries and executable files are stored in a central folder, but when you make an environment the exe files and third party libraries are stored in subfolders in your project folder. This is so your projects don't break when you upgrade the central Python files.
The people at https://discuss.python.org are really helpful! Try that as well.
On the forum above, do not post code as a screen shot, post it in code fence blocks. People will try to copy your code then, and help you with it.

[-]

danja@reddit

Good bot!