Sandboxed agent runs for GitHub repos with replayable video output
Posted by delecioushelix@reddit | programming | View on Reddit | 2 comments
I’ve been experimenting with a workflow for making coding-agent runs more observable.
Instead of asking an agent to summarize a repo, the system runs the agent in a sandbox against a GitHub repo and records the actual terminal/browser session. The result is a replayable video of what happened: setup, failures, retries, browser state, and final output.
The motivation is that text summaries from agents hide a lot. For repo evaluation, the path matters as much as the final answer.
High-level flow:
GitHub repo → sandbox → agent run → terminal/browser recording → processed replay
Demo: https://www.trymyrepo.com
Architecture notes: https://www.trymyrepo.com/how-it-works
Planning to open source it next week. Curious if people here think this kind of “visual evidence” is useful for agent workflows, or if logs/traces are enough.
programming-ModTeam@reddit
r/programming is not a place to post your project, get feedback, ask for help, or promote your startup.
Technical write-ups on what makes a project technically challenging, interesting, or educational are allowed and encouraged, but just a link to a GitHub page or a list of features is not allowed.
The technical write-up must be the focus of the post, not just a tickbox-checking exercise to get us to allow it. This is a technical subreddit.
We don't care what you built, we care how you build it.
fiskfisk@reddit
We don't need LLM generated spam. Sod off.