I built a local AI agent that turns my messy computer into a private, searchable memory

Posted by AlanzhuLy@reddit | LocalLLaMA | View on Reddit | 23 comments

My own computer is a mess: Obsidian markdowns, a chaotic downloads folder, random meeting notes, endless PDFs. I’ve spent hours digging for one info I know is in there somewhere — and I’m sure plenty of valuable insights are still buried.

So I built Hyperlink — an on-device AI agent that searches your local files, powered by local AI models. 100% private. Works offline. Free and unlimited.

https://reddit.com/link/1nfa11x/video/fyfbgmuivrof1/player

How I use it:

Connect my entire desktop, download folders, and Obsidian vault (1000+ files) and have them scanned in seconds. I no longer need to upload updated files to a chatbot again!
Ask your PC like ChatGPT and get the answers from files in seconds -> with inline citations to the exact file.
Target a specific folder (@research_notes) and have it “read” only that set like chatGPT project. So I can keep my "context" (files) organized on PC and use it directly with AI (no longer to reupload/organize again)
The AI agent also understands texts from images (screenshots, scanned docs, etc.)
I can also pick any Hugging Face model (GGUF + MLX supported) for different tasks. I particularly like OpenAI's GPT-OSS. It feels like using ChatGPT’s brain on my PC, but with unlimited free usage and full privacy.

Download and give it a try: hyperlink.nexa.ai
Works today on Mac + Windows, ARM build coming soon. It’s completely free and private to use, and

I’m looking to expand features—suggestions and feedback welcome! Would also love to hear: what kind of use cases would you want a local AI agent like this to solve?

Hyperlink uses Nexa SDK (https://github.com/NexaAI/nexa-sdk), which is a open-sourced local AI inference engine.

[-]

Bytebirdie@reddit

Hey there! how do you connect your agent to the data ? How have you made it private?

[-]

AlanzhuLy@reddit (OP)

It is private because everything including the indexing model is running locally on device. We have the agent autotracks the changes the folders you indexed.

[-]

SimonPage@reddit

I'm curious why your model downloads at like... 1.5 MB/s, when I have download speed of 102.96 Mbps/s according to speedtest.net (both "live" current values -- just too lazy to upload an image to imgur)

Any way to speed it up? I seriously don't plan to wait 1.5 hours for your model to load.

[-]

AlanzhuLy@reddit (OP)

Which continent/country are you in? Let me check

[-]

SimonPage@reddit

USA... it did eventually speed up and I got the application to load. Thanks!

[-]

TaiMaiShu-71@reddit

Do you have a repo? What is the future plan, open or closed source? I like the idea.

[-]

AlanzhuLy@reddit (OP)

The frontend is closed source and you can download here: hyperlink.nexa.ai

Our backend AI engine is open sourced here: https://github.com/NexaAI/nexa-sdk

Would love to hear your feedback on what use case we should target next!

[-]

torako@reddit

i was mildly interested until i heard closed source, nevermind

[-]

twack3r@reddit

So this is RAG without an API endpoint?

[-]

AlanzhuLy@reddit (OP)

Yes. All in one app.

[-]

UdyrPrimeval@reddit

Hey, turning chaotic notes into neat knowledge graphs with a local AI agent? That's genius! I've got a pile of scribbles that could use that magic touch without shipping data to the cloud.

A few tweaks to level it up: Integrate something like Neo4j for graph viz, makes querying intuitive, but trade-off: adds overhead on lighter machines, so optimize for RAM. Fine-tune your LLM on domain-specific notes (e.g., via LoRA), which boosts accuracy, though training time can drag; in my experience, starting with small batches avoids frustration. Add versioning for graphs, tracks changes over time, but might complicate the pipeline if not modular.

Local setups like this are perfect for privacy nuts, and communities here share great forks, or events such as LLM workshops alongside AI hacks like Sensay Hackathon's can help iterate on agent ideas.

[-]

SM8085@reddit

Neat. Personally, I would like remote (but still on my LAN) inference through an API endpoint as an option. I got my LLM rig in the dining room that I point everything toward. Then the requirements for my llm rig become 18-32+GB while things like my cheap laptop can still run it since the laptop isn't doing the inference.

[-]

gadgetb0y@reddit

This. I run my local models on a beefier machine on my LAN using LM Studio or Ollama.

[-]

johnerp@reddit

I second this, if you could get this to the top of the feature list OP that would be awesome.

[-]

AlanzhuLy@reddit (OP)

This is an interesting idea! Maybe with API endpoint, we can enable users to interact with this local AI agent via a phone but the inference stays on some powerful device.

[-]

whatever462672@reddit

Add inference over network API. I have a separate llama.cpp rig in my rack, so that the heat doesn't follow me. Almost no tools allow simply entering an endpoint IP.

[-]

-dysangel-@reddit

https://i.redd.it/3b9b0o9jgtof1.gif

[-]

JazzlikeLeave5530@reddit

Or use Everything which is faster and better than start menu.

[-]

AlanzhuLy@reddit (OP)

lmao. Love the suggestion of replacing the search bar with hyperlink

[-]

teh_spazz@reddit

I’m dying lmao.

[-]

No-Mountain3817@reddit

Option to allow already downloaded model on the system.
where does it store models?

[-]

PhotoRepair@reddit

so like https://www.nvidia.com/en-in/ai-on-rtx/chat-with-rtx-generative-ai/ RTX chat? how does it differ?

[-]

AlanzhuLy@reddit (OP)

RTX Chat is tied to NVIDIA GPUs. Hyperlink runs on CPU/GPU from Apple, AMD, Intel and NVIDIA—plus it integrates with your local file system. That means it keeps track of file changes automatically. You also get inline citations to verify answers and explore context. And unlike most local AI tools, we’ve polished the UI so it feels closer to cloud apps.