maddogawl

Phi-4 has been released

Posted by paf1138@reddit | LocalLLaMA | View on Reddit | 229 comments

maddogawl@reddit

Do you think that issue would also impact being able to run it in LM Studio with AMD hardware? I also can't get the model to load for the life of me. Tried with ROCm, Vulkan, and down to a super low context window, and it won't load. Q3, Q4, Q6, none of them load for me :/ Very vague error: (Exit code: 0). Some model operation failed. Try a different model and/or config.

Phi-4 has been released

Posted by paf1138@reddit | LocalLLaMA | View on Reddit | 229 comments

Phi-4 has been released

Posted by paf1138@reddit | LocalLLaMA | View on Reddit | 229 comments

DeepSeek V3 is the shit.

Posted by Odd-Environment-7193@reddit | LocalLLaMA | View on Reddit | 301 comments

We deserve something better than LangChain

Posted by Available_Ad_5360@reddit | LocalLLaMA | View on Reddit | 68 comments

Get multiple llms talking to each other?

Posted by Aggressive_Special25@reddit | LocalLLaMA | View on Reddit | 4 comments

maddogawl@reddit

I’m working on a better guide but here’s a video I did on having agents work together to build a book GitHub link is there as well. I’m currently working on 2 side project. 1. Having 2 LLMs play a game 2. Having a ton of agents build a game design document Should have videos out on those this week. https://youtu.be/EVrL6Qg7e9A

Get multiple llms talking to each other?

Posted by Aggressive_Special25@reddit | LocalLLaMA | View on Reddit | 4 comments

Deepseek-V3 GGUF's

Posted by fraschm98@reddit | LocalLLaMA | View on Reddit | 79 comments

maddogawl@reddit

I bet its like a small heater lol! Thats a really nice build you have. I was thinking about building a dedicated LLM machine with 4 - 8 of the Intel B580's but I need to get one first to see how it performs. That or get another 7900XTX and add it to my main computer.

Deepseek-V3 GGUF's

Posted by fraschm98@reddit | LocalLLaMA | View on Reddit | 79 comments

LLM as survival knowledge base

Posted by NickNau@reddit | LocalLLaMA | View on Reddit | 152 comments

maddogawl@reddit

I was recently watching Silo on Apple TV, which got me to thinking about how we could store all of the worlds history without needing physical copies. I feel like LLMs are destined for that, we could send the entire Earths history to another planet in the future. Its really amazing to think about that. I'm really curious how close we are to that today. Could we take opensource DeepSeek V3 and have it give us detailed history lessons, and how accurate would it be? My mind is spinning lol

deepseek suks

Posted by RouteGuru@reddit | LocalLLaMA | View on Reddit | 11 comments

maddogawl@reddit

Curious what small coding errors you are seeing, what technology, language etc? I'm primarily working with Typescript and Python, and I've found it to be near perfect. Claude I usually have to tweak things a lot after it gives me code, where DeepSeek almost always seems to be good first try.

deepseek suks

Posted by RouteGuru@reddit | LocalLLaMA | View on Reddit | 11 comments

deepseek suks

Posted by RouteGuru@reddit | LocalLLaMA | View on Reddit | 11 comments

I used AI agents to see if I could write an entire book | AutoGen + Mistral-Nemo

Posted by maddogawl@reddit | LocalLLaMA | View on Reddit | 30 comments

maddogawl@reddit (OP)

I love that character creator, one idea I had for writing a book was to create an agent for each character and have it write its own story from its point of view based on the situation around it. Seems ambitious to pull off, but could be so cool. I could see using character generators being used to fill out extras in the book, that have full storylines of its own. Thank you for sharing KoboldAI, I did not know that existed, checking it out more tonight!

I used AI agents to see if I could write an entire book | AutoGen + Mistral-Nemo

Posted by maddogawl@reddit | LocalLLaMA | View on Reddit | 30 comments

maddogawl@reddit (OP)

I found around the bigger the better, I landed on around 32k min with Mistral-Nemo to get the best results, but I did get okay results at 1/2 that.

I used AI agents to see if I could write an entire book | AutoGen + Mistral-Nemo

Posted by maddogawl@reddit | LocalLLaMA | View on Reddit | 30 comments

maddogawl@reddit (OP)

I agree, I do think it will get there, part of me wonders if a model could be fine tuned on writing styles and trained to respond not as an assistant but as a writer. Then you could use assistant models with the writer models. Totally theoretical, but I agree right now they are not good enough.

I used AI agents to see if I could write an entire book | AutoGen + Mistral-Nemo

Posted by maddogawl@reddit | LocalLLaMA | View on Reddit | 30 comments

maddogawl@reddit (OP)

Here is the source code, I would love to see if anyone has feedback on this little side project I took on: [https://github.com/adamwlarson/ai-book-writer](https://github.com/adamwlarson/ai-book-writer) I really feel we are just scratching the surface of AI agents and their possibility.

µLocalGLaDOS - offline Personality Core

Posted by Reddactor@reddit | LocalLLaMA | View on Reddit | 136 comments

Radeon GPU for local LLM?

Posted by PetMogwai@reddit | LocalLLaMA | View on Reddit | 32 comments

Radeon GPU for local LLM?

Posted by PetMogwai@reddit | LocalLLaMA | View on Reddit | 32 comments

Radeon GPU for local LLM?

Posted by PetMogwai@reddit | LocalLLaMA | View on Reddit | 32 comments

maddogawl@reddit

How do you manage drivers when you have AMD and Nvidia both installed at the same time? I've always thought that could cause a ton of headaches

What's your primary local LLM at the end of 2024?

Posted by AaronFeng47@reddit | LocalLLaMA | View on Reddit | 210 comments

maddogawl@reddit

I’m really enjoying phi-4 the unofficial release. It seems to be good or decent at everything I try, from coding to writing. QWQ is probably my next one

My Apple Intelligence Writing Tools for Windows/Linux/macOS app just had a huge new update. It supports a ton of local LLM implementations, and is open source & free :D. You can now chat with its one-click summaries of websites/YT videos/docs, and bring up an LLM chat UI anytime. Here's a new demo!

Posted by TechExpert2910@reddit | LocalLLaMA | View on Reddit | 38 comments

maddogawl@reddit

this is amazing, thank you for making this! Sent you a tip, hopefully you are able to keep working on this. As a developer myself I know how hard it is to find time to build things like this for the community.

How does a model like QwQ do calculations like 4692*2 „in its head“?

Posted by andWan@reddit | LocalLLaMA | View on Reddit | 33 comments

Latest AMD Driver 24.12.1 performs significantly worse than 24.8.1 running QwQ.

Posted by maddogawl@reddit | LocalLLaMA | View on Reddit | 21 comments

Thoughts on Intel B850 GPU with 12GB VRAM for $250?

Posted by Chemical_Elk7746@reddit | LocalLLaMA | View on Reddit | 70 comments

Latest AMD Driver 24.12.1 performs significantly worse than 24.8.1 running QwQ.

Posted by maddogawl@reddit | LocalLLaMA | View on Reddit | 21 comments

Latest AMD Driver 24.12.1 performs significantly worse than 24.8.1 running QwQ.

Posted by maddogawl@reddit | LocalLLaMA | View on Reddit | 21 comments

Latest AMD Driver 24.12.1 performs significantly worse than 24.8.1 running QwQ.

Posted by maddogawl@reddit | LocalLLaMA | View on Reddit | 21 comments

Open models wishlist

Posted by hackerllama@reddit | LocalLLaMA | View on Reddit | 238 comments

maddogawl@reddit

Thank you so much for reaching out to the community to ask. My main use case is coding, and i've found that Gemma 27b while a good model just isn't great for my use case. I'd love a model that has some additional reasoning capabilities that I could run locally. I find that my go to local model is QwQ at the moment which is incredible but very wordy. More context is always better for what i'm doing, but i can work around that most of the time.

Latest AMD Driver 24.12.1 performs significantly worse than 24.8.1 running QwQ.

Posted by maddogawl@reddit | LocalLLaMA | View on Reddit | 21 comments

Latest AMD Driver 24.12.1 performs significantly worse than 24.8.1 running QwQ.

Posted by maddogawl@reddit | LocalLLaMA | View on Reddit | 21 comments

Latest AMD Driver 24.12.1 performs significantly worse than 24.8.1 running QwQ.

Posted by maddogawl@reddit | LocalLLaMA | View on Reddit | 21 comments

Latest AMD Driver 24.12.1 performs significantly worse than 24.8.1 running QwQ.

Posted by maddogawl@reddit | LocalLLaMA | View on Reddit | 21 comments

Gemini 2.0 Flash Experimental, anyone tried it?

Posted by robberviet@reddit | LocalLLaMA | View on Reddit | 67 comments

maddogawl@reddit

I'm using Sonnet 3.5, putting together some larger tests at the moment, and its really blowing my mind how much its competing with 3.5 for my use cases. I primarily use it for coding, a mix of data science ML model building, data cleaning, feature engineering, as well as backend and frontend code using Vue.js and Typescript.

Gemini 2.0 Flash beating Claude Sonnet 3.5 on SWE-Bench was not on my bingo card

Posted by jd_3d@reddit | LocalLLaMA | View on Reddit | 169 comments

maddogawl@reddit

Today it was amazing using Gemini 2.0 Flash, my only gripe is that I hit moments where responses were erroring out, or taking 300+ seconds. I have a feeling this is a scaling issue since it just released. It really crushed code for me today.

Gemini 2.0 Flash Experimental, anyone tried it?

Posted by robberviet@reddit | LocalLLaMA | View on Reddit | 67 comments

maddogawl@reddit

I've been trying it out, doing side by side comparisons with Claude, QWQ for a specific data science problem where I want to create a model that generates a propensity score. This is a very narrow use case, but what I found was the following. Pros: 1. The response time is incredibly fast 2. The quality is on par with Claude for the first response, this is using identical setup and prompts. 3. Both initial versions were very flawed. Cons: 1. Fixing errors in 2.5, pasting Python error leads to a new version of the code that wasn't fixed. I gave it 5 attempts, and the problem wasn't resolved. In Claude it had similar issues that were resolved after 3 attempts. Mixed: 1. The model each generated were fine, but what I liked about Googles was how it attempted to test multiple models against each other, where Claude just picked one. 2. The final quality of the model is still up in the air, but the features generated by the Google model were much more basic, where Claude put together some much more complex features. I eventually hit a point with Google's where it quit giving me responses, i'm assuming they are hitting demand limits.

Llama 3.3 on a 4090 - quick feedback

Posted by latentmag@reddit | LocalLLaMA | View on Reddit | 105 comments

maddogawl@reddit

Woah, I didn't know you could cross brands/architectures that way. I assumed they all had to be the same card. So you can run model inference across 2 different GPU's?

Meta releases Llama3.3 70B

Posted by Amgadoz@reddit | LocalLLaMA | View on Reddit | 246 comments

Meta releases Llama3.3 70B

Posted by Amgadoz@reddit | LocalLLaMA | View on Reddit | 246 comments

maddogawl@reddit

I have a Macbook pro M1, i'll have to give that try, it may not be good enough. I'm so curious how a Mac would load a 70B param model, but a top of the line graphics card in a Windows PC can't.

Meta releases Llama3.3 70B

Posted by Amgadoz@reddit | LocalLLaMA | View on Reddit | 246 comments

maddogawl@reddit

What do you guys use to run models like this, my limit seems to be 32B param models with limited context windows? I have 24GB of VRAM, thinking I need to add another 24GB, but curious if that would even be enough.

Intel Battlemage GPUs Just Got Announced

Posted by Someone13574@reddit | LocalLLaMA | View on Reddit | 141 comments

maddogawl@reddit

I didn't think about this, but it could be worthwhile picking up like 4 of these GPU's to run local LLM's. I currently have a 7900XTX and hit limits pretty hard. How does the current Intel cards compare to AMD in performance running LLMs?