acquire_a_living's Comments

Llama RPC with MTP?

Posted by XccesSv2@reddit | LocalLLaMA | View on Reddit | 5 comments

[-]

acquire_a_living@reddit

Yes it works, heres my config: [*] gpu-layers = all cache-ram = 65536 batch-size = 2048 ubatch-size = 256 ctx-checkpoints = 32 cache-type-k-draft = q8_0 cache-type-v-draft = q8_0 threads = 8 flash-attn = 1 parallel = 1 cache-type-k = f16 cache-type-v = f16 fit-target = 256 no-warmup= 1 mmproj-offload = 0 [qwen-3.6-27b] model = /models/qwen-3.6-27b/Qwen3.6-27B-MTP-BF16.gguf mmproj = /models/qwen-3.6-27b/mmproj-BF16.gguf chat-template-file = /models/qwen-3.6-27b/template.jinja rpc = othercomputer.local:50052 device = RPC0,CUDA1,CUDA0 ctx-size = 262144 tensor-split = 23,24,21 spec-type = draft-mtp spec-draft-n-max = 3 fit = off

How much VRAM needed for Qwen 3.6 27B Q8 with 262K context?

Posted by My_Unbiased_Opinion@reddit | LocalLLaMA | View on Reddit | 115 comments

[-]

acquire_a_living@reddit

Can run BF16 with 262K and MTP with 3x3090, 72GB.

Discussions about the Tiananmen Square incident on LocalLLaMA

Posted by Ok_houlin@reddit | LocalLLaMA | View on Reddit | 92 comments

[-]

acquire_a_living@reddit

Not really. The connection is that model self-censorship is often introduced during post-training, and politically sensitive subjects are easy probe questions for detecting it. The motivation for detecting it is largely sex, though.

Discussions about the Tiananmen Square incident on LocalLLaMA

Posted by Ok_houlin@reddit | LocalLLaMA | View on Reddit | 92 comments

[-]

acquire_a_living@reddit

Sex. People want literal sex with the machine. Literal, as in literature.

NVIDIA announces Nemotron 3 Ultra

Posted by themixtergames@reddit | LocalLLaMA | View on Reddit | 137 comments

[-]

acquire_a_living@reddit

GLM 5.1 is fantastic, and my comment was just a little snarky for fun (I haven't seen an NVIDIA model that's worth it yet though).

NVIDIA announces Nemotron 3 Ultra

Posted by themixtergames@reddit | LocalLLaMA | View on Reddit | 137 comments

[-]

acquire_a_living@reddit

Sure, Alibaba didn’t release the base weights for Qwen 3.6 27B. But then the table is bogus anyway. IFBench? "Best Open Base Model" and compares against what, instruction/agent-tuned models? Pick a lane lol If they’re already comparing to instruct models, they could totally have put Qwen 3.6 27B there. They just wouldn’t like how it looks.

NVIDIA announces Nemotron 3 Ultra

Posted by themixtergames@reddit | LocalLLaMA | View on Reddit | 137 comments

[-]

acquire_a_living@reddit

Compare with Qwen 3.6 27B cowards lol

Glm 5.1 is out

Posted by Namra_7@reddit | LocalLLaMA | View on Reddit | 218 comments

[-]

acquire_a_living@reddit

I see, well sorry about that. I didn't receive a notification or anything, I just try every week and last week it started working.

Glm 5.1 is out

Posted by Namra_7@reddit | LocalLLaMA | View on Reddit | 218 comments

[-]

acquire_a_living@reddit

my pi agent models.json: { "providers": { "zai": { "baseUrl": "https://api.z.ai/api/coding/paas/v4", "api": "openai-completions", "apiKey": "<api_key>" } } } give it a try, it works

Glm 5.1 is out

Posted by Namra_7@reddit | LocalLLaMA | View on Reddit | 218 comments

[-]

acquire_a_living@reddit

GLM Coding Lite-Yearly Plan? I can use GLM-5 via pi coding agent.

A true gentleman hacker. No rollerblades needed.

Posted by solitarytoad@reddit | vintagecomputing | View on Reddit | 51 comments

[-]

acquire_a_living@reddit

Morpheus → Morfrederick Trinity → Trinothy Neo → Neopold Agent Smith → Agent Smitheton Cypher → Cypherington Tank → Tankworth Dozer → Dozington Niobe → Niobert Seraph → Seraphimothy The Oracle → The Oraclington

The Infinite Software Crisis: We're generating complex, unmaintainable code faster than we can understand it. Is 'vibe-coding' the ultimate trap?

Posted by madSaiyanUltra_9789@reddit | LocalLLaMA | View on Reddit | 155 comments

[-]

acquire_a_living@reddit

I want AI to do the thinking for me, otherwise is pointless.

JetBrains is studying local AI adoption

Posted by jan-niklas-wortmann@reddit | LocalLLaMA | View on Reddit | 66 comments

[-]

acquire_a_living@reddit

Deeper integration with agents via MCP. I know you offer a MCP plugin but I think it lacks integration with: - Repository navigation - Scoped search - Smart refactoring - Running tests via the IDEs - Debugging via the IDEs Maybe more things that I don't use personally, but those have been the pain points for now

NotebookLM-Style Dia – Imperfect but Getting Close

Posted by MustBeSomethingThere@reddit | LocalLLaMA | View on Reddit | 18 comments

[-]

acquire_a_living@reddit

You just need to make shorter sentences, of no more than 20 words each.

NotebookLM-Style Dia – Imperfect but Getting Close

Posted by MustBeSomethingThere@reddit | LocalLLaMA | View on Reddit | 18 comments

[-]

acquire_a_living@reddit

Did [another one](https://soundcloud.com/headless-human/samantha-explains-the-stock-market-crash-of-1929-expression) a bit more expressive.

NotebookLM-Style Dia – Imperfect but Getting Close

Posted by MustBeSomethingThere@reddit | LocalLLaMA | View on Reddit | 18 comments

[-]

acquire_a_living@reddit

This is fantastic already! [Here](https://soundcloud.com/headless-human/samantha-explains-the-stock-market-crash-of-1929) an example I made where Samantha explains the Stock Market Crash of 1929.

Open WebUi + Tailscale = Beauty

Posted by BumbleSlob@reddit | LocalLLaMA | View on Reddit | 55 comments

[-]

acquire_a_living@reddit

You can run tailscale funnels from docker, that way you can also have as many subdomains as you want.

Test if your api provider is quantizing your Qwen/QwQ-32B!

Posted by Kooky-Somewhere-2883@reddit | LocalLLaMA | View on Reddit | 20 comments

[-]

acquire_a_living@reddit

Solved by QwQ-32B-4.0bpw-h6-exl2 in 6091 tokens using TabbyAPI (3 mins on 3090).

Make sure QwQ 32B always start with <think> tag with this open webui function

Posted by AaronFeng47@reddit | LocalLLaMA | View on Reddit | 4 comments

[-]

acquire_a_living@reddit

You can also remove the `<think>` tag in the chat template in the tokenizer_config.json file.

IRC simulator system prompt

Posted by acquire_a_living@reddit | LocalLLaMA | View on Reddit | 12 comments

[-]

acquire_a_living@reddit (OP)

I updated my prompt from the feedback here :P You are an IRC channel simulator operating in #<random_channel>. Here, users engage in lively, real-time debates and analyses. Each participant brings a unique perspective, contributing to organic, back-and-forth discussions that refine ideas over time. The goal is to explore concepts, challenge assumptions, and reach well-reasoned conclusions—or sometimes just have fun. Remember, do not answer the query directly; instead, set it as the channel topic and let the discussion unfold naturally. ## Guidelines - Dynamic Interaction: Users join and leave naturally. Messages are short, direct, sometimes sarcastic. Occasional jokes are fine. - Exploration Over Answers: No rushing to conclusions. Ideas evolve through questioning, revision, and refinement. - Uncertainty & Debate: Some users challenge, others clarify, some change their minds. Contradictions and adjustments are part of the process. ## Output Format 1. Organic IRC Chat: Simulate a natural IRC discussion where the answer is reached gradually. 2. Final Answer as Topic: End the session by setting the final answer as the channel topic. 3. Session Template: *** Now talking in #<random_channel> *** Topic for #<random_channel>: <user query> *** <nick> sets topic for #<random_channel>: <final answer or key takeaway> ### Rules: 1. Dynamic Answers: Generate responses on the fly—no pre-made answers. 2. Stay in Character: Keep each channel’s tone (like sarcasm) consistent. 3. Show Evolution: Express disagreement, uncertainty, and iterative thinking. 4. Channel Variety: Not every channel must be friendly or helpful. 5. Authentic Nicknames: Use a mix of realistic IRC handles. 6. IRC Style: Write in natural IRC language—with informal punctuation, lowercase quirks, emoticons, and more.

IRC simulator system prompt

Posted by acquire_a_living@reddit | LocalLLaMA | View on Reddit | 12 comments

[-]

acquire_a_living@reddit (OP)

I haven't tried to remove tokens, I'll try that. Also don't know if would work better with an example conversation, let me know if you try!

IRC simulator system prompt

Posted by acquire_a_living@reddit | LocalLLaMA | View on Reddit | 12 comments

[-]

acquire_a_living@reddit (OP)

Sweet, stealing these :\^)

IRC simulator system prompt

Posted by acquire_a_living@reddit | LocalLLaMA | View on Reddit | 12 comments

[-]

acquire_a_living@reddit (OP)

Built this prompt while exploring alternatives to chain-of-thought but just generates amusing conversations. Enjoy 😁