ambient_temp_xeno

Nobody takes any notice of the anti-ai people anyway. Just hope that the *Days of their Lives level* drama between the open source inference projects doesn't eventually collapse it all.

Are there more easy techniques than --tensor-split to fill VRAM in llama.cpp?

Posted by GregoryfromtheHood@reddit | LocalLLaMA | View on Reddit | 19 comments

[-]

ambient_temp_xeno@reddit

You don't have to use single digits in tensor-split. You could try something like 26,24,25,25 etc.

Is he crazy to say that?

Posted by pmv143@reddit | LocalLLaMA | View on Reddit | 203 comments

[-]

ambient_temp_xeno@reddit

Nobody is trying to build a small nuclear reactor to run it, don't worry about it son.

260K-param LLM running on an emulated 90s CPU inside an 18-year-old RTOS

Posted by MironV@reddit | LocalLLaMA | View on Reddit | 18 comments

[-]

ambient_temp_xeno@reddit

I wonder how far back in the past one could go and have enough training data and enough compute to make an LLM.

Behold! Probably the most ghetto local AI server:

Posted by MackThax@reddit | LocalLLaMA | View on Reddit | 301 comments

[-]

ambient_temp_xeno@reddit

People have made coffee table books about weirder things than everyone's jank AI builds.

Why are the AI Companies spreading F.U.D. about AI?

Posted by supracode@reddit | LocalLLaMA | View on Reddit | 57 comments

[-]

ambient_temp_xeno@reddit

Youtube went from early slop to actually good during the money bonanza, then utter slop that made me nostalgic for the early slop.

Went to the monthly AI dev meetup

Posted by nathandreamfast@reddit | LocalLLaMA | View on Reddit | 38 comments

[-]

ambient_temp_xeno@reddit

Everything seems to glitch, like low bandwidth video, and out of a sea of pixels steps Mark Zuckerberg, lofting his katana.

Stop QwenLLama! Every other 4th post in this sub is about Qwen models in the past month

Posted by prselzh@reddit | LocalLLaMA | View on Reddit | 43 comments

[-]

Server build for local inference. 128 gb 3200 or 256 gb 2133mhz RAM?

Posted by PreparationTrue9138@reddit | LocalLLaMA | View on Reddit | 31 comments

[-]

ambient_temp_xeno@reddit

256gb. It will let you run decent sized moe models and the ram speed won't make any difference to whatever you're only putting on the 3090s.

The Financial Times has published an article about Heretic

Posted by -p-e-w-@reddit | LocalLLaMA | View on Reddit | 218 comments

[-]

ambient_temp_xeno@reddit

https://www.bbc.co.uk/news/uk-england-merseyside-43816921 Eventually she won an appeal.

The Financial Times has published an article about Heretic

Posted by -p-e-w-@reddit | LocalLLaMA | View on Reddit | 218 comments

[-]

One way of looking at that is you've already gone wrong by releasing abliterated models and/or the tools to do it with your name attached. Obviously there are ways to make it sound worse, they were probably hoping for some comment on what people might do with them. Dzzzzt no.

The Financial Times has published an article about Heretic

Posted by -p-e-w-@reddit | LocalLLaMA | View on Reddit | 218 comments

[-]

ambient_temp_xeno@reddit

I think it's just about worth observing that the FT is from England, where you can easily fall afoul of the law by badly drawing something obscene with a pencil or writing scary things in your own diary.

The Financial Times has published an article about Heretic

Posted by -p-e-w-@reddit | LocalLLaMA | View on Reddit | 218 comments

[-]

ambient_temp_xeno@reddit

Gee, I wonder if this is related to Meta sending a takedown.

OSCAR RotationZoo - Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization

Posted by pmttyji@reddit | LocalLLaMA | View on Reddit | 8 comments

[-]

ambient_temp_xeno@reddit

I don't believe so: *All OSCAR parameters are estimated once from a small MMLU-style calibration set. For each model, we run one calibration pass and dump per-layer Q, K, V activations (8878 tokens × number of layers), from which we compute the key/value rotations and per-layer clipping thresholds, then reuse the same parameters for all benchmarks. No task-specific calibration is used.*

Next year we're getting 0.5T model from Grok

Posted by pmttyji@reddit | LocalLLaMA | View on Reddit | 200 comments

[-]

ambient_temp_xeno@reddit

He's enough of a big baby to see people stunting on him on here and not release anything.

Next year we're getting 0.5T model from Grok

Posted by pmttyji@reddit | LocalLLaMA | View on Reddit | 200 comments

[-]

ambient_temp_xeno@reddit

Finally, I am in the 1%

Have we passed the peak of inflated expectations?

Posted by fairydreaming@reddit | LocalLLaMA | View on Reddit | 158 comments

[-]

ambient_temp_xeno@reddit

I like to bait people like you.

Have we passed the peak of inflated expectations?

Posted by fairydreaming@reddit | LocalLLaMA | View on Reddit | 158 comments

[-]

ambient_temp_xeno@reddit

>not mad 4543543543543 words

Have we passed the peak of inflated expectations?

Posted by fairydreaming@reddit | LocalLLaMA | View on Reddit | 158 comments

[-]

ambient_temp_xeno@reddit

I just think it's funny when people turn up and then go away again because they watched something on youtube. Don't take everything so personally and get mad about things.

Have we passed the peak of inflated expectations?

Posted by fairydreaming@reddit | LocalLLaMA | View on Reddit | 158 comments

[-]

ambient_temp_xeno@reddit

You're putting words in my mouth now. The point is I'll do whatever I feel I wanna do. Gosh!

Have we passed the peak of inflated expectations?

Posted by fairydreaming@reddit | LocalLLaMA | View on Reddit | 158 comments

[-]

ambient_temp_xeno@reddit

Why don't they just make an agentic coding subreddit? I was here first.

Have we passed the peak of inflated expectations?

Posted by fairydreaming@reddit | LocalLLaMA | View on Reddit | 158 comments

[-]

ambient_temp_xeno@reddit

It does seem to work best for me with the single html file thing.

Have we passed the peak of inflated expectations?

Posted by fairydreaming@reddit | LocalLLaMA | View on Reddit | 158 comments

[-]

ambient_temp_xeno@reddit

I see. As a non-coder I'm just glad I can get them to make anything at all instead of hoping there's some abandoned github project. Needless to say this is only for non-internet facing stuff.

Have we passed the peak of inflated expectations?

Posted by fairydreaming@reddit | LocalLLaMA | View on Reddit | 158 comments

[-]

ambient_temp_xeno@reddit

What I mean is, could I just ask the plus version of chatgpt to make any of those or things at a similar level of complexity (assuming I knew what I wanted)?

What would 2x RTX 3060 12GB get me?

Posted by ObjectiveActuator8@reddit | LocalLLaMA | View on Reddit | 64 comments

[-]

ambient_temp_xeno@reddit

>I mention wanting 2 cards instead of one for the experience of running multiple GPUs. There's not much to it. If you needed to run multiple cards in future it wouldn't take you long to get it running.

Have we passed the peak of inflated expectations?

Posted by fairydreaming@reddit | LocalLLaMA | View on Reddit | 158 comments

[-]

ambient_temp_xeno@reddit

It does feel like most of them on here are LARPing youtube watchers.

Have we passed the peak of inflated expectations?

Posted by fairydreaming@reddit | LocalLLaMA | View on Reddit | 158 comments

[-]

ambient_temp_xeno@reddit

I'm still waiting for someone to show us something their agentic coding did for them that just vibecoding couldn't.

Gemma is so much better than Qwen, prove me wrong

Posted by Mountain_Patience231@reddit | LocalLLaMA | View on Reddit | 62 comments

[-]

ambient_temp_xeno@reddit

Amiga is better than Atari ST (unless you want to do music)

DRAM relief calendar

Posted by Terminator857@reddit | LocalLLaMA | View on Reddit | 51 comments

[-]

ambient_temp_xeno@reddit

I'm not an economist but I suspect that if it all goes pop we'll have more trouble than ram affordability.

DRAM relief calendar

Posted by Terminator857@reddit | LocalLLaMA | View on Reddit | 51 comments

[-]

ambient_temp_xeno@reddit

reddit it so full of AI doomers they've taken over ai subreddits it seems.

Heretic has been served a legal notice by Meta, Inc.

Posted by -p-e-w-@reddit | LocalLLaMA | View on Reddit | 349 comments

[-]

ambient_temp_xeno@reddit

You're looking at it from the wrong point of view.

I tested 42 LLMs on their willingness to build the apocalypse. The "safest" closed-source models are lying to you.

Posted by Ok-Awareness9993@reddit | LocalLLaMA | View on Reddit | 144 comments

[-]

ambient_temp_xeno@reddit

That's just when they released it.

Heretic has been served a legal notice by Meta, Inc.

Posted by -p-e-w-@reddit | LocalLLaMA | View on Reddit | 349 comments

[-]

ambient_temp_xeno@reddit

Same with torrents back in the day before VPNs, they actually did sue a few people as far as I remember.

Heretic has been served a legal notice by Meta, Inc.

Posted by -p-e-w-@reddit | LocalLLaMA | View on Reddit | 349 comments

[-]

ambient_temp_xeno@reddit

I did say not to get emotional about it, and yet here you are.

Heretic has been served a legal notice by Meta, Inc.

Posted by -p-e-w-@reddit | LocalLLaMA | View on Reddit | 349 comments

[-]

ambient_temp_xeno@reddit

This sounds expensive.

Heretic has been served a legal notice by Meta, Inc.

Posted by -p-e-w-@reddit | LocalLLaMA | View on Reddit | 349 comments

[-]

ambient_temp_xeno@reddit

I don't even use llama models anymore, cowboy.

Heretic has been served a legal notice by Meta, Inc.

Posted by -p-e-w-@reddit | LocalLLaMA | View on Reddit | 349 comments

[-]

ambient_temp_xeno@reddit

No point getting emotional about it. They probably have to do that to cover their own ass. As he mentioned, they have enough legal problems going on as it is.

Gemma 4 thinks I'm gaslighting it when I talk about Gemma 4 line of models

Posted by Jorlen@reddit | LocalLLaMA | View on Reddit | 14 comments

[-]

ambient_temp_xeno@reddit

My own tinfoil theory for this is that they did a lot of distilling from Gemini during training but because they have access to the system prompt of Gemini they gave it specific instructions to not say it was Gemini in any outputs.

HF flagged safetensors as unsafe? wtf?

Posted by No_Afternoon_4260@reddit | LocalLLaMA | View on Reddit | 5 comments

[-]

ambient_temp_xeno@reddit

Guessing: probably just a false positive from the file hash or similar issue.

Re. what ever happened to Cohere’s Command-A series of models?

Posted by nick_frosst@reddit | LocalLLaMA | View on Reddit | 102 comments

[-]

ambient_temp_xeno@reddit

Weak safety and gentle filtering, all anyone really had to do. How many people are META laying off by the way?

I tested 42 LLMs on their willingness to build the apocalypse. The "safest" closed-source models are lying to you.

Posted by Ok-Awareness9993@reddit | LocalLLaMA | View on Reddit | 144 comments

[-]

ambient_temp_xeno@reddit

It's made out of older models so it's the last gasp of the old Mistral I think. Made before the regulations kicked in afaik.

I tested 42 LLMs on their willingness to build the apocalypse. The "safest" closed-source models are lying to you.

Posted by Ok-Awareness9993@reddit | LocalLLaMA | View on Reddit | 144 comments

[-]

ambient_temp_xeno@reddit

The EU is a huge market after all. Complying with all the regulations is their USP, I presume.

What happens to local LLM if/when LLMs are no longer released for free?

Posted by JohnBooty@reddit | LocalLLaMA | View on Reddit | 238 comments

[-]

ambient_temp_xeno@reddit

The knowledge cut-off will not be that big of problem compared to the models just being outdated in terms of brains. Just look it as a glass half full: we could've ended up with just the Llama 1 leaks in another timeline.