Elven77AI

Can 4chan data REALLY improve a model? TURNS OUT IT CAN!

Posted by Sicarius_The_First@reddit | LocalLLaMA | View on Reddit | 157 comments

Elven77AI@reddit

Also, the identities are anonymoyus: the training on Reddit will model "fictional identity bank" spread over various names(associative identity), 4chan forces more coherent single vector of same "Anonymous" post responsible for all replies, perhaps it appears more coherent during training and skips identity-modeling?

Can 4chan data REALLY improve a model? TURNS OUT IT CAN!

Posted by Sicarius_The_First@reddit | LocalLLaMA | View on Reddit | 157 comments

Elven77AI@reddit

> The finetune was literally on an extremely noise 4chan dataset, it should have eaten glue. Hmm, perhaps the post->reply structure in flat threads provides a better dialogue model vs threaded dialogue tree(reddit), since the clue to what post X replies to(>>post number) is direct pointer that LLM digest better than external "post X appears below Y"). i.e. the advantage would be context of the threads as interlocking tree of posts referencing(link numbers) each other explicitly outperforms threaded/quotable nesting structure within training.

DeepSeek-R1’s paper was updated 2 days ago, expanding from 22 pages to 86 pages and adding a substantial amount of detail.

Posted by Nunki08@reddit | LocalLLaMA | View on Reddit | 55 comments

Elven77AI@reddit

This seems like it, dumping dozens of pages means its no longer relevant to their current research and they moved on to something far more effective(i.e. no competitor advantage), likely a new reasoning architecture built from https://huggingface.co/papers/2512.24880

Physical documentation for LLMs in Shenzhen bookstore selling guides for DeepSeek, Doubao, Kimi, and ChatGPT.

Posted by abdouhlili@reddit | LocalLLaMA | View on Reddit | 55 comments

Elven77AI@reddit

What is the use case for this? Is this prompt engineering DeepSeek to be more focused? Then its 1 page cheat-sheet. There isn't enough material for a book.

When will the free ride be over?

Posted by DeltaSqueezer@reddit | LocalLLaMA | View on Reddit | 75 comments

Elven77AI@reddit

They are trying to get market share, once a stable userbase forms they move to freemium type service and make it limited/low-quality for free users. Its not a charity. The forces they compete with on cost, however can easily sway consumers towards something cheaper so they have to maintain some basic competitive offer that prevents users migrating or using local models. The free ride is the "loss leader" cost for making their models famous and their API dependencies entrenched in the market, plus benefits of free training material from user prompts.

[2510.05688] vAttention: Verified Sparse Attention

Posted by Elven77AI@reddit | LocalLLaMA | View on Reddit | 3 comments

Elven77AI@reddit (OP)

tl;dr The new sparse attention scheme "matches full model quality with upto 20x sparsity" Repo for their modified models. https://github.com/xAlg-ai/sparse-attention-hub

More love for GLM4.6 (evaluation vs. Claude 4.5 for NLP tasks)

Posted by LoveMind_AI@reddit | LocalLLaMA | View on Reddit | 61 comments

Elven77AI@reddit

With presence/frequency/repetition penalties, the model "lapse into chinese" is more likely when there its trying to repeat something but has to rephrase it due penalties, since the majority of its training corpus is Chinese and Chinese tokens are the default the rephrase shifts to another language

Biggest Provider for the community for at moment thanks to them

Posted by dead-supernova@reddit | LocalLLaMA | View on Reddit | 281 comments

Elven77AI@reddit

It justs shows the scale of software innovation, anyone reading arxiv preprints can see for themselves - the vast majority of papers on AI come from China. That is despite the GPU embargoes and much less financing per project.

inclusionAI/Ring-flash-2.0

Posted by nullmove@reddit | LocalLLaMA | View on Reddit | 13 comments

Clever code is probably the worst code you could write

Posted by Rtzon@reddit | programming | View on Reddit | 341 comments

Elven77AI@reddit

Disagree. Either the code is clever or i'll use AI to generate it. I don't like to waste time reimplementing wheels and boilerplate, its soul-draining to write "dumb code" to add functionality, like 'small talk' but it lasts for hours. Without AI writing down dumb code by kilobytes, i'd spending most of time debugging dumb code doing something even dumber(e.g. corner cases in C/C++).

PubChem is down, DNS record gone

Posted by geaibleu@reddit | PrepperIntel | View on Reddit | 39 comments

The US Government's open data is currently being scrubbed

Posted by InvisibleBobby@reddit | PrepperIntel | View on Reddit | 175 comments

Elven77AI@reddit

Has anyone analyzed what is the data being scrubbed? What exactly is hidden? Exposing this might be important, since e.g. EPA datasets are considered authorative(like https://www.epa.gov/chemical-data-reporting )

Shock poll: 41 percent of young voters find killing of UnitedHealthcare CEO acceptable

Posted by 1DarkStarryNight@reddit | anime_titties | View on Reddit | 418 comments

Elven77AI@reddit

Oh, you're russian, so you miss alot of context: You have to understand US insurance industry is not something normal. Its more like a mafia extortion scheme with extra steps, one of which is killing people by forcing to pay for "protection"(insurance) and not providing it so the patient dies or gets crippled, and if they sue they got the best lawyers extortion money can get.

I just had the scariest dream/nightmare in my life

Posted by Midnight-blue1513@reddit | HighStrangeness | View on Reddit | 57 comments

Interesting real life story on a man encountering a space-time distortion

Posted by Projectcultureshock@reddit | HighStrangeness | View on Reddit | 70 comments

Elven77AI@reddit

Source is “The Trap of the Devil” in: “Planet X” monthly newspaper, Kiev, Ukraine July 2005 https://www.fern-flower.org/en/articles/devils-trap

Alternative to Reddit/forum where knowledgeable people are

Posted by Nicoleism101@reddit | RedditAlternatives | View on Reddit | 26 comments

Elven77AI@reddit

Smart people(i'd assume smarter than myself) seem to have a penchant for forming some academic circles, similar to gamer's discord groups/clans, so you have e.g. "Quantum Gravity group" operating in some space where it excludes non-members and not subject to public opinion. If you could read that, it would defeat the point of that group and make it "too public", like e.g. Linux kernel mailing list dealing with constant drama.

How much do you value information density in Reddit-like UI?

Posted by kinghuang@reddit | RedditAlternatives | View on Reddit | 19 comments

Why is Lemmy.world so toxic?

Posted by Character-Storage661@reddit | RedditAlternatives | View on Reddit | 87 comments

Elven77AI@reddit

Based on my extensive browsing of lemmy servers: 1.They seem to be copying reddit structures at scale, which doesn't have as much users there - so empty subreddits. Lack of niche topics/hobbies. Nobody seems to want to make anything unpopular for long-term investment. 2.Mods are very pedantic and enforce their arbitrary rules very effectively: its far more moderation per user than reddit. Ironically where the moderation is stronger is enforcing low posting rates in communities to avoid spam: only few users are essentially posting vs hundreds viewers, and these rules limit them. 3.Suffocating ideological conformity: each lemmy seems to have an insular ideological platform of "us vs them" and they don't like opinions outside of consesus.

Have any of you peeps looked at "classic forums" (like Xenforo / Invision) as viable Reddit Alternatives? What would be your reasoning to use those community frameworks? If not, why not?

Posted by prankster999@reddit | RedditAlternatives | View on Reddit | 15 comments

Elven77AI@reddit

Google is unfortunately not capable of finding much due changes in its algorithm, you're better off with other search engines(Bing/Yandex/DuckDuckGo). There is much more spam and fake pages to sort through, but its possible; unless your niche is unpopular it should be indexed and linked from somewhere, you can check majestic million( https://majestic.com/reports/majestic-million ) to see how much exposure it has on the web, including rare websites that would fail to show up in most searches.

Have any of you peeps looked at "classic forums" (like Xenforo / Invision) as viable Reddit Alternatives? What would be your reasoning to use those community frameworks? If not, why not?

Posted by prankster999@reddit | RedditAlternatives | View on Reddit | 15 comments

Elven77AI@reddit

The problem with forums is exposure at scale: subreddits share 'exposure space' with all of reddit. Try finding a forum for any niche with search engines and compare it with finding a subreddit. Its the same with blogs and personal pages, they can't compete with centralized services on exposure: the 'discovery' of forums needs some central directory to compete with reddit. Now, imagine you do get exposure: your users will have to register to post for each of the forums they read(unlike one registration for all subreddits). This adds friction, and they're not that interested in 'just a forum' - without a huge amount of content to read, there is no point creating a empty forum(same with reddit clones). Maintaining the forum and dealing with forum hosters/companies for just a tiny community with low growth potential is not appealing to most people. Reddit clones allow to concentrate content by self-moderation sub-forums, thats the genius of this dynamic scheme: topics(tags) -> communities-> self-moderation. Forums are just this with manual moderation and intervention: you can't just randomly grant anyone mod powers or allow subforums to be created organically, but at reddit scale it self-organizes into successful subforums and compete for attention with rest.

Building Reddit right

Posted by lumpyvasdeferens@reddit | RedditAlternatives | View on Reddit | 11 comments

Elven77AI@reddit

Remove karma, replace it with log2(#replies)*log10(total_reply_text_length) to sort threads/subthreads, cutoff for top 1h/1d/1w/1m. Bootstrap with something like subSimulatorGPT2 and stealthily delete the initial bot posts that aren't replied. Don't allow to embed any media - only links, text is much cheaper to host(if you don't rely on dynamically constructed pages, make it 100% cloudflare compatible). Monetization: sponsored posts that stay "at top" of subreddit for N hours. Avoid captchas, instead use algorithmic triggers to mark bots: basically, don't anger potential power-users with hostile design. Reddit changing their design/API at whim is a prime example of alienating decisions, people rely on things staying as is for years.

Social websites with nested comments v6

Posted by 1billionthuser@reddit | RedditAlternatives | View on Reddit | 20 comments

Elven77AI@reddit

They allow sorting replies by ratings and are more easy to follow than a nest of chronological posts referring/quoting multiple posts(often nested). Threaded discussions also allow hiding irrelevant subthreads, while flat threads force you to skip posts in the middle of relevant content.

Strange lights and hums in the sky appear all over the world, a harbinger of an alien invasion?

Posted by JuliaJune96@reddit | HighStrangeness | View on Reddit | 6 comments

[Hype Train] Your friendly reminder that benevolent canine aliens are supposed to be revealing themselves TOMORROW (12/23)!

Posted by ResplendentShade@reddit | HighStrangeness | View on Reddit | 413 comments

Why do programmers need private offices with doors? (Do Not Disturb)

Posted by Mariambarouma@reddit | programming | View on Reddit | 382 comments

Elven77AI@reddit

A better analogy is interrupting an online game without save states vs a single-player game you can reload from a save state, recovering the progress/levels/items back to point where you left.

Fossil: A Git alternative with batteries included

Posted by ketralnis@reddit | programming | View on Reddit | 90 comments

Elven77AI@reddit

Alright, suppose i self-host my Fossil, how do i collaborate with others using their own self-hosted fossil without a central hub or ability to locate the code i'm wanting to fix/fork/review?

Fossil: A Git alternative with batteries included

Posted by ketralnis@reddit | programming | View on Reddit | 90 comments

The NSA advises move to memory-safe languages

Posted by ketralnis@reddit | programming | View on Reddit | 554 comments

Elven77AI@reddit

Why not reform C/C++ standards to mandate specific memory-safe features as default? Migrating from C/C++ codebases is a non-starter for most of companies. A buffer overflow checking overhead can be eliminated by proving at compile-time that all writes are limited to buffer length, so if the buffer can be written to outside the limit it would cause a "ambigious write error" instead of compiling it. Runtime-allocation would of course need to be checked at runtime limits, but since most of these exploits target fixed buffers its going to be priority to makes this "compile-time check for buffer operations outside of range" mandatory step (and disabling it with something like -funsafe-buffers)

Whole ship found in a mine in Alps in 1460

Posted by nixmix85@reddit | HighStrangeness | View on Reddit | 340 comments

Elven77AI@reddit

The damage seems to be something that sucked the ship into deep space at very high speed and then after some time teleported/portaled the ship into a hidden cave.

4K UAP Satellite Footage

Posted by littlespacemochi@reddit | HighStrangeness | View on Reddit | 239 comments

Elven77AI@reddit

Weird 4 engine tandem wing aircraft/drone with a tail. Linear movement along the ground. Design is clearly terrestrial, it reminds me of hybrid airship type aircraft. Its also clearly slower than a jet.

PFAS levels in ground and air could be *drumroll* higher than expected, research suggests

Posted by hitchinvertigo@reddit | collapse | View on Reddit | 148 comments

Elven77AI@reddit

Gem there: The Netherlands revised its soil limits upwards after about 70% of building projects at the time were **halted because soil remediation was required and builders protested against the thresholds**, the Stockholm paper noted.