ninjasaid13
-
Better & Faster Large Language Models via Multi-token Prediction
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 37 comments
-
LongCat-Next: Lexicalizing Modalities as Discrete Tokens
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 4 comments
-
View-oriented Conversation Compiler for Agent Trace Analysis
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 0 comments
-
Intern-S1-Pro - 1 Trillion parameters Open-Weights for Scientific Multimodal Foundation Model
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 1 comments
-
microsoft/Phi-4-reasoning-vision-15B · Hugging Face
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 1 comments
-
LightMem: Lightweight and Efficient Memory-Augmented Generation
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 3 comments
-
Can anyone tell me the performance of LLaVA vs BLIP?
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 2 comments
-
OpenCUA: Open Foundations for Computer-Use Agents
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 10 comments
-
Ming-flash-omni-Preview
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 0 comments
-
Latent Sketchpad: Sketching Visual Thoughts to Elicit Multimodal Reasoning in MLLMs
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 0 comments
-
Nvidia's OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 5 comments
-
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 13 comments
-
[2510.13804] Generative Universal Verifier as Multimodal Meta-Reasoner
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 0 comments
-
AndesVL Technical Report: An Efficient Mobile-side Multimodal Large Language Model
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 0 comments
-
LLM-JEPA: Large Language Models Meet Joint Embedding Predictive Architectures
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 2 comments
-
SDQ-LLM: Sigma-Delta Quantization for 1-bit LLMs of any size
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 2 comments
-
An Open-source Omni Chatbot for Long Speech and Voice Clone
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 17 comments
-
RLP: Reinforcement as a Pretraining Objective
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 1 comments
-
Dynalang code released
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 2 comments
-
NCSOFT/VARCO-VISION-2.0-14B · Hugging Face
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 13 comments
-
Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 0 comments
-
Kwai Keye-VL 1.5 Technical Report
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 4 comments
-
Thyme: Think Beyond Images
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 1 comments
-
Technical Report of TeleChat2, TeleChat2.5 and T1
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 3 comments
-
MemOS: A Memory OS for AI System
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 19 comments
-
Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 0 comments
-
Phi-4-mini-flash-reasoning
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 15 comments
-
Code for Skywork-R1V3-38B
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 0 comments
-
Kwai-Keye/Keye-VL-8B-Preview · Hugging Face
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 2 comments
-
Stream-Omni: Simultaneous Multimodal Interactions with Large Language-Vision-Speech Model
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 5 comments
-
inclusionAI/Ming-Lite-Omni · Hugging Face
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 11 comments
-
Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 3 comments
-
GitHub - jacklishufan/LaViDa: Official Implementation of LaViDa: :A Large Diffusion Language Model for Multimodal Understanding
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 3 comments
-
Pretraining on the Test Set Is All You Need
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 2 comments
-
Open-Sourced Multimodal Large Diffusion Language Models
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 17 comments
-
GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 2 comments
-
Qwen-14B model
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 1 comments
-
Aya Vision: Advancing the Frontier of Multilingual Multimodality
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 17 comments
-
R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 2 comments
-
Llama Nemotron - a nvidia Collection
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 0 comments
-
DFloat11: Lossless LLM Compression for Efficient GPU Inference
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 6 comments
-
Yo'Chameleon: Personalized Vision and Language Generation
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 1 comments
-
Skywork-R1V2-38B - New SOTA open-source multimodal reasoning model
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 18 comments
-
OpenGVLab/InternVL3-78B · Hugging Face
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 8 comments
-
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 1 comments
-
Tina: Tiny Reasoning Models via LoRA
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 1 comments
-
Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 0 comments
-
Meta Perception Language Model: Enhancing Understanding of Visual Perception Tasks
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 27 comments
-
An Easy-to-use Knowledge Editing Framework for LLMs.
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 4 comments
-
Browser Qwen
Posted by ninjasaid13@reddit | LocalLLaMA | View on Reddit | 16 comments