Can somebody please do a benchmark of the new Mistral-Large-2411?
Posted by CH1997H@reddit | LocalLLaMA | View on Reddit | 25 comments
I can't find a single benchmark anywhere. Nobody has made one. Very unusual
Is it better than the previous Mistral Large? How does it compare against Llama 3.1 405B?
That's all up to your imagination 😊
x0xxin@reddit
Has anyone run Mistral Large 2411 with a draft model in TabbyAPI?
When I run Mistral Large 2407, in Q4 with Mistral 7b v0.3 Q4 as a draft model, inference averages 22 tokens per second. Running Mistral Large 2411 with the same settings is averaging 8 tokens per second. Curious what could be causing this. It's slow even when I lower the max sequence length to 32768.
segmond@reddit
lolzinventor@reddit
I'm using Mistral Large 2407 Q5 for generating a CoT training data set, got a couple of weeks to go yet. Do you think it is worth pausing the job and switching to 2411? Possibly wouldn't notice the 1% increase between 2407 and 2411?
FullOf_Bad_Ideas@reddit
Q5 quant for batch dataset creation? Is this cheaper then just using their API? By Q5 I am assuming a GGUF quant, yes?
lolzinventor@reddit
Yes GGUF. It is cheaper for me.
FullOf_Bad_Ideas@reddit
Can you explain how do you archieve that? I'm curious. Do you use some inference engine meant for batches like vllm/Aphrodite-engine? How many gpu's are you using for it? Do you have free electricity that makes this economically sound? I was under a belief that Q5 GGUF quants would be performing quite slowly in terms of total throughput with batched inference, so a lot of power and gpu's for not a lot of tokens.
lolzinventor@reddit
I have 2 servers of 4x3090. Each server is running llama.cpp with Q5 GGUF. I have a process that interacts with a PostgreSQL DB for finding the next job for each llama instance. getting about 8 tok/sec for each server. Being llama.cpp each gpu is only loaded 25%. Not the fastest, but its slow and steady. Target is 50K Cot Q/A. Got about 40K right now.
Rows look like this:
#Question
How does Plato's allegory of the cave inform our understanding of the relationship between subjective human experience, objective reality, and the eternal forms, and what implications does this have for the role of reason and emotion in the pursuit of true knowledge?
#DATA
- Plato's allegory of the cave
- Subjective human experience
- Objective reality
- Eternal forms
- Role of reason
- Role of emotion
- Pursuit of true knowledge
#UNKNOWN
- The relationship between subjective human experience, objective reality, and the eternal forms as informed by Plato's allegory of the cave
- The implications for the role of reason and emotion in the pursuit of true knowledge
#CONNECTION
- Plato's theory of forms: the idea that non-physical (but substantial) forms represent the most accurate reality
- The allegory of the cave as a metaphor for enlightenment and the journey from ignorance to knowledge
- The distinction between the visible world (subjective experience) and the intelligible world (objective reality)
- The roles of reason and emotion in Plato's epistemology
#COT=4
Explain Plato's allegory of the cave and its core elements: the prisoners, the shadows, the fire, and the sun.
Analyze how the allegory distinguishes between subjective human experience (the shadows) and objective reality (the world outside the cave).
Connect the allegory to Plato's theory of forms, highlighting how the eternal forms represent the ultimate reality.
Discuss the implications for the roles of reason and emotion in the pursuit of true knowledge, emphasizing the necessity of reason for comprehending the forms and the potential hindrance of emotion in this process.
lolzinventor@reddit
#ANSWER
Plato's allegory of the cave is a profound exploration of the nature of reality, perception, and the pursuit of true knowledge. This allegory, found in Plato's "The Republic," uses a vivid metaphor to illustrate the relationship between subjective human experience, objective reality, and the eternal forms. By delving into these themes, we can gain insights into the roles of reason and emotion in the quest for understanding the true nature of the world.
### The Allegory of the Cave and Its Core Elements
Plato's allegory begins with a group of prisoners who have been confined in a cave since birth, chained in such a way that they can only see the wall directly in front of them. Behind the prisoners, a fire burns, and between the fire and the prisoners, puppeteers cast shadows onto the wall. The prisoners, having never seen anything else, believe these shadows to be the only reality. The allegory progresses as one prisoner is freed and, after initial pain and confusion, sees the fire and eventually the world outside the cave, illuminated by the sun. This journey represents the ascent from ignorance to enlightenment.
### Distinguishing Between Subjective Experience and Objective Reality
The shadows on the wall symbolize subjective human experience. The prisoners' perception of reality is limited to these shadows, which are mere representations of objects being manipulated by the puppeteers. This metaphor underscores the idea that our immediate sensory experiences can be deceptive and incomplete. In contrast, the world outside the cave represents objective reality. The freed prisoner's discovery of the true nature of the objects and the sun (which symbolizes the ultimate truth or the Form of the Good) highlights the distinction between the visible world of appearances and the intelligible world of true forms.
### Connecting the Allegory to Plato's Theory of Forms
Plato's theory of forms posits that there is a realm of non-physical, eternal forms that represent the ideal and most accurate reality. The objects and phenomena we perceive in the physical world are imperfect copies of these forms. In the allegory, the sun outside the cave represents the Form of the Good, the ultimate source of truth and knowledge. The freed prisoner's journey to see the sun directly parallels the philosopher's quest to comprehend the eternal forms. This allegory suggests that true knowledge lies not in the sensory world but in the contemplation of the ideal forms that underlie it.
### The Role of Reason and Emotion in the Pursuit of True Knowledge
The allegory also has significant implications for the roles of reason and emotion in the pursuit of true knowledge. Reason, represented by the freed prisoner's ability to question and ultimately comprehend the world outside the cave, is essential for understanding the eternal forms. It is through rational thought that one can transcend the limitations of subjective experience and grasp the objective reality of the forms.
Emotion, on the other hand, is depicted as a potential hindrance. The initial pain and confusion experienced by the freed prisoner when exposed to the light symbolize the emotional resistance to change and new knowledge. Emotions can cloud judgment and make it difficult to see beyond immediate, subjective experiences. Plato suggests that the pursuit of true knowledge requires the ability to overcome emotional obstacles and use reason to achieve a higher understanding.
### Summary and Key Insights
Plato's allegory of the cave offers a powerful metaphor for the relationship between subjective human experience, objective reality, and the eternal forms. It illustrates how our immediate perceptions can be misleading and how true knowledge lies in the contemplation of the ideal forms. The journey of the freed prisoner highlights the necessity of reason in comprehending the ultimate reality and the potential hindrance of emotion in this process.
This analysis underscores the importance of rational thought and the pursuit of objective truth in philosophy and education. It encourages us to question our assumptions and strive for a deeper understanding of the world beyond our immediate experiences. Furthermore, it invites reflection on the balance between reason and emotion in our own lives, reminding us that while emotions are a part of human experience, they should not overshadow the pursuit of truth and knowledge.
In conclusion, Plato's allegory of the cave provides a timeless framework for exploring the nature of reality, perception, and the pursuit of true knowledge. It emphasizes the critical role of reason and the need to transcend subjective experiences to grasp the objective reality of the eternal forms. This exploration encourages further inquiry into the relationship between reason and emotion, as well as the nature of the ideal forms in Plato's philosophy.
segmond@reddit
I don't know, that 1% could turn into more for you. But 2411 seems to perform CoT better without prompting. So it's probably worth it to get the newest model and continue from there.
lolzinventor@reddit
There is no harm in downloading and having a look first...
lolzinventor@reddit
There is no harm in downloading and having a look first...
lemon07r@reddit
I like my magical imagination
DragonfruitIll660@reddit
So far from my very limited testing it seems to be better than the previous version. It's not perfect (still hoping for an LLM that can accurately keep track of stats and status tags for an RPG game) but it seems to be more detailed for remembering lore. Running on Q4xs so may be even greater quality gains at higher quants.
cm8t@reddit
It is certainly better than 2407, better reasoning, big picture understanding
Caffeine_Monster@reddit
It certainly has better (more human) prose and stronger prompt following.
However I don't think the model is fundamentally any stronger though (which is why we keep seeing it match / within margin of error of perplexity tests.)
thezachlandes@reddit
I just uploaded a q2 MLX quant in case anyone wants to try this on MacOS! Size loaded in LM Studio is 42.31GB. I'm going to upload the q4 a little later.
get it here: https://huggingface.co/zachlandes/Mistral-Large-Instruct-2411-Q2-MLX/
Admirable-Star7088@reddit
I was wondering the same, and the best way to benchmark a model is trying it out yourself. My curiosity made me download Mistral-Large-2411 (Q4_K_M), even though I believed it would most likely crash my system. To my surprise, it actually worked (just right on my system's limit).
So far, in my own testings, it's a very powerful model with excellent reasoning and prompt-following, I've had a bunch of fun feeding it hypothetical / logical questions and also compared it against Nemotron 70b and Athene-V2 70b. Mistral-Large-2411 stands strong and it's on of my favorite models right now.
hello_2221@reddit
Just curious, what's your setup?
Admirable-Star7088@reddit
64gb ddr5 ram and 16gb vram = 80gb RAM in total. Getting \~1 t/s with Mistral Large Q4_K_M.
segmond@reddit
I haven't played with Athene-v2-70b, how does it compare to Nemotron70b?
Admirable-Star7088@reddit
I would need to test Athene-V2-70b a bit more to be more confident in my answer, but so far it has been roughly on par with Nemotron 70b for me, they are both very intelligent.
The main difference in a nutshell, as I can see, is that Nemotron seem to do more deep analysis on your hypothetical / logical requests, while Athene goes more straight to the point (but still making good points).
For example, in story writing, I have so far found Nemotron to write more fascinating and interesting texts, as it fills in with interesting details and makes it more awe-inspiring, while Athene tends to just progress the story more straightforward (but still in a logical and intelligent way).
This is my "feel" about these models right now.
Healthy-Nebula-3603@reddit
2407
llama-perplexity.exe --model models/new3/Mistral-Large-Instruct-2407-Q3_K_S.gguf --threads 30 -ngl 35 --hellaswag --hellaswag-tasks 400 -f models\hellaswag_val_full.txt -c 8192 --no-mmap
390 90.76923077
391 90.53708440
392 90.56122449
393 90.58524173
394 90.60913706
395 90.63291139
396 90.65656566
397 90.68010076
398 90.70351759
399 90.72681704
400 90.75000000
2411
bin\llama-perplexity.exe --model models/new3/Mistral-Large-Instruct-2411-Q3_K_S.gguf --threads 30 -ngl 35 --hellaswag --hellaswag-tasks 400 -f models\hellaswag_val_full.txt -c 8192 --no-mmap
390 88.97435897
391 88.74680307
392 88.77551020
393 88.80407125
394 88.83248731
395 88.86075949
396 88.88888889
397 88.91687657
398 88.94472362
399 88.97243108
400 89.00000000
Newer seems a bit worse with perplexity ....
YearZero@reddit
You can find it here:
https://oobabooga.github.io/benchmark.html
https://aider.chat/docs/leaderboards/
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard (no 2407 on this one to compare against tho)
segmond@reddit
It is better than 2407.
Healthy-Nebula-3603@reddit
I'm going to do that today for perplexity....locally