physics knowledge-based LLM

Posted by shanedrum@reddit | LocalLLaMA | View on Reddit | 11 comments

Hi!

I'm pretty new to the world of LLM's. I recently wrote my physics dissertation and I used chatgpt a lot (don't worry, it didn't write any part of my thesis, I just used it to give me quotes and stuff). This really gave me experience with LLM's and how powerful they can be. I am pretty privacy-oriented so I didn't really like the closed-source cloud-based nature of ChatGPT, so I downloaded ollama and got to tinkering.

My question is, are there any LLMs (maybe some from Meta) that I can download locally and plug in to open-webui that are aware of physics knowledge? For example, I can ask chatGPT to explain to me the Einstein equations or even hyper-specific topics like asymptotic symmetries, etc. Are there any locally-installed LLMs that can give me such features?

Thanks!

[-]

Paulonemillionand3@reddit

All LLMs will have some physics knowledge. Why don't you, you know, experiment?

[-]

relmny@reddit

That's actually a good advice from my POV...

OP only needs a computer and Internet access, that's it. Then Download/install open-webui or Jan or LLMstudio and then download the models and have a set of questions to ask.

There's no better answer than the one one can get by oneself... specially when is free.

[-]

MrSomethingred@reddit

No local model, even fine tuned will be at OpenAI or Anthropics big models when it comes to physics knowledge.

What you can get great performance out of is putting physics PDFs into OpenWebUI and letting the model do RAG with it. (Essentially skimming the textbook before answering you)

But regardless of what Sam Altman says, no LLM is replacing scientists any time soon, so they are way better at quickly reminding you things you already understand than trying to do new physics with them.

[-]

shanedrum@reddit (OP)

Yea I didnt really expect to rival the giants with my tiny rtx 2060.

Hmm, thats a good idea! I've read about RAG before, but didnt know I could just upload the PDF to openwebui and let the model train itself on it. Thanks cool! So in theory I could upload my dissertation?

Yea, thats all I use it for, just to remind myself of stuff. I'm not so naive to believe an LLM is going to replace physicists.

Thanks for the tip!

[-]

MrSomethingred@reddit

It is not quite "Training" per say, rather, it uses one system to skim the document for relevant parts, and then inject that along side your question.

So if you have questions about your own research, e.g. "what was this value?" Etc then uploading your dissertation would work

I'm my experience, I have found it more useful to upload "reference" materials for RAG. E.g. textbooks. This gives it the kind of general reference materials you can use to answer the sorts of questions you might have.

[-]

shanedrum@reddit (OP)

Thats interesting. So could I in principle upload a PDF of a textbook and have my model be fine-tuned in this fashion?

[-]

ShengrenR@reddit

No - don't get 'fine-tuning' and RAG wires crossed in your mind - at least in this context, fine-tuning a model involves actually updating the numerical weights in the network in order to modify behavior - it's generally good at style/format/form changes, but unless you have a TON (tons of those textbooks) and do continued pre-training, you're not going to impart much 'knowledge' into the model - while being at significant risk of breaking other parts of it.
RAG is a system - it's information lookup (most common/popular is embedded cosine similarity between a chunk of text in your textbook and the question) that then dumps that information into the LLM context - so when you ask about your chiral symmetry breaking it goes and grabs a few (..N) chunks of the book that you've pre-indexed and reads it along with your question in order to answer you. RAG is essentially an open-book test for the model, where the 'brains' are the LLM but the source material is whatever you feed it. This works reasonably well for topics that will fit into your context window, but if you need it to digest that whole textbook and understand it in order to answer your question, you're in trouble.

[-]

IrisColt@reddit

Question: Explain asymptotic symmetries in the context of Physics.

Answers:

ChatGPT-4o
Llama 3.1-70b
Wikipedia article (Bondi–Metzner–Sachs group)

Draw your own conclusions.

[-]

AerosolHubris@reddit

What do you mean give you quotes and stuff?

And yeah, as the other commenters mentioned, RAG is probably your best bet. You may not need the privacy of a local model for just Q&A, though I get why you'd use it for original research, and just general privacy concerns.

[-]

shanedrum@reddit (OP)

> What do you mean give you quotes and stuff?

Like "what are some little-known quotes by famous physicists". Then I pick one I like, check the internet to see if its real, and use it in the paper. I even used it to help formatting latex.

In general, im pretty privacy-oriented, so I always prefer local solutions to cloud-based ones. But especially true for writing academic papers, like if I ask it to point out editing errors in my papers for example.

Thanks I'll look into RAG!

[-]

No-Refrigerator-1672@reddit

In my experience, ChatGPT was extremely good of taking a PDF of a paper and then answering my questions about it contents. Ollama will soon get un update with Llama3.2 support, and then your OpenWebUI will get ability to process visual data* like graphs and diagrams. I think it'll be best to incorporate RAG into your workflow after the update, so you won't have to redo things.

*Technically, you can do vision now with LLaVa models in OpenWebUI, but Llama3.2 is much more recent and thus I expect it to be much better.