Handwritten formula to latex format
Posted by DataScientia@reddit | LocalLLaMA | View on Reddit | 15 comments
Is there is machine learning model? which take the input as some formula (image) and convert that the latex format. I have tried on some good ocr models, the text are captured accurately but formula is not captured well in latex format
H4medm@reddit
you can try fine-tuning got-ocr or qwen2 vl 2b using ms-swift with this command :
CUDA_VISIBLE_DEVICES=0 swift sft \ --model_type got-ocr2 --model_id_or_path stepfun-ai/GOT-OCR2_0 \ --sft_type lora \ --dataset latex-ocr-handwrite
DataScientia@reddit (OP)
Thanks for this, even i was trying to do this, but i was not getting handwritten latex dataset. Can you share the link of that dataset? I couldn’t find the data on huggingface( the one which you mentioned above)
Inevitable-Start-653@reddit
Try got-ocr. I've used it in my project here and it works very well on hand written equations.
https://github.com/RandomInternetPreson/Lucid_Autonomy
DataScientia@reddit (OP)
I tried got ocr , but it was not giving good results. Example: i wrote x and it detected as 2c. Here let me tell you i am considering bad handwriting
RandiyOrtonu@reddit
have you tried got ocr 2?
DataScientia@reddit (OP)
Yes got ocr 2 , the one which was released recently
Inevitable-Start-653@reddit
Hmm, that stinks :( Another suggestion might be Aria, but it requires a lot of vram. Their HF page has a link to try the model for free.
https://huggingface.co/rhymes-ai/Aria
Have you tried miniCPM v1.6 this is a good local smaller vision model I've had good success with understanding handwritten equations.
https://huggingface.co/openbmb/MiniCPM-V-2_6
DataScientia@reddit (OP)
Yea that aria is huge model for this usecase. But i will check out mini cpm. Thanks man
leelweenee@reddit
Qwen2-VL-7B instruct. is very good at this. You can tell it something like: "just the formulas" or "just markdown" or "just the text"
DataScientia@reddit (OP)
Sorry i forgot to mention i am looking for small models less than 2b parameters. Just for this i cannot call those big models
RandiyOrtonu@reddit
2B is also available
_supert_@reddit
Gpt-4o-mini
DataScientia@reddit (OP)
Kind of costly , i am looking for something open source small models
_supert_@reddit
You could try llama 3.2.
DataScientia@reddit (OP)
Yea that aria is huge model for this usecase. But i will check out mini cpm. Thanks man