NaturalLM - A Natural Sounding LLM

Posted by random-tomato@reddit | LocalLLaMA | View on Reddit | 7 comments

So I did continuous pretraining on Mistral Nemo 12B (base) on my dataset here: https://huggingface.co/datasets/qingy2019/Natural-Text

And here's the fine-tuned model: https://huggingface.co/qingy2019/NaturalLM

It actually sounds pretty good, here's a sample of its outputs (prompt is in bold):

"To take a look at the history of computers, we have to go back to 1822. Charles Babbage had just created what he called a 'Difference Engine' which was a mechanical device for calculating tables of numbers. But he had a much bigger idea. He wanted to build an even more powerful machine that could solve any mathematical problem. He called it the 'Analytical Engine'. It would have been able to store numbers on punch cards and it could have performed calculations at the touch of a button. It was a bit like a giant calculator. Babbage even came up with a design for a 'programmer' that could input instructions to the Analytical Engine. But Babbage never managed to complete his dream, he didn't have the money and he didn't have the technology. It would be another 100 years before Babbage's vision became a reality. In 1936, Alan Turing, a young mathematician at Cambridge University, wrote a paper called 'On Computable Numbers'. In it he described a simple machine that could solve mathematical problems. He called it a 'universal computing machine'. Turing's machine was a bit like Babbage's Analytical Engine but it was much simpler. It had a central processing unit or CPU, a memory and a way of inputting and outputting information. It could be programmed to perform a sequence of steps, called an algorithm, and it could keep going until it reached a solution. Turing's machine was the forerunner of the modern computer. In 1941, the British Government commissioned a team of engineers led by Alan Turing to build a machine that could crack the German Enigma code. The result was the Colossus, the world's first electronic computer. It was a huge machine that took up a whole room. It had 1,500 valves, 5,000 capacitors and 1,000 relays. It was so big that it had to be built in pieces and then assembled on site. The Colossus was a huge success. It was able to crack the Enigma code in minutes, whereas it had taken hours before. The Colossus was so successful that the British Government decided to build more of them. In 1944, they built the Colossus Mark II, which was even bigger and more powerful than the original. The Colossus Mark II was able to crack the German Enigma code in seconds."

Right now, it's still the base model, and I haven't yet found a suitable instruct-tuning dataset, but I'm just putting this out here if it's useful to anyone :D