I Created an AI Research Assistant that actually DOES research! Feed it ANY topic, it searches the web, scrapes content, saves sources, and gives you a full research document + summary. Uses Ollama (FREE) - Just ask a question and let it work! No API costs, open source, runs locally!
Posted by CuriousAustralianBoy@reddit | LocalLLaMA | View on Reddit | 134 comments
Automated-AI-Web-Researcher: After months of work, I've made a python program that turns local LLMs running on Ollama into online researchers for you, Literally type a single question or topic and wait until you come back to a text document full of research content with links to the sources and a summary and ask it questions too! and more!
What My Project Does:
This automated researcher uses internet searching and web scraping to gather information, based on your topic or question of choice, it will generate focus areas relating to your topic designed to explore various aspects of your topic and investigate various related aspects of your topic or question to retrieve relevant information through online research to respond to your topic or question. The LLM breaks down your query into up to 5 specific research focuses, prioritising them based on relevance, then systematically investigates each one through targeted web searches and content analysis starting with the most relevant.
Then after gathering the content from those searching and exhausting all of the focus areas, it will then review the content and use the information within to generate new focus areas, and in the past it has often finding new, relevant focus areas based on findings in research content it has already gathered (like specific case studies which it then looks for specifically relating to your topic or question for example), previously this use of research content already gathered to develop new areas to investigate has ended up leading to interesting and novel research focuses in some cases that would never occur to humans although mileage may vary this program is still a prototype but shockingly it, it actually works!.
Key features:
- Continuously generates new research focuses based on what it discovers
- Saves every piece of content it finds in full, along with source URLs
- Creates a comprehensive summary when you're done of the research contents and uses it to respond to your original query/question
- Enters conversation mode after providing the summary, where you can ask specific questions about its findings and research even things not mentioned in the summary should the research it found provide relevant information about said things.
- You can run it as long as you want until the LLM’s context is at it’s max which will then automatically stop it’s research and still allow for summary and questions to be asked. Or stop it at anytime which will cause it to generate the summary.
- But it also Includes pause feature to assess research progress to determine if enough has been gathered, allowing you the choice to unpause and continue or to terminate the research and receive the summary.
- Works with popular Ollama local models (recommended phi3:3.8b-mini-128k-instruct or phi3:14b-medium-128k-instruct which are the ones I have so far tested and have worked)
- Everything runs locally on your machine, and yet still gives you results from the internet with only a single query you can have a massive amount of actual research given back to you in a relatively short time.
The best part? You can let it run in the background while you do other things. Come back to find a detailed research document with dozens of relevant sources and extracted content, all organised and ready for review. Plus a summary of relevant findings AND able to ask the LLM questions about those findings. Perfect for research, hard to research and novel questions that you can’t be bothered to actually look into yourself, or just satisfying your curiosity about complex topics!
GitHub repo with full instructions:
https://github.com/TheBlewish/Automated-AI-Web-Researcher-Ollama
(Built using Python, fully open source, and should work with any Ollama-compatible LLM, although only phi 3 has been tested by me)
Target Audience:
Anyone who values locally run LLMs, anyone who wants to do comprehensive research within a single input, anyone who like innovative and novel uses of AI which even large companies (to my knowledge) haven't tried yet.
If your into AI, if your curious about what it can do, how easily you can find quality information using it to find stuff for you online, check this out!
Comparison:
Where this differs from per-existing programs and applications, is that it conducts research continuously with a single query online, for potentially hundreds of searches, gathering content from each search, saving that content into a document with the links to each website it gathered information from.
Again potentially hundreds of searches all from a single query, not just random searches either each is well thought out and explores various aspects of your topic/query to gather as much usable information as possible.
Not only does it gather this information, but it summaries it all as well, extracting all the relevant aspects of the info it's gathered when you end it's research session, it goes through all it's found and gives you the important parts relevant to your question. Then you can still even ask it anything you want about the research it has found, which it will then use any of the info it has gathered to respond to your questions.
To top it all off compared to other services like how ChatGPT can search the internet, this is completely open source and 100% running locally on your own device, with any LLM model of your choosing although I have only tested Phi 3, others likely work too!
Just-Contract7493@reddit
wonder if this can take apis....
CuriousAustralianBoy@reddit (OP)
what does that mean?
Just-Contract7493@reddit
an API from, for instance, kobold
CuriousAustralianBoy@reddit (OP)
It's just for Ollama at the moment unfortunately! Getting it to work at all was quite a challenge and I have never even used anything other then llama.cpp or ollama before, sorry!
But ollama's free so you can still try it if you want.
RedditPolluter@reddit
In addition to using OpenAI's protocol, another thing that could enhance its appeal to users is to enable the configuration of a maximum number of API calls to minimize risk of unexpectedly high API fees. For conservative configurations, to avoid abrupt halting between the stages or layers of complexity preceding the final output, you could configure for these stages to take a percentage of the total quota rather than fixed number of calls.
RedditPolluter@reddit
Making it compatible with OpenAI's endpoint is the most standardized way of maximizing support, since even Google has now added support for that same protocol. Mistral API and Ollama also support that protocol. It's not default for Ollama but it can be accessed mirroring the paths, like /v1/chat/completions. If you use the libraries for OpenAI's API, people can simply swap out the domain name and plug in virtually anything.
GimmePanties@reddit
I sent you a PR which adds wider support. It does OpenAI calls now, which supports pretty much anything that isn’t Anthropic, just set the base url to your preferred endpoint.
NEEDMOREVRAM@reddit
Any idea how hard it would be if I were to upload your files to Qwen 2.5 32B Coder and have her modify it so that instead of Ollama it uses an API from Oobabooga or Kobold? I know nothing about code (I just started to learn Python). And thank you for creating this program. I think you are on to something big here.
Just-Contract7493@reddit
oh it okie and thank
Fragrant-Purple504@reddit
Took a quick look and can see you've put some thought and effort into this, thanks for sharing! Will hopefully get to test it out this week.
CuriousAustralianBoy@reddit (OP)
Thanks very much! Yeah it took a lot of thought and quite a lot of effort to get it functional I appreciate it! Let me know what you think!
help_all@reddit
Does it take care of bot detection when scraping sites. Most sites will have it.
JustinPooDough@reddit
OP Should consider enhancing this with undetectable chrome driver and possibly also a proxy server rotation. Basically, implement strategies that existing industrial scraping solutions use.
ViperAMD@reddit
Use selenium base, even better!
MmmmMorphine@reddit
Phew, so glad someone beat me to the punch here. Knew something like this (background as well as active/chat-available knowledge development and iteration with 'versioning' and references using nearly exclusively local resources, to put it as briefly as possible) would be necessarily a part of my larger goal
Thank you for your work, this does look very promising indeed
David_Delaune@reddit
I think it's a really neat project, thanks for sharing it.
NewZealandIsNotFree@reddit
Awesome!
TheTerrasque@reddit
Looks nice. I haven't really looked at the code yet, but some suggestions:
bronkula@reddit
It seems odd to suggest supporting openai, when it seems the whole pitch is local llm usage.
bunchedupwalrus@reddit
It’s become the standard in a lot of ways tbf. Simplifies swapping providers from local to cloud etc
ForsookComparison@reddit
a necessary evil that doesn't really benefit OAI that much. I'll pay that tax.
allegedrc4@reddit
It's not even a necessary evil. It's just something that has OpenAI's name on it but nothing to do with them, from your perspective.
RazzmatazzReal4129@reddit
I think they mean openai api...not openai the hosted llm. It's just a standard for communicating with a llm.
rhet0rica@reddit
To clarify the other responses, the API is just the protocol that chatbots use to communicate with frontends. Everyone standardized on the format that OpenAI originated for their own services because it was a decent design. Tools must use the same API to be compatible.
my_name_isnt_clever@reddit
I host my own LiteLLM proxy so I can run everything through it. If something supports openai spec I can use any models I want.
TheTerrasque@reddit
Most local llm solutions that offers an api support the openai api.
The_Seeker_25920@reddit
Great suggestions here, this is a cool project, maybe I’ll throw some of these in a PR
TheTerrasque@reddit
Someone already made a PR for adding openai api support. .. Which got rejected.
my_name_isnt_clever@reddit
This is probably not the project for me then. That's a shame.
AdHominemMeansULost@reddit
the curses requirement you have doesnt exist on windows
simqune@reddit
Yeah, on windows you wanna open the requirements.txt and change curses-windows to windows-curses
solidsnakeblue@reddit
I solved this by importing 'windows-curses', hope it helps
Throwawaytodelete123@reddit
PLEASE add support for Scihub. Scihub is what nearly every scientist uses to access scientific papers that are hidden behind paywalls.
Scihub lets you take the DOI of a scientific paper and access it directly. I don't know how to code, but there are a few unofficial API's...
https://github.com/Tishacy/SciDownl
https://pypi.org/project/scihub/
https://github.com/zaytoun/scihub.py
https://scihub.copernicus.eu/twiki/do/view/SciHubWebPortal/APIHubDescription
Main site... https://sci-hub.se/
For doing any research it's essential that everyone can access everything. Ironically without it only "free" science gets more citations because it's easier to access, but that means privatized science doesn't contribute as much, and then all of us miss out, and it's bad for science. This might actually make a huge improvement in your research model because if it finds a paper it will almost nearly always be able to download it.
I've been trying to write this ability into SAKANA AI scientist, but don't know coding.
GoogleOpenLetter@reddit
Can confirm. It's near universal that scientists hate these paywalls, the scientists don't get paid anything, only the publisher does. The authors want their papers disseminated as much as possible, they'll always give it for free to anyone that asks. There's even a Twitter account dedicated to doing this called "I Can Haz PDF? "
https://www.ibtimes.co.uk/i-can-haz-pdf-academics-tweet-secret-code-word-get-expensive-research-papers-free-1525241
Large institutions pay pennies because they buy subscriptions, but for anyone else, including people in developing countries, the charge is normally around $40 usd PER paper. Essentially all it does is prevent scientific participation for the little guy as a huge barrier to entry. The system is a total scam.
Incorporating it here makes a LOT of sense. It's probably easier for the bot to do it this way because it's a direct link with none of the messing around.
anticapacitor@reddit
Oh wow! What a share! I'm kinda speechless but I have to say thank you! Really!
I even managed to get this going. It's going right now, so I can't say how it works out in the end but apparently it's going very well! I was dumb-funded at first to what I was even going to research as a test lol.
Btw, in the instructions for git clone, you have "YourUserName" instead of your actual github name, just FYI.
Oh and I found that I actually had to name the Ollama model "custom-phi3-32k-Q4_K_M", regardless of which I used in the FROM field in the "Modelfile" file (I used mistral-nemo 12B Q4_0 atm). At 38000 context length, my 16 GB VRAM divides it at 3% CPU / 97% GPU (so not much slowdown).
(Ah, I found where to change the model name now, guess I should RTFM to the end 😁)
CuriousAustralianBoy@reddit (OP)
haha thanks I just fixed the readme, I was in a rush to get it out there.
and yeah the llm_config.py file is where you change the name and stuff!
thanks for your input, I was speechless too when I see how it worked, shocked I could make something like this in a month or two although I did spend most of my time on it, just glad it seems to have resulted in something quite cool by the end!
but let me know what you think!
Purple-Test-7139@reddit
I’m still not super sure on how to set it up. This might be too basic. But would it be possible for you to give slightly more detailed / exact syntax for the setup.
CuriousAustralianBoy@reddit (OP)
well where are you getting stuck?
You need ollama, then after that's setup completely follow these steps (i'll put everything I can think of so that you don't miss any steps):
ollama serve
ollama run phi3:3.8b-mini-128k-instruct-q6_K
FROM phi3:3.8b-mini-128k-instruct-q6_K
PARAMETER num_ctx 38000
5. Once the model is done downloading and lets you talk to it from the ollama run window close the window and open a new one, make a python virtual environment by typing in the terminal (the first bit is to navigate to the program files):
cd Automated-AI-Web-Researcher-Ollama
python -m venv venv source venv/bin/activate
6. then in that terminal once your in the virtual environment type:
pip install -r requirements.txt
this will install the requirements, now when done you need to type (with the ollama serve window still running) in terminal (The one you are in from installing the requirements is fine):
ollama create research-phi3 -f MODELFILE
now last thing before running the program, you go to llm_config.py script you will see a section that looks like this:
LLM_CONFIG_OLLAMA = {
"llm_type": "ollama",
"base_url": "http://localhost:11434", # default Ollama server URL
"model_name": "custom-phi3-32k-Q4_K_M", # Replace with your Ollama model name
"temperature": 0.7,
"top_p": 0.9,
"n_ctx": 55000,
"context_length": 55000,
"stop": ["User:", "\n\n"]
}
Where it says model name replace "custom-phi3-32k-Q4_K_M" with the model you just made with the model file which would be "research-phi3" Then save it.
python Web-LLM.py
And thats it should work! Please let me know if you have any issues! Sorry for the long guide I just wanted to make it as clear as possible!
Gilgameshcomputing@reddit
Brilliant, thank you for leading us non-coders step by step. Much appreciated 🙏🏻
clust3rfuck@reddit
can you point out where is this llm_config.py file? I am very new to this stuff
Icy_distribution8763@reddit
Dude! This is amazing! Definitely will be trying it out this week
D0TTTT@reddit
Looks great! will try it out on weekend and let you know. Thankyou for sharing this.
PurpleReign007@reddit
Awesome! Stoked to try this on some research
solidsnakeblue@reddit
How are all the Windows users getting this to run? Getting a "No Module Named 'termios'" error (research_manager.py line 17) and google suggests that's not something Windows can install.
solidsnakeblue@reddit
ChatGPT and I re-wrote the parts around the termios problem and got it running. So far I've got it working with:
LM Studio
OpenAI
OpenRouter
Google API via an OpenAI proxy
This thing is great! I've been making some tweaks to the amount it scrapes per site and the amount of sites
It doesn't seem to summarize correctly. I'm having to manually grab the .txt file and give it to the AI manually, I'll try to solve that next
CheatCodesOfLife@reddit
Thanks for this, explains a lot. I hate dealing with ollama and am very glad exllamav2 are adding vision models now.
trantrungtin@reddit
RemindMe! 1 month
Ornery_Meat1055@reddit
after a bit of tinkering got it working and submitted my research question via @ ... and control+D (im on mac).
But dont see anything progressing so far?
eleqtriq@reddit
Can’t wait to try it!
Arkonias@reddit
I don't use ollama, I use LM Studio. Is it easy to use the LM Studio API with it?
ForsookComparison@reddit
Ollama is as user-friendly as these tools get I feel. It's worth spending 15 minutes or so with to figure it out.
CuriousAustralianBoy@reddit (OP)
it was written for ollama I have never used LM Studio unfortunately
hugganao@reddit
awesome! thanks for the share. So which opensource model have you had the best results with?
Ornery_Meat1055@reddit
I want to see the hallucination benchmark for this
Daarrell@reddit
I got some error trying to install curses requirement on windows :/
Affectionate_Bet_820@reddit
Awesome work OP, kudos. I was always in need of such solution. Will definitely explore. By the way does anyone have any list of similar open-source tools/research agents? I am still figuring it what will be best for for my use case. So, I'll appreciate if somebody can point me towards a repo listing such agentic frameworks that are customised for academic research work, if any.
UsualYodl@reddit
Just need to add my positive feedback to the concert. This is great and inspiring. Note; i recently gave myself 6 months to come up with something similar. (i want a feature that weighs research quality according to various criterions ) . Thank you for the sharing and inherent tips !
Ben52646@reddit
You are awesome! Thank you!!
flitzbitz@reddit
!RemindMe 1 month
fleiJ@reddit
!remind me 4 weeks
dogcomplex@reddit
RemindMe! 4 weeks
fleiJ@reddit
Hm this is not how it works I guess😂
CuriousAustralianBoy@reddit (OP)
Thanks very much!
planetearth80@reddit
Please consider adding docker support. Great work!
The_Seeker_25920@reddit
Super cool project, I need to check this out! Are you open to having more contributors?
ElegantGrocery8081@reddit
Testing it now! Super simple to setup. Thanks from Spain
RikuDesu@reddit
Looks cool, I've been using ScrapeGraphAI this to do something similar, but i'm excited to try your solution
bladablu@reddit
Looks great, thank you for sharing, I will test it soon and report back.
schorhr@reddit
This is awesome.
Still, now I'm curious what'll happen if your research topic is "actually, the earth is flat", and what kind of sources it would dig up ;-)
Platfizzle@reddit
No option for a self hosted openai endpoint for Tabby/etc users?
DomeGIS@reddit
Hey this is great, this was exactly what I was looking for! I was always wondering why nobody built it so far.
I just had a peak at the web scraping part and noted that it "only" scrapes the html part. if you call it "research" assistant it might be mistaken for academic research which would require scientific resources like papers.
In case you want to consider Google Scholar papers as additional resource: https://github.com/do-me/research-agent It's very simple but works.
A friend of mine developed something more advanced: https://github.com/ferru97/PyPaperBot
winkler1@reddit
The `phi3:14b-medium-128k-instruct` model referenced in the readme seems invalid?
```
researcher ❯ ollama create research-phi3 -f modelfile
transferring model data
pulling manifest
Error: pull model manifest: file does not exist
```
https://ollama.com/search?q=phi3%3A14b-medium-128k-instruct -> no models found
CuriousAustralianBoy@reddit (OP)
you have to pick a specific quant, I swear I put in an example in the readme, but just look for one on the ollama website model list
Ollama
winkler1@reddit
Ahhh... it's under the View All. Was not seeing any instruct models in the short list. Thx.
goqsane@reddit
Starred. Am impressed. Please tell me, do you support the use case of using a separate llama? (I.e.: not on your computer but another one on your network). I got a whole server full of LLMs and I don’t like to run it on my “work” computer.
LoadingALIAS@reddit
Yeah, this is awesome. Thank you so much. I’ll about to dive in!
estebansaa@reddit
this is awesome, you can remove the context window limitation by using RAG to store and retrieve the data as the work progresses.
frobnosticus@reddit
Okay This seems fantastic. I can't get involved in the yak shaving required to play with it right now. But I absolutely will.
o7
!RemindMe 1 week
micseydel@reddit
How easy would it be to point it at and limit it to a local Obsidian vault? (No internet access.) An Obsidian vault is essentially just a wiki of Markdown notes that use [[wikilinks]].
wortelbrood@reddit
remindme! 1 week
theeditor__@reddit
nice! Have you tried using more powerful LLMs? A video demo would be nice!
CuriousAustralianBoy@reddit (OP)
haha I just uploaded a video demo to the github before I saw this comment, check it out!
and no I have not, my computer isn't very powerful and I was very focused on getting it working properly rather then testing different LLMs if you feel like testing it though, i'd be curious to hear how it performs!
all good if not though I am just happy it's done took months!
LeBoulu777@reddit
I'm new to AI and I'm building a computer with 2 X 3060 = 24gb vram, would it be enough to use your script in a efficient way? 🤔
NEEDMOREVRAM@reddit
Do you think a Q8 quant would perform better? And would it be hard for a n00b like me to modify OP's python code using Qwen 2.5 Coder 32B and make it so that instead of running Ollama, it uses an API from say Kobold or Oobabooga?
GimmePanties@reddit
It wasn’t hard at all. I added support for Anthropic and OpenAI / OpenAI like models. That would let you use it with Kobold and Oobabooga because they support OpenAi calls, just set your server as the base URL.
I’ve sent OP a PR, but you can grab the fork here: https://github.com/NimbleAINinja/Automated-AI-Web-Researcher-Hosted - the files you want are llm_config.py and llm_wrapper.py
CuriousAustralianBoy@reddit (OP)
I reviewed it, does it work? I really would be surprised, like none of the functions are written for it and theres thousands of lines of code, if it works that's great I just seriously don't think it would based on what I say in your code.
But if it does please let me know!
GimmePanties@reddit
Off course it works, lol, I wouldn't have submitted it without testing. It's not thousands of lines of code that needed to be rewritten, the API endpoint is an abstraction, so whether you're passing the prompt to ollama, or llama.cpp or passing it to the openAI library (which has the thousands of lines of code already written for you), its functionally the same to the rest of your code in that it returns a response.
All the local providers (LMStudio, Ollama, Oobabooga, Kobold, etc) provide an OpenAI compatible endpoint, as do most of the online providers. The nice part of this for a developer is that you can write one bit of code, and then repoint to different providers just by changing the base_url, model and optionally, the api_key.
Anthropic is the only one I can think of that doesn't have an openAI endpoint now that Google implemented one last week. But Anthropic has their own library which is just as simple to call.
ab2377@reddit
Great!!
eggs-benedryl@reddit
I'll have to look at this once i get home. I will say stuff like this probably gets a bit more adoption if it has even a basic UI, but if it works it works,
Very cool looking
candre23@reddit
How does this account for the fact that at least half the information on the internet is factually false, and 87% of statistics are made up? Is there any kind of source validation process, or is this a likely scenario:
BusRevolutionary9893@reddit
The ocean will rise about 6.5 inches or 165 mm an no one will notice. Oxygen levels are higher do to increased vegetative growth from the increase in CO2 accompanied by greater crop harvests. There are less climate related deaths because the cold weather kills far more people than hot weather. Finally, climate grifters are once again predicting the upcoming ice age that will end humanity if our world governments don't give huge sums of money to large corporations to save us from climate change. Society once again fails to see through the lobbying campaigns and propaganda because still few people can think for themselves.
Orolol@reddit
It's always sad to see people fall for Oil corporations propaganda.
BusRevolutionary9893@reddit
In 2023 $528 billion was spent on oil and natural gas worldwide vs $1.8 trillion on clean energy.
beybileyt@reddit
I like your attitude.
BusRevolutionary9893@reddit
There are actual upsides to an increase in temperature. There's no reason to scare children into thinking we're all going to die. Humans have existed with much higher and lower global temperatures. With modern advancements, I think we'll be fine.
AuggieKC@reddit
People in general don't like to hear that the vast areas of permafrost that will become agriculturally viable hugely outweigh the already highly variable coastlines they are so attached to.
pohui@reddit
Bad bot.
WhyNotCollegeBoard@reddit
Are you sure about that? Because I am 99.95379% sure that BusRevolutionary9893 is not a bot.
^(I am a neural network being trained to detect spammers | Summon me with !isbot |) ^(/r/spambotdetector |) ^(Optout) ^(|) ^(Original Github)
CuriousAustralianBoy@reddit (OP)
why don't you actually test it before you try to diss the thing mate!
Dyonizius@reddit
i was going to ask this too which search engine it supports can you slap a self hosted search aggregator there?
SillyHats@reddit
It's a respectful phrasing of a legitimate concern. I completely understand perceiving it as an attack, but this sort of thing you have to just force yourself to not take personally.
CuriousAustralianBoy@reddit (OP)
ALSO the beauty of this is that you can directly see every single source it's used in the summary in the research text file, what site it came from and what it said, so if your wanting to check out the authenticity of the research you absolutely can!
Dax_Thrushbane@reddit
Thanks for this - looks very good.
Psychedelic_Traveler@reddit
Support for LM studio ?
SillyHats@reddit
I think you can rely on your audience to have their inference backend already set up fine. (I figure, anyone who would need to be walked through installing ollama, is probably not going to be interested in trying a CLI tool). So, simplify all of that stuff in the README down to "give my thing your backend's URL in [whatever way]".
And make that [whatever way] something clean and straightforward - right now I see llm_config.py has a base_url field for ollama, but not llama.cpp, so I'm not clear how it would even use llama.cpp; I guess it just assumes 127.0.0.1:8080? (I have my llama.cpp setup all nicely tucked away on its own server, so any AI tool that wants any backend configuration beyond a single plain URL is a non-starter for me; I would imagine a lot of other local people might be similar)
But: it looks like you've done a good deal of coding legwork to build a thing I've been wanting, so thanks very much! I wouldn't be critiquing it if I didn't think there was something worthwhile here! I'm definitely going to take a close look. Also this appears vastly more presentable than anything I would have thrown together in college, lol
No-Refrigerator-1672@reddit
I've tried this script and want to provide some feedback:
I don't have the phi3 downloaded, so I tried the script both with Lamma3.2-vision:11b and Qwen2.5:14b, giving them up to 15 minutes to do the research. In both cases, the script did not work as expected. Both models generate completely empty research summaries. Both models always investigate the same search query over and over again, occasionally changing 1 or 2 words in the query. Llama3.2-vision always assesses the research as sufficient, but then generates empty summaries and anwers that there's not enough data in q&a mode. Qwen2.5 seems to adequately assess the research itself, but completely fails at q&a. At this moment it seems like the project is incompatible with anything but phi3. I may donwload phi3 and test it again later.
In case if you need an example, below is my test results with Qwen 2.5.
================================================================================
RESEARCH SUMMARY
================================================================================
Original Query: Compare Tesla M40 and P102-100 refromance for LLM inference using llama.cpp, ollama, exllamav2, vllm, and possibly other software if you find it.
Generated on: 2024-11-20 12:19:13
### Comparison of Tesla M40 and P102-100 for LLM Inference Using llama.cpp, ollama, exllamav2, vllm
================================================================================
End of Summary
================================================================================
================================================================================
Research Conversation Mode
================================================================================
Instructions:
- Type your question and press CTRL+D to submit
- Type 'quit' and press CTRL+D to exit
- Your messages appear in green
- AI responses appear in cyan
Your question (Press CTRL+D to submit):
So what's your verdict?
Submitted question:
So what's your verdict?
AI Response:
Based on the provided research content:
--------------------------------------------------------------------------------
Your question (Press CTRL+D to submit):
CuriousAustralianBoy@reddit (OP)
yeah I also found llama3 to work, honestly try phi 3 instead it's just a quick download away! again it's literally the model I tested the whole program with so I have no clue what any others are liable to do, you can try medium or mini
HOWEVER as I specified in the instructions on the github page, you need to make a new custom model because even models that should hard larger context sizes are for some reason defaulting to like 2k tokens max, so if you don't do that, it could explain the lack of summary as the LLM just runs out of context after like 2 searches, but that's just a theory!
Eugr@reddit
You can send context window size in a request. For Ollama it’s n_ctx parameter in options. You could even set it automatically to max size the model supports by reading model info and getting its trained context size from there.
GimmePanties@reddit
OP nice work. I would consider ignoring robots.txt if it exists because that’s more for mass web scraping and this is a user directed tool. I was getting a lot of urls skipped because that was being enforced.
Search mode doesn’t seem to be implemented? Research mode works fine.
CuriousAustralianBoy@reddit (OP)
I have considered it, maybe I will in the future, and yeah search mode is actually a left over part of my previous program that was this programs predecessor, and it only really did web searches and scraping, and in the process of implementing this massive new version I totally broke it, and haven't had the time to fix it, thanks for the input though!
The research mode is essentially a better version of the search anyways!
GimmePanties@reddit
okay, take the / operator out of the menu maybe? it confused me, since it's the first option.
CuriousAustralianBoy@reddit (OP)
yeah I will do that right now good call
Butthurtz23@reddit
Wow, it's great that it will totally shut down those fake Karen researchers! Too bad universities just love to accept papers that have been peer-reviewed by experts like Karen with doctoral degrees. But your project will be challenged and don’t let that deter you from continuing your work!
obsolesenz@reddit
I want to use this to change my digital footprint. Does this make sense. I hate the ads I get from data brokers. I want have this running on my M4 Mac mini with 16g of RAM that I'm currently using as an Apple TV replacement and just have search AI scholarly stuff 24/7 or better yet guitars. I have a visceral hatred for data brokers and would love to poison my profile with this!
CuriousAustralianBoy@reddit (OP)
I don't think you can poison your profile with this, because:
but that's just imo
drAndric@reddit
Instant Star. I'll check it later today, sounds awesome. Thanks for the hard work and sharing.
CuriousAustralianBoy@reddit (OP)
thanks mate I appreciate it!
no_username_for_me@reddit
This looks kinda cool but perhaps a demonstration that the top LLMs can’t handle off the shelf would be more effective. Here is 4os response to the question you asked in the demo which I think you will agree is much more comprehensive:
Smelling salts work through a combination of chemical and biological mechanisms that stimulate the body’s nervous system. Here’s a breakdown of their function:
Chemical Mechanism
Smelling salts typically contain ammonium carbonate (NH₄HCO₃) or a similar compound. When the salts are exposed to air, they release ammonia gas (NH₃), which has a strong, pungent smell. • Reaction: Ammonium carbonate decomposes when exposed to air or heat:  • The released ammonia gas is highly volatile and easily inhaled through the nose.
Biological Mechanism
When inhaled, the ammonia gas irritates the nasal mucosa and the lining of the respiratory tract. This irritation triggers a reflexive response in the nervous system. 2. Activation of the Sympathetic Nervous System: The irritation stimulates the trigeminal nerve (cranial nerve V), which activates the sympathetic nervous system. This results in: • Increased respiratory rate (rapid inhalation). • Increased heart rate and blood pressure. • Heightened alertness. 3. Arousal Response: The body’s reflexive reaction to the strong odor acts as a wake-up call, jolting a person out of a faint or drowsy state. It can momentarily counteract light-headedness or fainting by increasing oxygen intake and blood flow to the brain.
Uses and Limitations
Safety Concerns
Overuse or prolonged exposure to ammonia gas can cause: • Irritation to the respiratory system. • Damage to mucous membranes. • Reflexive choking or coughing.
They should be used sparingly and under proper supervision.
CuriousAustralianBoy@reddit (OP)
It's just a demo to show how it works, if you wanna test it out more thoroughly feel free to do so!
Bil_Wi_theScience_Fi@reddit
Saved! Can’t wait to try it out
InsightfulLemon@reddit
+1 Saving for later
young_picassoo@reddit
Really nice
shepbryan@reddit
Hurrah!
wontreadterms@reddit
This is neat. Its an interesting implementation of CoT with web scraping. It would be interesting if the agent had other tools to retrieve information, not just web scraping, like direct API access to search engines.
It would be amazing to port this flow as a Task in my framework: https://github.com/MarianoMolina/project_alice
The web scraping functionality, and other search APIs, are already implemented as tools/tasks, and a complex task flow like yours could be a good way of exploiting all these tools while building better/more complex agent structures. You can use multiple API providers by default, including LM Studio as a local deployment.
I'm planning on adding cuda support by v0.4 (and probably ollama), and I'm launching v0.3 in a few days with a bunch of cool updates (to get an early look: https://github.com/MarianoMolina/project_alice/tree/development)
yehiaserag@reddit
Is it possible to have this in a docker image or a compose?
helvetica01@reddit
!remindme 1 week
RemindMeBot@reddit
I will be messaging you in 7 days on 2024-11-27 12:41:46 UTC to remind you of this link
CLICK THIS LINK to send a PM to also be reminded and to reduce spam.
^(Parent commenter can ) ^(delete this message to hide from others.)
SophiaBackstein@reddit
I will check that out :)
worry_always@reddit
Looks cool, will try it out soon.
ahmcode@reddit
Looks awesome !!! And the video is very funny, I something today 😁
TanaMango@reddit
Does anyone wanna build some app with me or help me with a project? I really need a bi project for my resume, thank you!
XhoniShollaj@reddit
Thats pretty cool, thank you for sharing. Next step could be to integrate agents with resources / cloud services for automated experimentation + documentation & paper generation
CuriousAustralianBoy@reddit (OP)
The whole point is that it's locally running, free, and doesn't require external services, and why would I generate papers especially when i'm running such small LLMs?
it's only a 3.8b model that I mostly tested on, the point is to find reliable real papers and information, and then to gather the info for you from real research with links to it, not to try and make some up yourself!
that's just my 2 cents on it, but thanks for the input though!
jacek2023@reddit
step towards AGI :)