A while ago I installed AutoGPT.
It basically takes the functionality of ChatGPT to work locally on your computer. It might even work as a Docker Image on a Synology NAS.
What helps a lot is that you can add data-sources to AutoGPT. Source like the Internet, personal documents, etc.
There are several language models that can be downloaded. Models ranging from small to large.
I was wondering if those kind of language models could be used by a Xojo app…
Whisper, by OpenAI has a downloadable app, that works mainly offliine. A Youtube video by TroubleChute explains how to install it for Windows. Though, installing on a Mac should not be problem.
From there, Whisper could be called via Shell
. I think.
Embedding these offline AI models opens up a lot of possibilities, I think.
Multiple models can be used.
A “tiny model” can be used in a mobile app, for simple tasks like creating summarized text. And finding basic info on data input. Even simple math.
While more complex text renderings could be done on a web-app.
1 Like
I played with it for a bit but found a couple of things. Despite the ability to pull data sources from elsewhere, ChatGPT’s generated results were just better more often. Secondly, it really ran my power bill up. All that GPU activity cost me more than the $20 I pay for ChatGPT every month.
But it’s great that it exists and that these open source tools are available to everyone. They will get better and better and eventually I wouldn’t be surprised to see most programming languages and/or OSes ship with some sort of built in LLM as ubiquitously as database engines are today.
1 Like
The reason why an offline LLM might be useful or even essential is when sensitive data needs to be processed.
Security
I work as a freelance cameraman and video-editor. Besides the regular TV jobs I also film at locations where there is a strict security policy.
Like, quarterly reports for huge companies (like ING Bank, Unilever, etc.)
Also, I occasionally work at the International Court Of Justice, in The Hague.
Sometimes I need to get transcriptions of the raw footage and end product. To type it in by hand can take a long time. AI is really helpful. But, it goes against some policies of some companies.
No internet connection
Besides the sensitive content, there might be places where an offline LLM could be handy. A couple of times a year I film a rally with classic cars. I find myself on places with poor cell reception.
A simple first pass by a simple local LLM could work. At a later point an online LLM could clean up the first pass.
Summarize longer source data stream
Also, when creating a first pass, in any case, by a simple offline LLM, doesn’t cost me the tokens. Those tokens aren’t that expensive. But it adds up.
For instance: I have a transcribe-module. It uses GPT-4 by OpenAI. For longer media files, it needs to be split up.
- I use MBS to convert any media input into low-quality mono audio files
- I use MBS to detect the best places where I could make the splits, without chopping up words
- OpenAI’s Whisper transcribes the audio
- GPT-4 enhances the text
- A local LLM would generate a summery, that is fed as context for the subsequent Whisper processes.
Right now, I use a cheap GPT model to do the summarizing
…Yes, I totally agree that the quality of a local LLM can not be compared with an online service. But as an assistant’s assistant it might work well.
1 Like