Let’s say I want to download a model and run it locally. How the heck do I even get started with that in Xojo? I am not talking about generative ai.
Although I haven’t used them for a bit now, a couple good places to start would be…
Just plan on making sure you have tons of free space as the language models IIRC are generally 7+GB on up to tens, so you can quickly and easily plow through a hundred or more GB. Of course, having a fast Internet connection will help too.
This is all old news (6+ months), which in current AI timelines is ancient and others might have more recent developments on this front.
You can try with GitHub - ggml-org/llama.cpp: LLM inference in C/C++
LMStudio and you setup a local server using any of the model available in lmstudio
then it’s an htmlviewer or a tcpsocket
Create your server, call it from anywhere at your lab using the Ollama API
You will need “just” to create the Xojo client.
Oh duh. I guess that’s my 3am brain not connecting the dots. I was thinking “run locally” was “in process” when “on prem” is more practical.
Several ways to do it, however running ollama locally and accessing its web API works, although there is a lot of fiddling around to get the format correct, spaces seem to be an issue, however i think ollama may have fixed this in the latest release.
For the main online ones, there are detailed documented api calls.