Announcing WikipediaKit: Open source Wikipedia API access

I’m pleased to announce the release of WikipediaKit - a Xojo module that allows you to query the free, public, Wikipedia API to retrieve descriptions, excerpts and full HTML content of articles.

Repository

Project page: garrypettet.com
GitHub repository: https://github.com/gkjpettet/WikipediaKit

Usage

The bulk of the work is done by the WikipediaKit.Search class. It performs synchronous queries to the Wikipedia API. This is a free API that doesn’t require an API token. You should supply your own user agent string with each request (the API requires this).

The class can either run a search query and return the HTML contents of the page that is the best match to the query string (determined by the API) or it can return an array of search results ordered by closest match (lowest array index is the best match).

Var engine As New WikipediaKit.Search("MySearcher")

// Let's search for "Jupiter".
Var query As String = "Jupiter"

// Get the full page HTML contents for the best match.
// If html is empty then the search failed.
Var html As String = engine.SearchAndGetPageHTMLContent(query)

// We could also request a bunch of matches from the API:
Var allMatches() As WikipediaKit.SearchResult
allMatches = engine.Search(query, 10) // Limit to 10 results.

// Or we can try to get just the best match (uses our own internal heuristics):
Var bestMatch As WikipediaKit.SearchResult
bestMatch = engine.FindBestMatch(query)

// Valid WikipediaKit.SearchResults will have a `Key` property set by the API.
// You can use this to retrieve a specific page from the API.
Var contents As String = engine.GetPageHTML(bestMatch.Key)

Be aware that the Search class make synchronous requests so this is a blocking activity.

14 Likes

thanks a lot, will use it

1 Like

on first open i have

on first run i have

thanks

Hmm that’s a weird bug.

I hadn’t reopened the project so didn’t see it.

I’ll investigate. I think it’s a compiler bug because there is a class and a method with the same name.

1 Like

I saw the deprecated constructor warning in 2025r2.1. You should file a ticket, nothing is jumping out at me as obviously wrong. Could not reproduce in a sample project.

Managed to reproduce the deprecated constructor warning. It does have to do with the class and method having the same name. Honestly, when I write code, if I’m tempted to name a class and method the same thing, I remind myself that means I haven’t made my code self-explanatory. deprecated-constructor.xojo_xml_project (2.3 KB)

Did not get the compiler errors.

Is this constructor for convenience? Without defining the non-parameter constructor as private, we aren’t forced use the parameter-accepting constructor; bypassing what reads like your User Agent setup.

1 Like

The easiest fix here was to simply change the Search class to SearchEngine.

I’m not sure I’m understanding your question @Tim_Parnell: A user agent is required by the Wikipedia API which is why I provide a default one that can be overridden.

1 Like

Your design doesn’t prevent

var oQuery as new SearchQuery

So if the User Agent is a required parameter, you need to define the constructor without parameters as private to prevent the code above.

I very rarely use magic constructors, I think it’s poor non-self-explanatory code; but for when I use them, the first thing I do is make the non-parameter constructor private.

works great thanks

so if i understand correctly you can use this, then use ur html parser to prepare to store to some LLM ?

1 Like