Extracting XLS (Excel) file natively

If it helps, and if your file is XLSX file, here is a public repo (pure Xojo, no plugin) where I try to do exactly that. Only the reader is usable.

https://github.com/slo1958/sl-xj-lib-xlsx.git

You only need the project/xlsx-lib subfolder. Hope it helps…

@William_Reynolds In your initial post you said

The rest of this thread has been about .xlsX (with the exception of references to plugins, which tend to handle both). Those are completely different file types.

Could you please confirm, which type of excel file you have to process.

I strongly recommend you do not try to parse these files yourself, regardless which of these excel file types you actually get. Use a plugin solution, an external library or a web service to read or convert the file to something you can handle. Preferably ask the party responsible for the external automation to send you a file you can process natively, something like csv, xml (implementing one or more tables) or json.

1 Like

HI @Stefan_von_Allmen ,

It is XLSX, and these files are coming from an antiquated Warehouse Mgmt System which apparently only offers this format (I’ve asked for CSV/JSON/Etc and got shot down).

The solution might just be to put Microsoft’s Power Automate in the middle - have it ingest and convert the data, then have XOJO grab and utilize the output. I was just hoping for a Xojo-native solution in order to have less external system dependencies.

If it really is XLSX, that’s an easy format to read via Xojo’s XML classes; I’m not sure why the other poster here is so adamantly opposed to this approach. You owe it to yourself to crack open one of these documents in an XML reader or even a text editor and see what you think.

Personally, I’d be eager to avoid looping in another system like Microsoft Power Automate; you’re adding significant complexity and maintenance cost.

2 Likes

again, as i’ve asked before, did you read the file using a binary editor ? to see if it’s really xls(x)
it could be simple txt files

Hi @Jean-Yves_Pochez ,

I’m not sure which content you’re pointing me to - here is a look ‘inside’ the xlsx file…

1 Like

so these are definitely real xlsx files. can be deal with xml methods.

Examining each .XML, they appear to contain structural and formatting information - but no real content. This is uncharted territory for me.

Well, if you do not want to use a plugin, you can have a look at sl-xj-lib-xlsx on github (see link above). Most common issues (shared string, type guessing, finding the selected sheet from the catalog, …) are handled there. Tested with XLSX file generated by WPS and OnlyOffice. Only potential issue is size, since I’m using XMLDocument() to load the catalog, the worksheets, …

You can probably import a few things from there in your project.

help yourself ! :slight_smile:

3 Likes

That’s awesome! I’ve been playing with it all day and have modified it so that it no longer extracts to a temporary folder. It can now load, parse, and extract zip/XLSX files directly in memory. All without needing a 3rd party plugin or library. Just plain old Xojo.

I just sent you a Pull Request with the changes.

2 Likes

merged your request, and add a mix of it to the main code. user can choose between in memory unzip, or the temp folder

2 Likes