Optimising code with GTP

Last two days I was asking GTP to optimise code to speed up things.
And it does do a good job in general. But one thing really blew my mind.

I had a method that converts byte size of a file into a readable format like 15.9MB, 1.53GB, …

That methode looked like this.

if bytes = 0 then return "0 Bytes"
var k as integer = 1000
var sizes() as string
sizes =  array("Bytes", "KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB")
var i as double = floor( log(bytes) / log(k) )
return format(bytes / pow(k , i),"#.00") + " " + sizes(i)

So I ask GTP:
“Optimise for speed the following xojo code: if bytes = 0 then return “0 Bytes”
var k as integer = 1000
var sizes() as string
sizes = array(“Bytes”, “KB”, “MB”, “GB”, “TB”, “PB”, “EB”, “ZB”, “YB”)
var i as double = floor( log(bytes) / log(k) )
return format(bytes / pow(k , i),”#.00") + " " + sizes(i)

And the answer was this:

The code you provided is used to format a given number of bytes into a human-readable string that includes units (e.g. “KB”, “MB”, etc.). Here are a few potential optimizations you can make to improve its speed:

  1. Use integer math instead of floating-point math: The log and pow functions in your code use floating-point math, which can be slower than integer math. Since the sizes array is only 9 elements long, you can precompute the powers of 1000 and store them in a separate array. Then you can use integer division and modulus to compute the appropriate index into the sizes array:
if bytes = 0 then return "0 Bytes"

var k as integer = 1000
var sizes() as string = array("Bytes", "KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB")
var powers() as integer = array(1, k, k*k, k*k*k, k*k*k*k, k*k*k*k*k, k*k*k*k*k*k, k*k*k*k*k*k*k, k*k*k*k*k*k*k*k)

dim i as integer = 0
while bytes >= powers(i+1) and i < 8
  i = i + 1
wend

return str(bytes \ powers(i)) + "." + right("0" + str(bytes mod powers(i) \ powers(i-1)), 2) + " " + sizes(i)
  1. Avoid unnecessary function calls: The format function in your code is used to format the number of bytes with two decimal places. However, you can achieve the same result using integer division and modulus. This avoids the overhead of a function call:
if bytes = 0 then return "0 Bytes"

var k as integer = 1000
var sizes() as string = array("Bytes", "KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB")
var powers() as integer = array(1, k, k*k, k*k*k, k*k*k*k, k*k*k*k*k, k*k*k*k*k*k, k*k*k*k*k*k*k, k*k*k*k*k*k*k*k)

dim i as integer = 0
while bytes >= powers(i+1) and i < 8
  i = i + 1
wend

return str(bytes \ powers(i)) + "." + right("0" + str(bytes mod powers(i) \ powers(i-1)), 2) + " " + sizes(i)

By implementing these optimizations, you can significantly improve the performance of your code while still producing the same output.

So I tried this with 1000 files.
And the solution provided by GTP was about 280% faster! Incredible. And to be honest, I would never ever came up with the code myself. Pretty nifty.

4 Likes

was it with v4 or v3.5 ?

I tested your code and GPT code for bytes = 12345678

  • your code 12.35 MB
  • GPT code 12.45 MB

maybe GPT fixed a bug too? or GPT has a bug?

Shouldn’t the base be 1024 rather than 1000?

2 Likes

i guess GTP isn’t sentient yet. :laughing:

Try passing 5. Five bytes.

I want to see what ChatGPT does with that powers(i-1)

1 Like

In the past yes. But since 2008 it changed to 1000 with ISO/IEC 80000 chapter 13 (Computing part), the base 2 still exists but has new names and simbols as Ki for the old K, Mi for old M, and so on. Both nomenclatures coexist now.

The dumbing-down never ends…

4 Likes

It started with greed people disagreeing about memory capacities (like HDD) being sold and suing companies. The rationale was that K is 1000 everywhere, so it couldn’t be 1024 in computing.

1 Like

or two?

Edit: I guess having so much knowledge is hard to believe that anything is only “bytes” anymore :slight_smile:

I remember when a can of tuna fish was 12 ounces, now it’s like 9.5 :sweat_smile:

I’m seeing lots of downsized products due to inflation too.

A 1,000,000-byte MB makes your internet speed numbers look better, too :slight_smile:

Here is one approach from ChatRPT, Rick’s Pre-trained Tranformation:

If bytes = 1 Then Return "1 Byte"

const k As Integer = 1000

Static sizes() As String = Array("Bytes", "KB", "MB", "GB", "TB", "PB", "EB")
Static powers() As Int64

Var i As Integer

If powers.LastIndex < 6 Then // init once
  For i = 0 to 6             // Int64 can't handle 1000^7
    powers.Add k^i
  Next
  powers.Add 0              // Filler
End

For i = 0 to 6
  If bytes<powers(i+1) Then Exit
Next

Return str(bytes \ powers(i)) + If(i>0, "." + right("0" + str(bytes mod powers(i) \ powers(i-1)), 2),"") + " " + sizes(i)

If we use bytes = 12345678

This code:

"0" + str(bytes mod powers(i) \ powers(i-1))

gives us “0345” and Right selects the last 2 (45) instead of 34 or rounded 35 as it should, no?

For example, 12901 bytes will return 12.01 KB and 12090 will return 12.90 KB

Yep, that residual is wrong too. Things will get so complicated to round it that would be much easier using a double to compute the final answer.

1 Like

I don’t know the speed hit by using:

"0" + str(Round(bytes Mod powers(i) \ powers(i-1)/10))

but at least 12901 will be 12.90 KB (instead of 12.01 KB) and 12090 will return 12.09 KB (instead of 12.90 KB)

Ok, RickGPT did the int math including rounding.

If bytes = 1 Then Return "1 Byte"

Const k As Integer = 1000

Static sizes() As String = Array("Bytes", "KB", "MB", "GB", "TB", "PB", "EB")
Static powers() As Int64

Var i As Integer

If powers.LastIndex < 6 Then  // init once
  For i = 0 to 6              // Int64 can't handle 1000^7
    powers.Add k^i
  Next
End

Var powerCut As Int64
var powerUnit As String

For i = 0 to 6
  If i > 5 Or bytes < powers(i+1) Then 
    powerCut = powers(i)
    powerUnit = sizes(i)
    Exit
  End
Next

Var intPart, temp As Int64
Var fracPart As Integer

If bytes >= k Then
  intPart = bytes \ powerCut
  temp = bytes - (intPart * powerCut)
  fracPart = temp \ (powerCut \ 100)
  temp = (temp - (fracpart *  (powerCut \ 100))) \ (powerCut \ 1000)
  If temp >= 5 then fracPart = fracPart + 1
Else
  intPart = bytes
  fracPart = 0
End

Return intPart.ToString + If(i>0 And fracPart>0, "." + Right("0" + fracPart.ToString, 2), "") + " " + powerUnit

my approach :

Public Function PrettySize(aSize as Integer) As String
  If aSize<10000 Then
    Return Str( aSize)
  Elseif aSize<1000000 Then
    Return Str( aSize\1000)+kKb
  Elseif aSize<1000000000 Then
    Return Str( aSize\1000000)+kMb
  Elseif aSize<1000000000000 Then
    Return Str( aSize\1000000000)+kGb
  Else
    Return Str( aSize\1000000000000)+kTb
  End If
  
End Function

1 Like

Good, but too inaccurate. The good part is that it raised my eyes to a rounding bug in my int rounding routine. A PrettySize(1999999) returns 1MB, and should return 1.99MB or better 2MB.

The ChatGPT approach for fast int math, with many human corrections + rounding:

If bytes = 1 Then Return "1 Byte"

Const k As Integer = 1000

Static sizes() As String = Array("Bytes", "KB", "MB", "GB", "TB", "PB", "EB")
Static powers() As Int64

Var i As Integer

If powers.LastIndex < 6 Then  // init once
  For i = 0 to 6              // Int64 can't handle 1000^7
    powers.Add k^i
  Next
End

Var powerCut As Int64
var powerUnit As String

For i = 0 to 6
  If i > 5 Or bytes < powers(i+1) Then 
    powerCut = powers(i)
    powerUnit = sizes(i)
    Exit
  End
Next

Var intPart, fracPart, temp As Int64

If bytes >= k Then
  intPart = bytes \ powerCut
  temp = bytes - (intPart * powerCut)
  fracPart = temp \ (powerCut \ 100)
  temp = (temp - (fracpart *  (powerCut \ 100))) \ (powerCut \ 1000)
  If temp >= 5 then   // int rounding
    fracPart = fracPart + 1
    If fracPart>99 Then
      fracPart = 0
      intPart = intPart + 1
    End
  End
Else
  intPart = bytes
  fracPart = 0
End

Return intPart.ToString + If(i>0 And fracPart>0, "." + Right("0" + fracPart.ToString, 2), "") + " " + powerUnit