Optimising code with GTP

Ignoring that we are using float-point math here, I would use this “your” approach, for me is just enough:

Public Function PrettySize(aSize as Integer) As String

  Select Case aSize
  Case Is < 1000
    Return aSize.ToString + " Bytes"
  Case Is < 1000000
    Return Str( aSize/1000, "#0.00")+" KB"
  Case Is < 1000000000
    Return Str( aSize/1000000, "#0.00")+" MB"
  Case Is < 1000000000000
    Return Str( aSize/1000000000, "#0.00")+" GB"
  End
  
  Return Str( aSize/1000000000000, "#0.00")+" TB"

End Function
1 Like

What I learned, ChatGPT code is:

  • faster
  • introduced a rounding error “Right 2”, that makes 12090 bytes show 12.90 KB
  • introduced a crash with powers(i-1) when bytes < 1000
  • does not provide warnings/recommendations that ZB and YB can’t work and EB to only 9.22 or so in 64bit systems, or different results if we compile for 32bit systems

My take: “Currently it creates bugs that will blast the face of the copy/paste people off at some point.”

5 Likes

My take away is, the overconfidence of humans can be their undoing.

The AI got a better looking answer for:

System.DebugLog PrettySize(999999999)
System.DebugLog PrettySize(1000000000)

Which was?

I guess the AI gets 999.99 MB and the last code you posted 1000.00 MB, but I don’t know if the other option makes a difference.

So the AI gets a few results correct, but things like 999x99999 (change x to 0-9) will always report 999.99MB

mankind created AI

This is on purpose. I even thought if people would an option for round up (current behavior), or round down, what Julian think it is better.

Rounding aside, it came out with 1000.00MB which kind of goes against it outputting 1.00 GB, but you know, test test test :slight_smile:

Removed rounding on limit until crossing the border

Public Function PrettySize(sizeInBytes As Int64) As String

  If sizeInBytes = 1 Then Return "1 Byte"
  
  Const k As Integer = 1000
  
  Static sizes() As String = Array("Bytes", "KB", "MB", "GB", "TB", "PB", "EB")
  Static powers() As Int64
  
  Var i As Integer
  
  If powers.LastIndex < 6 Then  // init once
    For i = 0 to 6              // Int64 can't handle 1000^7
      powers.Add k^i
    Next
  End
  
  Var powerCut As Int64
  var powerUnit As String
  
  For i = 0 to 6
    If i > 5 Or sizeInBytes < powers(i+1) Then 
      powerCut = powers(i)
      powerUnit = sizes(i)
      Exit
    End
  Next
  
  Var intPart, fracPart, temp As Int64
  
  If sizeInBytes >= k Then
    intPart = sizeInBytes \ powerCut
    temp = sizeInBytes - (intPart * powerCut)
    fracPart = temp \ (powerCut \ 100)
    temp = (temp - (fracpart *  (powerCut \ 100))) \ (powerCut \ 1000)
    If temp >= 5 And fracPart<99 Then   // int rounding while not crossing limits
      fracPart = fracPart + 1
    End
  Else
    intPart = sizeInBytes
    fracPart = 0
  End
  
  Return intPart.ToString + If(fracPart>0, "." + fracPart.ToString("00"), "") + " " + powerUnit
  
End Function

Interestingly even StrFormatByteSizeW is wrong in Windows 10

In Windows 10, size is reported in base 10 rather than base 2. For example, 1 KB is 1000 bytes rather than 1024.

9.76 KB (10000)

Is that the same as “mostly pregnant”? :slight_smile:

2 Likes

This version has 2 modes. SI Metric (k=1000) and Binary (k=1024)

Defaults to Metric. Also just truncates to 2 decimals, I removed the rounding. No one cares about it, for instance, if MS cared, 10000 would be 9.77 KiB and not 9.76

System.DebugLog PrettyMemSize(10000)          // 10 KB
System.DebugLog PrettyMemSize(10000, true)    // 9.76 KiB
Public Function PrettyMemSize(sizeInBytes As Int64, binary As Boolean = False) As String
  
  If sizeInBytes = 1 Then Return "1 Byte"
  
  Static notInited As Boolean = True
  
  Static units(1, 6) As String
  
  Static powers(1, 6) As Int64
  
  Var k As Integer = If(binary, 1024, 1000)
  Var sel As Integer = If(binary, 1, 0)
  
  Var i As Integer
  
  If notInited Then  // init once
    
    notInited = False
    
    Var asm() As String = Array("Bytes", "KB", "MB", "GB", "TB", "PB", "EB") // Metric
    Var asb() As String = Array("Bytes", "KiB", "MiB", "GiB", "TiB", "PiB", "EiB") // Binary
    
    For i = 0 to 6              // Int64 can't handle 1000^7
      powers(0,i) = 1000^i      // Metric
      powers(1,i) = 1024^i      // Binary
      units(0, i) = asm(i)      // Metric
      units(1, i) = asb(i)      // Binary
    Next
    
  End
  
  Var powerCut As Int64
  var powerUnit As String
  
  For i = 0 to 6
    If i > 5 Or sizeInBytes < powers(sel, i+1) Then 
      powerCut = powers(sel, i)
      powerUnit = units(sel, i)
      Exit
    End
  Next
  
  Var intPart, fracPart, temp As Int64
  
  If sizeInBytes >= k Then
    intPart = sizeInBytes \ powerCut
    temp = sizeInBytes - (intPart * powerCut) // remainder
    If binary Then
      fracPart = temp / powerCut * 100   // binary needed float point
    Else
      fracPart = temp \ (powerCut \ 100)
    End
  Else
    intPart = sizeInBytes
    fracPart = 0
  End
  
  Return intPart.ToString + If(fracPart>0, "." + fracPart.ToString("00"), "") + " " + powerUnit
  
End Function

Precomputed statics, removed unnecessary optimization, cleaned up and compressed:

Observed that int64 cant handle Petabyte math reliably so removed k^6 and beyond due to excessive aberrations. The Int128 era arrived a few years ago.

Public Function PrettyMemSize(sizeInBytes As Int64, binary As Boolean = False) As String
  
  If sizeInBytes = 1 Then Return "1 Byte"
  
  Static unitsSI() As String = Array("Bytes", "KB", "MB", "GB", "TB", "PB")
  Static unitsBinary() As String = Array("Bytes", "KiB", "MiB", "GiB", "TiB", "PiB")
  
  Static powersSI() As Int64 = Array(1, 1000, 1000000, 1000000000, 1000000000000, 1000000000000000)
  Static powersBinary() As Int64 = Array(1, 1024, 1048576, 1073741824, 1099511627776, 1125899906842624)
  
  Var powerCut As Int64
  var powerUnit As String
  
  For i As Integer = 0 to 5
    If i > 4 Or sizeInBytes < If(binary, powersBinary(i+1), powersSI(i+1)) Then
      powerCut = If(binary, powersBinary(i), powersSI(i))
      powerUnit = If(binary, unitsBinary(i), unitsSI(i))
      Exit
    End
  Next
  
  Var intPart, fracPart As Int64
  
  intPart = sizeInBytes \ powerCut
  fracPart = sizeInBytes - (intPart * powerCut)
  fracPart = fracPart * 100 \ powerCut
  
  Return intPart.ToString + If(fracPart>0, "." + fracPart.ToString("00"), "") + " " + powerUnit
  
End Function

I’m not sure how you are measuring performance. This code takes a string as input…and string manipulation is usually very expensive in terms of processing. If I can take advantage of using powers of 1000 instead of 1024 I could use the length of the numeric string which keeps me processing using small integers. How awful is this?
It does support strings like “123456789101112131415161718”


Dim length as Integer
Dim shortbytes as String
Dim prettysize as string


length = len(bytestr)

Select Case length 
Case 1, 2, 3
  If bytestr = "1 " then 
    prettysize = "1 byte"
  Else
    prettysize = bytestr + " Bytes"
  End 
Case 4 
  shortbytes = str((val(left(bytestr,4))+ 5))
  prettysize = left(shortbytes,1)+"."+mid(shortbytes,2,2) + " KB"
case 5
  shortbytes  = str((val(left(bytestr,5))+ 5))
  prettysize = left(shortbytes,2)+"."+mid(shortbytes,3,2) + " KB"
case 6
  shortbytes = str((val(left(bytestr,6))+ 5))
  prettysize = left(shortbytes,3)+"."+mid(shortbytes,4,2) + " KB"
Case 7 
  shortbytes = str((val(left(bytestr,4))+ 5))
  prettysize = left(shortbytes,1)+"."+mid(shortbytes,2,2) + " MB"
case 8
  shortbytes  = str((val(left(bytestr,5))+ 5))
  prettysize = left(shortbytes,2)+"."+mid(shortbytes,3,2) + " MB"
case 9
  shortbytes = str((val(left(bytestr,6))+ 5))
  prettysize = left(shortbytes,3)+"."+mid(shortbytes,4,2) + " MB"
Case 10 
  shortbytes = str((val(left(bytestr,4))+ 5))
  prettysize = left(shortbytes,1)+"."+mid(shortbytes,2,2) + " GB"
case 11
  shortbytes  = str((val(left(bytestr,5))+ 5))
  prettysize = left(shortbytes,2)+"."+mid(shortbytes,3,2) + " GB"
case 12
  shortbytes = str((val(left(bytestr,6))+ 5))
  prettysize = left(shortbytes,3)+"."+mid(shortbytes,4,2) + " GB"
Case 13 
  shortbytes = str((val(left(bytestr,4))+ 5))
  prettysize = left(shortbytes,1)+"."+mid(shortbytes,2,2) + " TB"
case 14
  shortbytes  = str((val(left(bytestr,5))+ 5))
  prettysize = left(shortbytes,2)+"."+mid(shortbytes,3,2) + " TB"
case 15
  shortbytes = str((val(left(bytestr,6))+ 5))
  prettysize = left(shortbytes,3)+"."+mid(shortbytes,4,2) + " TB"
Case 16
  shortbytes = str((val(left(bytestr,4))+ 5))
  prettysize = left(shortbytes,1)+"."+mid(shortbytes,2,2) + " PB"
case 17
  shortbytes  = str((val(left(bytestr,5))+ 5))
  prettysize = left(shortbytes,2)+"."+mid(shortbytes,3,2) + " PB"
case 18
  shortbytes = str((val(left(bytestr,6))+ 5))
  prettysize = left(shortbytes,3)+"."+mid(shortbytes,4,2) + " PB"
Case 19 
  shortbytes = str((val(left(bytestr,4))+ 5))
  prettysize = left(shortbytes,1)+"."+mid(shortbytes,2,2) + " EB"
case 20
  shortbytes  = str((val(left(bytestr,5))+ 5))
  prettysize = left(shortbytes,2)+"."+mid(shortbytes,3,2) + " EB"
case 21
  shortbytes = str((val(left(bytestr,6))+ 5))
  prettysize = left(shortbytes,3)+"."+mid(shortbytes,4,2) + " EB"
Case 22 
  shortbytes = str((val(left(bytestr,4))+ 5))
  prettysize = left(shortbytes,1)+"."+mid(shortbytes,2,2) + " ZB"
case 23
  shortbytes  = str((val(left(bytestr,5))+ 5))
  prettysize = left(shortbytes,2)+"."+mid(shortbytes,3,2) + " ZB"
case 24
  shortbytes = str((val(left(bytestr,6))+ 5))
  prettysize = left(shortbytes,3)+"."+mid(shortbytes,4,2) + " ZB"
Case 25 
  shortbytes = str((val(left(bytestr,4))+ 5))
  prettysize = left(shortbytes,1)+"."+mid(shortbytes,2,2) + " YB"
case 26
  shortbytes  = str((val(left(bytestr,5))+ 5))
  prettysize = left(shortbytes,2)+"."+mid(shortbytes,3,2) + " YB"
case 27
  shortbytes = str((val(left(bytestr,6))+ 5))
  prettysize = left(shortbytes,3)+"."+mid(shortbytes,4,2) + " YB"
Else
  prettysize = "Undefined"
End Select
return prettysize

I would say useless, because the objective is to “print” a string from a value obtained from a system call, and such value comes usually as a 64bit integer from modern OSs and it will never bigger than 18446744073709551615.

I’ll refrain on analyzing such code deeply. :smiley:

Well my point that if this could be modified to handle a string where the string length could be represented by int64. In that instance you would need an array of 1/3 that size to hold the friendly names after Yottabytes. And you probably really only need 3 case statements to handle #.##, ##.##, and ###.##. I was just thinking it would be interesting to see how a string based solution works. You could obviously add a str(x) in the routine so you could receive a double…turn it into a string. I don’t know how nasty those functions calls are. At some point you are sending a string back anyway.

I asked ChatGPT to find out what is the calculation basis for calculating the memory size according to the ISO/IEC 80000 chapter 13 standard, should it be divided by 1000 or 1024?

Chapter 13 of ISO/IEC 80000 provides a definition for binary (bi-) multiples used in computing and telecommunications to express quantities of digital data.

According to this standard, binary multiples are defined as powers of 2, not as powers of 10. This means that the prefix “kilo” (symbol “k”) corresponds to 2^10, or 1024, and not to 10 ^3, or 1000.

Thus, to calculate the memory size according to the ISO/IEC 80000 chapter 13 standard, it is necessary to divide by 1024, and not by 1000. For example, 1 kilobyte (1 KB) corresponds to 1024 bytes, and 1 mega- byte (1 MB) is 1024 kilobytes, or 1,048,576 bytes.

My question is: If I take the calculation base of 1000 bytes for 1KB, is it the same to calculate a file size as to calculate a memory size?

Thanks

1 Like

Another prove that ChatGPT is not a reliable source of information.

I advice you to read ISO/IEC 80000 chapter 13 standard instead of asking ChatGPT. Memory is usually reported as binary, it means a short 2^n multiple value (KiB, MiB, GiB, etc, not the past KB, MB, etc)

Media capacity, is reported using a short 10^n multiple value (KB, MB, GB, TB…)

That said,

is simply false using CURRENT standards. (Yes, it was ok in the past, I used it for decades)

3 Likes