Handling bad JSON

Wondering the best way to handle JSON that has formatting problems.

This is the JSON I’m dealing with, it comes directly from YouTube: Dropbox - playlist.json - Simplify your life

It has a formatting problem with the following portion:

{"accessibilityData":{"label":"Play More Than You Know - Axwell \/\\\ Ingrosso - 3 minutes, 24 seconds"}},"accessibilityPauseData":{"accessibilityData":{"label":"Pause More Than You Know - Axwell \/\\\ Ingrosso - 3 minutes, 24 seconds"}}

That’s because the artist in the playlist has a forward slash and backslash in their name “Axwell /\ Ingrosso” and YouTube is escaping it with an extra backslash, which can be appropriate by some escaping standards, but apparently is not what Xojo is expecting. An online validator also flags this as problematic, but several of my JSON viewers handle it just fine.

Apropos, when I try to load it into a JSONItem, I get: lexical error: inside a string, '' occurs before a character which it may not.

But the rest of the data in the JSON is fine except for these keys. Unfortunately there doesn’t seem to be a way to parse the rest since I can’t load it at all into a JSONItem.

So how do you manage this? Is there a way to load the JSONItem anyway and ignore just the problematic key/value pairs?

To be clear, I know I can just do a ReplaceAll("/\\\\ ", "/\\ ") to fix this instance of the issue. But this is just one of many different escaping issues caused by how YouTube does things in their JSON.

And boy, even the forum software is changing how these escapes are typed.

You can try to use a different JSON library. However, it’s your job to clean up the bad data.

you can pipe the json to python -m json.tool (at least on macos)
you would get the position of any error if present.

Invalid \escape: line 1 column 69 (char 68)

Public Function FixEscapeSpaceJSON(jsonString As String) As String
  Return jsonString.ReplaceAll("\\",Chr(1))_
  .ReplaceAll("\ ","\u0020")_
  .ReplaceAll(Chr(1),"\\")
End Function

1 Like

@Beatrix_Willius Will give it a try with MBS later but I was hoping to avoid refactoring all my code.

@Jean-Yves_Pochez Sadly it seems that python is no longer preinstalled on newer versions of MacOS, and it doesn’t ship with Windows either. I have it manually installed in my installation but I can’t count on my users having it.

@Rick_Araujo Interesting, so we’re replacing double-backslash with a placeholder, then single backslash with a unicode space, then changing the placeholder back to double-backslash? But won’t this break other escaped characters like \" ?

Incorrect.

We are protecting double backslashes to avoid misinterpretations of edge cases, then looking into such result for the target case "\ " (backslash+space), replacing it with a proper escape for such case, put the double backslashes back.

1 Like

Ahhhh I see it, I didn’t notice the whitespace after the \ on my phone but on desktop it’s more obvious. Thanks for that snippet, will give it a try!

1 Like

Theoretically this version could be few clock pulses faster

Public Function FixEscapeSpaceJSON(jsonString As String) As String
  Return jsonString.ReplaceAll("\\",&u1)_
  .ReplaceAll("\ ","\u0020")_
  .ReplaceAll(&u1,"\\")
End Function
2 Likes