| Ticket Hash: | b3d38f063f899f931799dc6181c230331bbf4e9c | ||
| Title: | dom jsonEscape \x7f should be escaped | ||
| Status: | Closed | Type: | Code_Defect |
| Severity: | Important | Priority: | Immediate |
| Subsystem: | Resolution: | Not_A_Bug | |
| Last Modified: |
2026-02-16 20:39:20 70.8 days ago |
Created: |
2026-02-02 17:47:23 84.9 days ago |
| Version Found In: | 0.9.6 | ||
| User Comments: | ||||
|
anonymous added on 2026-02-02 17:47:23:
Thanks for great tdom. I tried to include json string formatting from tdom to tcllib json module. When running the test suite, there is one major difference. The DEL control "\c7F" character is not escaped. dom jsonEscape \x7f IMHO, this is a non printable character and should be escaped. I tried download 0.9.6 with tcl 9.1a1rc0 on Windows 64 bit. The tcllib test is json-write-4.0.1. It is currently in branch "e15c2a7c-json-write-hao" Thanks again, Harald rolf added on 2026-02-04 00:24:43: Sorry, missed this again. I think I disagree. The character x7f is allowed literally in a JSON string. The jsonEscape method escapes all code points the JSON serializion require so and pass evertything else throu. Maybe json-write tries to do something more. Basically every code point in a json data load could be u escaped. If a writer choose to escape more then the necessary code points it is of course free to do so. String compare of serializied JSON strings by different writers very well may differ, although both writer work according to the spec. (I'm not aware of an established "canonical JSON serialization" but would not be too surprised if someone tried. For XML tdom has asCanonicalXML.) anonymous added on 2026-02-09 17:53:38: Thanks, great. At least, it would be very practical, if x7f would be represented by u007f. I have searched a long time to find the reason why the test fails. As "x7b" is the "Delete" control code, it is not shown on the console and a character may dissappear too. Anyway, you may close the bug, no problem. Thanks for all, Harald rolf added on 2026-02-09 23:05:56: Well, as I wrote a JSON writer may choose to \u escape every character (which would be perhaps a bit unhandy). The syntax rules are clear (see [https://www.rfc-editor.org/rfc/rfc8259#section-7]): string = quotation-mark *char quotation-mark char = unescaped / escape ( %x22 / ; " quotation mark U+0022 %x5C / ; \ reverse solidus U+005C %x2F / ; / solidus U+002F %x62 / ; b backspace U+0008 %x66 / ; f form feed U+000C %x6E / ; n line feed U+000A %x72 / ; r carriage return U+000D %x74 / ; t tab U+0009 %x75 4HEXDIG ) ; uXXXX U+XXXX escape = %x5C ; \ quotation-mark = %x22 ; " unescaped = %x20-21 / %x23-5B / %x5D-10FFFF And the sentence in the same section: "Any character may be escaped." JSON escapes every control character (more precise every C0 control character) _with the exception of \x7f_. At the moment the rule used by jsonEscape is clear: escape everythng that must be escaped according to the spec and write literally everything else. Other json writer work the same. See this sqlit3 example: package require sqlite3 sqlite3 db :memory: set json \"a\u007fb\" puts [db eval "SELECT json('$json')"] Though, what the tcllib json-writer does is not wrong. The output is correct. It may be even seen as more user friendly, because the character stands more out while you looking at the output, say for debuging reasons. But then: were to draw the line? \x7f is hard to see but so is \u200B. rolf added on 2026-02-16 20:39:20: Conversation seems have come to an end. Closing as "Not a Bug". | ||||