tDOM

View Ticket
Login

View Ticket

2026-02-16
20:39 Closed ticket [b3d38f063f]: dom jsonEscape \x7f should be escaped plus 4 other changes artifact: 86412a2202 user: rolf
2026-02-09
23:05 Ticket [b3d38f063f]: 3 changes artifact: 94a557e0d5 user: rolf
17:53 Ticket [b3d38f063f]: 3 changes artifact: 5152afe7d4 user: anonymous
2026-02-04
00:24 Ticket [b3d38f063f]: 5 changes artifact: 3ea08bf272 user: rolf
2026-02-02
17:47 New ticket [b3d38f063f]. artifact: 0d1df0d96a user: anonymous

Ticket Hash: b3d38f063f899f931799dc6181c230331bbf4e9c
Title: dom jsonEscape \x7f should be escaped
Status: Closed Type: Code_Defect
Severity: Important Priority: Immediate
Subsystem: Resolution: Not_A_Bug
Last Modified: 2026-02-16 20:39:20
70.8 days ago
Created: 2026-02-02 17:47:23
84.9 days ago
Version Found In: 0.9.6
User Comments:
anonymous added on 2026-02-02 17:47:23:

Thanks for great tdom.

I tried to include json string formatting from tdom to tcllib json module.

When running the test suite, there is one major difference.

The DEL control "\c7F" character is not escaped.

dom jsonEscape \x7f

IMHO, this is a non printable character and should be escaped.

I tried download 0.9.6 with tcl 9.1a1rc0 on Windows 64 bit.

The tcllib test is json-write-4.0.1. It is currently in branch "e15c2a7c-json-write-hao"

Thanks again, Harald


rolf added on 2026-02-04 00:24:43:

Sorry, missed this again.

I think I disagree. The character x7f is allowed literally in a JSON string. The jsonEscape method escapes all code points the JSON serializion require so and pass evertything else throu.

Maybe json-write tries to do something more. Basically every code point in a json data load could be u escaped. If a writer choose to escape more then the necessary code points it is of course free to do so. String compare of serializied JSON strings by different writers very well may differ, although both writer work according to the spec.

(I'm not aware of an established "canonical JSON serialization" but would not be too surprised if someone tried. For XML tdom has asCanonicalXML.)


anonymous added on 2026-02-09 17:53:38:

Thanks, great.

At least, it would be very practical, if x7f would be represented by u007f. I have searched a long time to find the reason why the test fails. As "x7b" is the "Delete" control code, it is not shown on the console and a character may dissappear too.

Anyway, you may close the bug, no problem.

Thanks for all, Harald


rolf added on 2026-02-09 23:05:56:
Well, as I wrote a JSON writer may choose to \u escape every character (which
would be perhaps a bit unhandy). The syntax rules are clear (see
[https://www.rfc-editor.org/rfc/rfc8259#section-7]):

      string = quotation-mark *char quotation-mark

      char = unescaped /
          escape (
              %x22 /          ; "    quotation mark  U+0022
              %x5C /          ; \    reverse solidus U+005C
              %x2F /          ; /    solidus         U+002F
              %x62 /          ; b    backspace       U+0008
              %x66 /          ; f    form feed       U+000C
              %x6E /          ; n    line feed       U+000A
              %x72 /          ; r    carriage return U+000D
              %x74 /          ; t    tab             U+0009
              %x75 4HEXDIG )  ; uXXXX                U+XXXX

      escape = %x5C              ; \

      quotation-mark = %x22      ; "

      unescaped = %x20-21 / %x23-5B / %x5D-10FFFF

And the sentence in the same section:

"Any character may be escaped."

JSON escapes every control character (more precise every C0 control character) _with the exception of \x7f_.

At the moment the rule used by jsonEscape is clear: escape everythng that must
be escaped according to the spec and write literally everything else.

Other json writer work the same. See this sqlit3 example:

package require sqlite3
sqlite3 db :memory:
set json \"a\u007fb\"
puts [db eval "SELECT json('$json')"]

Though, what the tcllib json-writer does is not wrong. The output is correct.
It may be even seen as more user friendly, because the character stands more
out while you looking at the output, say for debuging reasons.

But then: were to draw the line? \x7f is hard to see but so is \u200B.

rolf added on 2026-02-16 20:39:20:

Conversation seems have come to an end. Closing as "Not a Bug".