Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Merged from trunk. |
---|---|
Downloads: | Tarball | ZIP archive | SQL archive |
Timelines: | family | ancestors | descendants | both | schema |
Files: | files | file ages | folders |
SHA3-256: |
1fc373ed259f37a7804bb01c4718ade2 |
User & Date: | rolf 2020-05-13 23:50:18 |
Context
2020-05-14
| ||
23:12 | There is still a bit work left to do in checkElementEnd. check-in: 9f3926e748 user: rolf tags: schema | |
2020-05-13
| ||
23:52 | Merged from schema. check-in: f63309e59e user: rolf tags: wip | |
23:50 | Merged from trunk. check-in: 1fc373ed25 user: rolf tags: schema | |
23:49 | Added method clearString to the dom command. check-in: 5300f428a6 user: rolf tags: trunk | |
2020-05-02
| ||
00:41 | Merge the blunder in without documentation. check-in: a88689ecce user: rolf tags: schema | |
Changes
Changes to CHANGES.
1 2 3 4 5 6 7 |
2019-12-31 Rolf Ade <rolf@pointsman.de> Updated to expat 2.2.9. 2018-10-12 Rolf Ade <rolf@pointsman.de> Updated to expat 2.2.6. |
> > > > |
1 2 3 4 5 6 7 8 9 10 11 |
2020-05-14 Rolf Ade <rolf@pointsman.de> Added method clearString to the dom command. 2019-12-31 Rolf Ade <rolf@pointsman.de> Updated to expat 2.2.9. 2018-10-12 Rolf Ade <rolf@pointsman.de> Updated to expat 2.2.6. |
Changes to doc/dom.xml.
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
...
503
504
505
506
507
508
509
510
511
512
513
514
515
516
|
investigated or modified with the full range of the doc and node methods. Please note that the element node names and the text node values within the tree may be outside of what the appropriate XML productions allow.</desc> </optdef> <optdef> <optname>-jsonmaxnesting</optname> <optarg>integer</optarg> <desc>This option only has effect if used together with the <m>-json</m> option. The current implementation uses recursive descent JSON parser. In order to avoid using excess stack space, any JSON input that has more than a certain levels of nesting is considered invalid. The default maximum nesting is 2000. The option -jsonmaxnesting allows the user to adjust that.</desc> </optdef> <optdef> <optname>--</optname> <desc>The option <m>--</m> marks the end of options. While respected in general this option is only needed in case of parsing JSON data, which may start with a ................................................................................ <command><cmd>dom</cmd> <method>isCharData</method> <m>string</m></command> <desc>Returns 1 if every character in <m>string</m> is a valid XML Char according to production 2 of the <ref href="http://www.w3.org/TR/2000/REC-xml-20001006.html">XML 1.0</ref> recommendation. Otherwise it returns 0.</desc> </commanddef> <commanddef> <command><cmd>dom</cmd> <method>isBMPCharData</method> <m>string</m></command> <desc>Returns 1 if every character in <m>string</m> is a valid XML Char with a Unicode code point within the Basic Multilingual Plane (that means, that every character within the string is at most 3 bytes long). Otherwise it returns 0.</desc> |
>
>
>
>
>
>
>
>
|
>
>
>
>
>
>
>
>
>
>
>
>
|
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
...
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
|
investigated or modified with the full range of the doc and node methods. Please note that the element node names and the text node values within the tree may be outside of what the appropriate XML productions allow.</desc> </optdef> <optdef> <optname>-jsonroot <document element name></optname> <desc>If given makes the given element name the document element of the resulting doc. The parsed content of the JSON string will be the childs of this document element node.</desc> </optdef> <optdef> <optname>-jsonmaxnesting</optname> <optarg>integer</optarg> <desc>This option only has effect if used together with the <m>-json</m> option. The current implementation uses a recursive descent JSON parser. In order to avoid using excess stack space, any JSON input that has more than a certain levels of nesting is considered invalid. The default maximum nesting is 2000. The option -jsonmaxnesting allows the user to adjust that.</desc> </optdef> <optdef> <optname>--</optname> <desc>The option <m>--</m> marks the end of options. While respected in general this option is only needed in case of parsing JSON data, which may start with a ................................................................................ <command><cmd>dom</cmd> <method>isCharData</method> <m>string</m></command> <desc>Returns 1 if every character in <m>string</m> is a valid XML Char according to production 2 of the <ref href="http://www.w3.org/TR/2000/REC-xml-20001006.html">XML 1.0</ref> recommendation. Otherwise it returns 0.</desc> </commanddef> <commanddef> <command><cmd>dom</cmd> <method>clearString</method> <m>string</m></command> <desc>Returns the string given as argument cleared out from any characters not allowed as XML parsed character data.</desc> </commanddef> <commanddef> <command><cmd>dom</cmd> <method>isBMPCharData</method> <m>string</m></command> <desc>Returns 1 if every character in <m>string</m> is a valid XML Char with a Unicode code point within the Basic Multilingual Plane (that means, that every character within the string is at most 3 bytes long). Otherwise it returns 0.</desc> |
Changes to generic/dom.c.
350 351 352 353 354 355 356 357 358 359 360 361 362 363 |
if (clen > 4) return 0; if (UTF8_XMLCHAR((unsigned const char *)p,clen)) p += clen; else return 0; } return 1; } /*--------------------------------------------------------------------------- | domIsBMPChar | \--------------------------------------------------------------------------*/ int domIsBMPChar ( |
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 |
if (clen > 4) return 0; if (UTF8_XMLCHAR((unsigned const char *)p,clen)) p += clen; else return 0; } return 1; } /*--------------------------------------------------------------------------- | domClearString | \--------------------------------------------------------------------------*/ char * domClearString ( char *str, int *haveToFree ) { const char *p, *s; char *p1, *clearedstr; int clen, i, rewrite = 0; p = str; while (*p) { clen = UTF8_CHAR_LEN(*p); if (clen > 4 || !UTF8_XMLCHAR((unsigned const char*)p,clen)) { rewrite = 1; break; } p += clen; } if (!rewrite) { *haveToFree = 0; return str; } s = p; p += clen; while (*p) p++; clearedstr = MALLOC (sizeof(char) * (p-str)); p1 = clearedstr; while (str < s) { *p1 = *str; p1++; str++; } str += clen; while (*str) { clen = UTF8_CHAR_LEN(*str); if (clen <= 4 && UTF8_XMLCHAR((unsigned const char*)str,clen)) { for (i = 0; i < clen; i++) { *p1 = *str; p1++; str++; } } else { str += clen; } } *p1 = '\0'; *haveToFree = 1; return clearedstr; } /*--------------------------------------------------------------------------- | domIsBMPChar | \--------------------------------------------------------------------------*/ int domIsBMPChar ( |
Changes to generic/dom.h.
838 839 840 841 842 843 844 845 846 847 848 849 850 851 |
void tcldom_tolower (const char *str, char *str_out, int len); int domIsNAME (const char *name); int domIsPINAME (const char *name); int domIsQNAME (const char *name); int domIsNCNAME (const char *name); int domIsChar (const char *str); int domIsBMPChar (const char *str); int domIsComment (const char *str); int domIsCDATA (const char *str); int domIsPIValue (const char *str); void domCopyTo (domNode *node, domNode *parent, int copyNS); void domCopyNS (domNode *from, domNode *to); domAttrNode * domCreateXMLNamespaceNode (domNode *parent); |
> |
838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 |
void tcldom_tolower (const char *str, char *str_out, int len);
int domIsNAME (const char *name);
int domIsPINAME (const char *name);
int domIsQNAME (const char *name);
int domIsNCNAME (const char *name);
int domIsChar (const char *str);
char * domClearString (char *str, int *haveToFree);
int domIsBMPChar (const char *str);
int domIsComment (const char *str);
int domIsCDATA (const char *str);
int domIsPIValue (const char *str);
void domCopyTo (domNode *node, domNode *parent, int copyNS);
void domCopyNS (domNode *from, domNode *to);
domAttrNode * domCreateXMLNamespaceNode (domNode *parent);
|
Changes to generic/tcldom.c.
216 217 218 219 220 221 222 223 224 225 226 227 228 229 .... 6243 6244 6245 6246 6247 6248 6249 6250 6251 6252 6253 6254 6255 6256 6257 6258 6259 6260 .... 6835 6836 6837 6838 6839 6840 6841 6842 6843 6844 6845 6846 6847 6848 6849 6850 6851 6852 6853 6854 6855 6856 6857 6858 6859 6860 6861 6862 6863 6864 6865 6866 6867 6868 6869 6870 6871 6872 6873 6874 .... 7090 7091 7092 7093 7094 7095 7096 7097 7098 7099 7100 7101 7102 7103 |
) " createNodeCmd ?-returnNodeCmd? ?-tagName name? ?-jsonType jsonType? ?-namespace URI? (element|comment|text|cdata|pi)Node cmdName \n" " setStoreLineColumn ?boolean? \n" " setNameCheck ?boolean? \n" " setTextCheck ?boolean? \n" " setObjectCommands ?(automatic|token|command)? \n" " isCharData string \n" " isComment string \n" " isCDATA string \n" " isPIValue string \n" " isName string \n" " isQName string \n" " isNCName string \n" " isPIName string \n" ................................................................................ jsonRoot = Tcl_GetString(objv[1]); } else { SetResult("The \"dom parse\" option \"-jsonroot\" " "expects the document element name of the " "DOM tree to create as argument."); return TCL_ERROR; } if (!domIsNAME(jsonRoot)) { SetResult("-jsonroot value: not a valid element name"); return TCL_ERROR; } objv++; objc--; continue; case o_simple: takeSimpleParser = 1; objv++; objc--; continue; case o_html: ................................................................................ Tcl_Interp * interp, int objc, Tcl_Obj * const objv[] ) { GetTcldomTSD() char * method, tmp[300]; int methodIndex, result, i, bool; Tcl_CmdInfo cmdInfo; Tcl_Obj * mobjv[MAX_REWRITE_ARGS]; static const char *domMethods[] = { "createDocument", "createDocumentNS", "createNodeCmd", "parse", "setStoreLineColumn", "isCharData", "isName", "isPIName", "isQName", "isComment", "isCDATA", "isPIValue", "isNCName", "createDocumentNode", "setNameCheck", "setTextCheck", "setObjectCommands", "featureinfo", "isBMPCharData", #ifdef TCL_THREADS "attachDocument", "detachDocument", #endif NULL }; enum domMethod { m_createDocument, m_createDocumentNS, m_createNodeCmd, m_parse, m_setStoreLineColumn, m_isCharData, m_isName, m_isPIName, m_isQName, m_isComment, m_isCDATA, m_isPIValue, m_isNCName, m_createDocumentNode, m_setNameCheck, m_setTextCheck, m_setObjectCommands, m_featureinfo, m_isBMPCharData #ifdef TCL_THREADS ,m_attachDocument, m_detachDocument #endif }; static const char *nodeModeValues[] = { "automatic", "command", "token", NULL ................................................................................ CheckArgs(3,3,2,"feature") return tcldom_featureinfo(clientData, interp, --objc, objv+1); case m_isBMPCharData: CheckArgs(3,3,2,"string"); SetBooleanResult(domIsBMPChar(Tcl_GetString(objv[2]))); return TCL_OK; } SetResult( dom_usage); return TCL_ERROR; } #ifdef TCL_THREADS |
> > < < < < | | | | > > > > > > > > > > > > |
216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 .... 6245 6246 6247 6248 6249 6250 6251 6252 6253 6254 6255 6256 6257 6258 .... 6833 6834 6835 6836 6837 6838 6839 6840 6841 6842 6843 6844 6845 6846 6847 6848 6849 6850 6851 6852 6853 6854 6855 6856 6857 6858 6859 6860 6861 6862 6863 6864 6865 6866 6867 6868 6869 6870 6871 6872 .... 7088 7089 7090 7091 7092 7093 7094 7095 7096 7097 7098 7099 7100 7101 7102 7103 7104 7105 7106 7107 7108 7109 7110 7111 7112 7113 |
) " createNodeCmd ?-returnNodeCmd? ?-tagName name? ?-jsonType jsonType? ?-namespace URI? (element|comment|text|cdata|pi)Node cmdName \n" " setStoreLineColumn ?boolean? \n" " setNameCheck ?boolean? \n" " setTextCheck ?boolean? \n" " setObjectCommands ?(automatic|token|command)? \n" " isCharData string \n" " clearString string \n" " isBMPCharData string \n" " isComment string \n" " isCDATA string \n" " isPIValue string \n" " isName string \n" " isQName string \n" " isNCName string \n" " isPIName string \n" ................................................................................ jsonRoot = Tcl_GetString(objv[1]); } else { SetResult("The \"dom parse\" option \"-jsonroot\" " "expects the document element name of the " "DOM tree to create as argument."); return TCL_ERROR; } objv++; objc--; continue; case o_simple: takeSimpleParser = 1; objv++; objc--; continue; case o_html: ................................................................................ Tcl_Interp * interp, int objc, Tcl_Obj * const objv[] ) { GetTcldomTSD() char * method, tmp[300], *clearedStr; int methodIndex, result, i, bool; Tcl_CmdInfo cmdInfo; Tcl_Obj * mobjv[MAX_REWRITE_ARGS], *newObj; static const char *domMethods[] = { "createDocument", "createDocumentNS", "createNodeCmd", "parse", "setStoreLineColumn", "isCharData", "isName", "isPIName", "isQName", "isComment", "isCDATA", "isPIValue", "isNCName", "createDocumentNode", "setNameCheck", "setTextCheck", "setObjectCommands", "featureinfo", "isBMPCharData", "clearString", #ifdef TCL_THREADS "attachDocument", "detachDocument", #endif NULL }; enum domMethod { m_createDocument, m_createDocumentNS, m_createNodeCmd, m_parse, m_setStoreLineColumn, m_isCharData, m_isName, m_isPIName, m_isQName, m_isComment, m_isCDATA, m_isPIValue, m_isNCName, m_createDocumentNode, m_setNameCheck, m_setTextCheck, m_setObjectCommands, m_featureinfo, m_isBMPCharData, m_clearString #ifdef TCL_THREADS ,m_attachDocument, m_detachDocument #endif }; static const char *nodeModeValues[] = { "automatic", "command", "token", NULL ................................................................................ CheckArgs(3,3,2,"feature") return tcldom_featureinfo(clientData, interp, --objc, objv+1); case m_isBMPCharData: CheckArgs(3,3,2,"string"); SetBooleanResult(domIsBMPChar(Tcl_GetString(objv[2]))); return TCL_OK; case m_clearString: CheckArgs(3,3,2,"string"); clearedStr = domClearString (Tcl_GetString (objv[2]), &bool); if (bool) { newObj = Tcl_NewStringObj (clearedStr, -1); FREE (clearedStr); Tcl_SetObjResult (interp, newObj); } else { Tcl_SetObjResult (interp, objv[2]); } return TCL_OK; } SetResult( dom_usage); return TCL_ERROR; } #ifdef TCL_THREADS |
Changes to tests/dom.test.
1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 |
} {0} test dom-3.41 {isPIValue} { dom isPIValue "some invalid processing instruction data?>" } {0} test dom-4.1 {-useForeignDTD 0} { set doc [dom parse -useForeignDTD 0 {<root/>}] $doc delete } {} test dom-4.2 {-useForeignDTD 1 with document with internal subset} {need_uri} { set baseURI [tdom::baseURL [file join [pwd] [file dir [info script]] dom.test]] |
> > > > > > > > > > > > > > > > > > > > > |
1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 |
} {0} test dom-3.41 {isPIValue} { dom isPIValue "some invalid processing instruction data?>" } {0} test dom-3.43 {clearString} { set result [list] foreach str { \u0001 a\u0002 \u0003b a\u0004b a\u0004\u0005b a\u0004c\u0005b a\u0004d\u0005\u0006b a\u0004d\u0005\uD800\uD801\uD802_foo_bar \uD800\uD801\uD802_foo_bar_baz\uD802_didum\uDFFF \uD800\uD801\uD802_foo_bar_baz\uD802_didum\uE000 \u0004\u0005\uDABC abc } { lappend result [dom clearString $str] } set result } [list {} a b ab ab acb adb ad_foo_bar _foo_bar_baz_didum _foo_bar_baz_didum\uE000 {} abc] test dom-4.1 {-useForeignDTD 0} { set doc [dom parse -useForeignDTD 0 {<root/>}] $doc delete } {} test dom-4.2 {-useForeignDTD 1 with document with internal subset} {need_uri} { set baseURI [tdom::baseURL [file join [pwd] [file dir [info script]] dom.test]] |