Update some xml.cc comments and add encoding and standalone declarations

Explanatory comments within, but this came from reading the XML 1.0
spec. See some history:

http://seclists.org/nmap-dev/2013/q4/48 - Added DOCTYPE
http://seclists.org/nmap-dev/2014/q2/82 - broke Zenmap
http://seclists.org/nmap-dev/2014/q3/331 - reduced DOCTYPE
This commit is contained in:
dmiller 2014-10-29 14:46:11 +00:00
parent c7e5ad67b5
commit 183566948f

18
xml.cc
View file

@ -124,12 +124,13 @@
This is a simple library for writing XML. It handles two main things:
keeping track of the element stack, and escaping text where necessary.
If you wanted to write this XML:
<?xml version="1.0"?>
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!DOCTYPE elem>
<elem name="&amp;10.5"></elem>
these are the functions you would call. Each one is followed by the text
it prints enclosed in ||.
xml_start_document() |<?xml version="1.0"?>|
xml_start_document("elem") |<?xml version="1.0" encoding="UTF-8" standalone="yes"?>\n<!DOCTYPE elem>|
xml_newline(); |\n|
xml_open_start_tag("elem"); |<elem|
xml_attribute("name", "&%.1f", 10.5); | name="&amp;10.5"|
@ -166,7 +167,7 @@ Additional functions are
xml_write_raw Raw unescaped output.
xml_write_escaped XML-escaped output.
xml_write_escaped_v XML-escaped output, with a va_list.
xml_start_document Writes <?xml version="1.0"?>.
xml_start_document Writes <?xml version="1.0" encoding="UTF-8" standalone="yes"?>\n<!DOCTYPE elem>.
xml_depth Returns the size of the element stack.
The library makes it harder but not impossible to make non-well-formed
@ -308,7 +309,7 @@ int xml_write_escaped_v(const char *fmt, va_list va) {
return 0;
}
/* Write the XML declaration: <?xml version="1.0"?>
/* Write the XML declaration: <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
* and the DOCTYPE declaration: <!DOCTYPE rootnode>
*/
int xml_start_document(const char *rootnode) {
@ -316,6 +317,15 @@ int xml_start_document(const char *rootnode) {
return -1;
if (xml_attribute("version", "1.0") < 0)
return -1;
/* Practically, Nmap only uses ASCII, but UTF-8 encompasses ASCII and allows
* for future expansion */
if (xml_attribute("encoding", "UTF-8") < 0)
return -1;
/* This indicates that parsers don't have to go elsewhere for entity
* declarations and so forth. We had trouble with this when we defined a
* PUBLIC doctype. */
if (xml_attribute("standalone", "yes") < 0)
return -1;
if (xml_close_pi() < 0)
return -1;
if (xml_newline() < 0)