-
-
Notifications
You must be signed in to change notification settings - Fork 920
Description
When running the test suite on Rubinius, it results in consistent crashes. From what I've gathered, it seems that assumptions about what _private
contains in xmlNode
are not always correct.
Here a VALUE is stored inside _private
:
https://github.com/sparklemotion/nokogiri/blob/master/ext/nokogiri/xml_node.c#L1441-L1442
The crash happens when calling the mark function for an XML node:
#6 0x00007ffff424636c in mark (node=0x3be03b0) at ../../../../ext/nokogiri/xml_node.c:17
https://github.com/sparklemotion/nokogiri/blob/master/ext/nokogiri/xml_node.c#L17
This uses the DOC_RUBY_OBJECT macro. This macro assumes that _private
inside the document for this node is a nokogiriTuplePtr
.
https://github.com/sparklemotion/nokogiri/blob/master/ext/nokogiri/xml_document.h#L18
In this case however, node->doc
is not a document, but a XML_DOCUMENT_FRAG_NODE
, so actually it's an xmlNode
that stores a VALUE in _private
:
(gdb) p node
$16 = (xmlNodePtr) 0x3be03b0
(gdb) p *node
$17 = {_private = 0x28779a0, type = XML_COMMENT_NODE, name = 0x7fffef8f3704 "comment", children = 0x0, last = 0x0,
parent = 0x0, next = 0x0, prev = 0x0, doc = 0x2603850, ns = 0x0, content = 0x3be0430 "moo", properties = 0x0,
nsDef = 0x0, psvi = 0x0, line = 0, extra = 0}
(gdb) p *node->doc
$18 = {_private = 0x2877ca0, type = XML_DOCUMENT_FRAG_NODE, name = 0x0, children = 0x3161fc0, last = 0x3161fc0,
parent = 0x0, next = 0x0, prev = 0x0, doc = 0x5085590, compression = 0, standalone = 0, intSubset = 0x0,
extSubset = 0x0, oldNs = 0x0, version = 0x0, encoding = 0x0, ids = 0x41, refs = 0x2603850,
URL = 0x2603850 "�|\207\002", charset = 39860304, dict = 0x0, psvi = 0x3be04b0, parseFlags = 0, properties = 0}
This can also be seen because the _private
pointer of node
and node->doc
are both very similar and allocated in the memory area where Rubinius allocates the storage behind a VALUE.
I have no idea why this doesn't also crash on MRI, but I suspect it's the conservative garbage collection that doesn't try to GC something inside rb_gc_mark that doesn't look like a valid heap pointer. That way it works as an implementation side effect and actually still is a bug in the implementation.