GumboParseFlags

Parse flags. We track the reasons for parser insertion of nodes and store them in a bitvector in the node itself. This lets client code optimize out nodes that are implied by the HTML structure of the document, or flag constructs that may not be allowed by a style guide, or track the prevalence of incorrect or tricky HTML code.

Values

ValueMeaning
GUMBO_INSERTION_NORMAL0

A normal node - both start and end tags appear in the source, nothing has been reparented.

GUMBO_INSERTION_BY_PARSER1 << 0

A node inserted by the parser to fulfill some implicit insertion rule. This is usually set in addition to some other flag giving a more specific insertion reason; it's a generic catch-all term meaning "The start tag for this node did not appear in the document source".

GUMBO_INSERTION_IMPLICIT_END_TAG1 << 1

A flag indicating that the end tag for this node did not appear in the document source. Note that in some cases, you can still have parser-inserted nodes with an explicit end tag: for example, "Text</html>" has GUMBO_INSERTED_BY_PARSER set on the <html> node, but GUMBO_INSERTED_END_TAG_IMPLICITLY is unset, as the </html> tag actually exists. This flag will be set only if the end tag is completely missing; in some cases, the end tag may be misplaced (eg. a </body> tag with text afterwards), which will leave this flag unset and require clients to inspect the parse errors for that case.

GUMBO_INSERTION_IMPLIED1 << 3

A flag for nodes that are inserted because their presence is implied by other tags, eg. <html>, <head>, <body>, <tbody>, etc.

GUMBO_INSERTION_CONVERTED_FROM_END_TAG1 << 4

A flag for nodes that are converted from their end tag equivalents. For example, </p> when no paragraph is open implies that the parser should create a <p> tag and immediately close it, while </br> means the same thing as <br>.

GUMBO_INSERTION_FROM_ISINDEX1 << 5

A flag for nodes that are converted from the parse of an <isindex> tag.

GUMBO_INSERTION_FROM_IMAGE1 << 6

A flag for <image> tags that are rewritten as <img>.

GUMBO_INSERTION_RECONSTRUCTED_FORMATTING_ELEMENT1 << 7

A flag for nodes that are cloned as a result of the reconstruction of active formatting elements. This is set only on the clone; the initial portion of the formatting run is a NORMAL node with an IMPLICIT_END_TAG.

GUMBO_INSERTION_ADOPTION_AGENCY_CLONED1 << 8

A flag for nodes that are cloned by the adoption agency algorithm.

GUMBO_INSERTION_ADOPTION_AGENCY_MOVED1 << 9

A flag for nodes that are moved by the adoption agency algorithm.

GUMBO_INSERTION_FOSTER_PARENTED1 << 10

A flag for nodes that have been foster-parented out of a table (or should've been foster-parented, if verbatim mode is set).

Meta