Customizing the semantic HTML comparison
This library includes comparison and assert helpers that use the AngleSharp Diffing library to perform semantic HTML comparison.
On this page we will go through how the comparison works, and what options you have to affect the comparison process.
Why semantic comparison is needed for stable tests
Just performing string comparison of two strings containing HTML markup can break quite easily, even if the two markup strings are semantically equivalent. Some changes that can cause a regular string comparison to fail are as follows:
- Formatting of markup, e.g. with extra line breaks or indentions and changes to insignificant whitespace.
- Reordering of attributes. The order of attributes does not matter.
- Reordering of classes defined in the
class="..."attribute. The order of classes does not matter. - Change of boolean attributes to be implicit or explicit, e.g. from
required="required"torequired. - Changes to insignificant whitespace inside
<style>tags. - Changes to HTML comments and comments inside
<style>tags.
The AngleSharp Diffing library handles all those cases, so your tests are more stable.
Customizing Options
The AngleSharp Diffing library also allows us to customize the comparison process by adding special attributes to the "control" markup, i.e. the expected markup we want to use in verification.
All the customization options below will match with the following markup:
<header>
<h1 id="head-1">
Hello <em>world</em>
</h1>
</header>
Here are the customization options you have available to you:
Ignore comments (enabled by default): Comments in markup and inside
<style>tags are automatically ignored and not part of the comparison process.Ignore element: Use the
diff:ignoreattribute to ignore an element, all its attributes and its child nodes. For example, to ignore theh1element, do the following:<header> <h1 diff:ignore></h1> </header>Ignore attribute: To ignore an attribute during comparison, add the
:ignoremodifier to the attribute (no value is needed). For example, to ignore theidattribute:<header> <h1 id:ignore>Hello <em>world</em></h1> </header>Ignore children: Use the
diff:ignoreChildrenattribute (no value is needed) to ignore all child nodes/elements of an element. This does not include attributes. For example, to ignore all child nodes of theh1element, do the following:<header> <h1 id="head-1" diff:ignoreChildren></h1> </header>Ignore all attributes: Use the
diff:ignoreAttributesattribute (no value is needed) to ignore all attributes of an element. For example:<header> <h1 diff:ignoreAttributes>Hello <em>world</em></h1> </header>Note
The
diff:ignoreChildrenanddiff:ignoreAttributesattributes can be combined to ignore all child nodes/element and attributes of an element, but still verify that the element itself exists. For example:<header> <h1 diff:ignoreChildren diff:ignoreAttributes></h1> </header>Configure whitespace handling: By default, all nodes and elements are compared using the
Normalizewhitespace handling option. TheNormalizeoption will trim all text nodes and replace two or more whitespace characters with a single space character. The other options arePreserve, which will leave all whitespace unchanged, andRemoveWhitespaceNodes, which will only remove empty text nodes.To override the default option, use the
diff:whitespaceattribute, and pass one of the three options to it. For example:<header> <h1 id="head-1" diff:whitespace="preserve"> Hello <em>world</em> </h1> </header>Note
The default for
<pre>and<script>elements is thePreserveoption. To change that, use thediff:whitespaceattribute. For example:<header> <h1 id="head-1" diff:whitespace="RemoveWhitespaceNodes">Hello<em>world</em></pre> </header>Perform case-insensitive comparison: By default, all text comparison is case sensitive, but if you want to perform a case-insensitive comparison of text inside elements or attributes, use the
diff:ignoreCaseattribute on elements and:ignoreCasemodifier on attributes. For example, to perform a case insensitive comparison of the text in the followingh1element , do the following:<header> <h1 id="head-1" diff:ignoreCase>HeLLo <em>world</em></h1> </header>To perform case insensitive comparison of the text inside the
idattribute, do the following:<header> <h1 id:ignoreCase="HeAD-1">Hello <em>world</em></h1> </header>Use RegEx during comparison: To use a regular expression when comparing the text inside an element or inside an attribute, use the
diff:regexattribute on elements and the:regexmodifier on attributes.For example, to use a regular expression during comparison of the text in the
emelement, add thediff:regexattribute to the element and place the regular expression in the body of the element:<header> <h1 id="head-1">Hello <em diff:regex>\w</em></h1> </header>To use a regular expression during comparison of the text inside the
idattribute, add the:regexmodifier to the attribute and add the regular expression in the attribute's value:<h1 id:regex="head-\d{1}">...</h1>Note
The attribute modifiers
:ignoreCaseand:regexcan be combined, for example, as:attr:ignoreCase:regex="FOO-\d{4}"
Examples
Let’s look at a few examples where we use the semantic comparison options listed above to modify the comparison. In tests, we have the MarkupMatches() methods we can use to perform semantic comparison of the output from a rendered component. For example, we may have a component, <Heading>, that renders the following markup:
<h3 id="heading-1337" required>
Heading text
<small class="text-muted mark">
Secondary text
</small>
</h3>
In this case, we want to verify that the markup is rendered correctly, using something such as RegEx to verify the id attribute (it might be generated) and ignoring the <small> element. In tests we can do this like so with the MarkupMatches() method:
var cut = RenderComponent<Heading>();
// Assert
// Here we specify expected HTML from CUT.
var expectedHtml = @"<h3 id:regex=""heading-\d{4}"" required>
Heading text
<small diff:ignore></small>
</h3>";
// Here we use the HTML diffing library to assert that the rendered HTML
// from CUT is semantically the same as the expected HTML string above.
cut.MarkupMatches(expectedHtml);
}