aboutsummaryrefslogtreecommitdiff
path: root/content
diff options
context:
space:
mode:
authorPaul Duncan <pabs@pablotron.org>2022-03-09 23:16:16 -0500
committerPaul Duncan <pabs@pablotron.org>2022-03-09 23:16:16 -0500
commit28c1f310bc45f98f27b59bb7890989e63d684d8e (patch)
treeed44de0bc0c7196d7e02b7ccb7ecc3d117e6bbcf /content
parent6d992bf825699d836859a031eff96fdb020b6a2a (diff)
downloadpablotron.org-28c1f310bc45f98f27b59bb7890989e63d684d8e.tar.bz2
pablotron.org-28c1f310bc45f98f27b59bb7890989e63d684d8e.zip
add posts/2022-03-09-fastest-js-html-escape.md
Diffstat (limited to 'content')
-rw-r--r--content/posts/2022-03-09-fastest-js-html-escape.md194
1 files changed, 194 insertions, 0 deletions
diff --git a/content/posts/2022-03-09-fastest-js-html-escape.md b/content/posts/2022-03-09-fastest-js-html-escape.md
new file mode 100644
index 0000000..ee28920
--- /dev/null
+++ b/content/posts/2022-03-09-fastest-js-html-escape.md
@@ -0,0 +1,194 @@
+---
+slug: fastest-js-html-escape
+title: "Fastest JavaScript HTML Escape"
+date: "2022-03-09T21:42:57-04:00"
+---
+What is the fastest [JavaScript][] [HTML][] escape implementation? To
+answer that question I did the following:
+
+1. Wrote [10 different JavaScript HTML escape implementations][impls].
+2. Created a web-based benchmarking tool which uses [web workers][] and the
+ [Performance API][] to test the with a variety of string sizes and
+ generates a downloadable [CSV][] of results.
+3. A set of scripts to aggregate and plot the results.
+
+## Results
+
+The times are from 64-bit [Chrome 99][chrome] running in [Debian][] on a
+[Lenovo Thinkpad X1 Carbon (9th Gen)][laptop]; the specific timing
+results may vary for your system, but the relative results should be
+comparable.
+
+The first chart shows implementations' mean call time (95% [CI][]) as
+the string length varies:
+
+{{< figure
+ src="/files/posts/fastest-js-html-escape/sizes.svg"
+ class=image
+ caption="String Size vs. HTML Escape Function Call Time (&mu;s)"
+>}}
+
+The second chart comparse implementations' mean call time (95% [CI][])
+for 3000 character strings:
+
+{{< figure
+ src="/files/posts/fastest-js-html-escape/times.svg"
+ class=image
+ caption="HTML Escape Function Call Times"
+>}}
+
+The red, blue, and green bars in this chart indicate the slow, medium,
+and fast functions, respectively.
+
+### Slow Functions
+
+Anything that uses a capturing [regular expression][re].
+
+#### Example: h2
+
+```js
+const h2 = (() => {
+ // characters to match
+ const M = /([&<>'"])/g;
+
+ // map of char to entity
+ const E = {
+ '&': '&amp;',
+ '<': '&lt;',
+ '>': '&gt;',
+ "'": '&apos;',
+ '"': '&quot;',
+ };
+
+ // build and return escape function
+ return (v) => v.replace(M, (_, c) => E[c]);
+})();
+```
+&nbsp;
+
+The capture is definitely at fault, because the call times for identical
+non-capturing implementations (example: `h4`) are comparable to
+everything else.
+
+### Medium Functions
+
+Except for the capturing [regular expression][re] implementations in the
+previous section, the remaining implementations' call times were comparable
+with one another. This includes:
+
+* Reducing an array of string literals and calling `replace()`.
+* Several variants of reducing an array of non-capturing [regular
+ expression][re] with `replace()`.
+
+#### Example: h4
+
+```js
+const h4 = (() => {
+ // characters to match
+ const M = /[&<>'"]/g;
+
+ // map of char to entity
+ const E = {
+ '&': '&amp;',
+ '<': '&lt;',
+ '>': '&gt;',
+ "'": '&apos;',
+ '"': '&quot;',
+ };
+
+ // build and return escape function
+ return (v) => v.replace(M, (c) => E[c]);
+})();
+```
+
+### Fast Functions
+
+Three implementations are slightly faster than the others. They all use
+`replaceAll()` and match on string literals. Their call times are
+indistinguishable from one another:
+
+* h7: Reduce, Replace All
+* h8: Reduce, Replace All, Frozen
+* h9: Replace All Literal
+
+#### Example: h7
+
+```js
+const h7 = (() => {
+ const E = [
+ ['&', '&amp;'],
+ ['<', '&lt;'],
+ ['>', '&gt;'],
+ ["'", '&apos;'],
+ ['"', '&quot;'],
+ ];
+
+ return (v) => E.reduce((r, e) => r.replaceAll(e[0], e[1]), v);
+})();
+```
+&nbsp;
+
+## The Winner: h9
+
+Even though the call times for `h7`, `h8`, and `h9` are
+indistinguishable, I actually prefer `h9` because:
+
+* The most legible. It is the easiest implementation to read for
+ beginning developers and developers who are uncomfortable with
+ functional programming.
+* The simplist parse (probably).
+* Slightly easier for browsers to optimize (probably).
+
+Here it is:
+
+```js
+// html escape (replaceall explicit)
+const h9 = (v) => {
+ return v.replaceAll('&', '&amp;')
+ .replaceAll('<', '&lt;')
+ .replaceAll('>', '&gt;')
+ .replaceAll("'", '&apos;')
+ .replaceAll('"', '&quot;');
+};
+```
+&nbsp;
+
+## Notes
+
+* The benchmarking interface, aggregation and plotting scripts, and
+ additional information are available in the [companion GitHub
+ repository][repo].
+* I also wrote a [DOM][]/`textContent` implementation, but I couldn't
+ compare it with the other implementations because [web workers][]
+ don't have [DOM][] access. I would be surprised if it was as fast as
+ the fast functions above.
+* `Object.freeze()` doesn't appear to help, at least not in
+ [Chrome][].
+
+
+[repo]: https://github.com/pablotron/fastest-js-html-escape
+ "Fastest JavaScript HTML Escape"
+[js]: https://en.wikipedia.org/wiki/ECMAScript
+ "JavaScript programming language."
+[html]: https://en.wikipedia.org/wiki/HTML
+ "HyperText Markup Language"
+[impls]: https://github.com/pablotron/fastest-js-html-escape/blob/main/public/common.js
+ "Variety of JavaScript HTML escape implementations."
+[web workers]: https://en.wikipedia.org/wiki/Web_worker
+ "JavaScript that runs in a background thread and communicates via messages with HTML page."
+[performance api]: https://developer.mozilla.org/en-US/docs/Web/API/Performance
+ "Web performance measurement API."
+[csv]: https://en.wikipedia.org/wiki/Comma-separated_values
+ "Comma-Separated Value file."
+[chrome]: https://www.google.com/chrome/
+ "Google Chrome web browser."
+[debian]: https://debian.org/
+ "Debian Linux distribution."
+[laptop]: https://en.wikipedia.org/wiki/ThinkPad_X1_series#X1_Carbon_(9th_Gen)
+ "Lenovo Thinkpad X1 Carbon (9th Gen)"
+[re]: https://en.wikipedia.org/wiki/Regular_expression
+ "Regular expression."
+[ci]: https://en.wikipedia.org/wiki/Confidence_interval
+ "Confidence interval."
+[dom]: https://en.wikipedia.org/wiki/Document_Object_Model
+ "Document Object Model"