1 files changed, 194 insertions, 0 deletions
diff --git a/content/posts/2022-03-09-fastest-js-html-escape.md b/content/posts/2022-03-09-fastest-js-html-escape.md
new file mode 100644
index 0000000..ee28920
--- /dev/null
+++ b/content/posts/2022-03-09-fastest-js-html-escape.md
@@ -0,0 +1,194 @@
+---
+slug: fastest-js-html-escape
+title: "Fastest JavaScript HTML Escape"
+date: "2022-03-09T21:42:57-04:00"
+---
+What is the fastest [JavaScript][] [HTML][] escape implementation?  To
+answer that question I did the following:
+
+1. Wrote [10 different JavaScript HTML escape implementations][impls].
+2. Created a web-based benchmarking tool which uses [web workers][] and the
+   [Performance API][] to test the with a variety of string sizes and
+   generates a downloadable [CSV][] of results.
+3. A set of scripts to aggregate and plot the results.
+
+## Results
+
+The times are from 64-bit [Chrome 99][chrome] running in [Debian][] on a
+[Lenovo Thinkpad X1 Carbon (9th Gen)][laptop]; the specific timing
+results may vary for your system, but the relative results should be
+comparable.
+
+The first chart shows implementations' mean call time (95% [CI][]) as
+the string length varies:
+
+{{< figure
+  src="/files/posts/fastest-js-html-escape/sizes.svg"
+  class=image
+  caption="String Size vs. HTML Escape Function Call Time (&mu;s)"
+>}}
+
+The second chart comparse implementations' mean call time (95% [CI][])
+for 3000 character strings:
+
+{{< figure
+  src="/files/posts/fastest-js-html-escape/times.svg"
+  class=image
+  caption="HTML Escape Function Call Times"
+>}}
+
+The red, blue, and green bars in this chart indicate the slow, medium,
+and fast functions, respectively.
+
+### Slow Functions
+
+Anything that uses a capturing [regular expression][re].
+
+#### Example: h2
+
+```js
+const h2 = (() => {
+  // characters to match
+  const M = /([&<>'"])/g;
+
+  // map of char to entity
+  const E = {
+    '&': '&amp;',
+    '<': '&lt;',
+    '>': '&gt;',
+    "'": '&apos;',
+    '"': '&quot;',
+  };
+
+  // build and return escape function
+  return (v) => v.replace(M, (_, c) => E[c]);
+})();
+```
+&nbsp;
+
+The capture is definitely at fault, because the call times for identical
+non-capturing implementations (example: `h4`) are comparable to
+everything else.
+
+### Medium Functions
+
+Except for the capturing [regular expression][re] implementations in the
+previous section, the remaining implementations' call times were comparable
+with one another.  This includes:
+
+* Reducing an array of string literals and calling `replace()`.
+* Several variants of reducing an array of  non-capturing [regular
+  expression][re] with `replace()`.
+
+#### Example: h4
+
+```js
+const h4 = (() => {
+  // characters to match
+  const M = /[&<>'"]/g;
+
+  // map of char to entity
+  const E = {
+    '&': '&amp;',
+    '<': '&lt;',
+    '>': '&gt;',
+    "'": '&apos;',
+    '"': '&quot;',
+  };
+
+  // build and return escape function
+  return (v) => v.replace(M, (c) => E[c]);
+})();
+```
+
+### Fast Functions
+
+Three implementations are slightly faster than the others.  They all use
+`replaceAll()` and match on string literals.  Their call times are
+indistinguishable from one another:
+
+* h7: Reduce, Replace All
+* h8: Reduce, Replace All, Frozen
+* h9: Replace All Literal
+
+#### Example: h7
+
+```js
+const h7 = (() => {
+  const E = [
+    ['&', '&amp;'],
+    ['<', '&lt;'],
+    ['>', '&gt;'],
+    ["'", '&apos;'],
+    ['"', '&quot;'],
+  ];
+
+  return (v) => E.reduce((r, e) => r.replaceAll(e[0], e[1]), v);
+})();
+```
+&nbsp;
+
+## The Winner: h9
+
+Even though the call times for `h7`, `h8`, and `h9` are
+indistinguishable, I actually prefer `h9` because:
+
+* The most legible.  It is the easiest implementation to read for
+  beginning developers and developers who are uncomfortable with
+  functional programming.
+* The simplist parse (probably).
+* Slightly easier for browsers to optimize (probably).
+
+Here it is:
+
+```js
+// html escape (replaceall explicit)
+const h9 = (v) => {
+  return v.replaceAll('&', '&amp;')
+    .replaceAll('<', '&lt;')
+    .replaceAll('>', '&gt;')
+    .replaceAll("'", '&apos;')
+    .replaceAll('"', '&quot;');
+};
+```
+&nbsp;
+
+## Notes
+
+* The benchmarking interface, aggregation and plotting scripts, and
+  additional information are available in the [companion GitHub
+  repository][repo].
+* I also wrote a [DOM][]/`textContent` implementation, but I couldn't
+  compare it with the other implementations because [web workers][]
+  don't have [DOM][] access.  I would be surprised if it was as fast as
+  the fast functions above.
+* `Object.freeze()` doesn't appear to help, at least not in
+  [Chrome][].
+
+
+[repo]: https://github.com/pablotron/fastest-js-html-escape
+  "Fastest JavaScript HTML Escape"
+[js]: https://en.wikipedia.org/wiki/ECMAScript
+  "JavaScript programming language."
+[html]: https://en.wikipedia.org/wiki/HTML
+  "HyperText Markup Language"
+[impls]: https://github.com/pablotron/fastest-js-html-escape/blob/main/public/common.js
+  "Variety of JavaScript HTML escape implementations."
+[web workers]: https://en.wikipedia.org/wiki/Web_worker
+  "JavaScript that runs in a background thread and communicates via messages with HTML page."
+[performance api]: https://developer.mozilla.org/en-US/docs/Web/API/Performance
+  "Web performance measurement API."
+[csv]: https://en.wikipedia.org/wiki/Comma-separated_values
+  "Comma-Separated Value file."
+[chrome]: https://www.google.com/chrome/
+  "Google Chrome web browser."
+[debian]: https://debian.org/
+  "Debian Linux distribution."
+[laptop]: https://en.wikipedia.org/wiki/ThinkPad_X1_series#X1_Carbon_(9th_Gen)
+  "Lenovo Thinkpad X1 Carbon (9th Gen)"
+[re]: https://en.wikipedia.org/wiki/Regular_expression
+  "Regular expression."
+[ci]: https://en.wikipedia.org/wiki/Confidence_interval
+  "Confidence interval."
+[dom]: https://en.wikipedia.org/wiki/Document_Object_Model
+  "Document Object Model"