#800 bug
Juriy Zaytsev

hasClassName/removeClassName to escape regex meta characters

Reported by Juriy Zaytsev | September 15th, 2009 @ 06:49 AM | in After 1.7

Comments and changes to this ticket

  • Juriy Zaytsev

    Juriy Zaytsev September 16th, 2009 @ 03:17 PM

    • State changed from “new” to “doc”
    • Tag changed from needs:doc, needs:failing_testcase, needs:patch, needs:tests, section:dom to needs:doc, section:dom

    Thinking more about this (and after discussion with JDD) I realized that it's probably best to leave things as is for perf. reasons. It's already possible to remove class with special chars simply by escaping them - someElement.hasClassName('foo\\[bar\\]').

    Marking as "Doc" to mention this peculiarity in docs.

  • Juriy Zaytsev

    Juriy Zaytsev September 26th, 2009 @ 07:43 PM

    • Assigned user changed from “Juriy Zaytsev” to “T.J. Crowder”
  • T.J. Crowder

    T.J. Crowder September 27th, 2009 @ 11:05 AM

    • Tag changed from needs:doc, section:dom to needs:patch, section:dom
    • State changed from “doc” to “bug”

    Can't say I'm comfortable with telling them to do their own regex escaping because our implementation uses regexes; our implementation isn't really their concern (and what if we change our implementation later so it doesn't?).

    Sometimes the old ways are the best:

    hasClassName: function(element, className) {
      if (!(element = $(element))) return;
      var elementClassName = element.className;
      return elementClassName == className
             || elementClassName.startsWith(className + ' ')
             || elementClassName.endsWith(' ' + className)
             || elementClassName.indexOf(' ' + className + ' ') >= 0;
    }
    

    Not only does it not have the bug, but (unoptimised) it's markedly faster on Chrome and Firefox than the original in all cases (matching out of several, matching out of one, no match out of several, no match out of one). On IE it's faster in the matching cases, but slightly slower in the non-matching cases. Optimisation may (or may not) improve it.

    I also tried this:

    hasClassName: function(element, className) {
      if (!(element = $(element))) return;
      return element.className.split(' ').include(className);
    }
    

    But it's slower in most cases than the original (even if I replace the include call with a boring old-fashioned loop).

    Barring objection, I'll take this as a bug and do a patch for both hasClassName and removeClassName, which also has this problem.

  • Tobie Langel

    Tobie Langel September 28th, 2009 @ 12:01 AM

    Have you benchmarked this:

    var WS = ' ';
    function hasClassName(element, className) {
      if (!(element = $(element))) return;
      return (WS + element.className + WS).indexOf(WS + className + WS) > -1;
    }
    
  • Juriy Zaytsev

    Juriy Zaytsev September 28th, 2009 @ 12:35 AM

    I'm pretty sure having variable instead of string inline will be slower.

  • Juriy Zaytsev

    Juriy Zaytsev September 28th, 2009 @ 01:02 AM

    T.J., escaping those characters is somewhat consistent with, for example, our selector implementation (and not only ours, but, IIRC, others too). It is also consistent with the way you write CSS rules - special characters such as ":", "[" and "]" - those that denote something special in CSS - are always escaped.

    Using plain whitespace in hasClassName is one way to avoid escaping, but then we have to understand that we replace HTML (4.01 or 5) defined whitespace character set with only one "\u0020" character. This means we'll fail to match classNames separated by, say, newlines or tabs (which HTML 4.01 and 5 allows).

    Note that HTML4.01 [1] and HTML5 [2] don't have identical whitespace definitions :)

    Whether we should try to be compliant in this regard or if it's an overkill is another question.

    I'm somewhat OK with using "\u0020" only, as long as we mention it in the docs. It would 1) avoid escaping, 2) make implementation faster, and 3) allow us to avoid choosing between HTML 4.01 or 5 definitions.

    I also know that some browsers normalize multiple whitespace into one, so maybe they normalize all whitespace characters to "\u0020" as well. I can't be sure about that though.

    [1] http://www.w3.org/TR/html401/struct/text.html#h-9.1 [2] http://www.w3.org/TR/html5/infrastructure.html#space-character

  • T.J. Crowder

    T.J. Crowder November 16th, 2009 @ 04:50 PM

    • Assigned user cleared.

    [responsible:none bulk edit command]

  • Andrew Dupont

    Andrew Dupont October 17th, 2010 @ 11:14 PM

    • Milestone changed from 1.7 to After 1.7
    • Importance changed from “” to “Low”

    Not gonna change behavior between now and 1.7. Pushing it back.

Please Sign in or create a free account to add a new ticket.

With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.

New-ticket Create new ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป

The Prototype JavaScript library.

Shared Ticket Bins

Pages