#79 bug
frank kootte

ObjectRange isn't inclusive when used on strings

Reported by frank kootte | May 6th, 2008 @ 12:40 PM | in After 1.7

As mentioned on the api pages one should be carefull with the ObjectRange object considering strings.

Even though the behaviour should be considered undesirable.

$A($R('a','b')) -> delivers as expected, an array containing the values [ 'a', 'b']

$A($R('aa', 'bb')) -> delivers unexpected

one would expect an array containing all following values

['aa', 'ab', 'ba', 'bb' ].

as described in the api the objectrange iterates over all characters in the current table but the method does return an array not containing the value 'bb' which is the least i would expect as it does behave like that when working with a numeric range.

i've got some suggestions on how to handle this, so if any inspiration / information needed please email me.

Comments and changes to this ticket

  • Tobie Langel

    Tobie Langel May 6th, 2008 @ 01:24 PM

    • State changed from “new” to “bug”
    • Title changed from “ObjectRange returning unusable results” to “ObjectRange isn't inclusive when used on strings ”
  • John-David Dalton

    John-David Dalton May 6th, 2008 @ 02:29 PM

    frank kootte -> why not just post your patch to this ticket?

  • frank kootte

    frank kootte May 6th, 2008 @ 03:36 PM

    i've created a piece of code which indeed does as i suggested, but it is more "proof-of-concept" code then code which can be added to prototype, i will rewrite it to fit the prototype branch.

  • frank kootte

    frank kootte May 6th, 2008 @ 04:03 PM

    ok, some suggestions :

    • we only work with positive ranges, so if we encounter a negative one, we swap the start and the end.
    • we need to distinguish between numerical and string based ranges as the string range needs some recursive / while computation that the numeric range does not need
    • we only work with pure string or numerical ranges as it just akward to have a range that is from for example 'a1' -> 'dd', so we only work with 11 -> 99 || 'aa' -> 'zz'
    • we only work with string ranges where the start and end have the same length.

    does anybody have an objection against changes to be made based on these suggestions ?

  • Juriy Zaytsev

    Juriy Zaytsev May 6th, 2008 @ 11:10 PM

    I might be missing something but ObjectRange simply delegates iteration logic to #succ method of its "start" object. #succ method is currently defined for both - Number.prototype and String.prototype, so the behavior is indeed different. I also could not reproduce "inclusiveness" bug:

    $R('a', 'd').toArray(); // => ['a','b','c','d']
    $R('a', 'd', true).toArray(); // => ['a','b','c']
    

    The "expected outcome" is imo somewhat subjective in this case. Current implementation increments character based on its ASCII code, while you are proposing to output a collection of possible combinations (or is it permutations?).

    I think the proposed change could actually make more sense.

  • Juriy Zaytsev

    Juriy Zaytsev May 7th, 2008 @ 05:40 AM

    Ok, there is a bug (#succ keeps incrementing ASCII code and everything fails once 65535 is reached).

    I'll try to make a patch as soon as I can.

  • Juriy Zaytsev

    Juriy Zaytsev May 7th, 2008 @ 06:08 AM

    A patch that allows #succ to work with strings that have any number of characters

  • Juriy Zaytsev

    Juriy Zaytsev May 7th, 2008 @ 06:08 AM

    • Milestone set to 1.7
    • Assigned user set to “Juriy Zaytsev”
  • frank kootte

    frank kootte May 7th, 2008 @ 09:12 AM

    You're correct about the problem when reaching the boundaries of the ASCII table, but i'm still in favor of changing the general behaviour of ranges with strings.

    When i want a range like the [a,b][a,b], i expect the following result.

    $R ('aa', 'bb') // -> ['aa', 'ab', 'ba', 'bb']
    

    but what ObjectRange returns is an huge array by iterating over the ASCII table until it will find 'b', but it will never get there.

    perhaps this term expected is not suitable, but i think the change i mentioned would give back more usable results.

  • frank kootte

    frank kootte May 7th, 2008 @ 10:20 AM

    I wrote a quick fix to show you my train of though, it's not completely like i want it, e.g. not all border exceptions are taken care of but it most surely does the trick.

    ObjectRange._each = function (iterator)
    {
        if(typeof (this.start) == 'string')
        {
        	var size = this.start.length;
        	
    		(function ( start, end, prefix  )
    		{
    			var length = start.length;
    				
    			for ( var index = 0; index < length; index++ )
    			{
    				var current = start [ index ];
    				var stop = end [ index ];
    					
    				while ( this.exclusive ? current.charCodeAt () < stop.charCodeAt () : current.charCodeAt () <= stop.charCodeAt () )
    				{					
    					if ( start [ index + 1 ] != undefined && end [ index + 1 ] != undefined )
    					{
    						arguments.callee ( start.slice ( 1 ), end.slice( 1 ), prefix + current );
    					}
    					else if ( ( prefix + current ).length == size )
    					{
    						iterator ( prefix + current );						
    					}
    					else
    					{
    						break;
    					}
    					
    					current = current.succ ();
    				}
    			}	
    		
    		}).bind(this)( this.start.toArray(), this.end.toArray(), '' );
        }
        else
        {
        	var value = this.start;
        	while (this.include(value)) 
        	{
          		iterator(value);
          		value = value.succ();
        	}
        }
      };
    
  • frank kootte

    frank kootte May 7th, 2008 @ 10:22 AM

    // replaced tabs with double spaced
    if(typeof (this.start) == 'string')
        {
          var size = this.start.length;
        	
    	  (function ( start, end, prefix  )
    	  {
    	   	var length = start.length;
    				
    		for ( var index = 0; index < length; index++ )
    		{
    		  var current = start [ index ];
    		  var stop = end [ index ];
    				
    		  while ( this.exclusive ? current.charCodeAt () < stop.charCodeAt () : current.charCodeAt () <= stop.charCodeAt () )
    		  {					
    		    if ( start [ index + 1 ] != undefined && end [ index + 1 ] != undefined )
    			{
    			  arguments.callee ( start.slice ( 1 ), end.slice( 1 ), prefix + current );
    			}
    			else if ( ( prefix + current ).length == size )
    			{
    			  iterator ( prefix + current );						
    			}
    			else
    			{
    			  break;
    			}
    					
    		    current = current.succ ();
    		  }
    		}	
    	  }).bind(this)( this.start.toArray(), this.end.toArray(), '' );
        }
        else
        {
          var value = this.start;
          while (this.include(value)) 
          {
            iterator(value);
          	value = value.succ();
          }
        }
    
  • Samuel Lebeau

    Samuel Lebeau December 4th, 2008 @ 07:42 AM

    • Tag set to lang, needs_tests, string

    @franch, the behavior your describing corresponds to an hypothetical Enumerable#permutations(count)which would give all permuations [1] of count elements in the given Enumerable.

    I don't this is the desired behavior for Range, as it was originally designed to mimic Ruby's Range.

    By the way, in Ruby ("aa".."zz").to_a.size # => 676 and "z".succ # => "a" whereas in Prototype $R('aa', 'bb').toArray().length // => 65439 and "z".succ() // => "{"

    [1] http://en.wikipedia.org/wiki/Per...

  • Lo0kBeHiNdU

    Lo0kBeHiNdU March 1st, 2009 @ 03:35 PM

    
    var $R = function(start,end) {
    	ret = [];
    	var words = "a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z".split(",");
    	var combinations = [];
    	
    	for(i = 0; i < words.length; i++) {
    		var cur = words[i];
    		for(j = 0; j < words.length; j++)
    			combinations.push( cur + words[j]);
    	}
    							
    	if( Object.isNumber(start ))
    		for( i = start; i < end + 1; i++)
    			ret[ret.length] = i;
    	else
    		for( n = combinations.indexOf(start); n < combinations.indexOf(end) + 1; n++)
    			ret[ret.length] = combinatons[n];
    	
    	return ret;
    }
    

    Combinations is an array with all the possible alphabet letter combinations in the correct order. This will make this $R("ad","ba") to return "ad","ae","af","ag"......."az","ba"

  • Kit Goncharov

    Kit Goncharov April 17th, 2009 @ 02:48 AM

    @Lo0kBeHiNdU The problem with using that method is that it covers too few permutations: uppercase characters, for example are not included. Attempting to create such a range results in either an empty array (i.e., $R('AD', 'BA'), or an unusable range (i.e., $R('aD', 'ba') simply ignores the uppercase character).

    While the method could be potentially extended to include all uppercase characters, performance then begins to suffer. Enumerable#permutations also seems like a good idea although, again, I'm not sure if there is an effective way to implement this in JavaScript without affecting either performance or function in some way. :(

  • Lo0kBeHiNdU

    Lo0kBeHiNdU May 16th, 2009 @ 11:50 AM

    function $R(start,end,filter) {
        var ret = [],
            letters = "A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z,a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z".split(","),
            combinations = [];
        
        if( toString.call(filter) === "[object String]" )
            filter = window[filter];
        
        filter = filter || function(set) { return set; };
    
        for(var i = 0, l = letters.length; i < l; ++i) {
            combinations[combinations.length] = letters[i];
            for(var j = 0; j < l; ++j)
                combinations[combinations.length] = letters[i] + letters[j];
        }
        
        if(filter === window.LETTERS_ONLY)
            combinations = letters;
            
        if( toString.call(start) === "[object Number]" )
            for(var i = start; i < end + 1; ++i)
                ret[ret.length] = i;
        else {
            for(var set = combinations, i = set.indexOf(start); i < set.indexOf(end) + 1; ++i) {
                ret[ret.length] = set[i];
            }
        }
                
        ret = filter(ret);
        
        return ret;
    }
    
    // Allow words whose all letters are uppercased
    function ONLY_UPPER(set) {
        return set.grep(function(l) { return l.split("").every(function(val) { return /[A-Z]/.test(val); }); });
    }
    
    // Allow words whose all letters are lowercased
    function ONLY_LOWER(set) {
        return set.grep(function(l) { return l.split("").every(function(val) { return /[a-z]/.test(val); }); });
    }
    
    // Allow words which at least have a lowercase character
    function LET_LOWER(set) {
        return set.grep(function(l) { return /[a-z]/.test(l); });
    }
    
    // Allow words which at least have an uppercase character
    function LET_UPPER(set) {
        return set.grep(function(l) { return /[A-Z]/.test(l); });
    }
    
    // Get only the alphabet letters from the range
    function LETTERS_ONLY(set) {
        return set;
    }
    

    I think this covers all issues.

    PS: I tested the performace with the max range length "2704" and the time to complete was 129 ms. 1/10 of a second.

  • Samuel Lebeau

    Samuel Lebeau June 1st, 2009 @ 11:40 AM

    • Tag changed from lang, needs_tests, string to lang, patched, string, tested

    @kangax: As your fix was quite a lot of code I decided to try something.

    Here is my take, with missing unit tests.

  • Samuel Lebeau

    Samuel Lebeau June 5th, 2009 @ 03:49 PM

    BTW, the current behavior of String#succ is not so useful at application level.
    Maybe we should consider sticking to its Ruby equivalent : http://github.com/evanphx/rubinius/blob/master/spec/frozen/core/str...

  • Tobie Langel

    Tobie Langel June 5th, 2009 @ 04:20 PM

    Samuel: agreed. However, wouldn't that be extremely costly ?

  • Samuel Lebeau

    Samuel Lebeau June 5th, 2009 @ 04:43 PM

    I guess it would be costly in terms of LOC but not necessarily in terms of computation (Rubinius implementation is 53 lines of Ruby...).

    Currently, assuming String#succ is fixed by one of the patches we submitted with kangax, computing $A($R('aa', 'bb')) for instance is extremely costly and the result is quite useless...

  • Juriy Zaytsev

    Juriy Zaytsev June 5th, 2009 @ 05:00 PM

    Can we agree that Range is not something that should be in Prototype's core? Does anyone really use alpha-based ranges in web development (unless it is a word puzzles web application or something, in which case one would probably need a more specialized solution anyway).

    I don't think it's reasonable to spend time on this. I would rather see Range as a separate, officially supported plugin. Then it could be implemented in a more through way.

    Does this make sense?

  • Samuel Lebeau

    Samuel Lebeau June 5th, 2009 @ 07:15 PM

    Range itself is rather useful in real applications and in the framework itself, and should stay in core IMHO. The String#succ addition I was suggesting may not be indeed.

  • Juriy Zaytsev

    Juriy Zaytsev June 5th, 2009 @ 07:22 PM

    Samuel, I can see usefulness of numeric Ranges, but employing a bunch of code to make $R('aa', 'zz') work seems like an unnecessary burden.

  • Tobie Langel

    Tobie Langel July 24th, 2009 @ 01:59 AM

    • Tag changed from lang, patched, string, tested to patched, section:lang

    [not-tagged:"lang" tagged:"section:lang" bulk edit command]

  • T.J. Crowder

    T.J. Crowder November 16th, 2009 @ 04:50 PM

    • Assigned user cleared.

    [responsible:none bulk edit command]

  • Tobie Langel

    Tobie Langel February 25th, 2010 @ 10:44 PM

    • Tag changed from patched, section:lang to patched, priority:low, section:lang
  • Tobie Langel
  • Tobie Langel
  • Tobie Langel

    Tobie Langel March 1st, 2010 @ 12:20 AM

    • Tag changed from patched, priority:low, section:lang to patched, section:lang
  • Andrew Dupont

    Andrew Dupont October 17th, 2010 @ 10:21 PM

    • Milestone changed from 1.7 to After 1.7
    • Importance changed from “” to “Low”

    Pushing this back one milestone because of its low priority.

Please Sign in or create a free account to add a new ticket.

With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.

New-ticket Create new ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป

The Prototype JavaScript library.

Shared Ticket Bins

Attachments

Pages