lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


After reviewing the code, I realized that it suffers from a couple of
problems: (1) it doesn't handle the numbers that exceed integer on
your platform (as %d overflows), (2) it doesn't correctly deal with
numbers with decimal points (although nobody would probably expect it
to).

It's possible that for natural sorting none of the two matters, but I
wanted to improve the algorithm and came up with the version that
doesn't seem to have these problems: it can handle numbers with up to
999 digits, handle leading zeroes, and handle digits after a decimal
point. It is only a tad longer than the original one:

function alphanumsort(o)
  local function padnum(d) local t, z, r = string.match(d, "(%.?)(0*)(.+)")
    return #t == 1 and ("%s%02d%03d%s"):format(t, 99-#z, #r, r) --
after decimal point
                    or ("%s%03d%s%02d"):format(t, #r, r, 99-#z) end --
no decimal point
  table.sort(o, function(a,b)
    return tostring(a):gsub("%.?%d+",padnum)
         < tostring(b):gsub("%.?%d+",padnum) end)
  return o
end

If you don't care about sorting 1.01 before 1.1, then you can use a
simpler version:

function alphanumsort(o)
  local function padnum(d) local z, r = string.match(d, "(0*)(.+)")
    return ("%03d%s%02d"):format(#r, r, 99-#z) end
  table.sort(o, function(a,b)
    return tostring(a):gsub("%d+",padnum) < tostring(b):gsub("%d+",padnum) end)
  return o
end

I updated the post with the results of this new script:
http://notebook.kulchenko.com/algorithms/alphanumeric-natural-sorting-for-humans-in-lua.

Paul.

On Sat, Aug 25, 2012 at 1:37 PM, Paul K <paulclinger@yahoo.com> wrote:
> Hi Sergey,
>
>> This won't work in case of "a012" vs "a12". It will think the strings are
>> identical, but they're not. Windows places "a012" first. It's easy to fix by
>> simply returning s1 < s2 in case padded strings are the same. The limitation
>
> I'm not sure if this is by coincidence or not (as it may be just
> treated as the same value, which means that the sort order will be
> arbitrarily picked between 012 and 12), but it should be easy to fix
> the logic if you want a specific order; just add the length:
>
> function alphanumsort(o)
>   local function padnum(d) return
> ("%012d"):format(d)..("%02d"):format(12-#d) end
>   table.sort(o, function(a,b)
>     return tostring(a):gsub("%d+",padnum) < tostring(b):gsub("%d+",padnum) end)
>   return o
> end
>
> Use format(12-#d) if you want 012 before 12 or format(#d) if you want
> 12 before 012 (this is assuming the length is 12 or less). I'll update
> the blog post to reflect this.
>
> Paul.
>
> On Sat, Aug 25, 2012 at 7:52 AM, GrayFace <sergroj@mail.ru> wrote:
>> On 25.08.2012 6:03, Ignacio Burgueño wrote:
>>>
>>> Something like natural sort may suit you, perhaps?
>>>
>>>
>>> http://notebook.kulchenko.com/algorithms/alphanumeric-natural-sorting-for-humans-in-lua
>>
>>
>> This won't work in case of "a012" vs "a12". It will think the strings are
>> identical, but they're not. Windows places "a012" first. It's easy to fix by
>> simply returning s1 < s2 in case padded strings are the same. The limitation
>> of 12 digits is a harder one.
>>
>> Here's an implementation I've just made, I really like it:
>> -- return s1 < s2, but with natural comparison of numbers:"2" < "12"
>> function alnumcmp(s1, s2)  -- return s1 < s2
>>     if s1 == s2 then  return false  end
>>     local n1, n2 = 1, 1
>>     repeat
>>         local i1, i2, c1, c2
>>         i1, c1, n1 = s1:match("0*(%d*)([^%d]*)()", n1)
>>         i2, c2, n2 = s2:match("0*(%d*)([^%d]*)()", n2)
>>         if #i1 ~= #i2 then  return #i1 < #i2  end
>>         if i1 ~= i2 then  return i1 < i2  end
>>         if c1 ~= c2 then  return (n1 > #s1 and c1 or c1.."0") < (n2 > #s2
>> and c2 or c2.."0")  end
>>     until n1 > #s1 and n2 > #s2
>>     return s1 < s2
>> end
>>
>> --
>> Best regards,
>> Sergey Rozhenko                 mailto:sergroj@mail.ru
>>
>>