|
On 3/25/2012 9:49 AM, David Manura wrote:
On Thu, Mar 22, 2012 at 2:43 PM, KHMan wrote:(a) load data, parse, dump into two arrays [...] lua-5.2.0 0.261 0.274 0.285 lua-5.2.1wk1 shortlen=16 0.180 0.188 0.204 shortlen=32 0.224 0.201 0.220No difference in word count (wc): http://www.lua.org/pil/21.2.1.html . [snip]
Yeah, I don't think there will be a lot of difference for most workloads. I am also curious on how well the random hash seed is working compared to 5.2.0.
The samples I posted are extreme examples that use strings mostly of a particular type (crypto hash hex keys and relative paths) and the large differences in short/long string usage are amplified through reps. Mainly, I wanted to get a feel of the tradeoffs... though I rather like shortlen=64.
Would be nice to see actual apps that use extreme levels of long string lookups or longstr==longstr compares, but they may be a rarity. On the other hand, for long-running apps, slightly faster loading of text due to less string interning does not matter that much.
Intern on first compare is not easy to implement, there are a lot of complications. A quick-and-dirty attempt at using long string hash values to avoid extra memcmp() work didn't really work (a few % slower than default lua-5.2.1wk1) because in the examples I am using, there are a lot of equal compares (76%) which means memcmp() is still mostly needed. Short string compares is still much faster.
-- Cheers, Kein-Hong Man (esq.) Kuala Lumpur, Malaysia