|
David Kastrup wrote:
most small strings that we deal with are indexes in (huge) font tables and constants; there hashed strings as well as keys save a lot of space and time since lots of string comparisons take place and many same strings occurs all over the place (not unlike tex's control sequences, which are also hashed)Oh, actually I was not concerned about large strings as much rather than about many smaller ones that are repeatedly getting interned into Lua.
indeed, and returning values instead of a table in the associated functions may also speed up the process; but simplicity and staying close to lua's way of doing things is also worth a price; concerning the tokenizer callback (which is what this is about) ... using numbers instead of strings (them being hashed or simple strings) makes sense anyway since in tex they are numbers; there are related functions that map them to symbolic names (strings);For some callbacks getting information from TeX, Taco switched the data passed into Lua from strings to integers and reportedly got a speedup of about 10 from that.
since in lua numbers are floating point numbers, integer base types may speed up things, but since cpu's that tex run on have fp support it's no big deal either; i'm also pretty sure that this code is rather optimized in lua
concerning tex ... we may loose some time doing some things in lua, but also gain a lot using lua, for instance replacing the web2c file handling by lua filehandling speeds thsi part up by 50%; in due time we will report on such matters in journals and manuals.
eventually hyphenation patterns will be visible to lua too and is quite interesting in some respect; but that part is under reconstruction at the tex side as wellNot really especially: control sequence names is very much what TeX hashes. There is another hash table for hyphenation exceptions, and the hyphenation trie also works with some sort of hashing, but those hashes are not really interesting to pass into Lua as far as I can see.
as said, currently we're satisfied with the way lua works and optimizations only will be considered when lua itself is drastically revised; in that respect, the experimental lpeg features will have more impact than extendions to the string
model Hans