Hey folks
I'm going over some crash reports for Hammerspoon (so we're talking about macOS here) and I've got several where the crash is happening inside Lua (5.4.2) in a way I can't explain. I would love it if someone has some ideas as to why. Unfortunately, because it's just a crash report uploaded from a user's machine with no real context, I have no idea how to reproduce it.
Firstly, here is the raw crash dump:
Exception Type: EXC_BAD_ACCESS (SIGBUS)
Exception Codes: BUS_NOOP at 0x0000000000000002
Crashed Thread: 0
Application Specific Information:
Exception 1, Code 2, Subcode 8 >
Attempted to dereference garbage pointer 0x2.
Thread 0 Crashed:
0 libsystem_kernel.dylib 0xfffe4076670e read$NOCANCEL
1 libsystem_c.dylib 0xfffe405b4589 __srefill1
2 libsystem_c.dylib 0xfffe405af0a9 __fread
3 libsystem_c.dylib 0xfffe405aef16 fread
4 LuaSkin 0x108dfb84f [inlined] read_all (liolib.c:540)
5 LuaSkin 0x108dfb84f g_read (liolib.c:591)
6 LuaSkin 0x108dfdd1b luaD_precall (ldo.c:519)
7 LuaSkin 0x108df6200 luaV_execute (lvm.c:1615)
8 LuaSkin 0x108dfddb9 ccall (ldo.c:564)
9 LuaSkin 0x108debc18 luai_objcttry (lobjectivec_exceptions.m:84)
10 LuaSkin 0x108dfe26b [inlined] luaD_rawrunprotected (ldo.c:160)
11 LuaSkin 0x108dfe26b luaD_pcall (ldo.c:800)
12 LuaSkin 0x108e0164a lua_pcallk (lapi.c:1031)
13 LuaSkin 0x108de0d1b -[LuaSkin protectedCallAndTraceback:nresults:] (Skin.m:368)
14 internal.so 0x1176427f2 eventtap_callback (internal.m:48)
Starting from the bottom, eventtap_callback is called by the OS when a relevant HID event happens, and it asks our Lua wrapper (LuaSkin) to call a Lua function the user has supplied and that we stored a lua reference to. That happens in protectedCallAndTraceback (essentially just a lua_pcall() wrapper with debug.traceback in the stack as the message handler).
Then we see the pcall bits, then luai_objcttry which is from the only patch we apply to Lua - taken from Eric Wing's LuaCocoa, to allow Lua errors to fall nicely back into Objective C Land.
Then we see ccall, and tracing through to luaD_precall in ldo.c:519 we get to:
n = (*f)(L);
and f seemingly points to g_read since that is next in the backtrace, except this is where I get confused - the declaration of g_read() is:
static int g_read (lua_State *L, FILE *f, int first);
so the second two parameters aren't being passed any values. I believe this is undefined behaviour in C, but the "garbage pointer 0x2" suggests to me that clang is passing them as NULL or zero or something along those lines.
From there it's not surprising something is going to crash, but my real question here is - how come g_read() could ever be the function pointer that Lua is calling? Does that suggest some kind of other memory corruption in my code somewhere, and the fact that it happens to end up pointing at g_read() is a coincidence? I'm finding that a little hard to believe because I've had 3 reports of this crash in the last week. Surely this means I'm misreading the backtrace somehow?
So there we are, I am lost and confused, and a little disappointed that this particular crash report didn't come through with a register dump for some reason.
Cheers,
--