[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index]
[Thread Index]
- Subject: Re: [ANN] lua-pb Lua Protocol Buffers
- From: "Robert G. Jakabosky" <bobby@...>
- Date: Fri, 24 Jun 2011 22:03:28 -0700
On Friday 24, Josh Haberman wrote:
> Robert G. Jakabosky <bobby <at> sharedrealm.com> writes:
> > -- person.proto:
> > message Person {
> >
> > required string name = 1;
> > required int32 id = 2;
> > required string email = 3;
> >
> > }
> >
> > -- example_person.lua:
> > require"pb" -- first load lua-pb
> > require"person" -- load the above .proto file
> > msg = person.Person() -- create a Person message
>
> I'm all for magic where there's a benefit, but what is
> the benefit over:
No real benefit, I just though it would be cool.
> person = pb.load("person.proto")
This is still supported as:
person = pb.require"person"
or as an embedded .proto:
person = pb.load_proto([[
message Person { ... }
]])
>
> This seems clearer to me.
>
> If you're going the magical route, I think it makes more
> sense to load things into the top-level namespace, based
> on the package name specified in the .proto file:
>
> // person.proto
> package foo.bar
> message Person { ... }
>
> -- test.lua
> require "person"
> msg = foo.bar.Person()
Yes, this seems like a good idea. if the .proto has a defined package name,
then that will be used instead of what is passed to require().
> > I am planning on adding these methods to the message interface:
> > msg:MergeFrom(msg1)
> > msg:CopyFrom(msg1)
> > msg:Clear()
> > msg:IsInitialized()
> > msg:MergeFromString(str)
> > msg:ParseFromString(str)
> > msg:SerializeToString()
> > msg:SerializePartialToString()
> > msg:ByteSize()
>
> I'm thinking more and more that it makes sense to separate
> the in-memory representation of a protobuf from its
> serializations (text format, binary format, JSON, etc),
> both code-wise and API-wise.
>
> I have plans to write a C-based protobuf extension for
> Lua, and my plans were to have something like:
I am very interested to see how you implement the interface to nested C data
structures. I have done this for a private project and it is an interesting
problem (reference counting, nested structure, arrays of structures).
> -- These are as you mentioned, because they are not
> -- specific to any one serialization format.
> msg:Clear()
> msg:IsInitialized()
> msg:CopyFrom(msg1)
This one should be included too:
msg:MergeFrom(msg1) -- CopyFrom() clears the message first.
> -- These are specific to one or more serializations:
> pb.Serialize(msg)
> pb.SerializeText(msg)
> pb.SerializeJSON(msg)
> msg = pb.Parse(str)
> msg = pb.ParseText(str)
> msg = pb.ParseJSON(str)
Why not pass the format as a parameter?
msg:Serialize(format, ...)
msg:SerializePartial(format, ...) -- this doesn't check required fields.
msg:Parse(str, format, ...)
and use the binary format as the default if 'format' is not provided. New
formats can be 'registered' with the library.
> Don't worry, I don't plan to use the "pb" namespace --
> I'm planning to put everything under "upb", since that's
> the name of my project:
> https://github.com/haberman/upb/wiki
I have been keeping an eye on your 'upb' project for a long time (I think for
more then a year) waiting for it to become usable. I still can't wait to see
it finished. One of the reasons I started lua-pb is that I wanted to see how
close LuaJIT could get to the speed of your JIT'ed decoder (haven't optimized
the project for LuaJIT yet, so it is not even close right now).
> Some other things to consider:
>
> - do you plan to allow reparenting of nested messages? eg.
> msg.foo = Foo(). I ask because you say you're emulating
> Python proto, which does not allow this AFAIK and instead
> uses the C++ convention of: msg.mutable_foo. I've always
> thought this was awkward for a dynamic language, so plan
> to allow reparenting, but in that case you have to watch
> out for cycles that the user may create.
I don't plan on emulating every thing from the Python/C++ interface, and
msg.mutable_foo is something I don't want to do.
I was planning on allowing a message to be referenced by multiple parent
messages, but restricting messages to one parent will allow invalidating the
cached "byte size" of the parent message when a field is changed. Maybe I
will just add a "msg:Duplicate()" method.
As for cycles, for now I will just let Lua throw a stack overflow error.
Cyclic references can only be made by the programmer using the Lua interface,
a decoded message can't have cyclic references. Later I might add a max depth
setting for the encoder/decoder.
> - watch out for 64-bit integers, which lua_Number can't
> fully represent in its default configuration (since
> a double only has 52 bits of precision. Probably the best
> you can do is just warn your users about the loss of
> precision.
Yup, right now lua-pb doesn't round-trip (i.e. lost of precision) large 64-bit
integers. I am not sure what would be the best way to handle this in standard
Lua. Even if a message internally stored 64bit integers as binary 8byte
strings (packing/unpacking every time the field is accessed), Lua wouldn't be
able to work on the full value. I should atleast add support for preserving
64bit integer fields that are not changed by the Lua code.
--
Robert G. Jakabosky