LuaPlus Callback Dispatcher 1.00

Written by Joshua Jensen (jjensen@workspacewhiz.com)

http://wwhiz.com/LuaPlus/index.html


LuaPlus contains many useful Lua enhancements.  In the past, the advanced callback dispatching to C++ functions provided by LuaPlus was only available as part of the LuaPlus distribution.  This release marks the first version of the LuaPlus Callback Dispatcher that works directly with the original Lua 5 code base.

Callback Dispatching

Typical C-style Lua callbacks come in the form:

int Callback(lua_State* state)

Callbacks must be static or global.  They can not be C++ class member functions.  For C++ users, this approach is limiting as the programmer has to write the non-member callback and then dispatch through to the C++ member function that actually does the work:

int TheGlobalCallback(lua_State* L)
{
    // Retrieve the class's 'this' pointer, usually through an upvalue, but possibly as the calling userdata or table.
    MyClass* myClass = // The retrieved pointer
    return myClass->TheMemberCallback(L);
}

int MyClass::TheMemberCallback(lua_State* L)
{
    const char* str = lua_tostring(L, 1);
    printf("%s\n", str);
    return 0;
}

The LuaPlus Callback Dispatcher provides a means whereby C++ member functions, either virtual or non-virtual, may be called directly.  These "functor" objects are even capable of calling global and static functions, just as before.  In the example above, TheMemberCallback() can be registered and called directly, without the need for TheGlobalCallback to exist.

Additionally, the LuaPlus Callback Dispatcher allows direct calling of C or C++ functions, without knowing anything about a lua_StateTheMemberCallback() function above could be represented more naturally:

void MyClass::TheDirectMemberCallback(const char* str)
{
    printf("%s\n", str);
}

In an ideal environment, the callback shim would automatically be generated for us.  With some C++ template metaprogramming, we are able to accomplish this with ease.  Best of all, the performance of the call to the either style of callback function is as high or higher than if C-style methodologies had been employed to accomplish the same feat.

Finally, the LuaPlus Callback Dispatcher can call the same member functions with differing 'this' pointers.

Basic Callbacks

There are few differences between pushing a standard Lua C-closure to the stack and pushing a functor to the Lua stack.  The overloaded function lua_pushfunctorclosure() is used to push functor closures for global/static functions, non-virtual member functions, and virtual member functions. 

An example follows:

static int LS_LOG(lua_State* L)
{
    printf("In static function\n");
    return 0;
}


class Logger
{
public:
    int LS_LOGMEMBER(lua_State* L)
    {
        printf("In member function.  Message: %s\n", lua_tostring(L, 1));
        return 0;
    }

    virtual int LS_LOGVIRTUAL(lua_State* L)
    {
        printf("In virtual member function\n");
        return 0;
    }
};

lua_pushstring(L, "LOG");
lua_pushfunctorclosure(L, LS_LOG, 0);
lua_settable(L, LUA_GLOBALSINDEX);

lua_dostring(L, "LOG()");

Logger logger;
lua_pushstring(L, "LOGMEMBER");
lua_pushfunctorclosure(L, logger, Logger::LS_LOGMEMBER, 0);
lua_settable(L, LUA_GLOBALSINDEX);

lua_dostring(L, "LOGMEMBER('The message')");

lua_pushstring(L, "LOGVIRTUAL");
lua_pushfunctorclosure(L, logger, Logger::LS_LOGVIRTUAL, 0);
lua_settable(L, LUA_GLOBALSINDEX);

lua_dostring(L, "LOGVIRTUAL()");

Object Dispatch Functors

Even though lua_pushfunctorclosure() can dispatch to C++ member functions, it uses a 'this' pointer as provided by the second argument passed to the function.  The 'this' pointer is constant, and lua_pushfunctorclosure() is not suited for mirroring class hierarchies in Lua.

The solution to the 'this' pointer issue is through lua_pushobjectfunctorclosure().  It is a specialized form of lua_pushfunctorclosure() where a 'this' pointer isn't provided during the closure registration.  Instead, it is retrieved from either the calling userdata or the calling table's __object member, which must be a full or light userdata.

As an example, we want to mirror a class called MultiObject:

class MultiObject
{
public:
    MultiObject(int num) :
        m_num(num)
    {
    }

    int Print(lua_State* state)
    {
        printf("%d\n", m_num);
        return 0;
    }

    void Print2(int num)
    {
        printf("%d %d\n", m_num, num);
    }

protected:
    int m_num;
};

The best way to implement C++ objects mirrored in Lua is through metatables.  We'll start by creating a metatable and adding the MultiObject::Print() function to it.

// Create the metatable.
lua_newtable(L);
lua_pushstring(L, "__index");
lua_pushvalue(L, -2);
lua_settable(L, -3);

// Add the Print function.
lua_pushstring(L, "Print");
lua_pushobjectfunctorclosure(L, MultiObject::Print, 0);
lua_settable(L, -3);

Now, we'll give two C++ objects implementations in Lua called obj1 and obj2.  We set each Lua table's metatable to be the metatable we created above:

MultiObject obj1(10);
lua_pushstring(L, "obj1");
lua_boxpointer(L, &obj1);
lua_pushvalue(L, -3);
lua_setmetatable(L, -2);
lua_settable(L, LUA_GLOBALSINDEX);

MultiObject obj2(20);
lua_pushstring(L, "obj2");
lua_boxpointer(L, &obj2);
lua_pushvalue(L, -3);
lua_setmetatable(L, -2);
lua_settable(L, LUA_GLOBALSINDEX);

Everything is set up in Lua to handle proper dispatching now.  To illustrate, a few lua_dostring() calls will dispatch to the correct objects:

lua_dostring(L, "obj1:Print()");
lua_dostring(L, "obj2:Print()");

obj1 and obj2 were both created as userdata objects with metatables.  The other approach involves assigning a full or light userdata representing the C++ object to a table's __object member.

lua_pushstring(L, "table1");
lua_newtable(L);
lua_pushstring(L, "__object");
lua_pushlightuserdata(L, &obj1);
lua_settable(L, -3);
lua_pushvalue(L, -3);
lua_setmetatable(L, -2);
lua_settable(L, LUA_GLOBALSINDEX);

lua_pushstring(L, "table2");
lua_newtable(L);
lua_pushstring(L, "__object");
lua_pushlightuserdata(L, &obj2);
lua_settable(L, -3);
lua_pushvalue(L, -3);
lua_setmetatable(L, -2);
lua_settable(L, LUA_GLOBALSINDEX);

lua_dostring(L, "table1:Print()");
lua_dostring(L, "table2:Print()");

Above, two Lua tables called table1 and table2 are created and their __object members are assigned to the C++ obj1 and obj2 objects respectively.  After the assignments are done, two lua_dostring() calls are run to illustrate the correct callback dispatching.

Direct Calling of C++ Functions

The LuaPlus Callback Dispatcher supports direct registration of C++ functions through the overloaded function lua_pushdirectclosure.  lua_pushdirectclosure() is capable of registering global/static functions, non-virtual member functions, and virtual member functions.

A simple example follows:

float Add(float num1, float num2)
{
    return num1 + num2;
}

lua_pushstring(L, "Add");
lua_pushdirectclosure(L, Add, 0);
lua_settable(L, LUA_GLOBALSINDEX);

lua_dostring(L, "print(Add(10, 5))");

Any functions registered in this fashion automatically receive built-in type checking for the incoming arguments.  If an argument is not valid, luaL_argassert is called.  For instance, in the above example, if Add was called with a non-numeric string, there would be a failure.

lua_dostring(L, "print(Add(10, 'Hello'))");  // Error!

Just as global functions can be registered, member functions can be registered, also.

void LOG(const char* message)
{
    printf("In global function: %s\n", message);
}


class Logger
{
public:
    void LOGMEMBER(const char* message)
    {
        printf("In member function: %s\n", message);
    }

    virtual void LOGVIRTUAL(const char* message)
    {
        printf("In virtual member function: %s\n", message);
    }
};


lua_pushstring(L, "LOG");
lua_pushdirectclosure(L, LOG, 0);
lua_settable(L, LUA_GLOBALSINDEX);

Logger logger;
lua_pushstring(L, "LOGMEMBER");
lua_pushdirectclosure(L, logger, Logger::LOGMEMBER, 0);
lua_settable(L, LUA_GLOBALSINDEX);

lua_pushstring(L, "LOGVIRTUAL");
lua_pushdirectclosure(L, logger, Logger::LOGVIRTUAL, 0);
lua_settable(L, LUA_GLOBALSINDEX);

lua_dostring(L, "LOG('Hello')");
lua_dostring(L, "LOGMEMBER('Hello')");
lua_dostring(L, "LOGVIRTUAL('Hello')");

The implementation built into the LuaPlus Call Dispatcher supports up to 7 parameters in the registered function.

Note: This direct registration of C++ functions is based around techniques presented in LuaBind (http://luabind.sourceforge.net/).  LuaBind provides a much more powerful function dispatching mechanism, including inheritance for classes.  It also inherits the baggage of both STL and Boost and runs slower than the LuaPlus Callback Dispatcher due to its ability to handle class hierarchies and function overloading.  Depending on your needs, you may find LuaBind a much more suitable library to use.

Object Dispatch to Directly Called C++ Member Functions

Even though lua_pushdirectclosure() can dispatch to C++ member functions, it uses a 'this' pointer as provided by the second argument passed to the function.  The 'this' pointer is constant, and lua_pushdirectclosure() is not suited for mirroring class hierarchies in Lua.

The solution to the 'this' pointer issue is through lua_pushobjectdirectclosure().  It is a specialized form of lua_pushdirectclosure() where a 'this' pointer isn't provided during the closure registration.  Instead, it is retrieved from either the calling userdata or the calling table's __object member, which must be a full or light userdata.  The techniques presented in this section mirror closely the lua_pushobjectfunctorclosure() description above.

Using the above MultiObject sample, we'll add support for a directly called C++ member function to the metatable.

// Add the Print2 direct function.
lua_pushstring(L, "Print2");
lua_pushobjectdirectclosure(L, (MultiObject*)0, MultiObject::Print2, 0);
lua_settable(L, -3);

lua_pushobjectdirectclosure() has a slightly strange syntax in the second argument.  The reason this is the case is for the template expansion of the Callee object type to be correct.  It is possible to retrieve the Callee type from the third argument, but the current version of the LuaPlus Callback Dispatcher does not do so.  A future version may.

All that needs be done at this point are some calls to lua_dostring():

lua_dostring(L, "obj1:Print2(5)");
lua_dostring(L, "obj2:Print2(15)");
lua_dostring(L, "table1:Print2(5)");
lua_dostring(L, "table2:Print2(15)");

Functor Technical Design

Before we can go much further into this technique, it is important to understand how an arbitrary function pointer to a C global or static function or a C++ member function (either virtual or non-virtual) is stored.  Past experience with C function pointers tells us the address of a function is of the same size as a pointer (4 bytes on 32-bit hardware).  Member function pointers consume anywhere from 8 to 16 bytes on the same hardware, depending on the virtual-ness of the base class.  Under C++'s strict type system, these function pointers can't easily be mixed and matched, so a simple typedef won't do.  Some C++ compilers have support for delegates built in, but this support is based around compiler extensions operating outside the realm of the C++ standard.  We can solve this problem generically and in a cross platform way by employing templates to handle the varying types of function pointers the user may request.

For simplicity of the number of cases we need to examine a, we're simply going to assume we're registering a member function.  The global/static function case is even simpler, and if the member function registration is understood, the simpler case will be easily understood.

First, we need the storage space for the callback.  This storage space is retrieved via lua_newuserdata().  The size of the storage space is calculated by calculating the size of a pointer to the Callee (typically a this pointer for a class), if calling a member function, and the size of the passed in function pointer.  Both the Callee pointer (callee) and the function pointer (func) are passed into lua_pushfunctorclosure().

template <typename Callee>
inline void lua_pushfunctorclosure(lua_State* L, Callee& callee, int (Callee::*func)(lua_State*), unsigned int nupvalues)
{
    unsigned char* buffer = (unsigned char*)lua_newuserdata(L, sizeof(Callee) + sizeof(func));

Now that there is space to store the function pointer, we need a way to generically insert the function calling data into buffer.

    memcpy(buffer, &callee, sizeof(Callee));
    memcpy(buffer + sizeof(Callee), &func, sizeof(func));

Finally, after storing down the function calling information, a standard Lua C closure is created, with the buffer userdata as an upvalue.

    lua_pushcclosure(L, LPCD::lua_StateMemberDispatcher<Callee>, nupvalues + 1);
}

When lua_pushfunctorclosure() is called, it creates a standard C closure pointing at a global templatized C function LPCD::lua_StateMemberDispatcher<Callee>.  An userdata upvalue comprised of sizeof(Callee) + sizeof(func) bytes is stored with the closure.  The userdata buffer contains the Callee pointer and the function callback information itself.

LPCD::lua_StateMemberDispatcher<> is declared as follows:

template <typename Callee>
inline int lua_StateMemberDispatcher(lua_State* L)

It is perfectly acceptable to register global C template functions in Lua.  As can be seen above, the function signature for lua_StateMemberDispatcher matches with Lua's expectation of an int (*)(lua_State*) signature. 

For simplicity, a helper typedef called Functor is created:

{
    typedef int (Callee::*Functor)(lua_State*);    // Helper typedef.
    unsigned char* buffer = GetFirstUpValueAsUserData(L);
    Callee& callee = *(Callee*)buffer;
    Functor& f = *(Functor*)(buffer + sizeof(Callee));
    return (callee.*f)(L);
}

Removed from this version: GetFirstUpValueAsUserData() is a helper function coming in two implementations.  The first uses standard Lua functionality and is slower than the LPCD_FAST_DISPATCH version.  The LPCD_FAST_DISPATCH version requires an additional .cpp file that specifically returns a userdata as the upvalue.  No checking is performed, and it is very fast.  The default version does not use LPCD_FAST_DISPATCH and calls lua_touserdata(L, -1) directly.  The returned value is the functor buffer created from lua_pushfunctorclosure().

Both the callee and func local variables are just aliases inside the buffer to make the function calling more manageable (it is very difficult to get right and understand without it).  In the case of the object functor dispatch, callee isn't a reference to the buffer, but a retrieval of the self pointer.

Template Trickery

lua_pushdirectclosure() is identical to the lua_pushfunctorclosure() description above, with the exception of the registered C function.  Instead of lua_StateMemberDispatcher being the called C function, LPCD::DirectCallMemberDispatcher<Callee, F>::DirectCallMemberDispatcher is used instead.

Let's study DirectCallMemberDispatcher in detail:

template <typename Callee, typename F>
class DirectCallMemberDispatcherHelper
{
public:
    static inline int DirectCallMemberDispatcher(lua_State* L)
    {
        unsigned char* buffer = GetFirstUpValueAsUserData(L);
        return Call(*(Callee*)buffer, *(F*)(buffer + sizeof(Callee)), L, 1);
    }
};

A simpler implementation of the above combination class and member function would be a template function.  Unfortunately, many C++ compiler implementations do not support template functions well (such as Visual C++ 6).  Therefore, the more verbose implementation is used.

Just as in the lua_StateMemberDispatcher above, DirectCallMemberDispatcher retrieves the first upvalue as userdata.  The functor information, both callee and function data, is passed to a function called Call().

For our purposes, let's examine LOGMEMBER() above.  If LOGMEMBER() is expanded:

int DirectCallMemberDispatcherHelper<Logger, void (*)(const char*)>::DirectCallMemberDispatcher(lua_State* L)
{
    unsigned char* buffer = GetFirstUpValueAsUserData(L);
    return Call(*(Logger*)buffer, *( void(*)(const char*) )(buffer + sizeof(Logger)), L, 1);
}

We can see the function calling information data in buffer is cast out to the same function signature describing Logger::LOGMEMBER().  This same behavior is performed for any other function signature.

Even though DirectCallMemberDispatcher<> is a template function, it is still possible to take the address of it as we would any other C function.

You're probably wondering why we need both a pointer to the DirectCallMemberDispatcher<> and a functor.  This will become clear below.

What is Call()?

Call() is a templated overloaded function.  To keep the discussion simple, we'll only discuss the 0 and 1 parameter versions of the Call() template function.

template <typename Callee, typename RT>
int Call(Callee& callee, RT (Callee::*func)(), lua_State* L, int index)
{
    return ReturnSpecialization<RT>::Call(callee, func, L, index);
}


template <typename Callee, typename RT, typename P1>
int Call(Callee& callee, RT (Callee::*func)(P1), lua_State* L, int index)
{
    return ReturnSpecialization<RT>::Call(f, L, index);
}

Notice the first argument of the Call() function, ff looks very much like a pointer to a function, and in fact, it is.  The return value of f is determined by the compiler and given the template typename RT.  Using the LOGMEMBER() example above which has a return type of void, the type of RT would be void also.

In the second Call() function above, an additional parameter is handled by the function pointer f.  Using LOGMEMBER() again as our example, typename P1 would be const char*.

So, it still feels like we haven't got anywhere.  Why do we need this Call() function at all?  And what is this secondary dispatch to ReturnSpecialization<RT>::Call()?

Let's start with ReturnSpecialization<RT>::Call().

template<class RT>
struct ReturnSpecialization
{
    template <typename Callee>
    static int Call(Callee& callee, RT (Callee::*func)(), lua_State* L, int index)
    {
        RT ret = (callee.*func)();
        Push(L, ret);
        return 1;
    }

    template <typename Callee, typename P1>
    static int Call(Callee& callee, RT (Callee::*func)(P1), lua_State* L, int index)
    {
        luaL_argassert(1, index + 0);

        RT ret = (callee.*func)(
            Get(TypeWrapper<P1>(), L, index)
        );
        Push(L, ret);
        return 1;
    }
};

Now we're getting somewhere.  The previously mentioned Call() function calls one of the ReturnSpecialization<>::Call() functions above.  In the case of the LOGMEMBER() function, it calls the Call() function taking 1 parameter.  We'll get into the specifics of the argument checking and retrieval in the next section.

Actually, the compiler can't generate code to call the second Call() function above.  The reason for this is that RT would be void.  If RT is void, then the function call line reads:

void ret = f( ... );

It isn't valid C++ to assign a return value from a function to a void variable.  The compiler will choke.

The way to get around this is to specialize the Call function to handle void return values:

template<>
struct ReturnSpecialization<void>
{
    template <typename Callee>
    static int Call(Callee& callee, void (Callee::*func)(), lua_State* L, int index)
    {
        (callee.*func)();
        return 0;
    }

    template <typename Callee, typename P1>
    static int Call(Callee& callee, void (Callee::*func)(P1), lua_State* L, int index)
    {
        luaL_argassert(1, index + 0);

        (callee.*func)(
             Get(TypeWrapper<P1>(), L, index + 0)
        );
        return 0;
    }
};

The ReturnSpecialization<void>::Call() function taking one parameter in the function callback is the one LOGMEMBER() will dispatch through.

Back to the question we asked above.  Why do we need the Call<RT, P1> function at all?  The answer, if it isn't apparent by now, is we need a template to break apart our function pointer into its components, the return value (RT) and the parameters (P1).  The ReturnSpecialization<>::Call() functions don't actually use the return type RT, but in order to choose the appropriate ReturnSpecialization<RT>::Call() function, we require the type RTRT will either get sent to ReturnSpecialization<void> or the more generic ReturnSpecialization<RT>.

luaL_argassert, Match, and Get

luaL_argassert() is used for type checking each parameter before dispatching to the callback function itself.  If the type check fails, luaL_argerror() is called and the Lua virtual machine fires a lua_error(), allowing the last lua_pcall() to trap the error.

luaL_argassert() is defined thus:

#define luaL_argassert(arg, _index_) if (!Match(TypeWrapper<P##arg>(), L, _index_)) \
            luaL_argerror(L, _index_, "bad argument")

The final part of the function dispatcher is encompassed in the type retrieval functions, Match() and Get().  These functions are heavily overloaded, providing both basic type retrieval and more advanced type retrieval as specified by the user.

luaL_argassert calls the Match() function to determine if the C++ argument is a match with some value.  That value can be anything handled by the overloads of Match().  The function signature passes in a TypeWrapper<> structure based on the template typename P# (where # is the argument number being checked), the lua_State pointer, and the index of the argument to be looked up.  Any of the Match() arguments may be ignored by the overloaded function.

The Get() function retrieves the Lua value and translates it into a C++ argument.  It will perform the proper casts on the value.

Both Match() and Get() functions take a TypeWrapper<> parameter.  TypeWrapper<> is a templated structure that allows any type to be overloaded in the Match()/Get() functions without taking a performance hit.  Its only purpose is for function overload matching.  TypeWrapper<> is defined as:

template<class T> struct TypeWrapper {};

The function definition of a Match() function for an integer would be:

inline bool Match(TypeWrapper<int>, lua_State* L, int idx);

Note that there isn't actually an integer passed for the first argument.  It is a TypeWrapper<int> empty structure.  Passing an integer directly wouldn't be harmful for the overload performance, but TypeWrapper<> guards against heavier-weight objects being passed by value.

So why not pass by reference?  After all, this Match() function declaration would be sufficient for the compiler to do the proper overloading:

inline bool Match(HeavyWeightObject&, lua_State* L, int idx);

The reason this won't work is we don't have an actual instance of HeavyWeightObject to pass in.  In order to pass a reference to an object, the object has to exist (although technically this could be faked by the illegal *(HeavyWeightObject*)0 cast).  Note the second Call() function above, the one taking one parameter, P1.  It first calls Match() to make sure the argument is the one we want and then it calls Get().  The object isn't available to us until Get() is called.

Adding Match and Get functions for your own types is easy.  Assume you have the following function declaration:

bool DoSomething(MyObject* obj)

A translation is needed from a Lua light user data to the MyObject*.  Both the Match and Get functions must reside in the LPCD namespace:

namespace LPCD
{
    inline bool Match(TypeWrapper<MyObject*>, lua_State* L, int idx)
    {
         return lua_isuserdata(L, idx);  // Might want a dynamic_cast<> here, too, on lua_touserdata(L, idx);
    }

Finally, we need to overload the Get() function so it can translate a userdata to a MyObject*:

    inline bool Get(TypeWrapper<MyObject*>, lua_State* L, int idx)
    {
        return reinterpret_cast<MyObject*>(lua_touserdata(L, idx));
    }
} // namespace LPCD

Now, we register DoSomething() as a direct closure:

lua_pushstring(L, "DoSomething");
lua_pushdirectclosure(L, DoSomething, 0);
lua_settable(L, LUA_GLOBALSINDEX);

And finally, we call it:

MyObject o;
lua_getglobal(L, "DoSomething");
lua_pushlightuserdata(L, &o);
lua_pcall(1, 1);

And voila!  A direct call to DoSomething() using a light user data of MyObject has been accomplished with minimal effort.

In Brief

Even though all this sounded complicated, it really isn't.  In a nutshell, for the member function declaration:

void LOGMEMBER(const char* str)

the following things happen:

  1. The function DirectCallMemberDispatcherHelper<Logger, void (*)(const char*)>::DirectCallMemberDispatcher(L) is called.
  2. The first upvalue of the stack is retrieved as userdata.
  3. The userdata is translated into function call Call<Logger, void, const char*>(loggerReference, void (*)(const char*), L, 1).
  4. From there, ReturnSpecialization<void>::Call<Logger, const char*>(callee, func, L, 1) is called.
  5. Next, Match(TypeWrapper<const char*>(), L, 1) is called.  Internally, it checks lua_type(L, 1) == LUA_TSTRING and returns the value.  If the passed in argument is not a string, luaL_argerror is called.
  6. If Match() succeeds, Get(TypeWrapper<const char*>(), L, 1) is called.  Internally, it calls lua_tostring(L, 1) and returns the const char*.
  7. This argument is passed to (callee.*func)(const char*)callee is the Logger reference.  func is LOGMEMBER().

Performance

This all sounds heavyweight.  Amazingly enough, the Visual C++ 7 compiler, in an optimized build, optimizes it by completely removing step 3 (described above) from the equation.  Further, step 4 is translated into a simple assembly jmp instruction.

All said, the use of the functor techniques is slightly more expensive than a simple C function pointer, both in performance and memory.  The final cost, though, is probably not of consequence, since the straight C function being called will more than likely do an operation that is heavier weight than the function call.  The end result will likely not show up in a profiler (and if it does, you are doing a whole lot of Lua->C function calls).