Lua 与C/C++ 交互系列:Userdata知识点翻译


1、来自Lua 5.0Reference Manual  -The Applicatioin Program Interface -Userdata


Userdata represents C values in Lua. Lua supports two types of userdata: full userdata and light
userdata.
A full userdata represents a block of memory. It is an object (like a table): You must create
it, it can have its own metatable, and you can detect when it is being collected. A full userdata is
only equal to itself (under raw equality).
A light userdata represents a pointer. It is a value (like a number): You do not create it, it has
no metatables, it is not collected (as it was never created). A light userdata is equal to “any” light
userdata with the same C address.
In Lua code, there is no way to test whether a userdata is full or light; both have type userdata.
In C code, lua_type returns LUA_TUSERDATA for full userdata, and LUA_TLIGHTUSERDATA for light
userdata.
You can create a new full userdata with the following function:void *lua_newuserdata (lua_State *L, size_t size);
This function allocates a new block of memory with the given size, pushes on the stack a new userdata with the block address, and returns this address.
To push a light userdata into the stack you use lua_pushlightuserdata.
lua_touserdata retrieves the value of a userdata. When applied on a full userdata,
it returns the address of its block; when applied on a light userdata, it returns its pointer; when
applied on a non-userdata value, it returns NULL.
When Lua collects a full userdata, it calls the userdata’s gc metamethod, if any, and then it
frees the userdata’s corresponding memory.
翻译如下:
userdata表示Lua中的C的值。Lua支持两种userdata:full userdata和light userdata.
        一个full userdata表示一块内存地址。full userdata就是一个对象(类似于Table):可以通过C API创建userdata。
userdata可以拥有自己的metatable,用户可以决定full userdata何时被收集。full userdata 仅仅和自己相等。
        一个light userdata 表示一个指针。light userdata是一个值(类似于Number):不需要创建light userdata,
light userdata没有元表,不能被GC垃圾收集器清理。light userdata可以和任意地址相同的light userdata 相等。
        在Lua Code中,没有任何方式可以判断一个userdata是否为full userdata.两者都是userdata.(type (v) 通过方法获取都是userdata)
但是在C code中,int lua_type (lua_State *L, int index); 函数 可以判断userdata 是LUA_TUSERDATA 或者LUA_TLIGHTUSERDATA。 
        在C code中可以通过*lua_newuserdata()创建full userdata并且把full userdata压入虚拟栈中。
*lua_newuserdata()函数分配指定大小的内存地址,同时把内存地址压入到虚拟栈中,同时返回内存地址。
        在C code中可以通过lua_pushlightuserdata()把lightuserdata压入虚拟栈中。
        在C code中,可以通过lua_touserdata()对userdata转换成C Value.当参数为full userdata时,结果返回内存地址。
当参数为light userdata时,返回指针。当参数为non-userdata,则返回NULL.
        当Lua GC垃圾收集器收集full userdata时,调用userdata的__gc元方法,然后释放userdata相关的内存。

2- 来自The Evolution of Lua -Feature evolution -Userdata

英文原文:

Userdata's   evolution  
Since its first version, an important feature of Lua has been its ability to manipulate C data, which is provided by a
special Lua data type called userdata. This ability is an essential component in the extensibility of Lua.
For Lua programs, the userdata type has undergone no changes at all throughout Lua’s evolution:
although userdata are first-class values, userdata is an opaque type and its only valid operation in Lua is equality test. 
Any other operation over userdata (creation, inspection, modification) must be provided by C functions.
For C functions, the userdata type has undergone severalchanges in Lua’s evolution.
In Lua 1.0, a userdata value was a simple void* pointer.
The main drawback of this simplicity was that a C library had no way to check whether a userdata was valid. 
Although Lua code cannot create userdata values, it can pass userdata created by one library to another
library that expects pointers to a different structure. 
Because C functions had no mechanisms to check this mismatch, the result of this pointer mismatch was usually fatal to the application. 
We have always considered it unacceptable for a Lua program to be able to crash the host application. Lua should be a safe language.
To overcome the pointer mismatch problem, Lua 2.1 introduced the concept of tags (which would become the seed for tag methods in Lua 3.0). 
A tag was simply an arbitrary integer value associated with a userdata. A userdata’s tag could only be set once, when the userdata was created.
Provided that each C library used its own exclusive tag, C code could easily ensure that a userdata had the expected type by checking its tag. 
(The problem of how a library writer chose a tag that did not clash with tags from other libraries remained open. It was only solved in Lua 3.0, which provided tag management via lua_newtag.)
A bigger problem with Lua 2.1 was the management of C resources. 
More often than not, a userdata pointed to a dynamically allocated structure in C, which had to be freed when its corresponding userdata was collected in Lua.
However, userdata were values, not objects. As such, they were not collected (in the same way that numbers are not collected). 
To overcome this restriction, a typical design was to use a table as a proxy for the C structure in Lua, storing the actual userdata in a predefined field of the proxy table.
When the table was collected, its finalizer would free the corresponding C structure.
This simple solution created a subtle problem. Because the userdata was stored in a regular field of the proxy table, a malicious user could tamper with it from within Lua. 
Specifically, a user could make a copy of the userdata and use the copy after the table was collected. 
By that time, the corresponding C structure had been destroyed, making the userdata a dangling pointer, with disastrous results. 
To improve the control of the life cycle of userdata, Lua 3.0 changed userdata from values to objects, subject to garbage collection. 
Users could use the userdata finalizer (the garbagecollection tag method) to free the corresponding C structure.
The correctness of Lua’s garbage collector ensured that a userdata could not be used after being collected.
However, userdata as objects created an identity problem.
Given a userdata, it is trivial to get its corresponding pointer,but frequently we need to do the reverse:
given a C pointer,we need to get its corresponding userdata.
In Lua 2, two userdata with the same pointer and the same tag would be equal; equality was based on their values. So, given the pointer and the tag, we had the userdata. 
In Lua 3, with userdata being objects, equality was based on identity: two userdata were equal only when they were the same userdata (that is, the same object). 
Each userdata created was different from all others. Therefore, a pointer and a tag would not be enough to get the corresponding userdata.
To solve this difficulty, and also to reduce incompatibilities with Lua 2, 
Lua 3 adopted the following semantics for the operation of pushing a userdata onto the stack:
if Lua already had a userdata with the given pointer and tag, then that userdata was pushed on the stack;
otherwise, a new userdata was created and pushed on the stack.
So, it was easy for C code to translate a C pointer to its corresponding userdata in Lua. (Actually, the C code could be the same as it was in Lua 2.)
However, Lua 3 behavior had a major drawback: it combined into a single primitive (lua_pushuserdata) two basic operations: 
userdata searching and userdata creation.For instance, it was impossible to check whether a given C pointer had a corresponding userdata without creating that userdata. 
Also, it was impossible to create a new userdata regardless of its C pointer. If Lua already had a userdata with that value, no new userdata would be created.
Lua 4 mitigated that drawback by introducing a new function, lua_newuserdata. 
Unlike lua_pushuserdata, this function always created a new userdata. Moreover, what was more important at that time, those userdata were able to store
arbitrary C data, instead of pointers only. The user would tell lua_newuserdata the amount memory to be allocated and lua_newuserdata returned a pointer to the allocated area.
By having Lua allocate memory for the user, several common tasks related to userdata were simplified.
For instance, C code did not need to handle memory-allocation errors, because they were handled by Lua. 
More important, C code did not need to handle memory deallocation: memory used by such userdata was released by Lua automatically, when the userdata was collected.
However, Lua 4 still did not offer a nice solution to the search problem (i.e., finding a userdata given its C pointer).
So, it kept the lua_pushuserdata operation with its old behavior, resulting in a hybrid system. 
It was only in Lua 5 that we removed lua_pushuserdata and dissociated userdata creation and searching. 
Actually, Lua 5 removed the searching facility altogether. Lua 5 also introduced light userdata, which store plain C pointer values, exactly like regular userdata in Lua 1.
A program can use a weak table to associate C pointers (represented as light userdata) to its corresponding “heavy” userdata in Lua.
As is usual in the evolution of Lua, userdata in Lua 5 is more flexible than it was in Lua 4;
it is also simpler to explain and simpler to implement. 
For simple uses, which only require storing a C structure, userdata in Lua 5 is trivial to use. 
For more complex needs, such as those that require mapping a C pointer back to a Lua userdata, Lua 5 offers the mechanisms (light userdata and weak tables) for users to implement strategies suited to their applications.

翻译如下:

    自从Lua发布第一个版本开始, Lua 提供一个叫做userdata 特殊的数据类型。userdata能够操作C data是一个重要的特点。
    对于Lua Code编程而言,虽然lua在不断演化,但是在Lua Code中,userdata没有发生任何变化。
虽然userdata作为first-class值,userdata作为一个没有任何操作的透明类型是久经考验的。所有userdata的操作必须通过C code函.
操作c data是拓展Lua 语言的一个必须组成部分。
对于C code函数而言,伴随着Lua语言的演化,userdata类型发生过几次变化。在Lua1.0版本中,userdata就是一个简单的void* 指针.
这种简单的实现最主要的缺点就是无法通过C 库来检测userdata是否存在。尽管Lua Code不能创建userdata的值,但是Lua Code把userdata作为参数从一个库传递到参数为指针但是结构不同的库中。
因为C函数不存检测这种滥用的机制,指针的错误使用经常导致程序崩溃。Lua 是一种安全的语言,Lua Code编程将导致宿主程序的崩盘被认为是不可以接受的。
为了解决Void*指针的滥用,Lua2.1使用Tag的概念来解决该问题。
在2.1版本中,管理C 资源是一个重大的问题。往往,指向在C语言中动态申请内存的userdata,必须在userdata被垃圾收集器收集时,dynamically allocated structure in C 必须在Lua中释放.
然而,userdata是值类型,不是对象类型.所以,userdate不会被GC收集,就像number不能被GC收集一样。为了解决这一个限制,在Lua Code中特殊设计一个Table作为C structure的一个代理,存储真正的userdata在代理的Table的预定义字段中。
当代理Table被GC收集时,将释放相关的C structure.
使用代理Table简单的处理对于C语言资管管理引入了另外一个小问题。在Lua Code中,恶意的用户可以篡改在代理Table指定的字段的userdata.
还有,当Proxy Table被GC 收集过后,用户可以拷贝userdata.当 c structure 被销毁后,userdata将是一个野指针.
在Lua3.0版本中,userdata从值类型变成了对象类型,同时一种是GC收集的子类型。用户可以使用userdata终止方法(__GC元方法)来释放管理的c structure.
Lua 垃圾收集器确保userdata被收集后不会被用户再次使用。
  userdata存在一个唯一性问题。给定一个userdata,很容易获取的userdata管理的指针,但相反,给定一个指针,我们很难获取指针关联的userdata.
在Lua2版本中,两个一样指针一样标记的userdata被认为相等的。相等时在值的基础上。所以,给指针和Tag,我们能够获取userdata.
在Lua3中,userdata是对象。当两个userdata相等时,表示两个userdata是同一个userdata.因为userdata是对象。当创建userdata时,就是创建一个独一无二的userdata.所以,指针和Tag就不能够获取到关联的userdatat.
    为了解决这个问题,Lua3将不在兼容Lua2.Lua3版本使用压入userdata到虚拟栈上面:
如果指定指针和Tag的userdata已经存在,那么userdata将被压入到虚拟栈上面,否则,一个新的userdata将被创建并且被压入虚拟栈上面。(目前:Lua版本依旧这么使用)
所以,很容易通过C code把指向userdata的指针传递到Lua Coee中。
   但是,Lua 3中有一个主要的缺点是: 把查询userdata和创建userdata的函数变成了一个lua_pushuserdata中。
仅仅检查指定的指针中是否管理一个userdata,而不创建一个新的userdata是不可能做到的.
在Lua5中提供了luaL_checkudata()和lua_newuserdata()来管理userdata.同时Lua 5也提供light userdata ,可以存在C point values.
Lua5 提供weak Table和light userdata来管理映射到Lua中的userdata.


3、lua5中userdata数据结构:

** Header for userdata; memory area follows the end of this structure
*/
typedef union Udata 
{
  union { double u; void *s; long l; } dummy;  /* ensures maximum alignment for `local' udata */
  struct {
    GCObject *next; lu_byte tt; lu_byte marked;
    struct Table *metatable;
    struct Table *env;
    size_t len;  /* number of bytes */
  } uv;
} Udata;

有关userdata的演化翻译结束,但是发现仅仅是讲解userdata本身设计的演化,但是没有讲解userdata和__gc元方法的讲解。美中不足啊。但是还是知道了可以通过userdata来管理c语言中的结构。

你可能感兴趣的:(Lua,与C/C++,交互系列)