探索 Erlang Abstract Form--动态生成和修改module
在第一篇探索 Erlang Abstract Form--生成和获取,我们就说过,要获得Abstract Form有两种方法,一种读取beam文件中的debug_info,另一种方法就是直接解析源代码。
提供源代码文本
修改一个module最有用的功能是增加新的函数。我们从beam文件可以获取现有模块的Abstract Form,但是如果需要动态增加方法,最容易想到的就是提供函数的源代码文本。解析源代码通常需要两个工具,即扫描器和解析器。Erlang提供的基本扫描器是erl_scan,解析器为erl_parse。我们看看它们的文档。
MODULE
erl_scanMODULE SUMMARY
The Erlang Token ScannerDESCRIPTION
This module contains functions for tokenizing characters into Erlang tokens.
EXPORTS
string(CharList,StartLine]) -> {ok, Tokens, EndLine} | Error
string(CharList) -> {ok, Tokens, EndLine} | Error
Types:
CharList = string()
StartLine = EndLine = Line = integer()
Tokens = [{atom(),Line}|{atom(),Line,term()}]
Error = {error, ErrorInfo, EndLine}
Takes the list of characters
CharList
and tries to scan (tokenize) them. Returns{ok, Tokens, EndLine}
, whereTokens
are the Erlang tokens fromCharList
.EndLine
is the last line where a token was found.
StartLine
indicates the initial line when scanning starts.string/1
is equivalent tostring(CharList,1)
.
{error, ErrorInfo, EndLine}
is returned if an error occurs.EndLine
indicates where the error occurred.
erl_scan:string方法扫描字符串文本,如果没有发生错误,则在结果tuple中返回所有的token,不然返回错误的行号。
Eshell V5.5 (abort with ^G)
1> c(simplest, [debug_info]).
{ok,simplest}
2> {ok, Tokens, EndLine} = erl_scan:string("test() -> ok.").
{ok,[{atom,1,test},{'(',1},{')',1},{'->',1},{atom,1,ok},{dot,1}],1}
3>
有了这个 Tokens,我们可以用erl_parse解析生成Abstract Form。erl_parse的文档中说:
MODULE
erl_parseMODULE SUMMARY
The Erlang ParserDESCRIPTION
This module is the basic Erlang parser which converts tokens into the abstract form of either forms (i.e., top-level constructs), expressions, or terms. The Abstract Format is described in the ERTS User's Guide. Note that a token list must end with the dot token in order to be acceptable to the parse functions (see erl_scan).
EXPORTS
parse_form(Tokens) -> {ok, AbsForm} | {error, ErrorInfo}
Types:
Tokens = [Token]
Token = {Tag,Line} | {Tag,Line,term()}
Tag = atom()
AbsForm = term()
ErrorInfo = see section Error Information below.
This function parses
Tokens
as if it were a form. It returns:
{ok, AbsForm}
- The parsing was successful.
AbsForm
is the abstract form of the parsed form.{error, ErrorInfo}
- An error occurred.
- erl_parse可以解析很多种token,包括表达式、term、Form等等,我们需要的是完全解析Form的函数parse_form。同样,如果解析成功,那么返回的tuple中将包含tokens代表的Abstract Form,不然返回语法错误信息。
3> erl_parse:parse_form(Tokens).
{ok,{function,1,test,0,[{clause,1,[],[],[{atom,1,ok}]}]}}
我们可以看到,返回结果中包含了函数的Abtract Form。
编译Abstract Form
有了Abstract Form以后,我们就可以编译,并加载它。compile模块的文档中说:
Is the same as
forms(File, [verbose,report_errors,report_warnings])
.
forms(Forms, Options) -> CompRet
Types:
Forms = [Form]
CompRet = BinRet | ErrRet
BinRet = {ok,ModuleName,BinaryOrCode} | {ok,ModuleName,BinaryOrCode,Warnings}
BinaryOrCode = binary() | term()
ErrRet = error | {error,Errors,Warnings} Analogous to
file/1
, but takes a list of forms (in the Erlang abstract format representation) as first argument. The optionbinary
is implicit; i.e., no object code file is produced. Options that would ordinarily produce a listing file, such as 'E', will instead cause the internal format for that compiler pass (an Erlang term; usually not a binary) to be returned instead of a binary.
实际上,compile:file这个方法我们一直都在用。erl的shell中,输入c(File, options)就是在编译文件:
c(File) -> {ok, Module} | error
c(File, Options) -> {ok, Module} | error
Types:
File = Filename | Module
Filename = string() | atom()
Options = [Opt] -- see compile:file/2
Module = atom()
c/1,2
compiles and then purges and loads the code for a file.Options
defaults to []. Compilation is equivalent to:compile:file(File, Options ++ [report_errors, report_warnings])Note that purging the code means that any processes lingering in old code for the module are killed without warning. See
code/3
for more information.
要编译Abstract Form,我们必须提供整个module完整的Form。因此,我们需要提供module属性、export属性等等。
1> c(simplest,[debug_info]).
{ok,simplest}
2> {ok, Tokens, EndLine} = erl_scan:string("test() -> ok.").
{ok,[{atom,1,test},{'(',1},{')',1},{'->',1},{atom,1,ok},{dot,1}],1}
3> {ok, Forms} = erl_parse:parse_form(Tokens).
{ok,{function,1,test,0,[{clause,1,[],[],[{atom,1,ok}]}]}}
4>
4> NewForms = [{attribute, 1, module, simplest},{attribute, 2, export, [{test,0}]}, Forms].
[{attribute,1,module,simplest},
{attribute,2,export,[{test,0}]},
{function,1,test,0,[{clause,1,[],[],[{atom,1,ok}]}]}]
5> compile:forms(NewForms).
{ok,simplest,
<<70,79,82,49,0,0,1,188,66,69,65,77,65,116,111,109,0,0,0,55,0,0,0,6,7,...>>}
加载新编译的对象码
阅读上面文档的另外一个好处就是我们知道编译以后如何加载。code模块定义了这些函数:purge函数把现有的代码移去并标记为老版本,如果有任何process在使用旧代码,那么这些process将被杀死。注意尽管purge总是成功的,但是它的返回值只有在任何process需要被杀死的情况下才会返回true.Types:
Module = atom()
Purges the code for
Module
, that is, removes code marked as old. If some processes still linger in the old code, these processes are killed before the code is removed.Returns
true
if successful and any process needed to be killed, otherwisefalse
.
load_binary加载编译好的对象码,从而使得Module可以被程序使用。如果对象代码存在于sticky目录下的话,可能无法成功替换。sticky目录是erlang自己的运行时系统,包括kernel、stdlib和compiler,为了保证erlang的运行正常,缺省情况下这些目录是受保护的,被认为是sticky的。
load_binary(Module, Filename, Binary) -> {module, Module} | {error, What}
Types:
Module = atom()
Filename = string()
What = sticky_directory | badarg | term()
This function can be used to load object code on remote Erlang nodes. It can also be used to load object code where the file name and module name differ. This, however, is a very unusual situation and not recommended. The parameter
Binary
must contain object code forModule
.Filename
is only used by the code server to keep a record of from which file the object code forModule
comes. Accordingly,Filename
is not opened and read by the code server.Returns
{module, Module}
if successful, or{error, sticky_directory}
if the object code resides in a sticky directory, or{error, badarg}
if any argument is invalid. Also if the loading fails, an error tuple is returned. See erlang:load_module/2 for possible values ofWhat
.
组合起来
利用我们前面讨论过的内容,我们可以进行完整的试验.
假设现在有以下程序simplest.erl:
-module(simplest).
-export([foo/0]).
foo() ->
io:format("foo~n").
我们用erl一步一步进行试验。
1> c(simplest,[debug_info]).
{ok,simplest}
2> simplest:foo().
foo
ok
3> simplest:test().
=ERROR REPORT==== 18-Aug-2006::15:06:17 ===
Error in process <0.32.0> with exit value: {undef,[{simplest,test,[]},{erl_eval,do_apply,5},{shell,exprs,6},{shell,eval_loop,3}]}
** exited: {undef,[{simplest,test,[]},
{erl_eval,do_apply,5},
{shell,exprs,6},
{shell,eval_loop,3}]} **
4> {ok, Tokens, EndLine} = erl_scan:string("test() -> ok.").
{ok,[{atom,1,test},{'(',1},{')',1},{'->',1},{atom,1,ok},{dot,1}],1}
5> {ok, Forms} = erl_parse:parse_form(Tokens).
{ok,{function,1,test,0,[{clause,1,[],[],[{atom,1,ok}]}]}}
6> NewForms = [{attribute, 1, module, simplest},{attribute, 2, export, [{test,0}]}, Forms].
[{attribute,1,module,simplest},
{attribute,2,export,[{test,0}]},
{function,1,test,0,[{clause,1,[],[],[{atom,1,ok}]}]}]
7> {ok,simplest,Binary} = compile:forms(NewForms).
{ok,simplest,
<<70,79,82,49,0,0,1,188,66,69,65,77,65,116,111,109,0,0,0,56,0,0,0,6,8,...>>}
8> code:purge(simplest).
false
9> code:load_binary(simplest,"simplest.erl",Binary).
{module,simplest}
10> simplest:foo().
=ERROR REPORT==== 18-Aug-2006::15:08:13 ===
Error in process <0.40.0> with exit value: {undef,[{simplest,foo,[]},{erl_eval,do_apply,5},{shell,exprs,6},{shell,eval_loop,3}]}
** exited: {undef,[{simplest,foo,[]},
{erl_eval,do_apply,5},
{shell,exprs,6},
{shell,eval_loop,3}]} **
11> simplest:test().
ok
12>
结果是新加入了一个export的test/0函数,原先的foo/0函数没有了。
下一步
当然,我们的目的不是完全覆盖原先的module,而是能够增加、删除和替换原先的函数。因此,我们需要通过前面的beam_lib:chunks方法读取并保存老的Form,然后把根据不同的操作,加入、替换新函数的Form,然后一起编译。这就是Smerl所做的事情。
0 Comments:
发表评论
<< Home