【协程原理】 - Java中的协程

899 查看

很长一段时间,我都很天真的认为python,特别是以gevent为代表的库,才是协程的乐土。Java里是没法实现协程,更别说实现stackless python这样可以pickle的协程的。Bong!咱真的是太井底之蛙了。
Java不但可以实现协程,而且还有很多个实现版本。不完全列表如下:

还有一个据(作者)说是最NB的kilim (https://github.com/kilim/kilim
这些协程库的实现方式都是类似的,都是通过jvm字节码生成达到pause/resume的目的。在这篇文章中,RIFE的作者很清楚地讲明白了其实现方式:
http://www.artima.com/lejava/articles/continuations.html

Geert Bevin: At the byte-code level, when the method pops variables
from the stack, exactly the same operation will be performed on the
parallel stack. We just add a piece of byte code that mimics what goes
on exactly. Then, when a continuation has to be resumed, there's a bit
of added byte-code that interacts with the state of the stack, and
puts all the variables in the correct location so that they are
present when the executing thread passes that point. At this point,
it's as if nothing happened—execution resumed as if nothing happened.

The second problem is how to arrive at the point to resume a
continuation. You want to skip over code that you don't want to
execute. That's easily done in byte code, because you can maintain a
tableswitch in byte code that allows you to jump to a particular
location. You can create a unique label for each continuation point,
jump to that label in the switch table, and then you know exactly what
the stack for that label of variables was. You can restore that stack,
and continue executing from that point on.

kiliam的作者用代码解释得更加清楚(http://www.malhar.net/sriram/kilim/thread_of_ones_own.pdf):

// original
void a() throws Pasuable {
    x = ...
    b(); // b is pausable
    print (x);
}

经过代码增强之后是

// transformed code
void a(Fiber f) {
    switch (f.pc) { // prelude
        case 0: goto START;
        case 1: goto CALL_B}
    START:
    x = ...
    CALL_B: // pre_call
    f.down()
    b(f);
    f.up() // post-call
    switch (f.status) {
        case NOT_PAUSING_NO_STATE:
            goto RESUME
        case NOT_PAUSING_HAS_STATE:
            restore state
            goto RESUME
        case PAUSING_NO_STATE :
            capture state, return
        case PAUSING_HAS_STATE:
            return
    }
    RESUME:
    print (x);
}

因为这些框架都是以java对象的方式来存储call stack state和programe counter的,所以要做到序列化存储一个执行中的状态(continuation)一点都不困难。RIFE在其web框架就狠狠地利用了clonable的状态来实现复杂的wizard(回到任意时刻地过去重新执行之类的)。
看过了这些实现之后,我不禁觉得持久化一个continuation好像没啥了不起的,为什么会是stackless python的pypy两家的独门绝技呢?基于CPython的两个协程实现,一个greenlet一个fibers是否可以实现状态的持久化?状态持久化能不能不用更高效的serializer来实现呢?