OliverKowalke2014Oliver Kowalke
Distributed under the Boost Software License, Version 1.0. (See accompanying
file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
C++ Library for swiching different user ctx
ContextOverviewBoost.Context is a foundational library that
provides a sort of cooperative multitasking on a single thread. By providing
an abstraction of the current execution state in the current thread, including
the stack (with local variables) and stack pointer, all registers and CPU flags,
and the instruction pointer, a execution context represents a specific point
in the application's execution path. This is useful for building higher-level
abstractions, like coroutines, cooperative threads
(userland threads) or an equivalent to C#
keyword yield in C++.
callcc()/continuation
provides the means to suspend the current execution path and to transfer execution
control, thereby permitting another context to run on the current thread. This
state full transfer mechanism enables a context to suspend execution from within
nested functions and, later, to resume from where it was suspended. While the
execution path represented by a continuation
only runs on a single thread, it can be migrated to another thread at any given
time.
A context switch
between threads requires system calls (involving the OS kernel), which can
cost more than thousand CPU cycles on x86 CPUs. By contrast, transferring control
vias callcc()/continuation
requires only few CPU cycles because it does not involve system calls as it
is done within a single thread.
All functions and classes are contained in the namespace boost::context.
This library requires C++11!
Windows using fcontext_t: turn off global program optimization (/GL) and
change /EHsc (compiler assumes that functions declared as extern "C"
never throw a C++ exception) to /EHs (tells compiler assumes that functions
declared as extern "C" may throw an exception).
Requirements
If Boost.Context uses fcontext_t (the default)
as its implementation, it must be built for the particular compiler(s) and
CPU architecture(s) being targeted. Using fcontext_t,
Boost.Context includes assembly code and,
therefore, requires GNU as and GNU preprocessor for supported POSIX systems,
MASM for Windows/x86 systems and ARMasm for Windows/arm systems.
MASM64 (ml64.exe) is a part of Microsoft's Windows Driver Kit.
Please note that address-model=64 must be
given to bjam command line on 64bit Windows for 64bit build; otherwise 32bit
code will be generated.
For cross-compiling the lib you must specify certain additional properties
at bjam command line: target-os, abi, binary-format,
architecture and address-model.
Windows using fcontext_t: for safe SEH the property 'asmflags=\safeseh' must
be specified at bjam command line.
Windows using fcontext_t: turn off global program optimization (/GL) and
change /EHsc (compiler assumes that functions declared as extern "C"
never throw a C++ exception) to /EHs (tells compiler assumes that functions
declared as extern "C" may throw an exception).
Because this library uses C++11 extensively, it requires a compatible compiler.
Known minimum working versions are as follows: Microsoft Visual Studio 2015
(msvc-14.0), GCC 4.8 (with -std=c++11), Clang 3.4 (with -std=c++11). Other
compilers may work, if they support the following language features: auto declarations,
constexpr, defaulted functions, final, hdr thread, hdr tuple, lambdas, noexcept,
nullptr, rvalue references, template aliases. thread local, variadic templates.
Context switching with fibersfiber is the reference implementation of C++ proposal
P0876R0:
fibers without scheduler.
A fiber represents the state of the control flow of a
program at a given point in time. Fibers can be suspended and resumed later
in order to change the control flow of a program.
Modern micro-processors are registers machines; the content of processor registers
represent a fiber of the executed program at a given point in time. Operating
systems simulate parallel execution of programs on a single processor by switching
between programs (context switch) by preserving and restoring the fiber, e.g.
the content of all registers.
fiberfiber captures the current fiber
(the rest of the computation; code after fiber)
and triggers a context switch. The context switch is achieved by preserving
certain registers (including instruction and stack pointer), defined by the
calling convention of the ABI, of the current fiber and restoring those registers
of the resumed fiber. The control flow of the resumed fiber continues. The
current fiber is suspended and passed as argument to the resumed fiber.
fiber expects a context-function
with signature 'fiber(fiber && f)'.
The parameter f represents
the current fiber from which this fiber was resumed (e.g. that has called
fiber).
On return the context-function of the current fiber has
to specify an fiber to which
the execution control is transferred after termination of the current fiber.
If an instance with valid state goes out of scope and the context-function
has not yet returned, the stack is traversed in order to access the control
structure (address stored at the first stack frame) and fiber's stack is deallocated
via the StackAllocator.
Segmented stacks are
supported by fiber using
ucontext_t.
fiber represents a fiber;
it contains the content of preserved registers and manages the associated stack
(allocation/deallocation). fiber
is a one-shot fiber - it can be used only once, after calling continuation::resume()
or continuation::resume_with() it is invalidated.
fiber is only move-constructible
and move-assignable.
As a first-class object fiber
can be applied to and returned from a function, assigned to a variable or stored
in a container.
A fiber is continued by calling resume()/resume_with().
Usage
namespacectx=boost::context;inta;ctx::fibersource{[&a](ctx::fiber&&sink){a=0;intb=1;for(;;){sink=std::move(sink).resume();intnext=a+b;a=b;b=next;}returnstd::move(sink);}};for(intj=0;j<10;++j){source=std::move(source).resume();std::cout<<a<<" ";}output:0112358132134
This simple example demonstrates the basic usage of fiber
as a generator. The fiber sink
represents the main-fiber (function main()). sink
is captured (current-fiber) by invoking fiber
and passed as parameter to the lambda.
Because the state is invalidated (one-shot fiber) by each call of continuation::resume(),
the new state of the fiber,
returned by continuation::resume(), needs to be assigned
to sink after each call. In
order to express the invalidation of the resumed fiber, the member functions
resume()
and resume_with()
are rvalue-ref qualified. Both functions bind only to rvalues. Thus an lvalue
fiber must be casted to an rvalue via std::move().
The lambda that calculates the Fibonacci numbers is executed inside the fiber
represented by source. Calculated
Fibonacci numbers are transferred between the two fibers via variable a (lambda capture reference).
The locale variables b and
next remain their values during
each context switch. This is possible due source
has its own stack and the stack is exchanged by each context switch.
Parameter
passing
Data can be transferred between two fibers via global pointers, calling wrappers
(like std::bind) or lambda captures.
namespacectx=boost::context;inti=1;ctx::fiberf1{[&i](ctx::fiber&&f2){std::printf("inside f1,i==%d\n",i);i+=1;returnstd::move(f2).resume();}};f1=std::move(f1).resume();std::printf("i==%d\n",i);output:insidec1,i==1i==2f1.resume()
enters the lambda in fiber represented by f1
with lambda capture reference i=1. The expression
f2.resume()
resumes the fiber f2. On return
of f1.resume(),
the variable i has the value
of i+1.
Exception
handling
If the function executed inside a context-function emits
an exception, the application is terminated by calling std::terminate(). std::exception_ptr
can be used to transfer exceptions between different fibers.
Do not jump from inside a catch block and then re-throw the exception in
another fiber.
Executing
function on top of a fiber
Sometimes it is useful to execute a new function on top of a resumed fiber.
For this purpose continuation::resume_with() has to be
used. The function passed as argument must accept a rvalue reference to fiber and return void.
namespacectx=boost::context;intdata=0;ctx::fiberf1{[&data](ctx::fiber&&f2){std::cout<<"f1: entered first time: "<<data<<std::endl;data+=1;f2=std::move(f2).resume();std::cout<<"f1: entered second time: "<<data<<std::endl;data+=1;f2=std::move(f2).resume();std::cout<<"f1: entered third time: "<<data<<std::endl;returnstd::move(f2);}};f1=std::move(f1).resume();std::cout<<"f1: returned first time: "<<data<<std::endl;data+=1;f1=std::move(f1).resume();std::cout<<"f1: returned second time: "<<data<<std::endl;data+=1;f1=f1.resume_with([&data](ctx::fiber&&f2){std::cout<<"f2: entered: "<<data<<std::endl;data=-1;returnstd::move(f2);});std::cout<<"f1: returned third time"<<std::endl;output:f1:enteredfirsttime:0f1:returnedfirsttime:1f1:enteredsecondtime:2f1:returnedsecondtime:3f2:entered:4f1:enteredthirdtime:-1f1:returnedthirdtime
The expression f1.resume_with(...)
executes a lambda on top of fiber f1,
e.g. an additional stack frame is allocated on top of the stack. This lambda
assigns -1
to data and returns to the
second invocation of f1.resume().
Another option is to execute a function on top of the fiber that throws an
exception.
namespacectx=boost::context;structmy_exception:publicstd::runtime_error{ctx::fiberf;my_exception(ctx::fiber&&f_,std::stringconst&what):std::runtime_error{what},f{std::move(f_)}{}};ctx::fiberf{[](ctx::fiber&&f)->ctx::fiber{std::cout<<"entered"<<std::endl;try{f=std::move(f).resume();}catch(my_exception&ex){std::cerr<<"my_exception: "<<ex.what()<<std::endl;returnstd::move(ex.f);}return{};});f=std::move(f).resume();f=std::move(f).resume_with([](ctx::fiber&&f)->ctx::fiber{throwmy_exception(std::move(f),"abc");return{};});output:enteredmy_exception:abc
In this exception my_exception
is throw from a function invoked on-top of fiber f
and catched inside the for-loop.
Stack
unwinding
On construction of fiber a stack
is allocated. If the context-function returns the stack
will be destructed. If the context-function has not yet
returned and the destructor of an valid fiber
instance (e.g. fiber::operator bool() returns true) is called, the stack will be destructed
too.
Code executed by context-function must not prevent the
propagation ofs the detail::forced_unwind exception.
Absorbing that exception will cause stack unwinding to fail. Thus, any code
that catches all exceptions must re-throw any pending detail::forced_unwind
exception.
Allocating
control structures on top of stack
Allocating control structures on top of the stack requires to allocated the
stack_context and create the control structure with placement
new before fiber is created.
The user is responsible for destructing the control structure at the top
of the stack.
namespacectx=boost::context;// stack-allocator used for (de-)allocating stackfixedsize_stacksalloc(4048);// allocate stack spacestack_contextsctx(salloc.allocate());// reserve space for control structure on top of the stackvoid*sp=static_cast<char*>(sctx.sp)-sizeof(my_control_structure);std::size_tsize=sctx.size-sizeof(my_control_structure);// placement new creates control structure on reserved spacemy_control_structure*cs=new(sp)my_control_structure(sp,size,sctx,salloc);...// destructing the control structurecs->~my_control_structure();...structmy_control_structure{// captured fiberctx::fiberf;template<typenameStackAllocator>my_control_structure(void*sp,std::size_tsize,stack_contextsctx,StackAllocatorsalloc):// create captured fiberf{std::allocator_arg,preallocated(sp,size,sctx),salloc,entry_func}{}...};Inverting
the control flow
namespacectx=boost::context;/*
* grammar:
* P ---> E '\0'
* E ---> T {('+'|'-') T}
* T ---> S {('*'|'/') S}
* S ---> digit | '(' E ')'
*/classParser{charnext;std::istream&is;std::function<void(char)>cb;charpull(){returnstd::char_traits<char>::to_char_type(is.get());}voidscan(){do{next=pull();}while(isspace(next));}public:Parser(std::istream&is_,std::function<void(char)>cb_):next(),is(is_),cb(cb_){}voidrun(){scan();E();}private:voidE(){T();while(next=='+'||next=='-'){cb(next);scan();T();}}voidT(){S();while(next=='*'||next=='/'){cb(next);scan();S();}}voidS(){if(isdigit(next)){cb(next);scan();}elseif(next=='('){cb(next);scan();E();if(next==')'){cb(next);scan();}else{throwstd::runtime_error("parsing failed");}}else{throwstd::runtime_error("parsing failed");}}};std::istringstreamis("1+1");// user-code pulls parsed data from parser// invert control flowcharc;booldone=false;// execute parser in new fiberctx::fibersource{[&is,&c,&done](ctx::fiber&&sink){// create parser with callback functionParserp(is,[&sink,&c](charc_){// resume main fiberc=c_;sink=std::move(sink).resume();});// start recursive parsingp.run();// signal terminationdone=true;// resume main fiberreturnstd::move(sink);}};source=std::move(source).resume();while(!done){printf("Parsed: %c\n",c);source=std::Move(source).resume();}output:Parsed:1Parsed:+Parsed:1
In this example a recursive descent parser uses a callback to emit a newly
passed symbol. Using fiber the
control flow can be inverted, e.g. the user-code pulls parsed symbols from
the parser - instead to get pushed from the parser (via callback).
The data (character) is transferred between the two fibers.
Implementations:
fcontext_t, ucontext_t and WinFiberfcontext_t
The implementation uses fcontext_t per default. fcontext_t
is based on assembler and not available for all platforms. It provides a
much better performance than ucontext_t (the context
switch takes two magnitudes of order less CPU cycles; see section performance)
and WinFiber.
Because the TIB (thread information block on Windows) is not fully described
in the MSDN, it might be possible that not all required TIB-parts are swapped.
Using WinFiber implementation migh be an alternative.
ucontext_t
As an alternative, ucontext_t
can be used by compiling with BOOST_USE_UCONTEXT
and b2 property context-impl=ucontext.
ucontext_t might be available on a broader range of
POSIX-platforms but has some disadvantages
(for instance deprecated since POSIX.1-2003, not C99 conform).
fiber supports Segmented
stacks only with ucontext_t as its
implementation.
WinFiber
With BOOST_USE_WINFIB and
b2 property context-impl=winfib
Win32-Fibers are used as implementation for fiber.
The first call of fiber
converts the thread into a Windows fiber by invoking ConvertThreadToFiber(). If desired, ConvertFiberToThread() has to be called by the user explicitly
in order to release resources allocated by ConvertThreadToFiber() (e.g. after using boost.context).
Class fiber#include<boost/context/fiber.hpp>classfiber{public:fiber()noexcept;template<typenameFn>fiber(Fn&&fn);template<typenameStackAlloc,typenameFn>fiber(std::allocator_arg_t,StackAlloc&&salloc,Fn&&fn);~fiber();fiber(fiber&&other)noexcept;fiber&operator=(fiber&&other)noexcept;fiber(fiberconst&other)noexcept=delete;fiber&operator=(fiberconst&other)noexcept=delete;fiberresume()&&;template<typenameFn>fiberresume_with(Fn&&fn)&&;explicitoperatorbool()constnoexcept;booloperator!()constnoexcept;booloperator==(fiberconst&other)constnoexcept;booloperator!=(fiberconst&other)constnoexcept;booloperator<(fiberconst&other)constnoexcept;booloperator>(fiberconst&other)constnoexcept;booloperator<=(fiberconst&other)constnoexcept;booloperator>=(fiberconst&other)constnoexcept;template<typenamecharT,classtraitsT>friendstd::basic_ostream<charT,traitsT>&operator<<(std::basic_ostream<charT,traitsT>&os,fiberconst&other){voidswap(fiber&other)noexcept;};
Constructor
fiber()noexcept;Effects:
Creates a invalid fiber.
Throws:
Nothing.
Constructor
template<typenameFn>fiber(Fn&&fn);template<typenameStackAlloc,typenameFn>fiber(std::allocator_arg_t,StackAlloc&&salloc,Fn&&fn);Effects:
Creates a new fiber and prepares the context to execute fn. fixedsize_stack
is used as default stack allocator (stack size == fixedsize_stack::traits::default_size()).
The constructor with argument type preallocated,
is used to create a user defined data (for
instance additional control structures) on top of the stack.
Destructor
~fiber();Effects:
Destructs the associated stack if *this is a valid fiber, e.g. fiber::operator
bool() returns true.
Throws:
Nothing.
Move
constructor
fiber(fiber&&other)noexcept;Effects:
Moves underlying capture fiber to *this.
Throws:
Nothing.
Move assignment
operator
fiber&operator=(fiber&&other)noexcept;Effects:
Moves the state of other
to *this
using move semantics.
Throws:
Nothing.
Member function
operator()()
fiberresume()&&;template<typenameFn>fiberresume_with(Fn&&fn)&&;Effects:
Captures current fiber and resumes *this. The function resume_with,
is used to execute function fn
in the execution context of *this (e.g. the stack frame of fn is allocated on stack of *this).
Returns:
The fiber representing the fiber that has been suspended.
Note:
Because *this
gets invalidated, resume() and resume_with() are rvalue-ref qualified and bind
only to rvalues.
Note:
Function fn needs to
return fiber.
Note:
The returned fiber indicates if the suspended fiber has terminated
(return from context-function) via booloperator().
Member function
operator bool()
explicitoperatorbool()constnoexcept;Returns:true if *this
points to a captured fiber.
Throws:
Nothing.
Member function operator!()
booloperator!()constnoexcept;Returns:true if *this
does not point to a captured fiber.
Throws:
Nothing.
Member function
operator==()
booloperator==(fiberconst&other)constnoexcept;Returns:true if *this
and other represent
the same fiber, false
otherwise.
Throws:
Nothing.
Member
function operator!=()
booloperator!=(fiberconst&other)constnoexcept;Returns:! (other == * this)Throws:
Nothing.
Member function
operator<()
booloperator<(fiberconst&other)constnoexcept;Returns:true if *this!=other
is true and the implementation-defined total order of fiber values places *this
before other, false
otherwise.
Throws:
Nothing.
Member
function operator>()
booloperator>(fiberconst&other)constnoexcept;Returns:other<*thisThrows:
Nothing.
Member function
operator<=()
booloperator<=(fiberconst&other)constnoexcept;Returns:!(other<*this)Throws:
Nothing.
Member
function operator>=()
booloperator>=(fiberconst&other)constnoexcept;Returns:!(*this<other)Throws:
Nothing.
Non-member function operator<<()template<typenamecharT,classtraitsT>std::basic_ostream<charT,traitsT>&operator<<(std::basic_ostream<charT,traitsT>&os,fiberconst&other);Effects:
Writes the representation of other
to stream os.
Returns:osContext switching with call/cccall/cc is the reference implementation of C++ proposal
P0534R3:
call/cc (call-with-current-continuation): A low-level API for stackful context
switching.
call/cc (call with current continuation) is a universal
control operator (well-known from the programming language Scheme) that captures
the current continuation as a first-class object and pass it as an argument
to another continuation.
A continuation (abstract concept of functional programming languages) represents
the state of the control flow of a program at a given point in time. Continuations
can be suspended and resumed later in order to change the control flow of a
program.
Modern micro-processors are registers machines; the content of processor registers
represent a continuation of the executed program at a given point in time.
Operating systems simulate parallel execution of programs on a single processor
by switching between programs (context switch) by preserving and restoring
the continuation, e.g. the content of all registers.
callcc()callcc() is the C++ equivalent
to Scheme's call/cc operator. It captures the current
continuation (the rest of the computation; code after callcc())
and triggers a context switch. The context switch is achieved by preserving
certain registers (including instruction and stack pointer), defined by the
calling convention of the ABI, of the current continuation and restoring those
registers of the resumed continuation. The control flow of the resumed continuation
continues. The current continuation is suspended and passed as argument to
the resumed continuation.
callcc() expects a context-function
with signature 'continuation(continuation &&
c)'. The parameter c
represents the current continuation from which this continuation was resumed
(e.g. that has called callcc()).
On return the context-function of the current continuation
has to specify an continuation
to which the execution control is transferred after termination of the current
continuation.
If an instance with valid state goes out of scope and the context-function
has not yet returned, the stack is traversed in order to access the control
structure (address stored at the first stack frame) and continuation's stack
is deallocated via the StackAllocator.
Segmented stacks are
supported by callcc() using
ucontext_t.
continuationcontinuation represents a continuation;
it contains the content of preserved registers and manages the associated stack
(allocation/deallocation). continuation
is a one-shot continuation - it can be used only once, after calling continuation::resume()
or continuation::resume_with() it is invalidated.
continuation is only move-constructible
and move-assignable.
As a first-class object continuation
can be applied to and returned from a function, assigned to a variable or stored
in a container.
A continuation is continued by calling resume()/resume_with().
Usage
namespacectx=boost::context;inta;ctx::continuationsource=ctx::callcc([&a](ctx::continuation&&sink){a=0;intb=1;for(;;){sink=sink.resume();intnext=a+b;a=b;b=next;}returnstd::move(sink);});for(intj=0;j<10;++j){std::cout<<a<<" ";source=source.resume();}output:0112358132134
This simple example demonstrates the basic usage of call/cc
as a generator. The continuation sink
represents the main-continuation (function main()).
sink is captured (current-continuation)
by invoking callcc() and passed
as parameter to the lambda.
Because the state is invalidated (one-shot continuation) by each call of continuation::resume(),
the new state of the continuation,
returned by continuation::resume(), needs to be assigned
to sink after each call.
The lambda that calculates the Fibonacci numbers is executed inside the continuation
represented by source. Calculated
Fibonacci numbers are transferred between the two continuations via variable
a (lambda capture reference).
The locale variables b and
next remain their values during
each context switch. This is possible due source
has its own stack and the stack is exchanged by each context switch.
Parameter
passing
Data can be transferred between two continuations via global pointers, calling
wrappers (like std::bind) or lambda captures.
namespacectx=boost::context;inti=1;ctx::continuationc1=callcc([&i](ctx::continuation&&c2){std::printf("inside c1,i==%d\n",i);i+=1;returnc2.resume();});std::printf("i==%d\n",i);output:insidec1,i==1i==2callcc(<lambda>)
enters the lambda in continuation represented by c1
with lambda capture reference i=1. The expression
c2.resume()
resumes the continuation c2.
On return of callcc(<lambda>),
the variable i has the value
of i+1.
Exception
handling
If the function executed inside a context-function emits
an exception, the application is terminated by calling std::terminate(). std::exception_ptr
can be used to transfer exceptions between different continuations.
Do not jump from inside a catch block and then re-throw the exception in
another continuation.
Executing
function on top of a continuation
Sometimes it is useful to execute a new function on top of a resumed continuation.
For this purpose continuation::resume_with() has to be
used. The function passed as argument must accept a rvalue reference to continuation and return void.
namespacectx=boost::context;intdata=0;ctx::continuationc=ctx::callcc([&data](ctx::continuation&&c){std::cout<<"f1: entered first time: "<<data<<std::endl;data+=1;c=c.resume();std::cout<<"f1: entered second time: "<<data<<std::endl;data+=1;c=c.resume();std::cout<<"f1: entered third time: "<<data<<std::endl;returnstd::move(c);});std::cout<<"f1: returned first time: "<<data<<std::endl;data+=1;c=c.resume();std::cout<<"f1: returned second time: "<<data<<std::endl;data+=1;c=c.resume_with([&data](ctx::continuation&&c){std::cout<<"f2: entered: "<<data<<std::endl;data=-1;returnstd::move(c);});std::cout<<"f1: returned third time"<<std::endl;output:f1:enteredfirsttime:0f1:returnedfirsttime:1f1:enteredsecondtime:2f1:returnedsecondtime:3f2:entered:4f1:enteredthirdtime:-1f1:returnedthirdtime
The expression c.resume_with(...)
executes a lambda on top of continuation c,
e.g. an additional stack frame is allocated on top of the stack. This lambda
assigns -1
to data and returns to the
second invocation of c.resume().
Another option is to execute a function on top of the continuation that throws
an exception.
namespacectx=boost::context;structmy_exception:publicstd::runtime_error{ctx::continuationc;my_exception(ctx::continuation&&c_,std::stringconst&what):std::runtime_error{what},c{std::move(c_)}{}};ctx::continuationc=ctx::callcc([](ctx::continuation&&c){for(;;){try{std::cout<<"entered"<<std::endl;c=c.resume();}catch(my_exception&ex){std::cerr<<"my_exception: "<<ex.what()<<std::endl;returnstd::move(ex.c);}}returnstd::move(c);});c=c.resume_with([](ctx::continuation&&c){throwmy_exception(std::move(c),"abc");returnstd::move(c);});output:enteredmy_exception:abc
In this exception my_exception
is throw from a function invoked on-top of continuation c
and catched inside the for-loop.
Stack
unwinding
On construction of continuation
a stack is allocated. If the context-function returns
the stack will be destructed. If the context-function
has not yet returned and the destructor of an valid continuation
instance (e.g. continuation::operator bool() returns
true) is called, the stack will
be destructed too.
Code executed by context-function must not prevent the
propagation ofs the detail::forced_unwind exception.
Absorbing that exception will cause stack unwinding to fail. Thus, any code
that catches all exceptions must re-throw any pending detail::forced_unwind
exception.
Allocating
control structures on top of stack
Allocating control structures on top of the stack requires to allocated the
stack_context and create the control structure with placement
new before continuation is created.
The user is responsible for destructing the control structure at the top
of the stack.
namespacectx=boost::context;// stack-allocator used for (de-)allocating stackfixedsize_stacksalloc(4048);// allocate stack spacestack_contextsctx(salloc.allocate());// reserve space for control structure on top of the stackvoid*sp=static_cast<char*>(sctx.sp)-sizeof(my_control_structure);std::size_tsize=sctx.size-sizeof(my_control_structure);// placement new creates control structure on reserved spacemy_control_structure*cs=new(sp)my_control_structure(sp,size,sctx,salloc);...// destructing the control structurecs->~my_control_structure();...structmy_control_structure{// captured continuationctx::continuationc;template<typenameStackAllocator>my_control_structure(void*sp,std::size_tsize,stack_contextsctx,StackAllocatorsalloc):// create captured continuationc{}{c=ctx::callcc(std::allocator_arg,preallocated(sp,size,sctx),salloc,entry_func);}...};Inverting
the control flow
namespacectx=boost::context;/*
* grammar:
* P ---> E '\0'
* E ---> T {('+'|'-') T}
* T ---> S {('*'|'/') S}
* S ---> digit | '(' E ')'
*/classParser{charnext;std::istream&is;std::function<void(char)>cb;charpull(){returnstd::char_traits<char>::to_char_type(is.get());}voidscan(){do{next=pull();}while(isspace(next));}public:Parser(std::istream&is_,std::function<void(char)>cb_):next(),is(is_),cb(cb_){}voidrun(){scan();E();}private:voidE(){T();while(next=='+'||next=='-'){cb(next);scan();T();}}voidT(){S();while(next=='*'||next=='/'){cb(next);scan();S();}}voidS(){if(isdigit(next)){cb(next);scan();}elseif(next=='('){cb(next);scan();E();if(next==')'){cb(next);scan();}else{throwstd::runtime_error("parsing failed");}}else{throwstd::runtime_error("parsing failed");}}};std::istringstreamis("1+1");// execute parser in new continuationctx::continuationsource;// user-code pulls parsed data from parser// invert control flowcharc;booldone=false;source=ctx::callcc([&is,&c,&done](ctx::continuation&&sink){// create parser with callback functionParserp(is,[&sink,&c](charc_){// resume main continuationc=c_;sink=sink.resume();});// start recursive parsingp.run();// signal terminationdone=true;// resume main continuationreturnstd::move(sink);});while(!done){printf("Parsed: %c\n",c);source=source.resume();}output:Parsed:1Parsed:+Parsed:1
In this example a recursive descent parser uses a callback to emit a newly
passed symbol. Using call/cc the control flow can be inverted,
e.g. the user-code pulls parsed symbols from the parser - instead to get pushed
from the parser (via callback).
The data (character) is transferred between the two continuations.
Implementations:
fcontext_t, ucontext_t and WinFiberfcontext_t
The implementation uses fcontext_t per default. fcontext_t
is based on assembler and not available for all platforms. It provides a
much better performance than ucontext_t (the context
switch takes two magnitudes of order less CPU cycles; see section performance)
and WinFiber.
Because the TIB (thread information block on Windows) is not fully described
in the MSDN, it might be possible that not all required TIB-parts are swapped.
Using WinFiber implementation migh be an alternative.
ucontext_t
As an alternative, ucontext_t
can be used by compiling with BOOST_USE_UCONTEXT
and b2 property context-impl=ucontext.
ucontext_t might be available on a broader range of
POSIX-platforms but has some disadvantages
(for instance deprecated since POSIX.1-2003, not C99 conform).
callcc() supports Segmented stacks only with
ucontext_t as its implementation.
WinFiber
With BOOST_USE_WINFIB and
b2 property context-impl=winfib
Win32-Fibers are used as implementation for callcc().
The first call of callcc()
converts the thread into a Windows fiber by invoking ConvertThreadToFiber(). If desired, ConvertFiberToThread() has to be called by the user explicitly
in order to release resources allocated by ConvertThreadToFiber() (e.g. after using boost.context).
Class continuation#include<boost/context/continuation.hpp>classcontinuation{public:continuation()noexcept=default;~continuation();continuation(continuation&&other)noexcept;continuation&operator=(continuation&&other)noexcept;continuation(continuationconst&other)noexcept=delete;continuation&operator=(continuationconst&other)noexcept=delete;continuationresume();template<typenameFn>continuationresume_with(Fn&&fn);explicitoperatorbool()constnoexcept;booloperator!()constnoexcept;booloperator==(continuationconst&other)constnoexcept;booloperator!=(continuationconst&other)constnoexcept;booloperator<(continuationconst&other)constnoexcept;booloperator>(continuationconst&other)constnoexcept;booloperator<=(continuationconst&other)constnoexcept;booloperator>=(continuationconst&other)constnoexcept;template<typenamecharT,classtraitsT>friendstd::basic_ostream<charT,traitsT>&operator<<(std::basic_ostream<charT,traitsT>&os,continuationconst&other){voidswap(continuation&other)noexcept;};
Constructor
continuation()noexcept;Effects:
Creates a invalid continuation.
Throws:
Nothing.
Destructor
~continuation();Effects:
Destructs the associated stack if *this is a valid continuation, e.g.
continuation::operator bool() returns true.
Throws:
Nothing.
Move
constructor
continuation(continuation&&other)noexcept;Effects:
Moves underlying capture continuation to *this.
Throws:
Nothing.
Move assignment
operator
continuation&operator=(continuation&&other)noexcept;Effects:
Moves the state of other
to *this
using move semantics.
Throws:
Nothing.
Member function
operator()()
continuationresume();template<typenameFn>continuationresume_with(Fn&&fn);Effects:
Captures current continuation and resumes *this. The function resume_with,
is used to execute function fn
in the execution context of *this (e.g. the stack frame of fn is allocated on stack of *this).
Returns:
The continuation representing the continuation that has been suspended.
Note:
Function fn needs to
return continuation.
Note:
The returned continuation indicates if the suspended continuation has
terminated (return from context-function) via booloperator().
Member function
operator bool()
explicitoperatorbool()constnoexcept;Returns:true if *this
points to a captured continuation.
Throws:
Nothing.
Member function operator!()
booloperator!()constnoexcept;Returns:true if *this
does not point to a captured continuation.
Throws:
Nothing.
Member function
operator==()
booloperator==(continuationconst&other)constnoexcept;Returns:true if *this
and other represent
the same continuation, false
otherwise.
Throws:
Nothing.
Member
function operator!=()
booloperator!=(continuationconst&other)constnoexcept;Returns:! (other == * this)Throws:
Nothing.
Member function
operator<()
booloperator<(continuationconst&other)constnoexcept;Returns:true if *this!=other
is true and the implementation-defined total order of continuation values places *this
before other, false
otherwise.
Throws:
Nothing.
Member
function operator>()
booloperator>(continuationconst&other)constnoexcept;Returns:other<*thisThrows:
Nothing.
Member function
operator<=()
booloperator<=(continuationconst&other)constnoexcept;Returns:!(other<*this)Throws:
Nothing.
Member
function operator>=()
booloperator>=(continuationconst&other)constnoexcept;Returns:!(*this<other)Throws:
Nothing.
Non-member function operator<<()template<typenamecharT,classtraitsT>std::basic_ostream<charT,traitsT>&operator<<(std::basic_ostream<charT,traitsT>&os,continuationconst&other);Effects:
Writes the representation of other
to stream os.
Returns:osCall
with current continuation
#include<boost/context/continuation.hpp>template<typenameFn>continuationcallcc(Fn&&fn);template<typenameStackAlloc,typenameFn>continuationcallcc(std::allocator_arg_t,StackAllocsalloc,Fn&&fn);template<typenameStackAlloc,typenameFn>continuationcallcc(std::allocator_arg_t,preallocatedpalloc,StackAllocsalloc,Fn&&fn);Effects:
Captures current continuation and creates a new continuation prepared
to execute fn. fixedsize_stack is used as default
stack allocator (stack size == fixedsize_stack::traits::default_size()).
The function with argument type preallocated,
is used to create a user defined data (for
instance additional control structures) on top of the stack.
Returns:
The continuation representing the contexcontinuation that has been
suspended.
Note:
The returned continuation indicates if the suspended continuation has
terminated (return from context-function) via booloperator().
Stack allocation
The memory used by the stack is allocated/deallocated via a StackAllocator
which is required to model a stack-allocator concept.
stack-allocator
concept
A StackAllocator must satisfy the stack-allocator
concept requirements shown in the following table, in which a is an object of a StackAllocator
type, sctx is a stack_context, and size
is a std::size_t:
expression
return type
notes
a(size)
creates a stack allocator
a.allocate()stack_context
creates a stack
a.deallocate(sctx)void
deallocates the stack created by a.allocate()
The implementation of allocate() might include logic to protect against
exceeding the context's available stack size rather than leaving it as undefined
behaviour.
Calling deallocate()
with a stack_context not
set by allocate()
results in undefined behaviour.
Depending on the architecture allocate() stores an address from the top of the stack
(growing downwards) or the bottom of the stack (growing upwards).
Class protected_fixedsizeBoost.Context provides the class protected_fixedsize_stack
which models the stack-allocator concept. It appends
a guard page at the end of each stack to protect against exceeding the stack.
If the guard page is accessed (read or write operation) a segmentation fault/access
violation is generated by the operating system.
Using protected_fixedsize_stack is expensive. That
is, launching a new coroutine with a new stack is expensive; the allocated
stack is just as efficient to use as any other stack.
The appended guardpage
is not mapped to physical memory, only
virtual addresses are used.
#include<boost/context/protected_fixedsize.hpp>template<typenametraitsT>structbasic_protected_fixedsize{typedeftraitTtraits_type;basic_protected_fixesize(std::size_tsize=traits_type::default_size());stack_contextallocate();voiddeallocate(stack_context&);}typedefbasic_protected_fixedsize<stack_traits>protected_fixedsizestack_contextallocate()Preconditions:traits_type::minimum:size()<=size
and !traits_type::is_unbounded()&&(traits_type::maximum:size()>=size).
Effects:
Allocates memory of at least size
Bytes and stores a pointer to the stack and its actual size in sctx. Depending on the architecture
(the stack grows downwards/upwards) the stored address is the highest/lowest
address of the stack.
voiddeallocate(stack_context&sctx)Preconditions:sctx.sp is valid, traits_type::minimum:size()<=sctx.size and !traits_type::is_unbounded()&&(traits_type::maximum:size()>=sctx.size).
Effects:
Deallocates the stack space.
Class pooled_fixedsize_stackBoost.Context provides the class pooled_fixedsize_stack
which models the stack-allocator concept. In contrast
to protected_fixedsize_stack it does not append a guard
page at the end of each stack. The memory is managed internally by boost::pool<>.
#include<boost/context/pooled_fixedsize_stack.hpp>template<typenametraitsT>structbasic_pooled_fixedsize_stack{typedeftraitTtraits_type;basic_pooled_fixedsize_stack(std::size_tstack_size=traits_type::default_size(),std::size_tnext_size=32,std::size_tmax_size=0);stack_contextallocate();voiddeallocate(stack_context&);}typedefbasic_pooled_fixedsize_stack<stack_traits>pooled_fixedsize_stack;basic_pooled_fixedsize_stack(std::size_tstack_size,std::size_tnext_size,std::size_tmax_size)Preconditions:!traits_type::is_unbounded()&&(traits_type::maximum:size()>=stack_size)
and 0<nest_size.
Effects:
Allocates memory of at least stack_size
Bytes and stores a pointer to the stack and its actual size in sctx. Depending on the architecture
(the stack grows downwards/upwards) the stored address is the highest/lowest
address of the stack. Argument next_size
determines the number of stacks to request from the system the first
time that *this
needs to allocate system memory. The third argument max_size
controls how many memory might be allocated for stacks - a value of
zero means no uper limit.
stack_contextallocate()Preconditions:!traits_type::is_unbounded()&&(traits_type::maximum:size()>=stack_size).
Effects:
Allocates memory of at least stack_size
Bytes and stores a pointer to the stack and its actual size in sctx. Depending on the architecture
(the stack grows downwards/upwards) the stored address is the highest/lowest
address of the stack.
voiddeallocate(stack_context&sctx)Preconditions:sctx.sp is valid, !traits_type::is_unbounded()&&(traits_type::maximum:size()>=sctx.size).
Effects:
Deallocates the stack space.
Class fixedsize_stackBoost.Context provides the class fixedsize_stack
which models the stack-allocator concept. In contrast
to protected_fixedsize_stack it does not append a guard
page at the end of each stack. The memory is simply managed by std::malloc() and std::free().
#include<boost/context/fixedsize_stack.hpp>template<typenametraitsT>structbasic_fixedsize_stack{typedeftraitTtraits_type;basic_fixesize_stack(std::size_tsize=traits_type::default_size());stack_contextallocate();voiddeallocate(stack_context&);}typedefbasic_fixedsize_stack<stack_traits>fixedsize_stack;stack_contextallocate()Preconditions:traits_type::minimum:size()<=size
and !traits_type::is_unbounded()&&(traits_type::maximum:size()>=size).
Effects:
Allocates memory of at least size
Bytes and stores a pointer to the stack and its actual size in sctx. Depending on the architecture
(the stack grows downwards/upwards) the stored address is the highest/lowest
address of the stack.
voiddeallocate(stack_context&sctx)Preconditions:sctx.sp is valid, traits_type::minimum:size()<=sctx.size and !traits_type::is_unbounded()&&(traits_type::maximum:size()>=sctx.size).
Effects:
Deallocates the stack space.
Class
segmented_stackBoost.Context supports usage of a segmented_stack, e. g. the
size of the stack grows on demand. The coroutine is created with a minimal
stack size and will be increased as required. Class segmented_stack
models the stack-allocator concept. In contrast to
protected_fixedsize_stack and fixedsize_stack
it creates a stack which grows on demand.
Segmented stacks are currently only supported by gcc
from version 4.7clang
from version 3.4 onwards. In order to
use a segmented_stackBoost.Context
must be built with property segmented-stacks,
e.g. toolset=gcc segmented-stacks=on and
applying BOOST_USE_SEGMENTED_STACKS
at b2/bjam command line.
Segmented stacks can only be used with callcc()
(using ucontext_t)
.
#include<boost/context/segmented_stack.hpp>template<typenametraitsT>structbasic_segmented_stack{typedeftraitTtraits_type;basic_segmented_stack(std::size_tsize=traits_type::default_size());stack_contextallocate();voiddeallocate(stack_context&);}typedefbasic_segmented_stack<stack_traits>segmented_stack;stack_contextallocate()Preconditions:traits_type::minimum:size()<=size
and !traits_type::is_unbounded()&&(traits_type::maximum:size()>=size).
Effects:
Allocates memory of at least size
Bytes and stores a pointer to the stack and its actual size in sctx. Depending on the architecture
(the stack grows downwards/upwards) the stored address is the highest/lowest
address of the stack.
voiddeallocate(stack_context&sctx)Preconditions:sctx.sp is valid, traits_type::minimum:size()<=sctx.size and !traits_type::is_unbounded()&&(traits_type::maximum:size()>=sctx.size).
Effects:
Deallocates the stack space.
If the library is compiled for segmented stacks, segmented_stack
is the only available stack allocator.
Class stack_traitsstack_traits models a stack-traits
providing a way to access certain properites defined by the enironment. Stack
allocators use stack-traits to allocate stacks.
#include<boost/context/stack_traits.hpp>structstack_traits{staticboolis_unbounded()noexcept;staticstd::size_tpage_size()noexcept;staticstd::size_tdefault_size()noexcept;staticstd::size_tminimum_size()noexcept;staticstd::size_tmaximum_size()noexcept;}staticboolis_unbounded()Returns:
Returns true if the environment
defines no limit for the size of a stack.
Throws:
Nothing.
staticstd::size_tpage_size()Returns:
Returns the page size in bytes.
Throws:
Nothing.
staticstd::size_tdefault_size()Returns:
Returns a default stack size, which may be platform specific. If the
stack is unbounded then the present implementation returns the maximum
of 64kB
and minimum_size().
Throws:
Nothing.
staticstd::size_tminimum_size()Returns:
Returns the minimum size in bytes of stack defined by the environment
(Win32 4kB/Win64 8kB, defined by rlimit on POSIX).
Throws:
Nothing.
staticstd::size_tmaximum_size()Preconditions:is_unbounded()
returns false.
Returns:
Returns the maximum size in bytes of stack defined by the environment.
Throws:
Nothing.
Class stack_contextBoost.Context provides the class stack_context
which will contain the stack pointer and the size of the stack. In case of
a segmented_stack,
stack_context contains some extra control structures.
structstack_context{void*sp;std::size_tsize;// might contain additional control structures// for segmented stacks}void*spValue:
Pointer to the beginning of the stack.
std::size_tsizeValue:
Actual size of the stack.
Support for valgrind
Running programs that switch stacks under valgrind causes problems. Property
(b2 command-line) valgrind=on let
valgrind treat the memory regions as stack space which suppresses the errors.
Users must define BOOST_USE_VALGRIND
before including any Boost.Context headers when linking against Boost binaries
compiled with valgrind=on.
Support for sanitizers
Sanitizers (GCC/Clang) are confused by the stack switches. The library is
required to be compiled with property (b2 command-line) context-impl=ucontext and compilers santizer options.
Users must define BOOST_USE_ASAN
before including any Boost.Context headers when linking against Boost binaries.
Struct preallocatedstructpreallocated{void*sp;std::size_tsize;stack_contextsctx;preallocated(void*sp,std:size_tsize,stack_allocatorsctx)noexcept;};Constructor
preallocated(void*sp,std:size_tsize,stack_allocatorsctx)noexcept;Effects:
Creates an object of preallocated.
Performance
Performance measurements were taken using std::chrono::highresolution_clock,
with overhead corrections. The code was compiled with gcc-6.3.1, using build
options: variant = release, optimization = speed. Tests were executed on dual
Intel XEON E5 2620v4 2.2GHz, 16C/32T, 64GB RAM, running Linux (x86_64).
Performance of context switch
callcc()/continuation (fcontext_t)
callcc()/continuation (ucontext_t)
callcc()/continuation (Windows-Fiber)
9 ns / 19 CPU cycles
547 ns / 1130 CPU cycles
49 ns / 98 CPU cycles
ArchitecturesBoost.Context, using fcontext_t,
supports following architectures:
If the architecture is not supported but the platform provides ucontext_t,
Boost.Context should be compiled with BOOST_USE_UCONTEXT and b2 property context-impl=ucontext.
Cross compiling
Cross compiling the library requires to specify the build properties <architecture>,
<address-model>, <binary-format> and <abi> at b2 command
line.
RationaleNo
inline-assembler
Some newer compiler (for instance MSVC 10 for x86_64 and itanium) do not support
inline assembler. MSDN article
'Inline Assembler'. Inlined assembler generates code bloating which is not welcome
on embedded systems.
fcontext_t
Boost.Context provides the low level API fcontext_t
which is implemented in assembler to provide context swapping operations. fcontext_t
is the part to port to new platforms.
Context switches do not preserve the signal mask on UNIX systems.
fcontext_t is an opaque pointer.
Other APIs setjmp()/longjmp()
C99 defines setjmp()/longjmp()
to provide non-local jumps but it does not require that longjmp()
preserves the current stack frame. Therefore, jumping into a function which
was exited via a call to longjmp() is undefined
ISO/IEC 9899:1999, 2005, 7.13.2.1:2
.
ucontext_t
Since POSIX.1-2004 ucontext_t
is deprecated and was removed in POSIX.1-2008! The function signature of
makecontext()
is:
voidmakecontext(ucontext_t*ucp,void(*func)(),intargc,...);
The third argument of makecontext() specifies the number of integer arguments
that follow which will require function pointer cast if func
will accept those arguments which is undefined in C99
ISO/IEC 9899:1999, 2005, J.2
.
The arguments in the var-arg list are required to be integers, passing pointers
in var-arg list is not guaranteed to work, especially it will fail for architectures
where pointers are larger than integers.
ucontext_t preserves signal
mask between context switches which involves system calls consuming a lot
of CPU cycles (ucontext_t is slower; a context switch takes two
magnitutes of order more CPU cycles more than fcontext_t).
Windows
fibers
A drawback of Windows Fiber API is that CreateFiber() does not accept a pointer to user allocated
stack space preventing the reuse of stacks for other context instances. Because
the Windows Fiber API requires to call ConvertThreadToFiber() if SwitchFiber() is called for a thread which has not been
converted to a fiber. For the same reason ConvertFiberToThread() must be called after return from SwitchFiber()
if the thread was forced to be converted to a fiber before (which is inefficient).
if(!is_a_fiber()){ConvertThreadToFiber(0);SwitchToFiber(ctx);ConvertFiberToThread();}
If the condition _WIN32_WINNT>=_WIN32_WINNT_VISTA
is met function IsThreadAFiber() is provided in order to detect if the current
thread was already converted. Unfortunately Windows XP + SP 2/3 defines
_WIN32_WINNT>=_WIN32_WINNT_VISTA without providing
IsThreadAFiber().
x86 and
floating-point envi386
"The FpCsr and the MxCsr register must be saved and restored before
any call or return by any procedure that needs to modify them ..."
'Calling Conventions', Agner Fog
.
x86_64
Windows
MxCsr - "A callee that modifies any of the non-volatile fields within
MxCsr must restore them before returning to its caller. Furthermore, a caller
that has modified any of these fields must restore them to their standard
values before invoking a callee ..." MSDN
article 'MxCsr'.
FpCsr - "A callee that modifies any of the fields within FpCsr must
restore them before returning to its caller. Furthermore, a caller that has
modified any of these fields must restore them to their standard values before
invoking a callee ..." MSDN
article 'FpCsr'.
"The MMX and floating-point stack registers (MM0-MM7/ST0-ST7) are preserved
across context switches. There is no explicit calling convention for these
registers." MSDN
article 'Legacy Floating-Point Support'.
"The 64-bit Microsoft compiler does not use ST(0)-ST(7)/MM0-MM7".
'Calling Conventions', Agner Fog
.
"XMM6-XMM15 must be preserved" MSDN
article 'Register Usage'SysV
"The control bits of the MxCsr register are callee-saved (preserved
across calls), while the status bits are caller-saved (not preserved). The
x87 status word register is caller-saved, whereas the x87 control word (FpCsr)
is callee-saved."
SysV ABI AMD64 Architecture Processor Supplement Draft Version 0.99.4,
3.2.1
.
ReferenceARM
AAPCS ABI: Procedure Call Standard for the ARM Architecture
AAPCS/LINUX: ARM GNU/Linux Application Binary Interface Supplement
MIPS
O32 ABI: SYSTEM V APPLICATION BINARY INTERFACE, MIPS RISC Processor Supplement
PowerPC32
SYSV ABI: SYSTEM V APPLICATION BINARY INTERFACE PowerPC Processor Supplement
PowerPC64
SYSV ABI: PowerPC User Instruction Set Architecture, Book I
X86-32
SYSV ABI: SYSTEM V APPLICATION BINARY INTERFACE, Intel386TM Architecture
Processor Supplement
MS PE: Calling
ConventionsX86-64
SYSV ABI: System V Application Binary Interface, AMD64 Architecture Processor
Supplement
MS PE: x64
Software ConventionsAcknowledgments
I'd like to thank Adreas Fett, Artyom Beilis, Daniel Larimer, David Deakins,
Evgeny Shapovalov, Fernando Pelliccioni, Giovanni Piero Deretta, Gordon Woodhull,
Helge Bahmann, Holger Grund, Jeffrey Lee Hellrung (Jr.), Keith Jeffery, Martin
Husemann, Phil Endecott, Robert Stewart, Sergey Cheban, Steven Watanabe, Vicente
J. Botet Escriba, Wayne Piekarski.