Coroutines are functions or procedures that save control state between calls (as opposed to, but very similar to, Generators, such as Random Number Generators, that save data state between calls). The most common example is a lexical analyser that behaves differently on different calls because it is tracking e.g. whether it is currently inside a function definition or outside a function definition.
Coroutines are very important as one of the very few examples of a form of concurrency that is useful, and yet constrained enough to completely avoid the typical difficulties (race conditions, deadlock, etc). Synchronization is built-in to the paradigm. It therefore cannot in general replace more general unconstrained forms of concurrency, but for some things it appears to be the ideal solution.
Lex and GNU Flex implement coroutine-based lexical analyzers, as can be seen by their support for entering and leaving user-defined states that allow for state-dependent pattern recognition.
In their most intuitive implementation (e.g. in languages that directly support them), they return control (and optionally a value) to the caller with a Yield statement, similarly to a return from a function, but when called (e.g. with "resume coroutine_name"), they resume execution at the point immediately following the previous yield.
For instance, here's a Coroutine that implements a simple Generator, using a fictitious extension to C:
int coroutine Generate123() { yield 1; /* Execution begins here upon first call to Generate123 */ yield 2; /* execution continues here upon "resume Generate123" */ yield 3; /* execution continues here upon second "resume Generate123" */ } main() { printf("%d\n", Generate123());/* prints "1" */ printf("%d\n", resume Generate123()); /* prints "2" */ printf("%d\n", resume Generate123()); /* prints "3" */ }As someone noted below, they can be used to simulate cooperative multitasking, but this is not their major point. Misunderstanding of coroutines as being a bad version of multithreading has hampered their widespread adoption.
It's worth noting that any direct implementation of a state machine simulates coroutines, or you could say that coroutines directly implement state machines, indicating that they are theoretically both fundamental and well-behaved.
Continuations can be viewed as a generalization of coroutines, in that the behavior of coroutines is typically static and defined at compile time, whereas continuations do not necessarily have such restrictions.
It is nonstandard to spell the word with camel case.
Well-known example of problem requiring CoRoutines to implement efficiently moved to SameFringeProblem
EwDijkstra originally conceived the OddWordProblem to demonstrate coroutines.
If you understand continuations (see ContinuationExplanation) then coroutines are just two or more continuations that call each other whenever they block, end, or just feel like being polite. They can simulate a very simple and unstructured form of CooperativeThreading and are explained e.g. in the first volume of TheArtOfComputerProgramming. See ContinuationsAndCoroutines and SchemeCoroutineExample.
Let me see if I understand this idea by stating it in other words.
A coroutine is somewhat like a system process that, at some stage of execution, yields explicitly to another process, and then, when yielded to explicitly by some other process, picks up where it left off.
(A major difference being that the process is blocked until resumed; another that coroutines can communicate freely through a shared ontext of some kind much more freely than processes.)
"Somewhat like" that, sure, that's a good analogy, as long as you're careful not to take it too literally. People usually think that that's exactly what coroutines are, which leads to disdain for them as nothing but cooperating multitasking.
For certain kinds of problems, there is no really good replacement for them. If you've ever set a state flag in a function (Simple case, FirstTime?), so as to influence later behavior, then likely what you really wanted was a coroutine. Once one is aware of how much cleaner they make such things, it hurts not to have them in such contexts. I have simulated them using communicating threads, and that really sucks compared with having them directly supported by the language.
Of course, people who love higher level functions (and, of course, use languages that support them) just get used to passing around different functions to be executed next time around, so they don't miss not having coroutines, but that's kind of overkill. It's more general, but it doesn't always capture the intent quite as transparently as coroutines, for the things that coroutines do well.
...If you've ever set a state flag in a function, so as to influence later behavior,...
Would an example be that often encountered 'firstTime' flag?
Hmmm. Never used it for that, but actually sure, that too:
myfunction() { int value = longCalculation(); while (1) { yield value; } }Usually in C you'd have to make "value" static, and have a firstTime flag too, so this is a bit cleaner. The real value comes up when the structure of the function is significantly more complex, though. A lexical analyzer would show this clearly, but ...not right now. :-)
Thanks Doug! When I was in school (SlideRules) coroutines were mentioned, more as a definition than something that we would actually use. I'll have to ponder on my experience to see if the technique was used ignorantly (Oh! is that's the name for what I did!) or would have helped crack some tough nut that we ended up kludging around. Again, thanks for the time you've taken to explain this.
After looking at your example, a light came on. The first-time flag is not sufficient to make it a coroutine. Coroutines must number at least two. They must require alternating execution (in the case of exactly two) to accomplish their task. Maybe this could be added to the definition above?
A pair (at least) of FiniteStateMachines that call each other is one way to implement coroutines. Perhaps it is the 'model' implementation.
I really don't like the "call each other" description - it violates my sense that calls are naturally nested. IMO a much better description is in terms of suspend/resume. My guess is that the "call each other" image was a part of why coroutines got so little attention for so long.
-- BobBockholt
This wasn't the KillerApplication; that was in the middle of the page, solving the SameFringeProblem, maybe you overlooked it?
A single finite state machine with outputs on each transition is a coroutine all by itself, because it remembers its state, so long as you assume that something else is requesting its outputs and doing work before nudging the FSM to make a transition and yield another output.
Coroutines are an excellent way to implement a FSM that is called repeatedly as a subsystem, which is why they're handy for lexers.
I didn't understand your comment about the first time flag, and I don't agree that there must be at least two coroutines. A coroutine can be implemented in C by giving it a separate stack and getting to it and from it using setjmp/longjmp, the same hack that is used to implement user space threads.
Implementation-wise, coroutines are extremely similar to, or even identical to, threads. But the abstraction is a bit different.
If you haven't seen the samefringe example above, perhaps that will help.
-- DougMerritt
I really liked the samefringe example, because it is not contrived - I can easily imagine needing to do this very thing.
It was the fact that each instance was also recursive that made it clear that these two instances were completely independent of each other.
I think we're inadvertently arguing over different definitions of the word 'routine'. Too often this word refers to a segment of source code, and, by extension, the object which is executed at run time. This view is not sufficient in this case. We must use the execution environment view to see what's happening, as we do when considering recursion.
So, I don't exclude the case where the 'two or more routines' are different instances of the same function. As you said, each instance has its own 'stack' or state and the durations of the instances overlap so they really can interact cooperatively.
FSMs came into my mind because I was trying to think of an example where I may have used coroutines unconsciously. Several times I've implemented nice-guy multitasking systems for microcontrollers with limited stack space (three words), no interrupts, and real-time response constraints. The best solution (given that the hardware was already chosen for us) was to allocate a time slice and a state variable to each task, and have all the developers make sure no state in his task exceeded the time slice. On the surface this looked like a number of coroutines, because each task maintained state, yielded, and resumed. But they didn't necessarily interact with one another to accomplish a single goal.
It wasn't much of a leap from that to the idea that each time a coroutine yields it's at a state boundary. Hence, FSMs are a valid underlying model yet again. At least for me.
The existence of a first time flag doesn't necessarily indicate or require coroutines because there need not be another instance of the same routine. A static local variable, the usual solution, precludes having more than one instance because all of them would share that same private global variable. (static local === private global in the CeeLanguage and its derivatives.)
-- BobBockholt
No, Doug, I don't have anything better to do than follow you around today. It's Saturday and we're snowed in here. -- BobBockholt
Heh! That's funny. Hmm, I'm in California; I'll leave the weather here to your imagination, and you can still hate me. ;-)
You can have SoCa, but I absolutely loved living up around the Bay Area when I was a kid, and it feels like home every time I get to go there. So, yeah, there's a little envy here.
This is alluded to above.... however, it may be useful to divide coroutines into two categories (I just love creating AdHocTaxonomies?)
I am glad to see so much interest in CoRoutines! I maintain that FlowBasedProgramming (FBP) components are coroutines - in fact, one of the products that we developed at IBM Canada based on these concepts did refer to the asynchronously executing components as "coroutines". (When we produced the Japanese documentation, this came out as "koruchin"). FBP components also can be thought of individual mainline programs - they just have "get"s and "put"s where a mainline might have I/O. This means that a conventional single-thread program is an FBP program consisting of a single component. So conventional single-thread programs are a subset of FBP programs - FBP does not remove any capabilities, but in fact adds some rather interesting ones. End of rant! This also means that they provide a natural way to implement the MichaelJackson Methodology (ASIN 0123790506). -- PaulMorrison
More specifically, FlowBasedProgramming components can be coroutines. They can also be threads, processes, etc.
Sorry, what is the exact distinction you are making? I agree FBP components seem to share the nature of several different things one runs into in the literature. One early term that seems a good fit is EwDijkstra's "[cooperating] sequential processes".
That FBP components are not always coroutines.
Doug, maybe you are using a more exact definition of coroutine than I was...? In your view, do coroutines have a different weight from that of threads (and processes)?
Always lighter than processes, sometimes lighter than threads. The lightest of threads are the same thing as coroutines, but with their running controlled by a scheduler, whereas coroutines run when they are called by each other (no scheduler involved).
Processes can often be considered to be threads in separate address spaces.
BTW over the decades there has been considerable variation in terminology concerning the usage of "thread", "process", and "task" (which I don't see anymore, but used to be used to mean either thread or process in some older OS environments). But I am under the impression that terminology has settled down considerably in recent years.
I believe that "coroutine" was defined unambiguously when it was introduced in 1964, and that its meaning hasn't changed since, but that it has sometimes been used loosely rather than exactly, which confuses things.
Thanks, you're right, I was using "coroutine" in a loose sense. BTW "task" is still used in the mainframe environment (IBM's MVS operating system), and refers to a fairly heavy-weight process using preemptive multithreading.
Ah, no wonder; it's been a long while since I've been in stone's throw of MVS.
I am currently doing some experimenting trying to do FBP in that environment - it could be interesting as a way to include processes that are I/O-bound but not FBP-aware (e.g. SQL) in a network without hanging the whole job while SQL does its thing.
That makes sense. So much so that I'm surprised it hadn't come before in your previous work.
Yes, it's odd enough that I needed to think about it... Possible reasons:
"The word 'coroutine' was coined by M.E. Conway in 1958, after he had developed the concept, and he first applied it to the construction of an assembly program. Coroutines were independently studied by J. Erdwinn and J. Merner, at about the same time [...] The first published explanation of the coroutine concept appeared much later in Conway's article 'Design of a Separable Transition-Diagram Compiler', CACM 6 (1963), 396-408. Actually a primitive form of coroutine linkage had already been noted briefly as a 'programming tip' in an early UNIVAC publication [...] A suitable notation for coroutines in ALGOL-like languages was introduced in Dahl and Nygaard's SIMULA I [...] and several excellent examples of coroutines (including replicated coroutines) appear in the book Structured Programming [by Dahl, Dijkstra, and Hoare]." (The Art of Computer Programming, volume 1, pp 229-230)
See also GeneratorsAreNotCoroutines
Related: ContinuationsAndCoroutines
For an implementation in CeePlusPlus see BoostCoroutine.