Tech Blog
Flow types for generators and coroutines in JavaScript
Since ECMAScript 6 introduced the
yield
keyword, coroutines have become more common. The best known example is probably theasync
/await
framework for concurrency, but coroutines also form the backbone of redux-saga and have made their way into bluebird.
There seems to be little documentation on how to add Flow types to generators or coroutines. This post mitigates this by giving many different examples of typed generators. The code examples are on Github.
Generators vs. coroutines
Coroutines and generators are very different concepts. Generators let you create functions that look like iterators to the consumer.
Coroutines are an extension of the concept of traditional functions. A function will hand control back to its caller once through the return
statement. A coroutine will hand control back any number of times by calling yield
. When the coroutine yields, its internal state is preserved.
The consumer of a generator will pull data from the generator as needed. The caller of a coroutine will push data into the coroutine.
Generators and coroutines are very often lumped together because they use the same underlying machinery. For instance, in Python and JavaScript, both use the yield
keyword to hand over control. Both generators and coroutines can be paused and resumed later.
Despite these similarities, they are used in different contexts: generators are used to easily create iterators; coroutines are used to introduce concurrency or complex control flows.
If you are unfamiliar with coroutines and generators, I have found this chapter of Exploring JavaScript very useful.
One type to rule them all
Because coroutines and generators use the same underlying machinery, there is a single generic type in Flow:
Generator<Yield, Return, Next>
Here:
Yield
is the type of data that is yielded by the generator. If you have
a statement likeyield "my-string"
in your generator/coroutine,Yield
will bestring
.Return
is the type of the generator return statement. If you have a
statement likereturn "my-string"
, the type ofReturn
will bestring
.Next
is the type of values injected byyield
into the function. If
you have a statement likeconst nextItem: string = yield
, the type ofNext
will beYield
.
function* example: Generator<string, boolean, number> {
const toYield = 'this is a string'
const received: number = yield toYield
return false
}
Typing generators
Generators publish data. To start simple (and boring), let’s create a generator that publishes even numbers:
function* evens(): Generator<number, void, void> {
let current = 0;
while (true) {
yield current;
current += 2;
}
}
In this example, we:
- yield numbers
- do not return anything
- do not expect anything to be injected when we yield (there is no
variable bound to the left ofyield
).
Therefore, the type of our generator is Generator<number, void, void>
. With generators, the Next
type parameter is very often void
: it is rare to inject values back into the generator (though we do see lots of contrived examples where people try to restart a Fibonacci sequence).
Let’s try something slightly more complex. We can create a generator that recursively walks a file tree:
import fs from 'fs'
import path from 'path'
function* walkDirectories(root: string): Generator<string, void, void> {
for (const name of fs.readdirSync(root)) {
const filePath = path.join(root, name)
const stat = fs.lstatSync(filePath)
if (stat.isFile()) {
yield filePath
} else if (stat.isDirectory()) {
yield* walkDirectories(filePath)
}
}
}
We read the contents of a root directory and, for each item, publish it if it is a file, or recursively publish its contents if it is a directory. We still do not return anything, or bind to the yield
statement, so our generator will still be Generator<?, void, void>
. We either yield strings directly, or whatever walkDirectories
on a subdirectory yields, which is also strings. The type parameter for Yield
is therefore string
, making the overall type Generator<string, void, void>
.
Typing functions that consume generators
We now have types for our walkDirectories
function. What about consumers of our generator? Let’s write a function that groups files by their file extension and counts the number of files for each extension:
function countFilesByExtension(files): {[string]: number} {
const total = {};
for (const f of files) {
const extension = path.extname(f)
const currentCount = total[extension] || 0
total[extension] = currentCount + 1
}
return total
}
Here, the argument files
is our generator. What is the type of files
? We could, of course, type it as:
// overly specific
function countFilesByExtension(files: Generator<string, void, void>): {[string]: number} {
or, somewhat better, as:
// still overly specific
function countFilesByExtension(files: Generator<string, any, any>): {[string]: number} {
But really, we only care that files
can be iterated over. It would be legitimate to pass in an array, for instance. The recommended type to use is therefore Iterator
:
function countFilesByExtension(files: Iterator<string>): {[string]: number} {
This will work with our generator because Generator<T, any, any>
implements the Iterator<T>
interface.
We can use the whole pipeline as follows:
import os from 'os'
const rootDirectory = os.homedir()
console.log(
countFilesByExtension(
walkDirectories(rootDirectory)
)
)
Typing coroutines
In programs built around generators, information is pulled by the consumers. In programs built around coroutines, information is pushed to the consumer. Let’s create a pipeline that prints out the files in a given directory. We will wrap the asynchronous, callback-based functions in Node’s fs
module. The examples in this section are loosely based on the Generators section of Exploring JavaScript.
Commonly, stages in a coroutine pipeline take a target
argument that identifies the next stage in the pipeline. Therefore, if we have a function that pushes data of type T
down the pipeline, the type signature for that function will be:
function (target: Generator<any, any, T>): void { /* ... */ }
Let’s start by writing our source function. The source itself is not typically a coroutine, but it takes a coroutine as target.
We will just use Node’s fs.readdir
to push a list of all the entries in a given directory to the target:
const pushFiles = function(
directory: string,
target: Generator<any, any, string>
): void {
fs.readdir(
directory,
{ encoding: 'utf-8' },
(error, fileNames) => {
if (error) {
throw error
} else {
for (const fileName of fileNames) {
const filePath = path.join(directory, fileName)
target.next(filePath) // push file paths to a coroutine
}
}
}
)
}
Here, the target must be a Generator<any, any, string>
: it must accept strings on the left-hand side of the yield statement. The target will include a statement like:
const file: string = yield
Next, let’s create a coroutine that just logs everything passed into it:
const log = function* (): Generator<void, void, any> {
while (true) {
const item: any = yield
console.log(item)
}
})
The type of our coroutine is Generator<void, void, any>
:
Yield
is void since it does not yield anythingReturn
is void since the function does not returnNext
isany
since it accepts any type from upstream.
Somewhat annoyingly, we cannot use our log
coroutine directly without initialising it, because the coroutine needs to progress to the first yield. We need to write:
const logCoroutine = log()
logCoroutine.next() // initialize coroutine
// Read files in home directory and push them to logCoroutine
pushFiles(os.homedir(), logCoroutine)
To reduce this boilerplate, it is common to write a helper function that creates the generator, initialises it by calling its next
method and returns it:
function coroutine(generatorFunction) {
return function(...args) {
const generator = generatorFunction(...args)
generator.next()
return generator
}
}
Typing this helper method is a little tricky. What we really want to tell Flow is that we return a function with the same type as generatorFunction
. We also need to specify that generatorFunction
is a function that accepts variadic types and returns a generator. We can, for example, write:
function coroutine<G: Generator<any, any, any>>(
generatorFunction: ((...args: Array<any>) => G)
) {
return function(...args: Array<any>): G {
const generator = generatorFunction(...args)
generator.next()
return generator
}
}
Here, we specify that generatorFunction
must be a function that returns any generator, and that our coroutine function will itself return function returning that generator. Unfortunately, there is no good way to type polymorphic variadic functions in Flow (see this issue), so we lose type safety in the arguments to generatorFunction
. We can avoid this leaking out to the rest of our program by explicitly typing the coroutines. Our log
function now becomes:
const log: () => Generator<void, void, any> = coroutine(function* () {
while (true) {
const item: any = yield
console.log(item)
}
})
Therefore, by moving the types to log
, rather than directly on the argument of coroutine
, we can use log
in a type-safe way in the rest of our program.
We now don’t have to initialise our coroutine to use it any more:
pushFiles(os.homedir(), log())
Finally, let’s create an intermediate step in our pipeline that filters out anything that isn’t a simple file:
const isFile: (Generator<any, any, string> => Generator<void, void, string>) =
coroutine(function* (target) {
while (true) {
const fileMaybe: string = yield
fs.lstat(fileMaybe, (error, stat) => {
if (error) {
throw error
} else if (stat.isFile()) {
target.next(fileMaybe)
}
})
}
})
Our function accepts as its target any coroutine that accepts strings (since that is what it pushes out). Its return type is Generator<void, void, string>
:
Yield
is void since it does not yield anythingReturn
is void since the function does not returnNext
isstring
since it expects to be fed strings from upstream.
Generator<void, void, string>
will satisfy the constraint Generator<any, any, string>
that we specified on the target
argument in pushFiles
.
We can now build our coroutine pipeline:
pushFiles(os.homedir(), isFile(log()))
Conclusion
Flow is great. It catches a whole slew of errors that unit tests or manual QA will miss. It also makes the code much more readable to new users.
Unfortunately, there are few good examples for more complex types. If you struggle to add flow types to a construct, do write it up so the community can benefit from it!
More!
The Flow documentation on generators is useful further reading.
I have already mentioned Exploring JavaScript. Code samples are available on GitHub.
Finally, for a deeper understanding of coroutines in general, David Beazley has some great slides. These are aimed at Python but the concepts are very transferrable.