eio/doc/rationale.md

This document collects some of the reasons behind various API choices in Eio.

## Scheduling Order

When forking a new fiber, there are several reasonable scheduling behaviours:

1. Append the new fiber to the run queue and then continue running the parent.
2. Append the parent fiber to the run queue and start the child immediately.
3. Append both old and new fibers to the run-queue (in some order), then schedule the next task at the queue's head.
4. Prepend the old and new fibers to the *head* of the run-queue and resume one of them immediately.

And several desirable features:

- Especially for `Fiber.both`, putting both at the start or both at the end of the run-queue seems more consistent
  that starting one before everything in the queue and the other after.
- Adding both to the head of the queue is the most flexible, since the other behaviours can then be achieved by yielding.
- Putting both at the head may lead to starvation of other fibers.
- Running the child before the parent allows the child to e.g. create a switch and store it somewhere atomically.
- Scheduling new work to run next can make better use of the cache in some cases (domainslib works this way).

Therefore, `Fiber.fork f` runs `f` immediately and pushes the calling fiber to the head of the run-queue.
After making this change, the examples in the README seemed a bit more natural too.

## Indicating End-of-File

Functions for reading from a byte-stream need a way to indicate that the stream has ended.
There are various common ways to do this:

- Raise the `End_of_file` exception, as `Stdlib.input_line` does.
- Return `None`, as `Stdlib.In_channel.input_line` does.
- Return that 0 bytes were read, as `Stdlib.input` does.
- Return `Error End_of_file`, or similar.

Desirable features:

- A program that forgets to handle end-of-file should not hang.
- The programmer should not forget to handle end-of-file when they need to.
- The programmer should not be forced to handle it when they don't.
- Ideally, reading should not allocate on the heap.
- The meaning of the code should be obvious.

Returning 0 makes infinite loops easy to write.
For example, a function to parse a number from a stream might keep appending to a buffer until it sees a non-digit byte.
If the file ends in the middle of a number, the parser will hang, while working in other cases.

Returning a result or option type forces an allocation even for successful reads.
Mirage's `FLOW` type even uses ``Ok `Eof``, requiring a double allocation on every successful read.
However, it is likely that a read will allocate for other reasons (such as allocating a continuation when performing an effect),
so this is not a serious problem (for unbuffered reads at least).

Returning `None` makes the code unclear - you need to check the documentation to discover whether this applies only to end-of-file or to other kinds of error too, and may encourage programmers to discard error information and return `None` in all cases.

Raising `End_of_file` or returning 0 allows the programmer to forget to handle the error with no compile-time warning.

Eio chooses to use `End_of_file`:

- If you forget to handle it, the cause of the problem is at least obvious (unlike returning 0).
- It makes it easy to handle the error in one place, rather than throughout a parser.
  A backtracking parser is likely to be using exceptions for errors anyway.
- With typed effects it will be possible to track the exception in the type system (unlike returning 0).
- For simple code, with a single read in a loop, you will immediately notice if you don't handle end-of-file.
- For complex code, with multiple reads, you will likely use a parser library that hides this anyway.

## Dynamic Dispatch

Code is easier to understand when the target of a function call is known statically.
However, this is not always possible.
For example, there are many ways to provide a stream of bytes (from a file, TCP socket, HTTP encoding, TLS encoding, etc).
Often this choice is determined by the user at runtime, for example by providing a URL giving the scheme to use.
We may even need to choose a completely different Eio backend at runtime.
For example `Eio_main.run` will use the io_uring backend if the Linux kernel is new enough,
but fall back to `Eio_luv` if not.
For these reasons, Eio needs to use dynamic dispatch.

A resource whose implementation isn't known until runtime can be represented in many ways, including:

- As an object.
- As a record, with one field for each method.
- As a first-class module along with one of its values, packed in a GADT.

OCaml modules have the nice property that they can be used from fully static to fully dynamic situations:

- If a library author knows which concrete module they want,
  they can just call that module directly.

- If the library can be used with different modules,
  but the application using the library will decide which one at compile time,
  the library can use a functor.

- If the module will only be known at runtime, a first-class module can be passed as an argument.

For Eio, we also need good support for sub-typing because different platforms provide different features,
and because different operating system resources have overlapping features.
For example:

- Some flows are read-only, some write-only, and some read-write.
- Most can be closed.
- Some two-way flows support shutting down one side of the connection.
- Some flows are backed by a Unix file descriptor which we may want to extract.

The OCaml standard library provides separate `close_in` and `close_out` functions, but cannot handle two-way flows.
Eio instead provides a single `Flow.close` that works with all flows that can be closed.

Users of Eio can choose how specific to make their code.
For example, calling `Eio_main.run` will get you a basic Unix-like environment,
whereas using `Eio_linux.run` provides extra features specific to Linux's io_uring API.
This can then all be tracked in the type system
(dynamic checks are also possible, for more complex code that wants to use specific features only when available).
In contrast, OCaml's `Unix` module provides several functions that simply fail at runtime
on platforms where the function isn't available.

For dynamic dispatch with subtyping, objects seem to be the best choice:

- Records and modules require explicit casts when used.
  Objects avoid this problem using row-polymorphism.

- Using records or first-class modules requires frequently allocating.
  For example, when you have a `TWO_WAY` module and you want to use it as a `SOURCE`,
  OCaml has to make a copy of the module with the fields in the right order.
  For records, you have to write the code to do this copying yourself.
  Objects don't change their in-memory representation when used at different types.

- Using records means storing all the methods on every instance, which is wasteful.
  Using a GADT adds an extra level of indirection to the value's fields.
  An object uses a single block to store the object's fields and a pointer to the shared method table.

- First-class modules and GADTs are an advanced feature of the language.
  The new users we hope to attract to OCaml 5.00 are likely to be familiar with objects already.

- It is possible to provide base classes with default implementations of some methods.
  This can allow adding new operations to the API in future without breaking existing providers.

In general, simulating objects using other features of the language leads to worse performance
and worse ergonomics than using the language's built-in support.

In Eio, we split the provider and consumer APIs:

- To *provide* a flow, you implement an object type.
- To *use* a flow, you call a function (e.g. `Flow.close`).

The functions mostly just call the corresponding method on the object.
If you call object methods directly in OCaml then you tend to get poor compiler error messages.
This is because OCaml can only refer to the object types by listing the methods you seem to want to use.
Using functions avoids this, because the function signature specifies the type of its argument,
allowing type inference to work as for non-object code.
In this way, users of Eio can be largely unaware that objects are being used at all.

The function wrappers can also provide extra checks that the API is being followed correctly,
such as asserting that a read does not return 0 bytes,
or add extra convenience functions without forcing every implementor to add them too.

Note that the use of objects in Eio is not motivated by the use of the "Object Capabilities" security model.
Despite the name, that is not specific to objects at all.