Skip to content

Commit 9cefcb6

Browse files
committed
Add intro, goals and preferred option
1 parent 13182bc commit 9cefcb6

File tree

1 file changed

+82
-21
lines changed

1 file changed

+82
-21
lines changed

OfflineAudioContext/explainer.md

Lines changed: 82 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -15,17 +15,17 @@
1515

1616
WebAudio `OfflineAudioContext.startRendering()` allocates an `AudioBuffer` large enough to hold the entire render WebAudio graph before returning. For example, a 4 hour audio graph at 48 kHz with 4 channels will create gigabytes of in-memory float32 data in the `AudioBuffer`. This behaviour makes the API is unsuitable for very long offline renders or very large channel/length combinations. There is no simple way to chunk the output or consume it as a stream.
1717

18-
The [spec](https://webaudio.github.io/web-audio-api/#dom-offlineaudiocontext-startrendering) explicitly says at step 5: "Create a new AudioBuffer ... with ... length and sampleRate ... Assign this buffer to an internal slot" which means the API design currently mandates the full buffer allocation.
18+
The [spec](https://webaudio.github.io/web-audio-api/#dom-offlineaudiocontext-startrendering) explicitly states at step 5: "Create a new AudioBuffer ... with ... length and sampleRate ... Assign this buffer to an internal slot" which means the API design currently mandates the full buffer allocation.
1919

2020
The participants on the [GitHub discussion](https://github.com/WebAudio/web-audio-api/issues/2445) agree that incremental delivery of data is necessary. Either streaming chunks of rendered audio or dispatching data in bits rather than everything at once so that memory usage is bounded and the data can be processed/consumed as it is produced.
2121

2222
## User-Facing Problem
2323

24-
The user in this context is the web developer using the WebAudio API. Their goal is to perform media processing using the feature-rich WebAudio API without taking a dependency on a 3rd party library to render the graph in an offline context. Because the current WebAudio OfflineAudioContext API is not suitable for this use case, the developer needs to create a WASM audio processing library or use an existing 3rd party dependency to perform the workload.
24+
The user in this context is the web developer using the WebAudio API to perform media processing workflows. Ideally developers could use the feature-rich WebAudio API for realtime and faster-than-realtime processing, without taking a dependency on a 3rd party library. However, in reality, the current WebAudio OfflineAudioContext API is not suitable for faster-than-raltime processing so the developer needs to create a WASM audio processing library or use an existing 3rd party dependency to achieve this goal.
2525

2626
### Goals
2727

28-
- Allow streaming data out of an WebAudio `OfflineAudioContext.startRendering()` for rendering large WebAudio graphs faster-than-realtime
28+
- Allow streaming data out of a WebAudio `OfflineAudioContext.startRendering()` for rendering large WebAudio graphs faster-than-realtime
2929

3030
### Non-goals
3131

@@ -36,17 +36,26 @@ The user in this context is the web developer using the WebAudio API. Their goal
3636
The preferred approach is to add an `outputMode` to `startRendering()` to allow consumers to define the behavior of the offline rendering.
3737

3838
```typescript
39+
interface StartRenderingOptions {
40+
mode: "audiobuffer" | "stream"
41+
}
42+
3943
interface OfflineAudioContext {
40-
startRendering(
41-
outputMode?: "audiobuffer" | "stream" | "none" = "audiobuffer"
44+
startRendering(options?: startRenderingOptions}
4245
): Promise<AudioBuffer | ReadableStream | void>;
4346
}
4447
```
4548

46-
In "stream" mode the implementation will not allocate a giant `AudioBuffer` upfront. Instead it will render in quantums (e.g., 128 frames at a time), and enqueue chunks onto a return `ReadableStream`. [ReadableStream](https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream/getReader) will return a reader. `reader.read()` will resolve stream values until it is done and when no more data is available it will set `done = true`. In this mode, the user can read chunks as they arrive and consume them for storage, transcoding via WebCodecs, sending to a server, etc.
49+
If the `startRendering` function is passed `{ mode = "stream" }` then it will render the audio graph in quantums (e.g., 128 frames at a time), and enqueue chunks onto a return `ReadableStream`, rather than rendering the whole graph into an `AudioBuffer` up front as it does currently. `startRendering` will return a [ReadableStream](https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream/getReader) promise. A `reader` can be retrieved off the `ReadableStream` for reading chunks. `reader.read()` will resolve stream values to an `AudioBuffer` until it is done. When no more data is available it will set `done = true`.
50+
51+
In this mode, the user can read chunks as they arrive and consume them for storage, transcoding via WebCodecs, sending to a server, etc. An alternative is to allow BYOB reading, in this case `reader.read()` will return a Float32Array.
4752

4853
Memory usage is bounded by the size of each chunk plus the backlog of unhandled buffers.
4954

55+
### Questions
56+
57+
- What should return of ReadableStream.read() be? Float32Array with BYOB or AudioBuffer?
58+
5059
### Pros
5160

5261
- Aligns well with other web streaming APIs, similar to [WebCodecs](https://streams.spec.whatwg.org/#readablestream)
@@ -55,42 +64,94 @@ Memory usage is bounded by the size of each chunk plus the backlog of unhandled
5564

5665
### Cons
5766

58-
- Backwards-incompatible as existing code expects an `AudioBuffer` result
5967
- Requires spec change
6068
- Need to define sensible chunk sizes, backpressure, error handling, and end-of-stream
6169

6270
### Implement OfflineAudioContext.startRendering() streaming behaviour with this approach
6371

72+
#### Option 1: AudioBuffer stream
73+
```js
74+
const offlineContext = new OfflineAudioContext(...);
75+
76+
// ... build up WebAudio graph
77+
78+
const stream = await offlineContext.startRendering(options: { mode: "stream"});
79+
const reader = stream.getReader();
80+
while (true) {
81+
// get the next chunk of data from the stream
82+
const result = await reader.read();
83+
84+
// the reader returns done = true when there are no more chunks to consume
85+
if (result.done) {
86+
break;
87+
}
88+
89+
const buffers = result.value;
90+
}
91+
```
92+
93+
#### Option 2: BYOB reading with Float32Array stream
6494
```js
65-
/*
95+
/**
6696
* New API
6797
*/
68-
6998
const offlineContext = new OfflineAudioContext(...);
70-
// build up WebAudio graph
71-
const readable: Promise<ReadableStream> = await offlineContext.startRendering("stream");
72-
const reader: ReadableStreamDefaultReader = readable.getReader();
99+
100+
// ... build up WebAudio graph
101+
102+
const stream = await offlineContext.startRendering(options: { mode: "stream"});
103+
const reader = stream.getReader({ mode: 'byob' });
104+
let buffer = new ArrayBuffer(...);
73105
while (true) {
74-
const result = await reader.read();
75-
if (result.done) break;
76-
const buffers = result.value;
106+
const result = await reader.read(new Float32Array(buffer));
107+
108+
// the reader returns done = true when there are no more chunks to consume
109+
if (result.done) {
110+
break;
111+
}
112+
113+
// process result...
114+
115+
buffer = result.value.buffer;
77116
}
117+
```
78118

79-
/*
119+
In both cases, the existing API remains unchanged for backwards compatability:
120+
```js
121+
/**
80122
* Existing API unchanged
81123
*/
82124
const offlineContext = new OfflineAudioContext(...);
83-
// build up WebAudio graph
84-
const renderedBuffer: Promise<AudioBuffer> = await offlineContext.startRendering();
125+
126+
// ... build up WebAudio graph
127+
128+
// Full AudioBuffer is allocated
129+
const renderedBuffer = await offlineContext.startRendering();
85130
```
86131

87132
## Alternatives considered
88133

89-
[This should include as many alternatives as you can,
90-
from high level architectural decisions down to alternative naming choices.]
91-
92134
### [Alternative 1]
93135

136+
Keep current `startRendering()` API but do not allocate the full `AudioBuffer`. After starting, periodically emit events on the context or a new interface such as `ondataavailable(chunk: AudioBuffer)`.
137+
138+
The user can subscribe and collect chunks for processing.
139+
140+
At the end, the API may optionally still provide a full `AudioBuffer`.
141+
142+
#### Pros
143+
144+
- Simple to integrate with existing event-driven patterns.
145+
146+
#### Cons
147+
148+
149+
- Chunking semantics need spec
150+
- Memory benefit only if user discards chunks but at least this is the in user control
151+
152+
#### Concerns
153+
154+
- Browser vendors may implement the chunking API but still allocate full buffer internally defeating memory reduction goal unless spec mandates avoiding full allocation.
94155
[Describe an alternative which was considered,
95156
and why you decided against it.
96157
This alternative may have been part of a prior proposal in the same area,

0 commit comments

Comments
 (0)