You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: OfflineAudioContext/explainer.md
+82-21Lines changed: 82 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,17 +15,17 @@
15
15
16
16
WebAudio `OfflineAudioContext.startRendering()` allocates an `AudioBuffer` large enough to hold the entire render WebAudio graph before returning. For example, a 4 hour audio graph at 48 kHz with 4 channels will create gigabytes of in-memory float32 data in the `AudioBuffer`. This behaviour makes the API is unsuitable for very long offline renders or very large channel/length combinations. There is no simple way to chunk the output or consume it as a stream.
17
17
18
-
The [spec](https://webaudio.github.io/web-audio-api/#dom-offlineaudiocontext-startrendering) explicitly says at step 5: "Create a new AudioBuffer ... with ... length and sampleRate ... Assign this buffer to an internal slot" which means the API design currently mandates the full buffer allocation.
18
+
The [spec](https://webaudio.github.io/web-audio-api/#dom-offlineaudiocontext-startrendering) explicitly states at step 5: "Create a new AudioBuffer ... with ... length and sampleRate ... Assign this buffer to an internal slot" which means the API design currently mandates the full buffer allocation.
19
19
20
20
The participants on the [GitHub discussion](https://github.com/WebAudio/web-audio-api/issues/2445) agree that incremental delivery of data is necessary. Either streaming chunks of rendered audio or dispatching data in bits rather than everything at once so that memory usage is bounded and the data can be processed/consumed as it is produced.
21
21
22
22
## User-Facing Problem
23
23
24
-
The user in this context is the web developer using the WebAudio API. Their goal is to perform media processing using the feature-rich WebAudio API without taking a dependency on a 3rd party library to render the graph in an offline context. Because the current WebAudio OfflineAudioContext API is not suitable for this use case, the developer needs to create a WASM audio processing library or use an existing 3rd party dependency to perform the workload.
24
+
The user in this context is the web developer using the WebAudio APIto perform media processing workflows. Ideally developers could use the feature-rich WebAudio API for realtime and faster-than-realtime processing, without taking a dependency on a 3rd party library. However, in reality, the current WebAudio OfflineAudioContext API is not suitable for faster-than-raltime processing so the developer needs to create a WASM audio processing library or use an existing 3rd party dependency to achieve this goal.
25
25
26
26
### Goals
27
27
28
-
- Allow streaming data out of an WebAudio `OfflineAudioContext.startRendering()` for rendering large WebAudio graphs faster-than-realtime
28
+
- Allow streaming data out of a WebAudio `OfflineAudioContext.startRendering()` for rendering large WebAudio graphs faster-than-realtime
29
29
30
30
### Non-goals
31
31
@@ -36,17 +36,26 @@ The user in this context is the web developer using the WebAudio API. Their goal
36
36
The preferred approach is to add an `outputMode` to `startRendering()` to allow consumers to define the behavior of the offline rendering.
In "stream" mode the implementation will not allocate a giant `AudioBuffer` upfront. Instead it will render in quantums (e.g., 128 frames at a time), and enqueue chunks onto a return `ReadableStream`. [ReadableStream](https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream/getReader) will return a reader. `reader.read()` will resolve stream values until it is done and when no more data is available it will set `done = true`. In this mode, the user can read chunks as they arrive and consume them for storage, transcoding via WebCodecs, sending to a server, etc.
49
+
If the `startRendering` function is passed `{ mode = "stream" }` then it will render the audio graph in quantums (e.g., 128 frames at a time), and enqueue chunks onto a return `ReadableStream`, rather than rendering the whole graph into an `AudioBuffer` up front as it does currently. `startRendering` will return a [ReadableStream](https://developer.mozilla.org/en-US/docs/Web/API/ReadableStream/getReader) promise. A `reader` can be retrieved off the `ReadableStream` for reading chunks. `reader.read()` will resolve stream values to an `AudioBuffer` until it is done. When no more data is available it will set `done = true`.
50
+
51
+
In this mode, the user can read chunks as they arrive and consume them for storage, transcoding via WebCodecs, sending to a server, etc. An alternative is to allow BYOB reading, in this case `reader.read()` will return a Float32Array.
47
52
48
53
Memory usage is bounded by the size of each chunk plus the backlog of unhandled buffers.
49
54
55
+
### Questions
56
+
57
+
- What should return of ReadableStream.read() be? Float32Array with BYOB or AudioBuffer?
58
+
50
59
### Pros
51
60
52
61
- Aligns well with other web streaming APIs, similar to [WebCodecs](https://streams.spec.whatwg.org/#readablestream)
@@ -55,42 +64,94 @@ Memory usage is bounded by the size of each chunk plus the backlog of unhandled
55
64
56
65
### Cons
57
66
58
-
- Backwards-incompatible as existing code expects an `AudioBuffer` result
59
67
- Requires spec change
60
68
- Need to define sensible chunk sizes, backpressure, error handling, and end-of-stream
61
69
62
70
### Implement OfflineAudioContext.startRendering() streaming behaviour with this approach
[This should include as many alternatives as you can,
90
-
from high level architectural decisions down to alternative naming choices.]
91
-
92
134
### [Alternative 1]
93
135
136
+
Keep current `startRendering()` API but do not allocate the full `AudioBuffer`. After starting, periodically emit events on the context or a new interface such as `ondataavailable(chunk: AudioBuffer)`.
137
+
138
+
The user can subscribe and collect chunks for processing.
139
+
140
+
At the end, the API may optionally still provide a full `AudioBuffer`.
141
+
142
+
#### Pros
143
+
144
+
- Simple to integrate with existing event-driven patterns.
145
+
146
+
#### Cons
147
+
148
+
149
+
- Chunking semantics need spec
150
+
- Memory benefit only if user discards chunks but at least this is the in user control
151
+
152
+
#### Concerns
153
+
154
+
- Browser vendors may implement the chunking API but still allocate full buffer internally defeating memory reduction goal unless spec mandates avoiding full allocation.
94
155
[Describe an alternative which was considered,
95
156
and why you decided against it.
96
157
This alternative may have been part of a prior proposal in the same area,
0 commit comments