WebAudio OfflineAudioContext streaming API #1183

mattbirman · 2025-10-31T05:09:10Z

Explainer for improving the WebAudio OfflineAudioContext.startRendering() by adding streaming output.

OfflineAudioContext/explainer.md

Extraneous 'is' Co-authored-by: Nishitha Burman Dey <54861497+nishitha-burman@users.noreply.github.com>

mattbirman · 2025-11-03T03:24:11Z

Thanks all, I've updated the PR incorporating the feedback. Looking forward to your thoughts.

gabrielsanbrito

I like the detailed "user-facing problem" section. Just one minor comment, otherwise LGTM!

OfflineAudioContext/explainer.md

SteveBeckerMSFT · 2025-11-03T23:39:00Z

OfflineAudioContext/explainer.md

+
+### Output format
+
+There is an open question of what data format `startRenderingStream()` should return. The options under consideration are `AudioBuffer`, `Float32Array` planar or `Float32Array` interleaved.


How is the size of each audio chunk delivered through the audio stream determined? Do we need to give developers control of this parameter?

SteveBeckerMSFT · 2025-11-03T23:42:23Z

OfflineAudioContext/explainer.md

+
+```js
+// From https://developer.mozilla.org/en-US/docs/Web/API/AudioData/format
+enum AudioFormat {


I'm new to this issue, but it seems like two separate feature requests:

(1) OfflineAudioContext streaming.
(2) WebAudio/WebCodec interop or adding more audio formats to Web Audio.

It might be more tractable to solve (1) first with AudioBuffers and then tackle (2) afterwards. Then again, looking at past discussions, maybe bundling these is the way to go. Either way, I like how this explainer goes through all of the different options. Even if we tackle (1) first, we'll do to do so in a way that's forward looking to support (2).

I don't think we should add the idea of sample formats to WebAudio. All of it is already designed to be f32 from start to finish and I think it would be a lot to change that.

The code comment above the interface explains that it comes from an existing part of the spec, I don't propose adding these formats to WebAudio. The only addition is f32-interleaved. Which leaves the questions of how do we add this format if we want to use it as the output format?

I've removed this section

SteveBeckerMSFT · 2025-11-04T00:03:51Z

OfflineAudioContext/explainer.md

+const reader = context.startRenderingStream({ format: "f32-planar" }).reader();
+while (true) {
+  // get the next chunk of data from the stream
+  const result = await reader.read();


Can we expand on this example code to actually use result so we can see what the result contains?

SteveBeckerMSFT · 2025-11-04T00:04:27Z

OfflineAudioContext/explainer.md

+    break;
+  }
+
+  const buffers = result.value;


Does the value contain multiple buffers?

If we go with f32-interleaved then this value will contain interleaved Float32Array values

OfflineAudioContext/explainer.md

SteveBeckerMSFT · 2025-11-04T00:08:22Z

OfflineAudioContext/explainer.md

+
+- Simple to integrate with existing event-driven patterns
+
+### Cons


I think the main con is that the web platform already has a pattern for streaming (ReadableStreams). Using this provides interop with the rest of the platform that uses ReadableStreams. You implicitly mentioned this above with the BYOB reader. ReadableStreams also provides a pattern that many developers are familiar with. One other potential alternative would be to use a MediaStream instead of a ReadableStream. The non-offline AudioContext provides MediaStreams through the MediaStreamAudioDestinationNode.

OfflineAudioContext/explainer.md

matanui159

Sorry for all the nitpicks 😅

OfflineAudioContext/explainer.md

matanui159 · 2025-11-04T04:07:04Z

OfflineAudioContext/explainer.md

+**Pros**
+
+- semantically closest to the `startRendering()` API
+- does not add a new type to the WebAudio spec


WebAudio already uses Float32Array in its AudioBuffers and as the input/outputs of AudioWorkletProcessor.process

matanui159 · 2025-11-04T04:10:45Z

OfflineAudioContext/explainer.md

+
+**Cons**
+
+- requires the output of `startStreamingRendering()` to return an array of `Float32Array` in planar format for each output channel


To clarify on this point more, we could technically have each chunk of Float32Array be planar (e.g. with 128 left samples followed by 128 right samples), but it makes more sense for a stream of samples (with no sort of explicit separation like AudioBuffer provides) to be interleaved (also makes BYOB easier since you can read as many samples as you need).

The better alternative for if we wanted to stream planar Float32Array audio data is for an array of ReadableStream's each with their own Float32Array output (not an array of Float32Array). This would still allow us to use BYOB reading but still with explicitly separate planes for each channel

OfflineAudioContext/explainer.md

matanui159 · 2025-11-04T04:15:09Z

OfflineAudioContext/explainer.md

+**Pros**
+
+- allows for streaming a single stream of data, rather than one for each channel
+- interoperates with WebCodecs APIs because they operate with interleaved streams of data


WebCodecs supports both planar AND interleaved audio so this point isn't true (and since u can get the individual planes form an AudioBuffer it can also easily be used in WebCodecs)

matanui159 · 2025-11-04T04:16:50Z

OfflineAudioContext/explainer.md

+
+**Cons**
+
+- introduces a new type to the WebAudio spec, `f32-interleave` which does not exist at the moment


To clarify more on this point too, most of WebAudio uses AudioBuffer for raw data with the only exception (afaik?) being the AudioWorkletProcessor which uses planes of Float32Array. Even though a single stream of interleaved Float32Array makes the most sense for a BYOB stream, it would be inconsistent with the only other usage of Float32Array in WebAudio which could be confusing.

matanui159 · 2025-11-04T04:17:33Z

OfflineAudioContext/explainer.md

+
+#### Recommendation
+
+`f32-interleaved` as it is the most interoperable with other media APIs, like WebCodecs, and simplifies processing with other data streams such as video.


Again, WebCodecs supports both. Idk what other web APIs there are that only support interleaved?

So what should we put here for the format? f32 or f32-planar?

SteveBeckerMSFT · 2025-11-04T17:16:36Z

Thanks for updates! LGTM!

OfflineAudioContext/explainer.md

SteveBeckerMSFT requested review from SteveBeckerMSFT, gabrielsanbrito and nishitha-burman October 31, 2025 05:29

mattbirman added 2 commits October 31, 2025 22:31

Add skeleton of the explainer

13182bc

Add intro, goals and preferred option

9cefcb6

mattbirman force-pushed the mattbirman/webaudio-offlineaudiocontext-explainer branch from 9aa7b65 to 9cefcb6 Compare October 31, 2025 11:31

mattbirman added 3 commits October 31, 2025 22:32

Remove TOC

f1182db

Add detail to proposed approach intro

e11575d

Fix type of ReadableStream in questions

917ea42

nishitha-burman reviewed Oct 31, 2025

View reviewed changes

OfflineAudioContext/explainer.md Outdated Show resolved Hide resolved

nishitha-burman reviewed Oct 31, 2025

View reviewed changes

OfflineAudioContext/explainer.md Outdated Show resolved Hide resolved

nishitha-burman reviewed Oct 31, 2025

View reviewed changes

OfflineAudioContext/explainer.md Show resolved Hide resolved