-
Notifications
You must be signed in to change notification settings - Fork 261
WebAudio OfflineAudioContext streaming API #1183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
WebAudio OfflineAudioContext streaming API #1183
Conversation
9aa7b65 to
9cefcb6
Compare
Extraneous 'is' Co-authored-by: Nishitha Burman Dey <54861497+nishitha-burman@users.noreply.github.com>
|
Thanks all, I've updated the PR incorporating the feedback. Looking forward to your thoughts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the detailed "user-facing problem" section. Just one minor comment, otherwise LGTM!
|
|
||
| ### Output format | ||
|
|
||
| There is an open question of what data format `startRenderingStream()` should return. The options under consideration are `AudioBuffer`, `Float32Array` planar or `Float32Array` interleaved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is the size of each audio chunk delivered through the audio stream determined? Do we need to give developers control of this parameter?
OfflineAudioContext/explainer.md
Outdated
|
|
||
| ```js | ||
| // From https://developer.mozilla.org/en-US/docs/Web/API/AudioData/format | ||
| enum AudioFormat { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm new to this issue, but it seems like two separate feature requests:
(1) OfflineAudioContext streaming.
(2) WebAudio/WebCodec interop or adding more audio formats to Web Audio.
It might be more tractable to solve (1) first with AudioBuffers and then tackle (2) afterwards. Then again, looking at past discussions, maybe bundling these is the way to go. Either way, I like how this explainer goes through all of the different options. Even if we tackle (1) first, we'll do to do so in a way that's forward looking to support (2).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should add the idea of sample formats to WebAudio. All of it is already designed to be f32 from start to finish and I think it would be a lot to change that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code comment above the interface explains that it comes from an existing part of the spec, I don't propose adding these formats to WebAudio. The only addition is f32-interleaved. Which leaves the questions of how do we add this format if we want to use it as the output format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed this section
OfflineAudioContext/explainer.md
Outdated
| const reader = context.startRenderingStream({ format: "f32-planar" }).reader(); | ||
| while (true) { | ||
| // get the next chunk of data from the stream | ||
| const result = await reader.read(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we expand on this example code to actually use result so we can see what the result contains?
OfflineAudioContext/explainer.md
Outdated
| break; | ||
| } | ||
|
|
||
| const buffers = result.value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the value contain multiple buffers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we go with f32-interleaved then this value will contain interleaved Float32Array values
|
|
||
| - Simple to integrate with existing event-driven patterns | ||
|
|
||
| ### Cons |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the main con is that the web platform already has a pattern for streaming (ReadableStreams). Using this provides interop with the rest of the platform that uses ReadableStreams. You implicitly mentioned this above with the BYOB reader. ReadableStreams also provides a pattern that many developers are familiar with. One other potential alternative would be to use a MediaStream instead of a ReadableStream. The non-offline AudioContext provides MediaStreams through the MediaStreamAudioDestinationNode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for all the nitpicks 😅
OfflineAudioContext/explainer.md
Outdated
| **Pros** | ||
|
|
||
| - semantically closest to the `startRendering()` API | ||
| - does not add a new type to the WebAudio spec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WebAudio already uses Float32Array in its AudioBuffers and as the input/outputs of AudioWorkletProcessor.process
|
|
||
| **Cons** | ||
|
|
||
| - requires the output of `startStreamingRendering()` to return an array of `Float32Array` in planar format for each output channel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify on this point more, we could technically have each chunk of Float32Array be planar (e.g. with 128 left samples followed by 128 right samples), but it makes more sense for a stream of samples (with no sort of explicit separation like AudioBuffer provides) to be interleaved (also makes BYOB easier since you can read as many samples as you need).
The better alternative for if we wanted to stream planar Float32Array audio data is for an array of ReadableStream's each with their own Float32Array output (not an array of Float32Array). This would still allow us to use BYOB reading but still with explicitly separate planes for each channel
OfflineAudioContext/explainer.md
Outdated
| **Pros** | ||
|
|
||
| - allows for streaming a single stream of data, rather than one for each channel | ||
| - interoperates with WebCodecs APIs because they operate with interleaved streams of data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WebCodecs supports both planar AND interleaved audio so this point isn't true (and since u can get the individual planes form an AudioBuffer it can also easily be used in WebCodecs)
OfflineAudioContext/explainer.md
Outdated
|
|
||
| **Cons** | ||
|
|
||
| - introduces a new type to the WebAudio spec, `f32-interleave` which does not exist at the moment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify more on this point too, most of WebAudio uses AudioBuffer for raw data with the only exception (afaik?) being the AudioWorkletProcessor which uses planes of Float32Array. Even though a single stream of interleaved Float32Array makes the most sense for a BYOB stream, it would be inconsistent with the only other usage of Float32Array in WebAudio which could be confusing.
OfflineAudioContext/explainer.md
Outdated
|
|
||
| #### Recommendation | ||
|
|
||
| `f32-interleaved` as it is the most interoperable with other media APIs, like WebCodecs, and simplifies processing with other data streams such as video. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, WebCodecs supports both. Idk what other web APIs there are that only support interleaved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what should we put here for the format? f32 or f32-planar?
|
Thanks for updates! LGTM! |
Explainer for improving the WebAudio
OfflineAudioContext.startRendering()by adding streaming output.