Mixing is done via the ma_mixer API. You can use this if you want to mix multiple sources of audio together and play them all at the same time, layered on top
of each other. This is a mid-level procedural API. Do not confuse this with a high-level data-driven API. You do not "attach" and "detach" sounds, but instead
write raw audio data directly into an accumulation buffer procedurally. High-level data-driven APIs will be coming at a later date.
Below are the features of the ma_mixer API:
* Mixing to and from any data format with seamless conversion when necessary.
* Initialize the `ma_mixer` object using whatever format is convenient, and then mix audio in any other format with seamless data conversion.
* Submixing (mix one `ma_mixer` directly into another `ma_mixer`, with volume and effect control).
* Volume control.
* Effects (via the `ma_effect` API).
* Mix directly from raw audio data in addition to `ma_decoder`, `ma_waveform`, `ma_noise`, `ma_pcm_rb` and `ma_rb` objects.
Mixing sounds together is as simple as summing their samples. As samples are summed together they are stored in a buffer called the accumulation buffer. In
order to ensure there is enough precision to store the intermediary results, the accumulation buffer needs to be at a higher bit depth than the sample format
being mixed, with the exception of floating point. Below is a mapping of the sample format and the data type of the accumulation buffer:
+---------------+------------------------+
| Sample Format | Accumulation Data Type |
+---------------+------------------------+
| ma_format_u8 | ma_int16 |
| ma_format_s16 | ma_int32 |
| ma_format_s24 | ma_int64 |
| ma_format_s32 | ma_int64 |
| ma_format_f32 | float |
+---------------+------------------------+
The size of the accumulation buffer is fixed and must be specified at initialization time. When you initialize a mixer you need to also specify a sample format
which will be the format of the returned data after mixing. The format is also what's used to determine the bit depth to use for the accumulation buffer and
how to interpret the data contained within it. You must also specify a channel count in order to support interleaved multi-channel data. The sample rate is not
required by the mixer as it only cares about raw sample data.
The mixing process involves three main steps:
1) Clearing the accumulation buffer to zero
ma_mixer_begin()
2) Accumulating all audio sources
ma_mixer_mix_pcm_frames()
ma_mixer_mix_data_source()
ma_mixer_mix_rb()
ma_mixer_mix_pcm_rb()
3) Volume, clipping, effects and final output
ma_mixer_end()
At the beginning of mixing the accumulation buffer will be cleared to zero. When you begin mixing you need to specify the number of PCM frames you want to
output at the end of mixing. If the requested number of output frames exceeds the capacity of the internal accumulation buffer, it will be clamped and returned
back to the caller. An effect can be applied at the end of mixing (after volume and clipping). Effects can do resampling which means the number of input frames
required to generate the requested number of output frames may be different. Therefore, another parameter is required which will receive the input frame count.
When mixing audio sources, you must do so based on the input frame count, not the output frame count (usage examples are in the next section).
After the accumulation buffer has been cleared to zero (the first step), you can start mixing audio data. When you mix audio data you should do so based on the
required number of input frames returned by ma_mixer_begin() or ma_mixer_begin_submix(). You can specify audio data in any data format in which case the data
will be automatically converted to the format required by the accumulation buffer. Input data can be specified in multiple ways:
- A pointer to raw PCM data
- A data source (ma_data_source, ma_decoder, ma_audio_buffer, ma_waveform, ma_noise)
- A ring buffer (ma_rb, ma_pcm_rb)
Once you've finished accumulating all of your audio sources you need to perform a post process step which performs the final volume adjustment, clipping,
effects and copying to the specified output buffer in the format specified when the mixer was initialized. Volume is applied before clipping, which is applied
before the effect, which is done before final output. In between these steps is all of the necessary data conversion, so for performance it's important to be
mindful of where and when data will be converted.
The mixing API in miniaudio supports seamless data conversion at all stages of the mixing pipeline. If you're not mindful about the data formats used by each
of the different stages of the mixing pipeline you may introduce unnecessary inefficiency. For maximum performance you should use a consistent sample format,
channel count and sample rate for as much of the mixing pipeline as possible. As soon as you introduce a different format, the mixing pipeline will perform the
Before you can initialize a mixer you need to specify it's configuration via a `ma_mixer_config` object. This can be created with `ma_mixer_config_init()`
which requires the mixing format, channel count, size of the intermediary buffer in PCM frames and an optional pointer to a pre-allocated accumulation buffer.
Once you have the configuration set up, you can call `ma_mixer_init()` to initialize the mixer. If you passed in NULL for the pre-allocated accumulation buffer
this will allocate it on the stack for you, using custom allocation callbacks specified in the `allocationCallbacks` member of the mixer config.
Below is an example for mixing two decoders together:
// At this point, frameCountIn contains the number of frames we should be mixing in this iteration, whereas frameCountOut contains the number of output
ma_mixer_end(&mixer, NULL, pFinalMix, 0); // pFinalMix must be large enough to store frameCountOut frames in the mixer's format (specified at initialization time).
```
When you want to mix sounds together, you need to specify how many output frames you would like to end up with by the end. This depends on the size of the
accumulation buffer, however, which is of a fixed size. Therefore, the number of output frames you ask for is not necessarily what you'll get. In addition, an
effect can be applied at the end of mixing, and since that may perform resampling, the number of input frames required to generate the desired number of output
frames may differ which means you must also specify a pointer to a variable which will receive the required input frame count. In order to avoid glitching you
should write all of these input frames if they're available.
The ma_mixer API uses a sort of "immediate mode" design. The idea is that you "begin" and "end" mixing. When you begin mixing a number of frames you need to
call `ma_mixer_begin()`. This will initialize the accumulation buffer to zero (silence) in preparation for mixing. Next, you can start mixing audio data which
can be done in several ways, depending on the source of the audio data. In the example above we are using a `ma_decoder` as the input data source. This will
automatically convert the input data to an appropriate format for mixing.
Each call to ma_mixer_mix_*() accumulates from the beginning of the accumulation buffer.
Once all of your input data has been mixed you need to call `ma_mixer_end()`. This is where the data in the accumulation buffer has volume applied, is clipped
and has the effect applied, in that order. Finally, the data is output to the specified buffer in the format specified when the mixer was first initialized,
overwriting anything that was previously contained within the buffer, unless it's a submix in which case it will be mixed with the parent mixer. See section
below for more details.
The mixing API also supports submixing. This is where the final output of one mixer is mixed directly into the accumulation buffer of another mixer. A common
example is a game with a music submix and an effects submix, which are then combined to form the master mix. Example:
```c
ma_uint64 frameCountIn;
ma_uint64 frameCountOut = desiredOutputFrameCount; // <-- Must be set to the desired number of output frames. Upon returning, will contain the actual number of output frames.
ma_mixer_end(&masterMixer, NULL, pFinalMix); // pFinalMix must be large enough to store frameCountOut frames in the mixer's format (specified at initialization time).
```
If you want to use submixing, you need to ensure the accumulation buffers of each mixer is large enough to accomodate each other. That is, the accumulation
buffer of the sub-mixer needs to be large enough to store the required number of input frames returned by the parent call to `ma_mixer_begin()`. If you are not
doing any resampling you can just make the accumulation buffers the same size and you will fine. If you want to submix, you can only call `ma_mixer_begin()`
between the begin and end pairs of the parent mixer, which can be a master mix or another submix.
Implementation Details and Performance Guidelines
-------------------------------------------------
There are two main factors which affect mixing performance: data conversion and data movement. This section will detail the implementation of the ma_mixer API
and hopefully give you a picture on how best to identify and avoid potential performance pitfalls.
TODO: Write me.
Below a summary of some things to keep in mind for high performance mixing:
* Choose a sample format at compile time and use it for everything. Optimized pipelines will be implemented for ma_format_s16 and ma_format_f32. The most
common format is ma_format_f32 which will work in almost all cases. If you're building a game, ma_format_s16 may also work. Professional audio work will
likely require ma_format_f32 for the added precision for authoring work. Do not use ma_format_s24 if you have high performance requirements as it is not
nicely aligned and thus requires an inefficient conversion to 32-bit.
* If you're building a game, try to use a consistent sample format, channel count and sample rate for all of your audio files, or at least all of your
audio files for a specific category (same format for all sfx, same format for all music, same format for all voices, etc.)
* Be mindful of when you perform resampling. Most desktop platforms output at a sample rate of 48000Hz or 44100Hz. If your input data is, for example,
22050Hz, consider doing your mixing at 22050Hz, and then doing a final resample to the playback device's output format. In this example, resampling all
of your data sources to 48000Hz before mixing may be unnecessarily inefficient because it'll need to perform mixing on a greater number of samples.
ma_formatformat;/* This will be the format output by ma_mixer_end(). */
ma_uint32channels;
ma_uint64accumulationBufferSizeInFrames;
void*pAccumulationBuffer;/* In the accumulation format. */
ma_allocation_callbacksallocationCallbacks;
ma_bool32ownsAccumulationBuffer;
floatvolume;
ma_effect*pEffect;/* The effect to apply after mixing input sources. */
struct
{
ma_uint64frameCountIn;
ma_uint64frameCountOut;
ma_bool32isInsideBeginEnd;
}mixingState;
}ma_mixer;
/*
Initialize a mixer.
A mixer is used to mix/layer/blend sounds together.
Parameters
----------
pConfig (in)
A pointer to the mixer's configuration. Cannot be NULL. See remarks.
pMixer (out)
A pointer to the mixer object being initialized.
Return Value
------------
MA_SUCCESS if successful; any other error code otherwise.
Thread Safety
-------------
Unsafe. You should not be trying to initialize a mixer from one thread, while at the same time trying to use it on another.
Callback Safety
---------------
This is safe to call in the data callback, but do if you do so, keep in mind that if you do not supply a pre-allocated accumulation buffer it will allocate
memory on the heap for you.
Remarks
-------
The mixer can be configured via the `pConfig` argument. The config object is initialized with `ma_mixer_config_init()`. Individual configuration settings can
then be set directly on the structure. Below are the members of the `ma_mixer_config` object.
format
The sample format to use for mixing. This is the format that will be output by `ma_mixer_end()`.
channels
The channel count to use for mixing. This is the number of channels that will be output by `ma_mixer_end()`.
accumulationBufferSizeInFrames
A mixer uses a fixed sized buffer for it's entire life time. This specifies the size in PCM frames of the accumulation buffer. When calling
`ma_mixer_begin()`, the requested output frame count will be clamped based on the value of this property. You should not use this propertry to
determine how many frames to mix at a time with `ma_mixer_mix_*()` - use the value returned by `ma_mixer_begin()`.
pPreAllocatedAccumulationBuffer
A pointer to a pre-allocated buffer to use for the accumulation buffer. This can be null in which case a buffer will be allocated for you using the
specified allocation callbacks, if any. You can calculate the size in bytes of the accumulation buffer like so:
Note that you should _not_ use `ma_get_bytes_per_frame()` when calculating the size of the buffer because the accumulation buffer requires a higher bit
depth for accumulation in order to avoid wrapping.
allocationCallbacks
Memory allocation callbacks to use for allocating memory for the accumulation buffer. If all callbacks in this object are NULL, `MA_MALLOC()` and
`MA_FREE()` will be used.
volume
The default output volume in linear scale. Defaults to 1. This can be changed after initialization with `ma_mixer_set_volume()`.
Unsafe. You should not be uninitializing a mixer while using it on another thread.
Callback Safety
---------------
If you did not specify a pre-allocated accumulation buffer, this will free it.
Remarks
-------
If you specified a pre-allocated buffer it will be left as-is. Otherwise it will be freed using the allocation callbacks specified in the config when the mixer
was initialized.
*/
MA_APIvoidma_mixer_uninit(ma_mixer*pMixer);
/*
Marks the beginning of a mix of a specified number of frames.
When you begin mixing, you must specify how many frames you want to mix. You specify the number of output frames you want, and upon returning you will receive
the number of output frames you'll actually get. When an effect is attached, there may be a chance that the number of input frames required to output the given
output frame count differs. The input frame count is also returned, and this is number of frames you must use with the `ma_mixer_mix_*()` APIs, provided that
number of input frames are available to you at mixing time.
Each call to `ma_mixer_begin()` must be matched with a call to `ma_mixer_end()`. In between these you mix audio data using the `ma_mixer_mix_*()` APIs. When
you call `ma_mixer_end()`, the number of frames that are output will be equal to the output frame count. When you call `ma_mixer_mix_*()`, you specify a frame
count based on the input frame count.
Parameters
----------
pMixer (in)
A pointer to the relevant mixer.
pParentMixer (in, optional)
A pointer to the parent mixer. Set this to non-NULL if you want the output of `pMixer` to be mixed with `pParentMixer`. Otherwise, if you want to output
directly to a buffer, set this to NULL. You would set this to NULL for a master mixer, and non-NULL for a submix. See remarks.
pFrameCountOut (in, out)
On input, specifies the desired number of output frames to mix in this iteration. The requested number of output frames may not be able to fit in the
internal accumulation buffer which means on output this variable will receive the actual number of output frames. On input, this will be ignored if
`pParentMixer` is non-NULL because the output frame count of a submix must be compatible with the parent mixer.
pFramesCountIn (out)
A pointer to the variable that will receive the number of input frames to mix with each call to `ma_mixer_mix_*()`. This will usually always equal the
output frame count, but will be different if an effect is applied and that effect performs resampling. See remarks.
Return Value
------------
MA_SUCCESS if successful; any other error code otherwise.
Thread Safety
-------------
This can be called from any thread so long as you perform your own synchronization against the `pMixer` and `pParentMixer` object.
Callback Safety
---------------
Safe.
Remarks
-------
When you call `ma_mixer_begin()`, you need to specify how many output frames you want. The number of input frames required to generate those output frames can
differ, however. This will only happen if you have an effect attached (see `ma_mixer_set_effect()`) and if one of the effects in the chain performs resampling.
The input frame count will be returned by the `pFrameCountIn` parameter, and this is how many frames should be used when mixing with `ma_mixer_mix_*()`. See
examples below.
The mixer API supports the concept of submixing which is where the output of one mixer is mixed with that of another. A common example from a game:
Master
SFX
Music
Voices
In the example above, "Master" is the master mix and "SFX", "Music" and "Voices" are submixes. When you call `ma_mixer_begin()` for the "Master" mix, you would
set `pParentMixer` to NULL. For the "SFX", "Music" and "Voices" you would set it to a pointer to the master mixer, and you must call `ma_mixer_begin()` and
`ma_mixer_end()` between the begin and end pairs of the parent mixer. If you want to perform submixing, you need to pass the same parent mixer (`pParentMixer`)
to `ma_mixer_end()`. See example 2 for an example on how to do submixing.
Example 1
---------
This example shows a basic mixer without any submixing.
```c
ma_uint64 frameCountIn;
ma_uint64 frameCountOut = desiredFrameCount; // <-- On input specifies what you want, on output receives what you actually got.
ma_mixer_end(&masterMixer, NULL, pFramesOut, 0); // <-- pFramesOut must be large enough to receive frameCountOut frames in mixer.format/mixer.channels format.
Applies volume, performs clipping, applies the effect (if any) and outputs the final mix to the specified output buffer or mixed with another mixer.
Parameters
----------
pMixer (in)
A pointer to the mixer.
pParentMixer (in, optional)
A pointer to the parent mixer. If this is non-NULL, the output of `pMixer` will be mixed with `pParentMixer`. It is an error for `pParentMixer` and
`pFramesOut` to both be non-NULL. If this is non-NULL, it must have also been specified as the parent mixer in the prior call to `ma_mixer_begin()`.
pFramesOut (in, optional)
A pointer to the buffer that will receive the final mixed output. The output buffer must be in the format specified by the mixer's configuration that was
used to initialized it. The required size in frames is defined by the output frame count returned by `ma_mixer_begin()`. It is an error for `pFramesOut`
and `pParentMixer` to both be non-NULL.
outputOffsetInFrames (in)
The offset in frames to start writing the output data to the destination buffer.
Return Value
------------
MA_SUCCESS if successful; any other error code otherwise.
Remarks
-------
It is an error both both `pParentMixer` and `pFramesOut` to both be NULL or non-NULL. You must specify one or the other.
When outputting to a parent mixer (`pParentMixer` is non-NULL), the output is mixed with the parent mixer. Otherwise (`pFramesOut` is non-NULL), the output
will overwrite anything already in the output buffer.
When calculating the final output, the volume will be applied before clipping, which is done before applying the effect (if any).
See documentation for `ma_mixer_begin()` for an example on how to use `ma_mixer_end()`.
Mixes audio data from a buffer containing raw PCM data.
Parameters
----------
pMixer (in)
A pointer to the mixer.
pFramesIn (in)
A pointer to the buffer containing the raw PCM data to mix with the mixer. The data contained within this buffer is assumed to be of the same format as the
mixer, which was specified when the mixer was initialized. Use `ma_mixer_mix_pcm_frames_ex()` to mix data of a different format.
frameCountIn (in)
The number of frames to mix. This cannot exceed the number of input frames returned by `ma_mixer_begin()`. If it does, an error will be returned. If it is
less, silence will be mixed to make up the excess.
formatIn (in)
The sample format of the input data.
channelsIn (in)
The channel count of the input data.
Return Value
------------
MA_SUCCESS if successful; any other error code otherwise.
Remarks
-------
Each call to this function will start mixing from the start of the internal accumulation buffer.
This will automatically convert the data to the mixer's native format. The sample format will be converted without dithering. Channels will be converted based
A pointer to the data source to read input data from.
frameCountIn (in)
The number of frames to mix. This cannot exceed the number of input frames returned by `ma_mixer_begin()`. If it does, an error will be returned. If it is
less, silence will be mixed to make up the excess.
pFrameCountOut (out)
Receives the number of frames that were processed from the data source.
formatIn (in)
The sample format of the input data.
channelsIn (in)
The channel count of the input data.
Return Value
------------
MA_SUCCESS if successful; any other error code otherwise.
Remarks
-------
Each call to this function will start mixing from the start of the internal accumulation buffer.
This will automatically convert the data to the mixer's native format. The sample format will be converted without dithering. Channels will be converted based
MA_APIma_resultma_mixer_mix_rb(ma_mixer*pMixer,ma_rb*pRB,ma_uint64offsetInFrames,ma_uint64frameCountIn,ma_uint64*pFrameCountOut,floatvolume,ma_effect*pEffect,ma_formatformatIn,ma_uint32channelsIn);/* Caller is the consumer. */
MA_APIma_resultma_mixer_mix_pcm_rb(ma_mixer*pMixer,ma_pcm_rb*pRB,ma_uint64offsetInFrames,ma_uint64frameCountIn,ma_uint64*pFrameCountOut,floatvolume,ma_effect*pEffect);/* Caller is the consumer. */
result=ma_data_source_unmap(pDataSource,framesMapped);/* Do this last because the result code is used below to determine whether or not we need to loop. */
result=MA_SUCCESS;/* Make sure we don't return MA_AT_END which will happen if we conicidentally hit the end of the data source at the same time as we finish outputting. */
}else{
break;/* We've reached the end and we're not looping. */
We can now read some data from the callback. We should never read more input frame than will be consumed. If the format of the callback is the same as the effect's input
format we can save ourselves a copy and run on a slightly faster path.
*/
if(preEffectConversionRequired==MA_FALSE){
/* Fast path. No need for conversion between the callback and the */
/* Ring buffer mixing can be implemented in terms of a memory mapped data source. */
ma_rb_data_sourceds;
ma_rb_data_source_init(pRB,formatIn,channelsIn,&ds);/* Will never fail and does not require an uninit() implementation. */
returnma_mixer_mix_data_source(pMixer,&ds,offsetInFrames,frameCountIn,pFrameCountOut,volume,pEffect,MA_TRUE);/* Ring buffers always loop, but the loop parameter will never actually be used because ma_rb_data_source__on_unmap() will never return MA_AT_END. */