Jump to content

Contributions:AudioExtension: Difference between revisions

From BCI2000 Wiki
Gmilsap (talk | contribs)
No edit summary
Nluczak (talk | contribs)
No edit summary
 
(48 intermediate revisions by 6 users not shown)
Line 8: Line 8:
===Authors===
===Authors===
Griffin Milsap (griffin.milsap@gmail.com)
Griffin Milsap (griffin.milsap@gmail.com)
Jordan Powell (jpow7@outlook.com)
===Version History===
===Version History===
06/11/2012: Initial public release;
* 2012/06/11: Initial public release;


===Source Code Revisions===
===Source Code Revisions===
*Initial development: 4095
*Initial development: 4095
*Tested under: 4095
*Tested under: 8516
*Known to compile under: 4095
*Known to compile under: 8516
*Broken since: --
*Broken since: --


===Todo===
===Todo===
Fix Known Issues
* Fix Known Issues
* Add per-sample resolution to envelopes


===Known Issues===
===Known Issues===
* Leaving the module running for long periods of time in halted state causes a long time of no state logging before signal goes to realtime. Seems to be unrelated to how long system was left running (~12-15 seconds) -- Not sure if this is an issue with the extension itself, or an issue with the [Programming_Reference:Events] bcievent interface.
* Using DirectSound when suspending and resuming states can cause an issue where the file recorded drops samples, this can be fixed by suspending and resuming until the audio clears up. ''Luckily, AudioExtension plays back what has been recorded so its easy to detect when this issue happens'', just restart the trial to fix or use ASIO where no known issues exist.
* Bandpass filtering in filterbanks doesn't appear to function


*When compiling in Debug mode the audio clips and some data may be lost, this '''DOES NOT''' occur in release mode.


==Functional Description==
==Functional Description==
In many cases, an experiment may require data about where the participant is lookingIn these experiments, an eyetracker is the only way to gather data relating to gaze position and eye locationThere are many eyetracking methods currently on the market, but many of these require the subject to hold their head steady -- often while strapped to a structure of some sort.  The Tobii eyetrackers require no such restriction so they were a natural choice when it came to interfacing with BCI2000.
Experiments which require audio input or real-time audio synthesis based on system state are now possible with the AudioExtension.  This extension is capable of recording multiple channels of audio input, synthesizing tones or noise, and reading encoded audio filesThese channels are input to a mixing matrix which mixes those inputs to multiple channels of audio outputBoth input and output are run through a simple filterbank, then they have their envelope extracted and logged into states via the bcievent interface.  Audio input and output channels can be recorded into audio files losslessly and can be resynchronized offline.  The mixing matrix is a matrix of expressions which can be used to dynamically change audio mixing based on the system state.


==Integration into BCI2000==
==Integration into BCI2000==
Compile the extension into your source module by enabling contributed extensions in your CMake configuration.  You can do this by going into your root build folder and deleting <code>CMakeCache.txt</code> and re-running the project batch file, or by running <code>cmake -i</code> and enabling '''BUILD_EYETRACKERLOGGER'''.  Once the extension is built into the source module, enable it by starting the source module with the <code>--LogEyetracker=1</code> command line argument.
Compile the extension into your source module by enabling contributed extensions in your CMake configuration.  You can do this by going into your root build folder and deleting <code>CMakeCache.txt</code> and re-running the project batch file, or by running <code>cmake -i</code> and enabling '''BUILD_AUDIOEXTENSION'''.  Once the extension is built into the source module, enable it by starting the source module with the <code>--EnableAudioExtension=1</code> command line argument.
 
===Building with ASIO support===
ASIO is a driver that allows for recording from devices with up to four input channels. It also can provide lower latency than other audio drivers. To compile with ASIO support, visit https://www.steinberg.net/en/company/developers.html and download the ASIO SDK. Extract the downloaded SDK zip file to <code>src/extlib/portaudio</code> and rename it <code>asio</code>. Enable the AudioExtension in CMake and click "Configure". Make sure the "Advanced" option is checked in the CMake GUI and enable <code>PORTAUDIO_ENABLE_ASIO</code>. Click "Generate" and recompile BCI2000. ASIO will now appear as an option under the <code>EnableAudioExtension</code> parameter when BCI2000 is run with the AudioExtension enabled.
 
==Block Diagram==
 
[[Image:AudioExtensionBlockDiagram.png]]
 
==Parameters==
The AudioExtension is configured in the Source tab within the AudioExtension section.  The configurable parameters are:
 
===EnableAudioExtension===
Enables/Disables the AudioExtension.
 
===AudioExtensionHostAPI===
This parameter is an audio host API selector.  The following values of this parameter are valid.  NOTE: Not all audio APIs are available on all platforms.
**[0] - auto
**[1] - DirectSound
**[2] - MME
**[3] - ASIO
**[4] - SoundManager
**[5] - CoreAudio
**[6] - Disabled
**[7] - OSS
**[8] - ALSA
**[9] - AL
**[10] - BeOs
**[11] - WDMKS
**[12] - JACK
**[13] - WASAPI
**[14] - AudioScienceHPI
**[15] - AudioIO
**[16] - PulseAudio
**[17] - Sndio
 
When set to 0 (auto), the audio extension will choose a preferred host API. On Windows, DirectSound is chosen over WASAPI although WASAPI allows for lower latency because DirectSound is more compatible across multiple devices for input and output.
 
===AudioBufferSize===
The size of the audio buffer, in audio frames. When set to "auto", defaults to 2048 (50ms).
Enlarge in steps of 1024 if you experience audio frame loss.
 
===AudioMixer===
 
The Audio Mixer is represented as an '''N x M''' Matrix, where '''N''' is the number of input channels on the selected input device, and
'''M''' is the number of output channels on the selected output device.
 
If the input device has 2 inputs and the output device has 2 outputs, the user must open the '''AudioMixer''' and set the matrix size to 2 x 2. To specify which input will be mapped to a specific output you place a <code>1</code> at the intersection of the row (input) and column (output).
 
''For the simplest configuration set the number of inputs and outputs and place a <code>1</code> in a diagonal line from the top left hand corner to the bottom right hand corner. ''
    row:1, column:1; row:2, column:2; row:3, column:3; ... , row:(N-1), column:(N-1); row:N, column:N;
 
 
By Default the Matrix will have numeric values for all the labels. To specify a different label, double click on the label and type the specified input type.
 
Below are a list of valid input labels:
 
*<code>X</code> - This is automatically interpreted as INPUT[X], where x is the input channel on the device.
*<code>INPUT[X]</code> - This input will come from channel X on the sound capturing device.
*<code>FILE[X]</code> - This input will come from channel X in the specified ''AudioInputFile'' listed in the ''Source'' Tab of BCI2000 Config.
*<code>TONE[X]</code> - This input will be a synthesized sine wave with the frequency of X Hz.
*<code>NOISE[X]</code> - This input will be generated white noise at X Hz.  ''NOTE: NOISE[] is white noise at the audio sampling rate (which defaults to 44100)''
 
===AudioInputDevice===
Requires a number, which corresponds to an input device ID. Each Audio Recording Device connected to the computer has an associated number. To select a specific device, enter the number into the corresponding box. To view a list of detected Audio Input Devices in BCI2000 click on <code>Set Config</code> and the devices will be listed below 'Audio Extension Enabled' in the operator log.
 
 
'''Format:'''
              Audio Input Device ID [ '''i''' ] : ''[Name of Audio Device] supports  '''N''' Input Channels''
 
 
Where '''i''' is a number that corresponds to the Name of the Audio Device. A value of -1 selects the default input device on this host API.
 
Where '''N''' is the number of input channels that can capture audio. This is also used as the number to set up the ''AudioMixer'' during configuration.
 
===AudioOutputDevice===
Requires a number, which corresponds to an output device ID. Each Audio Playback Device connected to the computer has an associated number. To select a specific device, enter the number into the corresponding box. To view a list of detected Audio Input Devices in BCI2000 click on <code>Set Config</code> and the devices will be listed below 'Audio Extension Enabled' in the operator log.


==Usage and Calibration==
Set up the eyetracker as detailed in the documentation that came with your device.  The device will connect to your machine and communicate through the ethernet port.  As such, it'd be wise to disconnect and turn off any other networking devices while using the eyetracker.  It is possible that your network request could go out over a different network interface if you're not careful which makes for a troubleshooting nightmare.  When you start the source module, ensure that the <code>--LogEyetracker=1</code> command line parameter is set.  Run the Eyetracker Browser utility which came with your eyetracker drivers and use it to locate the device on your local network.  Copy the network address to the clipboard and paste it in the <code>NetworkLocation</code> parameter within BCI2000.  If the listed port is different, put that in the <code>Port</code> parameter in the BCI2000 operator.


Calibration can now occur.  Calibration should be done per subject per sitting.  Re-calibration is not necessary between runs, but any time that the subject changes eyewear, makeup, or position, or if the lighting conditions change it should be re-calibrated. A good rule of thumb would be to recalibrate at the start of every session.  Once a calibration is performed, it is saved in the Tobii device until the next calibration (even if there's a power loss).
'''Format:'''
              Audio Output Device ID [ '''i''' ] : ''[Name of Audio Device] Supports N Output Channels''


BCI2000 does not provide any way to calibrate the eyetracker.  This should be done using the Tobii SDK sample application.  The SDK can be obtained here: http://www.tobii.com/analysis-and-research/global/products/software/tobii-software-development-kit/.  The 2.0 or Beta 3.0 SDK can be used to calibrate the device.  When installed, open <code>C:\Program Files\Tobii\Tobii Eye Tracker SDK 2.0.1\samples\<\code> for the 2.0.1 SDK or <code>C:\Program Files\Tobii\SDK\Samples\</code> for the 3.0 Beta SDK and find the "Eye Tracker Components C++" or "EyetrackerComponents.Cpp" sample project.  There should be a pre-built executable in this directory or a "prebuild" directory which can be used to run a calibration.  Use the network address from the Eyetracker Browser and follow the instructions to calibrate and test your device.  When you're done you can close the calibration utility and run BCI2000 - which will use the calibration saved on the device.


If you're using an T60/T120 or any future Tobii eyetracker with an attached display, you'll probably have it in a dual screen setup showing different things on both monitors - either extended or dualview.  Typically, the Tobii monitor is used to present the task to the subject and the other monitor is used for the experimenter to control BCI2000 from. This can present a bit of a problem when calibrating because the sample application will only run on your "primary display" which may or may not be your Tobii monitor.  Either set your primary display to be the Tobii screen or temporarily set the screen configuration into "clone" mode.  A different screen resolution or aspect ratio does not impact the Tobii calibration process.
Where '''i''' is a number that corresponds to the Name of the Audio Device. A value of -1 selects the default output device on this host API.


Once the device is calibrated, it can be used reliably in BCI2000.  The logger will report information about eye validity in a text visualization window and feed states into the system.
Where '''N''' is the number of output channels that where audio can be stored. This is also used as the number to set up the ''AudioMixer'' during configuration.


==Parameters==
===AudioInputFile===
The eyetracker is configured in the Source tab within the EyetrackerLogger sectionThe configurable parameters are:
Audio file to use as audio input to AudioMixer.  The selected file can have any non-zero number of channels and be encoded in almost any format (except MP3), but MUST be encoded at 44100 Hz.
===AudioRecordInput===
Enables/Disables recording of audio data to a file in the DataDirectory.
===AudioRecordOutput===
Enables/Disables recording of audio data to a file in the DataDirectory.
===AudioRecordingFormat===
Changes the file format and encoding options of the recorded output files.  This parameter has the following three options:
*Raw - Records to 16 bit Microsoft formatted WAV files with no compression.  These files open directly in MATLAB if that's interesting to you.
*Lossless - Records to FLAC formatted files.  These files are slightly smaller than RAW files, but have no quality loss.
*Lossy - Records to Ogg Vorbis files.  These files are similar to MP3 but do not have the associated licensing issuesThey are compressed using a lossy algorithm, so the resulting files are very small but sound slightly worse than lossless encoding.  This format is good for long recordings where perfect quality is not necessary.


*<code>LogEyetracker</code>  - Enables/Disables logging of Eyetracker states
Output files are located in the current BCI2000 output directory, and bear the <tt>.dat</tt> file's name, with its extension replaced with <tt>_in.wav</tt> or <tt>_out.wav</tt>, respectively (<tt>.flac</tt> and <tt>.ogg</tt> for the remaining two file format options).
*<code>NetworkLocation</code> - The network address of the Eyetracker given by the Tobii Eyetracker Browser
*<code>Port</code> - The port that the Tobii communicates over - Tobii default is 4455
*<code>LogGazeData</code> - Enables/Disables logging of gaze data
*<code>LogEyePos</code> - Enables/Disables logging of eye position (as seen from the camera)
*<code>LogPupilSize</code> - Enables/Disables logging of pupil size (very rough)
*<code>LogEyeDist</code> - Enables/Disables logging of the distance from the screen to the eyes (again, rough)
*<code>GazeScale</code> - Scales the incoming gaze data first
*<code>GazeOffset</code> - Offsets the incoming gaze data after scaling


Note: GazeScale and GazeOffset are quick hacks to address an issue with gaze data being clamped around the edges of the screen.  The eyetracker gives back values which are between 0.0 and 1.0 for onscreen gaze but supports looking slightly offscreen by allowing gaze data returned to go above 1.0 and below 0.0BCI2000 needs this scaled between 0.0 and 1.0 before the gaze data is multiplied by 65535 for storage in the 16 bit state.  These two parameters account for this scaling and offset and prevents the clamping from happening as often as it would otherwiseThese parameters will be removed once BCI2000 supports typed states.
===AudioInputFilterbank, AudioOutputFilterbank===
A filterbank which filters audio input and output before rectification/smoothing for envelope extraction.  These butterworth filters will not be applied to the audible signal.  The format of the filter bank is as follows:
*Type - The characteristic of the filter.  The following values are valid.
**Lowpass - Creates a low pass filter
**Highpass - Creates a high pass filter
**Bandpass - Creates a band pass filter [[Contributions:AudioExtension#Known_Issues|*See Known Issues*]]
**Bandstop - Creates a band stop, or notch filter
*Order - The order of the filter model.  Higher order filters are more accurate but more expensive computationally.
*Cutoff1 - The cutoff frequency for Lowpass and Highpass filters, and the cut-on frequency for Bandpass and Bandstop filters.
*Cutoff2 - The cut-off frequency for Bandpass and Bandstop filters.
The matrix can have as many rows as necessary to filter the signalFilters can be applied in any order and their transfer functions are multiplied before filtering occurs.
===AudioEnvelopeSmoothing===
The cutoff frequency for the low pass filter which is applied to the filtered and full-wave rectified audio dataThis should be set to the highest frequency you want to see in the resulting audio envelope.


The following code retreives the actual ~(0.0-1.0) range that the eyetracker outputs directly (assuming you've scaled and offset the signal to avoid clipping) from each eye and averages it to find a gaze position.
===ComputeEnvelopes===
<pre>
When set to 0 (false), the relatively CPU intensive envelope computation for input and output signal is omitted. This may be useful if you experience audio sample loss.
float x = State( "EyetrackerLeftEyeGazeX" ) + State( "EyetrackerRightEyeGazeX" ); x /= ( 2.0f * 65535.0f );
float y = State( "EyetrackerLeftEyeGazeY" ) + State( "EyetrackerRightEyeGazeY" ); y /= ( 2.0f * 65535.0f );
x -= ( float )Parameter( "GazeOffset" ); x /= ( float )Parameter( "GazeScale" );
y -= ( float )Parameter( "GazeOffset" ); y /= ( float )Parameter( "GazeScale" );
</pre>


==State Variables==
==State Variables==
Unless otherwise specified, all states are prefixed with <code>Eyetracker<Left/Right>Eye</code> which corresponds with each individual eye.  The EyetrackerLogger extension does not support subjects with more than two eyes at the moment.
The AudioExtension outputs the following state variables:
 
===Audio[In/Out]Envelope[0-3]===
These are the envelope values of each channel (up to channel 4) of the audio inputs and outputs (in the AudioMixer matrix).  These 16 bit unsigned values correspond to the resulting envelope after the audio envelope extraction.  For architectural reasons, it is not possible to publish states after system startup, so you are limited to four channels of input and output.  The AudioExtension can be easily modified to change the number of channels by editing the <code>#define NUM_INPUT_ENVELOPES 4</code> and <code>#define NUM_OUTPUT_ENVELOPES</code> lines in AudioExtension.cpp, and recompiling your source module.
 
===AudioFrame===
This 32 bit unsigned number corresponds to the current frame of audio data in the recorded output files.  This can be used to resynchronize the lossless audio to the resulting .dat file offlineAudio is sampled internally at 44100 Hz, so this number will roll over once every 27 hours or so.
==Configuring AudioExtension for 4-Channel Recording==
===Overview===
The following instructions describe how to configure the '''AudioExtension''' in BCI2000 to record from '''four separate audio input channels'''. This configuration requires building the extension with '''ASIO''' support for multi-channel low-latency audio.
 
===Building with ASIO Support===
 
Download the '''ASIO SDK''' from [https://www.steinberg.net/en/company/developers.html Steinberg’s developer site].
Extract the downloaded archive into:
 
<code>src/extlib/portaudio/asio</code>
 
Run CMake and ensure the following options are set:
 
<code>BUILD_AUDIOEXTENSION</code> = ON
 
<code>PORTAUDIO_ENABLE_ASIO</code> = ON (requires "Advanced" view)
 
Click '''Generate''' and recompile BCI2000.
After compilation, ASIO will appear as an available audio host API when running the source module with the AudioExtension enabled.
 
===Enabling the Extension===
Start your source module with the following command-line argument:
 
<pre> --EnableAudioExtension=1 </pre>
 
===Parameter Configuration===
In the '''Source → AudioExtension''' section of the BCI2000 Configuration dialog, set the following parameters:
 
===EnableAudioExtension===
<code>EnableAudioExtension = 1</code>
 
===AudioExtensionHostAPI===
Select the audio host API. For 4-channel recording, choose ASIO:
 
<pre> AudioExtensionHostAPI = 3 </pre>
 
===AudioInputDevice===
Each connected audio device is listed in the Operator Log after clicking '''Set Config'''.
Locate the entry similar to:
 
<pre> Audio Input Device ID [ i ] : [Device Name] supports N Input Channels </pre>
 
Choose a device with '''N ≥ 4''' and set:
 
<pre> AudioInputDevice = i </pre>
 
Use -1 to select the default device.
 
===AudioOutputDevice===
Select an output device (optional, for monitoring).
If unused, you may set:
 
<pre> AudioOutputDevice = -1 </pre>
 
===AudioBufferSize===
Defines the audio buffer length in frames:
 
<pre> AudioBufferSize = auto </pre>
 
If you experience frame loss, increase this in steps of 1024 (e.g., 3072, 4096).
 
===AudioRecordInput===
Enable input recording:
 
<pre> AudioRecordInput = 1 </pre>
 
===AudioRecordingFormat===
Select the desired file format:
 
<code>Raw</code> — 16-bit WAV, uncompressed.
 
<code>Lossless</code> — FLAC, smaller size, no quality loss.
 
<code>Lossy</code> — compressed.
 
Example:
 
<pre> AudioRecordingFormat = Lossless </pre>
 
Recorded files are saved to the data directory as:
 
'''_in.wav / .flac / .ogg''' — input recording
 
'''_out.wav / .flac / .ogg''' — output recording
 
===ComputeEnvelopes===
If envelope states are not required or CPU load is high:
 
<pre> ComputeEnvelopes = 0 </pre>
 
Otherwise, leave set to <code>1</code> (default) to compute <code>AudioInEnvelope[0–3]</code>.
 
===AudioMixer===
The AudioMixer defines input–output routing as an '''N × M''' matrix, where:
 
'''N''' = number of input channels (4)
 
'''M''' = number of output channels
 
For a 4-channel input and stereo output:
 
<pre> Matrix size: 4 × 2 </pre>
 
For simple monitoring, set the diagonal entries to 1 and label each input:
 
<pre> INPUT[1] → OUTPUT[1] INPUT[2] → OUTPUT[2] INPUT[3] → OUTPUT[1] INPUT[4] → OUTPUT[2] </pre>
 
If you do not need to hear playback, set all matrix values to 0.
 
===State Variables===
 
<code>AudioInEnvelope0..3</code> — Input channel envelopes (0–3)
 
<code>AudioOutEnvelope0..3</code> — Output channel envelopes (0–3)
 
<code>AudioFrame</code> — Current frame index (44.1 kHz)
 
===Notes===


===GazeX, GazeY===
Use '''WASAPI''' for devices supporting ≥4 inputs.
The eye gaze position (where - on the screen - the subject is looking) is returned from the Tobii SDK as 32 bit floating point numbers which (roughly) range from 0.0 to 1.0.  They are multiplied by 65535 and stored as 16 bit integers in these states if the <code>LogGazeData</code> parameter is enabled.  (0,0) corresponds to the top left of the screen, (65535,65535) corresponds to the right bottom of the screen. -- See [[Contributions:EyetrackerLogger#EyetrackerStatesOK|EyetrackerStatesOK]].


===PosX, PosY===
'''Debug builds''' may drop audio—use, use '''Release''' for deployment.
The eye position relative to the camera in 2D space is returned if <code>LogEyePos</code> is enabled.  Again, these are returned from the library as floating point numbers from 0.0 to 1.0 and are scaled to 16 bit integer values from 0 to 65535.  (0,0) corresponds to the top left of the camera's view, and (65535,65535) corresponds to the bottom right of the camera's view.


===PupilSize===
Increasing <code>AudioBufferSize</code> can reduce dropouts.
The pupil size in mm is saved in this state if <code>LogPupilSize</code> is enabled.  It corresponds to the length of the longest chord drawn from one side of the pupil to the other.  The size will change depending on the eye position and distance from the screen.  Although it is given in mm, it would be best to use this as a relative measurement.


===EyeDist===
The distance between the screen and the eyes in mm is saved in this state if <code>LogEyeDist</code> is enabled.  This measurement is an approximation.  The actual measurement will depend on whether or not the test subject is wearing glasses or not.


===EyeValidity===
====Troubleshooting====
This state is a number from 0 to 4 and is documented in the Tobii SDK manual.  It is repeated here for convenience.
* 0 - The eye tracker is certain that the data for this eye is right.  There is no risk of confusing data from the other eye.
* 1 - The eye tracker has only recorded one eye and made some assumptions and estimations regarding which is the left and which is the right eye.  However, it is still very likely that the assumption made is correct.  The validity code for the other eye is in this case always set to 3.
* 2 - The eye tracker has only recorded one eye, and has no way of determining which one is the left eye and which one is the right eye.  The validity code for both eyes is set to 2.
* 3 - The eye tracker is fairly confident that the actual gaze data belongs to the other eye.  The other eye will always have validity code 1.
* 4 - The actual gaze data is missing or definitely belonging to the other eye.


{| class="wikitable"
If only two input channels are available, ensure the device driver and host API are WASAPI.
|-
! Code (Right - Left)
! Description
|-
| 0 - 0
| Both eyes found.  Data is valid for both eyes.
|-
| 0 - 4 or 4 - 0
| One eye found.  Gaze data is the same for both eyes.
|-
| 1 - 3 or 3 - 1
| One eye found.  Gaze data is the same for both eyes.
|-
| 2 - 2
| One eye found.  Gaze data is the same for both eyes.
|-
| 4 - 4
| No eye found.  Gaze data for both eyes are invalid.
|}


It'd probably be wise to remove all data points with a validity state of 2 or higher while running your analysis.
For stability issues, increase buffer size or disable envelope computation.


===EyetrackerStatesOK===
Early versions of the extension didn't take into account that the library may return a number greater than 1.0 or less than 0.0.  This resulted in "pac-man" style wrap around of gaze coordinates in 2.0 and crashes in 3.0.  If the output from the library is out of bounds, it is clamped to the boundaries and the "EyetrackerStatesOK" parameter is changed.  A value of "1" corresponds to valid gaze data, a value of "0" corresponds to invalid "clamped" gaze data.  Use the "GazeOffset" and "GazeScale" parameters to avoid clamping.  Those parameters scale and offset the data so that when it does go out of range, it can still be fit into the 16 bit state.


==See also==
==See also==

Latest revision as of 15:27, 28 October 2025

Synopsis

An environment extension which manages multichannel, low latency audio I/O.

Location

http://www.bci2000.org/svn/trunk/src/contrib/Extensions/AudioExtension

Versioning

Authors

Griffin Milsap (griffin.milsap@gmail.com)

Jordan Powell (jpow7@outlook.com)

Version History

  • 2012/06/11: Initial public release;

Source Code Revisions

  • Initial development: 4095
  • Tested under: 8516
  • Known to compile under: 8516
  • Broken since: --

Todo

  • Fix Known Issues
  • Add per-sample resolution to envelopes

Known Issues

  • Using DirectSound when suspending and resuming states can cause an issue where the file recorded drops samples, this can be fixed by suspending and resuming until the audio clears up. Luckily, AudioExtension plays back what has been recorded so its easy to detect when this issue happens, just restart the trial to fix or use ASIO where no known issues exist.
  • When compiling in Debug mode the audio clips and some data may be lost, this DOES NOT occur in release mode.

Functional Description

Experiments which require audio input or real-time audio synthesis based on system state are now possible with the AudioExtension. This extension is capable of recording multiple channels of audio input, synthesizing tones or noise, and reading encoded audio files. These channels are input to a mixing matrix which mixes those inputs to multiple channels of audio output. Both input and output are run through a simple filterbank, then they have their envelope extracted and logged into states via the bcievent interface. Audio input and output channels can be recorded into audio files losslessly and can be resynchronized offline. The mixing matrix is a matrix of expressions which can be used to dynamically change audio mixing based on the system state.

Integration into BCI2000

Compile the extension into your source module by enabling contributed extensions in your CMake configuration. You can do this by going into your root build folder and deleting CMakeCache.txt and re-running the project batch file, or by running cmake -i and enabling BUILD_AUDIOEXTENSION. Once the extension is built into the source module, enable it by starting the source module with the --EnableAudioExtension=1 command line argument.

Building with ASIO support

ASIO is a driver that allows for recording from devices with up to four input channels. It also can provide lower latency than other audio drivers. To compile with ASIO support, visit https://www.steinberg.net/en/company/developers.html and download the ASIO SDK. Extract the downloaded SDK zip file to src/extlib/portaudio and rename it asio. Enable the AudioExtension in CMake and click "Configure". Make sure the "Advanced" option is checked in the CMake GUI and enable PORTAUDIO_ENABLE_ASIO. Click "Generate" and recompile BCI2000. ASIO will now appear as an option under the EnableAudioExtension parameter when BCI2000 is run with the AudioExtension enabled.

Block Diagram

Parameters

The AudioExtension is configured in the Source tab within the AudioExtension section. The configurable parameters are:

EnableAudioExtension

Enables/Disables the AudioExtension.

AudioExtensionHostAPI

This parameter is an audio host API selector. The following values of this parameter are valid. NOTE: Not all audio APIs are available on all platforms.

    • [0] - auto
    • [1] - DirectSound
    • [2] - MME
    • [3] - ASIO
    • [4] - SoundManager
    • [5] - CoreAudio
    • [6] - Disabled
    • [7] - OSS
    • [8] - ALSA
    • [9] - AL
    • [10] - BeOs
    • [11] - WDMKS
    • [12] - JACK
    • [13] - WASAPI
    • [14] - AudioScienceHPI
    • [15] - AudioIO
    • [16] - PulseAudio
    • [17] - Sndio

When set to 0 (auto), the audio extension will choose a preferred host API. On Windows, DirectSound is chosen over WASAPI although WASAPI allows for lower latency because DirectSound is more compatible across multiple devices for input and output.

AudioBufferSize

The size of the audio buffer, in audio frames. When set to "auto", defaults to 2048 (50ms). Enlarge in steps of 1024 if you experience audio frame loss.

AudioMixer

The Audio Mixer is represented as an N x M Matrix, where N is the number of input channels on the selected input device, and M is the number of output channels on the selected output device.

If the input device has 2 inputs and the output device has 2 outputs, the user must open the AudioMixer and set the matrix size to 2 x 2. To specify which input will be mapped to a specific output you place a 1 at the intersection of the row (input) and column (output).

For the simplest configuration set the number of inputs and outputs and place a 1 in a diagonal line from the top left hand corner to the bottom right hand corner.

   row:1, column:1; row:2, column:2; row:3, column:3; ... , row:(N-1), column:(N-1); row:N, column:N;


By Default the Matrix will have numeric values for all the labels. To specify a different label, double click on the label and type the specified input type.

Below are a list of valid input labels:

  • X - This is automatically interpreted as INPUT[X], where x is the input channel on the device.
  • INPUT[X] - This input will come from channel X on the sound capturing device.
  • FILE[X] - This input will come from channel X in the specified AudioInputFile listed in the Source Tab of BCI2000 Config.
  • TONE[X] - This input will be a synthesized sine wave with the frequency of X Hz.
  • NOISE[X] - This input will be generated white noise at X Hz. NOTE: NOISE[] is white noise at the audio sampling rate (which defaults to 44100)

AudioInputDevice

Requires a number, which corresponds to an input device ID. Each Audio Recording Device connected to the computer has an associated number. To select a specific device, enter the number into the corresponding box. To view a list of detected Audio Input Devices in BCI2000 click on Set Config and the devices will be listed below 'Audio Extension Enabled' in the operator log.


Format:

             Audio Input Device ID [ i ] : [Name of Audio Device] supports  N Input Channels


Where i is a number that corresponds to the Name of the Audio Device. A value of -1 selects the default input device on this host API.

Where N is the number of input channels that can capture audio. This is also used as the number to set up the AudioMixer during configuration.

AudioOutputDevice

Requires a number, which corresponds to an output device ID. Each Audio Playback Device connected to the computer has an associated number. To select a specific device, enter the number into the corresponding box. To view a list of detected Audio Input Devices in BCI2000 click on Set Config and the devices will be listed below 'Audio Extension Enabled' in the operator log.


Format:

             Audio Output Device ID [ i ] : [Name of Audio Device] Supports N Output Channels


Where i is a number that corresponds to the Name of the Audio Device. A value of -1 selects the default output device on this host API.

Where N is the number of output channels that where audio can be stored. This is also used as the number to set up the AudioMixer during configuration.

AudioInputFile

Audio file to use as audio input to AudioMixer. The selected file can have any non-zero number of channels and be encoded in almost any format (except MP3), but MUST be encoded at 44100 Hz.

AudioRecordInput

Enables/Disables recording of audio data to a file in the DataDirectory.

AudioRecordOutput

Enables/Disables recording of audio data to a file in the DataDirectory.

AudioRecordingFormat

Changes the file format and encoding options of the recorded output files. This parameter has the following three options:

  • Raw - Records to 16 bit Microsoft formatted WAV files with no compression. These files open directly in MATLAB if that's interesting to you.
  • Lossless - Records to FLAC formatted files. These files are slightly smaller than RAW files, but have no quality loss.
  • Lossy - Records to Ogg Vorbis files. These files are similar to MP3 but do not have the associated licensing issues. They are compressed using a lossy algorithm, so the resulting files are very small but sound slightly worse than lossless encoding. This format is good for long recordings where perfect quality is not necessary.

Output files are located in the current BCI2000 output directory, and bear the .dat file's name, with its extension replaced with _in.wav or _out.wav, respectively (.flac and .ogg for the remaining two file format options).

AudioInputFilterbank, AudioOutputFilterbank

A filterbank which filters audio input and output before rectification/smoothing for envelope extraction. These butterworth filters will not be applied to the audible signal. The format of the filter bank is as follows:

  • Type - The characteristic of the filter. The following values are valid.
    • Lowpass - Creates a low pass filter
    • Highpass - Creates a high pass filter
    • Bandpass - Creates a band pass filter *See Known Issues*
    • Bandstop - Creates a band stop, or notch filter
  • Order - The order of the filter model. Higher order filters are more accurate but more expensive computationally.
  • Cutoff1 - The cutoff frequency for Lowpass and Highpass filters, and the cut-on frequency for Bandpass and Bandstop filters.
  • Cutoff2 - The cut-off frequency for Bandpass and Bandstop filters.

The matrix can have as many rows as necessary to filter the signal. Filters can be applied in any order and their transfer functions are multiplied before filtering occurs.

AudioEnvelopeSmoothing

The cutoff frequency for the low pass filter which is applied to the filtered and full-wave rectified audio data. This should be set to the highest frequency you want to see in the resulting audio envelope.

ComputeEnvelopes

When set to 0 (false), the relatively CPU intensive envelope computation for input and output signal is omitted. This may be useful if you experience audio sample loss.

State Variables

The AudioExtension outputs the following state variables:

Audio[In/Out]Envelope[0-3]

These are the envelope values of each channel (up to channel 4) of the audio inputs and outputs (in the AudioMixer matrix). These 16 bit unsigned values correspond to the resulting envelope after the audio envelope extraction. For architectural reasons, it is not possible to publish states after system startup, so you are limited to four channels of input and output. The AudioExtension can be easily modified to change the number of channels by editing the #define NUM_INPUT_ENVELOPES 4 and #define NUM_OUTPUT_ENVELOPES lines in AudioExtension.cpp, and recompiling your source module.

AudioFrame

This 32 bit unsigned number corresponds to the current frame of audio data in the recorded output files. This can be used to resynchronize the lossless audio to the resulting .dat file offline. Audio is sampled internally at 44100 Hz, so this number will roll over once every 27 hours or so.

Configuring AudioExtension for 4-Channel Recording

Overview

The following instructions describe how to configure the AudioExtension in BCI2000 to record from four separate audio input channels. This configuration requires building the extension with ASIO support for multi-channel low-latency audio.

Building with ASIO Support

Download the ASIO SDK from Steinberg’s developer site. Extract the downloaded archive into:

src/extlib/portaudio/asio

Run CMake and ensure the following options are set:

BUILD_AUDIOEXTENSION = ON

PORTAUDIO_ENABLE_ASIO = ON (requires "Advanced" view)

Click Generate and recompile BCI2000. After compilation, ASIO will appear as an available audio host API when running the source module with the AudioExtension enabled.

Enabling the Extension

Start your source module with the following command-line argument:

 --EnableAudioExtension=1 

Parameter Configuration

In the Source → AudioExtension section of the BCI2000 Configuration dialog, set the following parameters:

EnableAudioExtension

EnableAudioExtension = 1

AudioExtensionHostAPI

Select the audio host API. For 4-channel recording, choose ASIO:

 AudioExtensionHostAPI = 3 

AudioInputDevice

Each connected audio device is listed in the Operator Log after clicking Set Config. Locate the entry similar to:

 Audio Input Device ID [ i ] : [Device Name] supports N Input Channels 

Choose a device with N ≥ 4 and set:

 AudioInputDevice = i 

Use -1 to select the default device.

AudioOutputDevice

Select an output device (optional, for monitoring). If unused, you may set:

 AudioOutputDevice = -1 

AudioBufferSize

Defines the audio buffer length in frames:

 AudioBufferSize = auto 

If you experience frame loss, increase this in steps of 1024 (e.g., 3072, 4096).

AudioRecordInput

Enable input recording:

 AudioRecordInput = 1 

AudioRecordingFormat

Select the desired file format:

Raw — 16-bit WAV, uncompressed.

Lossless — FLAC, smaller size, no quality loss.

Lossy — compressed.

Example:

 AudioRecordingFormat = Lossless 

Recorded files are saved to the data directory as:

_in.wav / .flac / .ogg — input recording

_out.wav / .flac / .ogg — output recording

ComputeEnvelopes

If envelope states are not required or CPU load is high:

 ComputeEnvelopes = 0 

Otherwise, leave set to 1 (default) to compute AudioInEnvelope[0–3].

AudioMixer

The AudioMixer defines input–output routing as an N × M matrix, where:

N = number of input channels (4)

M = number of output channels

For a 4-channel input and stereo output:

 Matrix size: 4 × 2 

For simple monitoring, set the diagonal entries to 1 and label each input:

 INPUT[1] → OUTPUT[1] INPUT[2] → OUTPUT[2] INPUT[3] → OUTPUT[1] INPUT[4] → OUTPUT[2] 

If you do not need to hear playback, set all matrix values to 0.

State Variables

AudioInEnvelope0..3 — Input channel envelopes (0–3)

AudioOutEnvelope0..3 — Output channel envelopes (0–3)

AudioFrame — Current frame index (44.1 kHz)

Notes

Use WASAPI for devices supporting ≥4 inputs.

Debug builds may drop audio—use, use Release for deployment.

Increasing AudioBufferSize can reduce dropouts.


Troubleshooting

If only two input channels are available, ensure the device driver and host API are WASAPI.

For stability issues, increase buffer size or disable envelope computation.


See also

User Reference:Logging Input, Contributions:Extensions