The tutorials on the web don't talk too much about how to write an effect processor using ALSA. They do talk about recording program, and a playback program, but the combination is a kind of heresy. One tutorial even explains that writing a full duplex program is fairly involved and suggests forgetting about the whole thing and usingjack
instead. Gee, thanks. A "go away" message in tutorial was exactly what I wanted to hear. Well, jack
support is probably in the future forgnuitar
, but right now ALSA will have to do.
So, I struggled with it a while and managed to make it work at least on 3 soundcards, all using ALSA. It brings no great confidence to report that each time I managed to make it work on one, it seemed to break on the others... I suspect there are still bugs I don't know about.
This is fairly subtle, as is everything else. First of all, do not use the raw hw:0,0
device, but use plughw:0,0
. (Except use surround40:0,0
if you need 4-channel output.) This is so that you get automatic adaptation of hardware to your requirements. For instance, you can use sample formats not directly supported by hardware, or do mono output on systems that only support stereo output in hardware. (I have such a system, and I was not amused to discover that mono output was "impossible".)
So, firstly, open up devices and handle errors:
/* you allocate these in heap */
int restarting;
int nchannels = 1;
int buffer_size = 512;
int sample_rate = 48000;
int bits = 16;
char *snd_device_in = "plughw:0,0";
char *snd_device_out = "plughw:0,0";
snd_pcm_t *playback_handle;
snd_pcm_t *capture_handle;
if ((err = snd_pcm_open(&playback_handle, snd_device_out, SND_PCM_STREAM_PLAYBACK, 0)) < 0) {
fprintf(stderr, "cannot open output audio device %s: %s\n", snd_device_in, snd_strerror(err));
exit(1);
}
if ((err = snd_pcm_open(&capture_handle, snd_device_in, SND_PCM_STREAM_CAPTURE, 0)) < 0) {
fprintf(stderr, "cannot open input audio device %s: %s\n", snd_device_out, snd_strerror(err));
exit(1);
}
configure_alsa_audio(snd_device_in, nchannels);
configure_alsa_audio(snd_device_out, nchannels);
restarting = 1;
I wrote a small configuring function that sets identical setup on both handles. I copypasted these from the tutorials at thealsa project site, developer manuals.
int
configure_alsa_audio(snd_pcm_t *device, int channels)
{
snd_pcm_hw_params_t *hw_params;
int err;
int tmp;
snd_pcm_uframes_t frames;
int fragments = 2;
/* allocate memory for hardware parameter structure */
if ((err = snd_pcm_hw_params_malloc(&hw_params)) < 0) {
fprintf (stderr, "cannot allocate parameter structure (%s)\n",
snd_strerror(err));
return 1;
}
/* fill structure from current audio parameters */
if ((err = snd_pcm_hw_params_any(device, hw_params)) < 0) {
fprintf (stderr, "cannot initialize parameter structure (%s)\n",
snd_strerror(err));
return 1;
}
/* set access type, sample rate, sample format, channels */
if ((err = snd_pcm_hw_params_set_access(device, hw_params, SND_PCM_ACCESS_RW_INTERLEAVED)) < 0) {
fprintf (stderr, "cannot set access type: %s\n",
snd_strerror(err));
return 1;
}
// bits = 16
if ((err = snd_pcm_hw_params_set_format(device, hw_params, SND_PCM_FORMAT_S16_LE)) < 0) {
fprintf (stderr, "cannot set sample format: %s\n",
snd_strerror(err));
return 1;
}
tmp = sample_rate;
if ((err = snd_pcm_hw_params_set_rate_near(device, hw_params, &tmp, 0)) < 0) {
fprintf (stderr, "cannot set sample rate: %s\n",
snd_strerror(err));
return 1;
}
if (tmp != sample_rate) {
fprintf(stderr, "Could not set requested sample rate, asked for %d got %d\n", sample_rate, tmp);
sample_rate = tmp;
}
if ((err = snd_pcm_hw_params_set_channels(device, hw_params, channels)) < 0) {
fprintf (stderr, "cannot set channel count: %s\n",
snd_strerror(err));
return 1;
}
Now, at this point of setup we are mostly done, but here comes the important bit that I discovered: many soundcards haveweird buffer sizes. For instance, I have a Hoontech Digital XG-I card which refuses to use 2^n buffer sizes. The buffers had crazy sizes like 608 or 1346 bytes or something like that. What is even more interesting is that it doesn't support operating with just 2 fragments. (Fragments are the number of equal-sized chunks the full audio buffer gets divided to, and alsa calls them periods.) The code that reads will get fragment-sized chunks. The code that writes needs to write a fragment-sized chunk. Many devices support just 2 fragment operation, but not this one. So, I have to use the _near
variant of the function that will allocate whatever amount the soundcard hardware can do, and use that instead. So:
if ((err = snd_pcm_hw_params_set_periods_near(device, hw_params, &fragments, 0)) < 0) {
fprintf(stderr, "Error setting # fragments to %d: %s\n", fragments,
snd_strerror(err));
return 1;
}
The buffer_size
variable holds the size of a fragment. It needs to be converted to frames, which are one sample data for all channels involved. If we are in 16-bit 2-channel operation, a frame is 4 bytes wide. So, I multiply the frames with the number of fragments supported by hardware, and hope that the hardware can give me that buffer size. Or if it can't, I certainly hope it chooses a buffer size that yields equal-sized fragments... I don't really care, I'll just update my variable with whatever the sound card saw fit to give.
frame_size = channels * (bits / 8);
frames = buffer_size / frame_size * fragments;
if ((err = snd_pcm_hw_params_set_buffer_size_near(device, hw_params, &frames)) < 0) {
fprintf(stderr, "Error setting buffer_size %d frames: %s\n", frames,
snd_strerror(err));
return 1;
}
if (buffer_size != frames * frame_size / fragments) {
fprintf(stderr, "Could not set requested buffer size, asked for %d got %d\n", buffer_size, frames * frame_size / fragments);
buffer_size = frames * frame_size / fragments;
}
The final part: just set the parameters:
if ((err = snd_pcm_hw_params(device, hw_params)) < 0) {
fprintf(stderr, "Error setting HW params: %s\n",
snd_strerror(err));
return 1;
}
return 0;
}
The code enters a while loop that looks like this:
int frames, inframes, outframes, frame_size;
while (! exit_program) {
frame_size = channels * (bits / 8);
frames = buffer_size / frame_size;
if (restarting) {
restarting = 0;
/* drop any output we might got and stop */
snd_pcm_drop(capture_handle);
snd_pcm_drop(playback_handle);
/* prepare for use */
snd_pcm_prepare(capture_handle);
snd_pcm_prepare(playback_handle);
/* fill the whole output buffer */
for (i = 0; i < fragments; i += 1)
snd_pcm_writei(playback_handle, rdbuf, frames);
}
while ((inframes = snd_pcm_readi(capture_handle, rdbuf, frames)) < 0) {
if (inframes == -EAGAIN)
continue;
// by the way, writing to terminal emulator is costly if you use
// bad emulators like gnome-terminal, so don't do this.
fprintf(stderr, "Input buffer overrun\n");
restarting = 1;
snd_pcm_prepare(capture_handle);
}
if (inframes != frames)
fprintf(stderr, "Short read from capture device: %d, expecting %d\n", inframes, frames);
/* now processes the frames */
do_something(rdbuf, inframes);
while ((outframes = snd_pcm_writei(playback_handle, rdbuf, inframes)) < 0) {
if (outframes == -EAGAIN)
continue;
fprintf(stderr, "Output buffer underrun\n");
restarting = 1;
snd_pcm_prepare(playback_handle);
}
if (outframes != inframes)
fprintf(stderr, "Short write to playback device: %d, expecting %d\n", outframes, frames);
There's nothing especially strange about this -- this is a read-process-write loop... apart from one thing: the restarting variable. Whenever a buffer overrun or underrun occurs, or when we are initializing, the restarting
is set to 1
and that kicks the sound device back into orderly shape. I prefer shock treatment to accomplish this: first I tell it to dump all audio data it was carrying, then I prepare it back for operation with the prepare()
call, and now the important bit: I prefill 2 fragments worth of data with some gibberish that happened to already linger in the buffers. (Yes, I'll init the buffers with silence one day.)
This last part is essential. When entering a read-process-write loop, these buffers must be prefilled with data. 1 fragment write would not do for my SB Live!, it had to be 2 fragments. (I have no idea whether the count should be scaled up to greater number of fragments in case the sound hardware don't support 2 fragment operation. The soundcard that doesn't seems to not rattle with 2 fragments write, though.)
It's a bit too difficult to use. Too many function calls, too low level. A few utility functions to open a device and set the most common parameters at one swoop would be great.
There seems to be no good manuals. No wonder audio software tends to be a bit crappy, it's extremely hard to puzzle this stuff out.
Why do I need to write 2 fragments before I start my read-process-write loop? Why doesn't the audio driver even seem to notice it's outputting garbage?
Why, with SB Live!, if I set the system to use 128 frames fragments and it tells me it can only do 192 frames fragments the driver will use 128 frame fragments for read and write loops. Is it a bug? What's going on?
For updates to the code introduced in the blog entry, check out gnuitar's src/main.c. There's the full glory with locking and threading details that I chose to skip.