Illegitimi non carborundum


Sound on Linux is Confusing: Defuzzing Part 1: ALSA

So I often hear the phrase: "Sound on Linux is Confusing". While I don't totally disagree with this statement, as with everything on Linux the sound system is pretty logical and if you follow through the steps you can demystify things pretty quickly. So this article will explain how things work on Mandriva and should ensure users are more comfortable with "how things work".

First of all we should probably explain a little bit about how this works. ALSA is the Advanced Linux Sound Architecture. It is a replacement for OSS (Open Sound System) that had several problem on Liunx and which has ultimately been supplanted by ALSA as the only sound system in the Kernel. Despite a relaxation of license terms and further development on OSSv4, OSS is unlikely to replace ALSA in the mainline kernel.

ALSA also has a userspace component, libasound, that acts as a primary interface to the driver layer. Most complaints about the complexity of the ALSA API actually relate to this userspace component, not the kernel layer which, as you'd expect, is much more rigorously controlled. It does not suffer from the same need to remain backwards compatible with various userspace applications (the ALSA kernel drivers only need to work with libasound which can obviously be developed in parallel) thus leading to a cleaner design than the userspace layer itself which does have to remain backwards compatible.

In addition to interfacing with the kernel layer and interacting with sound hardware physically installed, ALSA also has a plugin architecture. This plugin system allows for devices to be faked/emulated, in various interesting ways. It allows, for example to create a null device that sends all audio to /dev/null, it allows for bluetooth headsets to be used (note this is considered a legacy way these days), and it allows all audio to be routed through PulseAudio which I'll discuss further in a subsequent article.

So, let's talk about ALSA configuration files. Now most of the ALSA config files live in:


Arguably these are not really "config" files in the classic sense, i.e. you are not meant to change them as a user according to your own preferences and whims - they are really more like source files and define the structure of various ALSA plugins and multichannel configurations. Unless you are developing/hacking on ALSA, you probably shouldn't tweak the vast majority of the files in this folder.

The main file


defines a list of additional files to parse and the order in which to parse them. In order to incorporate PulseAudio in an elegant and configurable way, we make a couple changes to the list of additional files:

--- alsa-lib-1.0.15rc3.lennart/src/conf/alsa.conf	2007-10-17 18:28:03.000000000 -0400
+++ alsa-lib-1.0.15rc3/src/conf/alsa.conf	2007-10-17 18:33:10.000000000 -0400
@@ -8,6 +8,8 @@
 		func load
 		files [
+			"/usr/share/alsa/pcm/pulseaudio.conf"
+			"/etc/alsa/pulse-default.conf"

The first file allows us to define a named "device" for ALSA called "pulse". This is always present even if the user ultimately decides not to use PulseAudio by default on their machine. This would allow them to e.g. define a remote PulseAudio server (via Environment variables or client.conf setup - see the next article in this series, linked below) and tell ALSA apps to use this "device" specifically. Arguably this is a corner case, but there is no reason not to support this all the same 🙂

The second file allows us to turn on PulseAudio by default. This is a pretty important as offering users a simple way to enable/disable PulseAudio is hightly desirable. PulseAudio roll out is not been completely free from problems and giving users the ability to quickly and easily disable it is essential. I firmly believe that everyone will ultimately use PulseAudio, but it'll take time for every app to fully support it (by removing non-"safe" ALSA API usage) and for driver issues to be worked out properly.

In this second file, we either comment out the file completely to disable PulseAudio by default, or leave it uncommented to enable PulseAudio. This is handled by draksound so the users just see a simple GUI which is a simple but effective solution. That said, I'll probably change this for Mandriva 2010.0, switching to an "Alternatives" driven system although this doesn't really matter in the overall scheme of things!

So when an ALSA application startups up, it parses all these files and ultimately works out how to route your audio. 99% of the time, the "default" device is used (and I use "device" in the loosest possible sense). For a standard Mandriva install, the default device is actually PulseAudio (via the PulseAudio plugin for ALSA).

So what happens next? In the next article, I'll go on to talk about how an ALSA app (or any PulseAudio client) works when connecting to PulseAudio.

Share and Enjoy:
  • Digg
  • StumbleUpon
  • Facebook
  • Yahoo! Buzz
  • Twitter
  • Google Bookmarks
  • Slashdot
  • 505

    Just ditch PulseAudio & OSS and everything would be good.

    • Colin

      I presume you mean just using plain ALSA? If so, I’m afraid this approach just wont work on a modern desktop platform and here’s why.

  • Pingback: SEBELK FOSS » Blog Archive » El sonido en Linux es confuso: Clarificando Parte1: ALSA()

  • Well, you’re making it look more simple by ignoring all the different libraries which can be used to feed sound into the kernel via the ALSA interface. Or the Pulse interface. Or the OSS interface (via ALSA emulation). Or the esd interface (via PulseAudio emulation). Or the arts interface (does that still work at all?)

    And also ignoring the wigginess that Phonon interjects, if you’re using KDE 4. And…


  • wilson

    this is more confusing than the topic itself.. 🙂

    • Colin

      If you’ve got any advice on how to make it better, please let me know 🙂