In an earlier article, I describe how the low level ALSA configuration allowed us to route all applications using the ALSA API via PulseAudio. In this article we'll take a look at the various configuration files and variables that control this side of the audio path.Let's walk through what happens when an application tries to play sound.
So first off, an application using the ALSA API tries to open the "default" device. Assuming we've configured this default device to be the PulseAudio plugin for ALSA, it will basically act like every PulseAudio client application. We're now into the land of the PulseAudio configuration.
PulseAudio adopts a client/server model that is very similar in principle to that of the X11 system. It is the server that actually outputs the audio and the client app that tells the server what to play. While this approach can be inefficient, resulting in the copying of audio data around, PulseAudio goes to great lengths to ensure that data copying and other latency-prone operations are kept to a minimum. In the common use case of both client and server running on the same machine, PulseAudio uses SHM (Shared Memory) to ensure that data sent from the client to server is not copied across the wire. The core of the PulseAudio server itself is "zero copy" meaning that references to the data are passed around without actually copying the data itself.
The first thing a PulseAudio client has to do is connect to a server. In order to do this, it checks various variables and configuration files to determine precisely to which server it will connect!
Initially, the PulseAudio client library looks for a PULSE_SERVER environment variable. If found, this variable can define a list of servers to which the client should connect. These servers can be specified as local UNIX sockets or DNS names/IP addresses for a TCP connection.
If this is variable does not exist or is empty, PulseAudio will then check for X11 properties on the root window. These properties are much like environment variables, but will be available remotely if you SSH to another machine with X11 forwarding. I'll speak about this more later. You can see a list of PulseAudio related properties by doing:
xprop -root | grep PULSE
The variables names used are the same as those used in the environment, so PulseAudio will look for a property called PULSE_SERVER.
Assuming it's still not got a server to connect to yet, PulseAudio will check for a default-server configuration in it's client.conf file. This file is located in either /etc/pulse/ or ~/.pulse/. Only one of the client.conf files is parsed. So if the user has their own one, the system one will not be parsed at all (this is a point I tried to make in vein on PulseAudio bug #606 - it took quite a while for this to sink in as you can see!).
So we've tried three ways to find a server. If we've still not found one, we just resort to defaults - i.e. connecting to a local, personal daemon and a local system-wide daemon (system-wide use is generally not recommended, but is supported for certainly circumstances - typically embedded systems). If we still cannot connect, the client.conf file can specify whether or not we will try to automatically start a personal daemon. Since PulseAudio 0.9.11, this is the default behaviour and allows console applications to work out of the box without starting a PulseAudio deamon beforehand.
So, in the unlikely event that all that fails, we will ultimately not be able to play sound, but we've done pretty much everything we can to make it work! In order to better visualise this, let's look at a couple of typical scenarios and run through the above process.
So, under a default install, we've booted to runlevel 3 and logged into the terminal. We don't set any special variables and start an application that plays sound via ALSA. Here is what happens.
- App opens default device
- ALSA PulseAudio plugin (like any PulseAudio client) checks it's config for a server and finds none.
- It tries to connect to a local server but fail as it is not running.
- It then starts a PulseAudio server automatically and then connects to it.
- The application then plays audio via the ALSA API functions and this is ultimately played by the PulseAudio daemon.
- The client application finishes doing it's thing, and exits.
- The PulseAudio daemon stays around for a while just incase another app wants to play sound in the near future.
- After a while, the PulseAudio daemon will go in the huff because no one loves it and kill itself :p
So that's it. It's quite simple. Let's have another example.
Under X11 things are a little bit different, but the same basic principles are followed.
- During X11 initialisation, modern desktops that support XDG Autostart ultimately run the script start-pulseaudio-x11. This script ensures the PulseAudio daemon is started and some extra X11 related modules loaded into it. These modules ensure that, unlike a console application, the X11 PulseAudio daemon will not exit after an idle timeout - it will instead stick around for as long as the X11 session exists. It als ensures that the X11 Properties mentioned earlier are set - the reasons for will will become apparent in the next example.
- When any PulseAudio client (be it an ALSA app via the ALSA PulseAudio plugin, or a native PulseAudio) is started it goes through it's "find a server routine". It will now stop this process when it reaches the X11 properties and use the info therein and connect to the server.
- The client application then plays sound as before and ultimately finishes and extis.
- The PulseAudio daemon doen't exit/kill itself as the X11 session is still going.
So again, things are actually quite simple.
Remote X11 Application
One of the handiest things with X11 is the ability to connect to another machine on your network and run GUI applications and have them display on your local display. With PulseAudio, the sound is also heard on the local machine. Out of the box (for security reasons), remote connections are not enabled. To enable them, run paprefs and enable the option Enable network access to local sound devices. This is the only option needed for this example. It loads an additional module into the server that listens on TCP port 4713 for incoming connections. Obviously, it goes without saying that any firewall on the machine must allow connections to this port!
- User starts a normal X11 session and the start-pulseaudio-x11 script ensures the X11 properties are set as in the previous example.
- User then connects to another machine on his network via SSH and starts an audio player (e.g. RhythmBox, Amarok, etc.)
- Even tho' the app is running remotely the visual display will be local.
- When the app starts playing audio, the PulseAudio client portion will find the X11 properties that have been forwarded through the SSH connection.
- The PulseAudio client will then connect over TCP to the PulseAudio server running on the user's local machine.
- The User revels in the local display and audio the application provides.
So as you can see, the use of the X11 properties has allowed us to piggy back on top of the X11 forwarding. It's not a totally clean connection as under SSH the X11 data will actually be tunnelled over a secure link handled by SSH itself, whereas all we are doing is telling the PulseAudio client where to connect directly, outside of any SSH tunnels. This means that while the display can work over a NATed system, the sound will not. This is fairly easily addressed, but we'd have to teach SSH about PulseAudio for this to work. The reason it works for X11 is because SSH is aware of, and has specific support for, X11. We're just piggy backing on this. That said, the current arrangement is "good enough" for most use cases.
So I hope this article has demystified how the PulseAudio client and server interact and the various configuration files/variables that come into play. If you have any questions, please ask in the comments and I'll endeavour to update the article.