So I would like to take a few minutes to talk about audio routing in PulseAudio. This is a oft misunderstood topic and it does sometimes seem like black magic and/or broken but, as always, it's pretty simple when you look at it properly. That's not to say it's sensible (I have a several reservations about the current way of working), but the first step to improving something is understanding it, so I'll try to explain here and then say what I think is needed to improve it. This is a rather complex and in depth post, so if this kind of stuff doesn't float your boat, it's a good candidate to skip :p
So how does PulseAudio route your audio and what the hell do you mean by "audio routing" anyway? Well, PA handles multiple audio devices ("sinks" for output, "sources" for capture). When an application wants to play sound it says "Hey PulseAudio, play this!!", and PA tries it's best to comply. The application will typically not care about which device to actually output too - they delegate this job to PA (and ultimately to the user via PA's tools such as pavucontrol, gnome-volume-control and (more recently) kmix/phonon too). These tools allow you to move the stream to your preferred sink - no need for an unfamiliar GUI inside a particular application for choosing which audio device to use; users know where to look in their desktop environment to control sound settings. From an HCI/usability perspective I think this is very important (although incumbent users need to shake of their natural assumption that apps should provide a way to configure audio device settings).
With a fresh user account PA will attempt to calculate a suitable default device to use. It does this by assigning each sink an internal score of appropriateness. This is just to determine our initial defaults so it doesn't matter too much if we get this wrong, but obviously it's nice to try and get things right first time! So once our default device is chosen, when a stream is played, it will use this sink. Simple huh? Well not quite. What if I change the default while a stream is playing? My stream is moved across to the new default right? No. Setting the default device does not do this. It acts as a fallback, it's not an "active" default. If I stop my app and play it again, it will be played on the new fallback, but it wont move when the stream is "live".
Now this is just the very beginning so if you've become overexcited by now, best take a cold shower (and probably look up psychiatrists in the phone book...) as I'm about to move on to look at the "stream restore" database...
So the stream restore database is handled by module-stream-restore (m-s-r). It's part of the default PA install so 99.9% of users are likely using it. What this module does is to track when a user specifically moves a stream to a new device. When the user uses e.g. pavucontrol or kmix to move a particular stream to a new sink, this triggers a mechanism inside m-s-r that causes the sink name to be saved for that application. If that application appears at any point in the future and tries to play sound, m-s-r will try it's best to ensure that this sink is used. If the device is not currently available, we will ultimately use the fallback/default instead. If this saved sink becomes available at some point during the lifetime of the stream, the stream will be moved across automatically (which differs from the behaviour when setting the default sink).
Now it gets complicated. Despite my terminology above, m-s-r does not use "applications" per-se when it saves the device choice. It will look at a stream and select a "stream restore id" for it based on several bits of metadata attached to the stream. Firstly it will look to see if the stream has a role. If it does, the role is used as the identifying factor, not the individual application. So e.g. for event sounds, the "stream restore id" will actually be "sink-input-by-media-role:event" regardless of which application actually produced it. As m-s-r is responsible for storing volume, mute and device preferences, this means that all event sounds will have the same volume and will be played on the same device. If the stream does not have a role, PA checks for a few other things to try and create it's "stream restore id": it checks for an "application id" (a specific property set AFAIK only by a select few applications), an "application name" (which 99% of PA clients provide) and finally it checks for a media name property (which is pretty unlikely IMO, although I may have missed a use case here). Once the "stream restore id" is chosen it is saved for that stream and we will always use this rule when updating our saved volume, mute and device settings thereafter.
When a user moves the stream (using one of the afore mentioned GUI apps - or via any app that makes use of the pa_context_move_sink_input_by_*() APIs (also applies to source_outputs)), the "stream restore rule" they are ultimately changing will depend on the metadata the application has provided PA in the first place. So for example, Amarok will be tagged with a "music" role, and thus when you use kmix to move the Amarok stream to a different device you are not just moving Amarok, you are actually moving all streams tagged as "music". (Update 1) But wait! It does not actually move the currently playing "music" streams. They will only be moved the next time a PA stream is created (this could be when the track changes, or it could be when the app restarts - it depends on implementation), or when the stream restore database is written by some PA client which requests the "apply_immediately" flag set.
This whole approach can be summed up with two flow diagrams:
OK, so that about covers m-s-r, now I'll talk a bit about module-device-manager (m-d-m). This is a module I developed principally for integration with KDE. The GUI exposed for the Phonon preferences in KDE is such that it lists all devices that have ever been available on the device even if they are not currently available. This allows them to be listed and arranged into a priority list. Whenever the highest priority device is available, it will be used. It's a relatively simple UI and one that users can easily understand and work with. For a degree of flexibility, the GUI allows for different priority lists to be configured for different Categories (which in PulseAudio terminology are called Roles). So m-d-m provides an API and Protocol extension to implement such a routing policy directly in PA which turned the KDE GUI effectively into a frontend to PulseAudio (internally this is abstracted via Phonon but this is not really important). So this m-d-m will route your audio for you to the appropriate device. Due to how it is implemented, m-s-r will still take priority, so if you move a particular stream and the device choice is saved by m-s-r, it will take priority over the role-based priority list of m-d-m. I have now provided a way in kmix to delete any application specific rules in m-s-r such that the m-d-m routing (and thus what is displayed in Phonon preferences) will be used again should an override have been in place.
Right so that's how things work. Now let's rip it apart and moan a little about the shitty bits.
- A default that is not a default: First things first. This is a question we get a lot: "I've set my default device but <insert victim application here> doesn't want to use it". Now that you know how the routing works due to the excellent overview above, you can probably work out what's going on here. Either m-s-r has a specific device saved for that application's role or the application itself, and that overrides the "default" choice; the user was playing the stream when they set the default device but this wont be honoured until a new stream is created; or m-d-m is implementing a role-based priority list preference (which in all cases overrides the default button due to having a priority list for all cases - one for each role and one for the default case). All of the above is flexible but it sure as hell is not clear to the user what the hell is going on. There needs to be some kind of unification and automatic reaction here. Having a simple "default" button IMO makes sense. It's a very easy concept to grasp for users and we should try to make it work in the most case and explain more clearly in the various GUIs as to why it may not work in certain circumstances.
- Why, Why, WHY?: Things are not clear to the user what is going on. In a typical setup we have m-s-r and it remembers one thing. Unless you know how it works, you cannot configure a scenario when you have three or more devices and have it "just work" for you. For me three devices is not uncommon. At work I have set of USB speakers I use and at home I have Airtunes and network-based PA servers I use too. I want them to "just work" without me having to fire up a GUI every time I want to use them. A primary problem just now is that unless you've taken the time to read and understand the above descriptions, it's very unclear to users as to how the default device works and what m-s-r does and how it operates. Clarity and transparency is needed here, or some way of making it "just work" without needing such and explanation.
- You gotta role with it?: Yes that is a rather contrived "problem", but the point is valid. As we go down the path of encouraging applications to include as much metadata as possible in their streams, assuming we ever reach such a zenith, then the ultimate end result is that all streams have their role specified. When this happens a fundamentally useful part of PulseAudios design is effectively crippled in the default setup: m-s-r will always pick it's stream restore id of "sink-input-by-role:fooo". All application's will no longer be able to chose their own individual audio device; you will only be able to pick the device for that whole class of application. All music players must go to one device. All video players to another, but nothing in between. Some applications set their role generically but can be used for other purposes e.g. totem/dragon are video players but may be being used for just music in a given instance - I'd like to use my preferred music device not my preferred video device but I don't want to affect all video players with this choice. Obviously it's better if a multi-purpose app can work out it's role dynamically based on content but the principle point still stands - you cannot move streams anymore (with the exception of the afore mentioned caveat for currently playing streams when the move is initiated), only whole classes of streams. Some may argue that this is OK/acceptable. Personally I don't think so.
- I don't want an override anymore: Say you've picked an override for your stream/role with m-s-r. Say you've now decided you don't want that override and just want to use the default device now? None of the GUI tools out there just now allow you to reset this (with the exception of the latest kmix in my Git Repository at the time of writing).
How do you solve a problem like Myrouting?
So the above problems basically make things a bit too much like a black box. It's not clear what's going on and it doesn't offer enough flexibility in many cases either. How do we solve this? I'll now outline what I think is the best possible implementation of a routing system and how it will work.
Firstly we need to remove the internal concept of a single "default" device. As we've discussed before within the PA community, we will move to a priority list concept. A significant difference to the KDE approach of showing the device priority list for users to tweak, we will ensure that exposing this list is not a strict requirement for operation. In order to service this requirement we will still offer the existing UIs for setting the default device, but the internal behaviour will change. Setting a device as default would simply move it to the top of the priority list. This will ensure that simpler UIs can still operate and whenever the user prefers a device above other others they just set it as default. In order to get the perfect priority list, they may have to click the "default" button a couple of times with a given setup, but it will stabilise for them very quickly (for reference, this default priory list will be the equivalent of the current m-d-m priority list with the psuedo-role: "none").
As well as the default priority list, we will also implement similar lists for each role (again very similar to the current m-d-m role-priority lists). When a stream has a role, we will simply use this list rather than the overall default one. If a given role does not have a priority list of it's own, the default one will be used (in KDE the phonon GUI will likely enforce that a list exists for each role, but has GUI techniques to keep it in sync).
In addition to the above, each individual application will also have it's own priority list. Again this is fully optional. If such a list does not exist, the role-based list will be used instead, ultimately falling back to the default list if no role is present.
When a new device appears it will be added to the default priority list only. It is open for debate as to where this device should appear in this list - top or bottom, but this could ultimately be a user preference. The routing logic should ultimately check the most appropriate list first, try to find the highest priority device in the list that is currently available. If none are available it will use the next most appropriate list ultimately falling back to the default list which will always contain an available device.
New APIs will be made available (either as extensions or baked in - details TBC). These APIs will allow the querying and editing of these priority lists (thus facilitating GUIs such as the one in KDE). When editing these lists the changes they reflect will always be immediately applied to any running streams. APIs will also be exposed to delete a role-based list or an application specific list (NB the application list could potentially also be cleared via the existing APIs by passing PA_INVALID_INDEX or an empty or NULL device name).
The existing APIs for setting the default device and moving a stream to a given device will still operate as just now but will operate differently internally.
As mentioned previously, setting a default device will update the overall default priority list. This change will be propagated to any currently running stream immediately. Thus a fresh user account with no priority lists saved, the default priority list will be used for all streams. Clicking on the default device will move any running streams to that new device. And thus the very basic use case will operate in an intuitive way.
If a user moves a specific stream to a new device via the (pa_context_move_sink_input_by_*() style APIs) this will trigger the update of the application specific priority list. This is a stark change to the existing behaviour which will update the details for the currently used device choice mechanism. If the priority list for particular role is to be updated by a given GUI application, then the new APIs above should be used to achieve this result. If a stream is moved and it does not yet have it's own priority list one will be created for it automatically (containing only the chosen device initially.
So what can the GUIs show? Well there are a lot of options now:
- A very simple GUI which just shows a list of devices and allows a single default to be shown.
- A more complex GUI that shows all devices in the default list and allows the user to order them.
- One that shows both the default and the known roles and allows a default to be chosen for each (or expose the full list for user ordering)
- etc. etc.
(FWIW although it's possible I don't think it would be necessary to ever expose the per-application list although there should be a method for indicating that such a list exists and is in use and allow the user to delete it and user a higher level priority list).
With regards to event sounds specifically, we want to make sure that these are handled differently to a regular stream produced by an application. For example I may want to use rhythm box to output sound to my Airtunes device and implement this via an application specific priority list. Any event sounds for this should not be pushed to this device. I'm not sure what the best solution to this conundrum is, but I suspect that a simple property set on the stream that says "do not allow per-application priority list updating" would be sufficient. This still allows streams to be moved to the device, but this wont be saved for later restoration (obviously moving an event sound is really hard due to their duration, so this logic is more for the principle rather than the practicality).
(Update 1) As I propose to apply changes whenever they happen if you run e.g. two paplay processes at the same time and change one to use a specific device, my logic dictates that both streams would be moved. This is a regression on the current behaviour where only when the stream is restarted will it be moved, but to be honest I don't think it's much of a problem. If it is desired to have specific control over a given instance of an application, then we need to either a public API to not save the move (much like our internal one), or support a proplist property that suppresses this (e.g. no API change but a loose "standard" for performing this type of operation. If you want to run an application in a specific way, then you can always set a property for the e.g. the "application id" as mentioned above which would make it appear as a unique application compared to running it normally. So flexibility still exists here (albeit rather complex), but we probably do want to deal with the default cases well and make the more niche ones possible via some degree of extra work.
OK, so this is quite an exhaustive proposal and I've not nailed down the exact best way to implement it, but I think that this approach is ultimately the most flexible possible that supports all of the interfaces desired and still exposes the flexibility of the PulseAudio system. I have not given much thought to the storage of volume/mute status (which m-s-r currently tracks along with it's single device preference). My gut feeling here is a fairly similar approach where you should be allowed to set the default volume, a per-device volume and a per-application volume. The former two being used as defaults and as soon as the per-application volume is changed via any API call, we save it with that specific application for future reuse. APIs can set the default volume and the per-role volumes and this will be propagated to applications that do not have their own specific override in place.
So in summary, this is how I think things should work. I don't think it's actually as complicated to implement as I've made it sound (would probably only take about three or four times as long as it's taken me to write about it!!) and I believe it keeps the power and flexibility of the system, still allows for minimal control interfaces yet allowing more exposed and complex ones if required.
 16th Feb: Add notes about how moving a stream with a role does not move any currently live streams and how the proposed solution would prevent running two instances of the same app and moving the device independently for each (as one would follow the other) unless the application id was overridden for one of the instances.