Colin.Guthr.ie Illegitimi non carborundum

24May/1024

Qt Multimedia/Mobility vs. Phonon: FIGHT!!!

Well it's not really like that, but I guess those involved can think about it a bit like that at times!. For some background, Phonon is a Multimedia framework that was included in Qt 4 as far as I understand it, it was developed outside Qt, but was adopted (please correct me if my history is incorrect here). It was designed to give application developers easy access to media playback systems, be it MP3 music or new fangled WebM video! Rather than implement any of the complex stuff itself, Phonon hands off the actual decoding and playback parts to existing media frameworks. Originally Qt wrote a GStreamer "backend" for Phonon and this was the only available backend on Linux in the early stages (others were available for other platforms too). I personally think that GStreamer was a good choice. I think it is a very powerful system, but it's not for the feint hearted. I wont begin to pretend that I understand it (although I have hacked my way through some GST code!), but the principle of it's operation seemed to fit the needs of the Phonon project very nicely.

Sadly the GStreamer backend was plagued with some problems. I wont go into the details (primary because I don't really know them all!) but many KDE developers felt that it just wasn't appropriate and decided to write a Xine backend for Phonon instead (NOTE: This is not actually correct. Please read comment by Kevin Kofler below). This matured quickly, but it was always had it's own set of problems that resulted in many hacks to be introduced at the application level. This was very bad as it meant that those of us (Mandriva included) who stuck with and helped fix the GStreamer backend were sometimes left a little out in the cold due to these hacks. Ultimately this story has concluded with the Xine backend being more or less abandoned now, and renewed focus being put on the new VLC backend. VLC, or more accurately "VideoLAN", is a multimedia framework, very similar to GStreamer. It has a very well maintained code base and support for numerous codecs and formats. It is also totally cross platform, working with Windows, OSX and of course Linux and several embedded OSs to boot. Having a single Phonon backend with uniform capabilities and support across all platforms is very desirable, so progress with this is quite rapid and an official release of the Phonon-VLC backend will be available in a matter of weeks.

But this is all about Phonon, not Qt Multimedia or Qt Mobility. So where does this all fit in? Well, Phonon has ultimately not really been developed much by Qt. That's not to say they have totally forgotten about it. I am assured that bugs reported will be looked at, time permitting, and that patches from downstream will be tested and merged, but it's fairly clear that it will not be actively developed further and no new features will come from the Qt side. Their new babies are Qt Multimedia and Qt Mobility.

Qt Multimedia, and arguably to a larger extent, Qt Mobility has been developed with a strong focus on mobile usage. Now that Qt is owned by Nokia this is not surprising! So what was the motivation for this and how does it differ from Phonon? Well, like Phonon it also makes use of existing projects to provide the core media decode capabilities. It will continue to use GStreamer for most "linuxy" systems and others for e.g. Windows and OSX. It differs however with regards to focus. It encompasses several additional features regarding the various needs for a typical mobile environment, such as still image and video capture (think cameras on a mobile phone).

There are several other key differences too but I wouldn't like to go into detail just now (I'm not really familiar enough with either Qt MM or Qt Mob to comment in any great depth), but suffice to say that this general abandonment of innovation in Phonon in favour of a new project has left several people in the KDE multimedia community feeling rather uncertain about what to do next; which horse to back!

Here at the KDE Multimedia Sprint in Randa, Switzerland, we were lucky enough to discuss the various issues with the Qt Multimedia/Mobility guys in Brisbane, Australia via a video conference. Well it was supposed to be a video conference, but despite it being between two groups of multimedia geeks, video completely failed us (various firewall issues for one solution and the lack of admin rights at the remote end for another!), meant we had to make do with a normal phone call! Such is life! Anyway, they were kind enough to give us an overview of Qt Multimedia/Mobility and let us know their approximate roadmap.

The outcome of this meeting was more or less that KDE Multimedia focus for the short term will remain with Phonon but the Qt guys will try and be more open about their plans and try and work better with the community longer term. At present, this is an uphill struggle as the VLC project has been very active in the KDE community for the last little while and seems to fit the needs in both the short and relatively long term of KDE, albeit in a way that is not currently exposed via Phonon API. This previous lack of communication is something that will be hard to overcome, but using a Qt integrated and supported solution definitely has advantages in the future - if they really mean it this time! The fact that the code is also used on Nokia mobile platforms give a fairly good reassurance that this will indeed be the case at least!

The rather obvious question of "Why not extend Phonon?" was also raised. While it was difficult to hear exactly the responses, the general reasoning seemed to be that there were some design mistakes in Phonon that basically meant that each individual feature that needed to be added, actually needed to be done at the backend level first (for each and every backend) and then exposed out through the API. With the Qt Mobility approach, more of the core features and functionality can be implemented once, with it only reaching out the the underlying platform specific system for specific and defined operations.

We enquired about the possibility of having a single Qt Multimedia backend written for Phonon, that allowed the Phonon API to continue in use at the application level while using the underlying systems/framework provided by Qt Mobility and/or Multimedia and deprecating all other Phonon backends. This is attractive in the sense that applications to not need to change horses mid stream and can gradually move over to using a pure Qt Multimedia API if indeed that is determined to be the most desirable outcome. That said it is also unattractive in some ways too. This adds yet another layer of abstraction to a system that many people argue is a layer of abstraction too far in itself! i.e. Application uses Phonon which uses Qt Multimedia which uses GStreamer; Why not just Application uses GStreamer? (there are of course many reasons to add a layer of abstraction; cross platform support being a primary one; but two layers is arguably not the best situation, especially on mobile platforms where wasted CPU cycles really hurt). That said, it would still be an acceptable stepping stone for most people. Regardless, there was no actual commitment to this eventuality from the Qt side. It was seen as a good idea but it's likely not something that there will be time (aka budget) for developing in the near term, which is understandable.

Many of the problems with Phonon still remain however. Qt Mobility will still require the setup of different underlying support libraries on different platforms. Just as before, GStreamer will be required for it to work on Linux and downstream (distributions and, to a lesser extent, users) will need to ensure that this is all configured and working correctly. As with the original Phonon, this backend will be different on different platforms, which is not ideal for the downstream.

So in the short to medium term (e.g. the next one to two years at least), Phonon will continue to be the primary media framework on KDE, and development on the Phonon-VLC backend will be seen as the best way forward to provide a standard experience across all platforms. GStreamer as a phonon backend will also continue, although the mplayer and xine backends should be considered deprecated.

With regards to Qt Multimedia and Mobility development itself, the Qt guys will try to be more open; opening a private mailing list to more access and generally being more communicative.

Those of you who regularly read this blog will be no doubt wondering where the PulseAudio connection is. Well, from what I gather, this is a primary difference of Qt Mobility and Qt Multimedia. Qt Mutlimedia seems to use ALSA directly and thus is not ideally suited for mobile situations (PulseAudio's timer based scheduling now being pretty much universally accepted as being the preferred approach on mobile platforms). Qt Mobility does actually use PulseAudio indirectly via GStreamer, but it does not seem to do anything special with regards to the "buffer-time" or "latency-time" attributes of the pulsesink (not that pulsesink is actually referred to directly in the appropriate place anyway from what I can tell). These attributes map directly to the tlength and minreq attributes of the buffer metrics in a PulseAudio stream. These are very important when it comes to mobile environments as the general aim is to provide as high a latency as possible. Higher latency means lower power usage (and for those wondering, this does not necessarily affect A/V sync on video playback - it's just about how much data you pump into the audio buffers before sleeping until you know it'll be almost empty). For a system that deals with audio playback on a mobile device, this is very important.

Now Phonon is also similarly ignorant to these properties when using PulseAudio. The the help of folk from Intel we're going to look at increasing the defaults in Phonon when used with GStreamer pulsesink to see if higher latencies would bring benefits without any drawbacks at the application level (I don't think any of the uses of Phonon will have problems with this approach), but we'll also have to consider how best to expose this via the Phonon API to allow the application level to state more explicitly if this matters. When capture APIs become possible on Phonon this will become important. The whole question of how much thought has gone into this kind of power saving is Qt Mobility is still very much at the forefront of my mind. I hope to be able to discuss things in more depth with the developers in due course, hopefully influencing them to extend the API in a similar was as described above.

More generally, those of us in the KDE community will try to get involved with Qt Multimedia/Mobility but until such times as it is easily configurable on the Desktop or there is a Phonon backend based on it, it will be hard to get involved too deeply.

Overall I think everyone has an open mind, but with the current focus on VLC and the functionality it provides, it will be the most interesting bits for most KDE Multimedia guys for the short term.

With regards to all of the above, I hope I've made fair and levelled comments. If you feel this is not the case, please tell me so in the comments below. I will happily retract or rephrase or reconsider any point if a suitable argument is made. I do not intend to represent anyone else in the KDE community in a way they do not feel is accurate so I'll be happy to both comment and edit the article with suitable annotation if this is deemed necessary.

Edits:

  1. 27th May: Clarification of my incorrect statement of when/why Xine backend was introduced. Thanks to those who pointed it out.
Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Yahoo! Buzz
  • Twitter
  • Google Bookmarks
  • Identi.ca
  • Slashdot
Comments (24) Trackbacks (2)
  1. So KDE will now continue use QT Phonon framework, which replaced ARTS, which will talk to PulseAudio with help of VLC backend, which replaces Xine backend, which is broken for some time now, which replaced GStreamer backend, which was always broken; all of them being originally written for ALSA, which replaced OSS.

    And at the same time, we’ll have additional QT Multimedia and QT Mobility frameworks, which will get to upstream QT at some point, but will either use ALSA or GStreamer directly.

    And yet at the same time, nor Phonon nor QT Multimedia nor QT Mobility will use PulseAudio features which were developed for the PulseAudio usage in mobile environments.

    Did I got it right? :)

    • Almost perfect! The only bit that is wrong is the fact that Qt Multimedia is already part of Qt 4.6 and Qt Mobility is actually a separate module for Qt that will likely remain separate (as far as I understand it, they are trying to be more modular, not unlike X11 – although I did hear some rumours about X11 going back to a monolithic structure but not sure how credible they are!)

  2. Oh I have been trying so hard to avoid anything PulseAudio on my machine. On one side the developers say that the distros never implement it correctly, and on the other side the distros and users always say Pulse is broken. Add to that the overly complex volume systems and the various GNOME/GTk dependencies that it seems to have… :(

    Xine has been working great for me, can’t we just fix that?? When is this new implementation supposed to start, anyways?

    • With PA there are no mandatory GNOME/Gtk deps. There is currently a gconf based module loader module but I plan to factor this out eventually…. I’ve just been lazy. The UIs in KDE were lacking but I’ve mostly implemented all the UIs needed now (not all landed in trunk yet tho’). Many problems around PA, ALSA, distro integration etc. have combined to create a bad impression. Particularly bad implementations in popular distros like Ubuntu have not helped, but they are making a lot more effort these days and have developers actively involved with upstream PA community, and working on interesting projects relating to usability etc. so please don’t let opinions formed in the past cloud your judgement going forward :)

      Xine is a dead project. It’s no longer developed upstream and it has so many bugs that applications have to work around them, even with a phonon wrapper. While it may work great for you, this is in part testament to the fact that the developers have added all these various workarounds to deal with it. This is not practical longer term. Work on Phonon VLC has been going on for a long time now and it’s progressing very well. When VLC 1.1 is released (soon) it will become more mainstream.

  3. To confuse things just that little bit more, Qt Multimedia has been removed from the upcoming Qt 4.7 release. It is, however, still part Qt Mobility and should reappear in the core of Qt 4.8 — or should I say as a module for 4.8, as they are aiming for complete modularity of Qt in that release.

    Explained here by more knowledgeable people than me: http://labs.trolltech.com/blogs/2010/05/06/qt-47-scope-change-regarding-qt-multimedia/

  4. Xine came first, btw., although it was apparently originally meant as a stop-gap measure.

    Qt came and kind of “dumped” in the three additional backends when they decided to adopt Phonon as their de-fact multimedia framework, but development was done behind closed doors and without much involvement from the community. And when Qt discontinued Qtopia (the originally mobile version of Qt), and the Brisbane office got the responsibility for multimedia from the guys (who had decided to include Phonon) in the Oslo office, they apparently found Phonon lacking and started work on QtMultimedia.

    Also, Phonon already has for example low-level access to video frames and audio frames (though not finalized yet, with room for improvement…), and as you know will get av-capture and low-level PCM-IO APIs as GSoC projects this summer, which should collide nicely with QtMultimedia… :/

  5. that was very … diplomatic. :)
    thank god we have phonon… applications can just keep writing to that, and users can just use whatever backend works for them at the time, and both are spared the pain of the audio wars until (if ever?) they’re settled ;)

  6. /me is confused like hell now :-/

    So, currently I hack a lot on plasma mediacenter which uses the Phonon::VideoWidget to display videos currently. It only works with GStreamer and stays black with Xine.
    The bad thing is, red/blue channels are swapped for the GStreamer backend making for strange effects. Would a VLC backend solve this problem?

    • Oh that’s easy to answer: “maybe” :p

      In all seriousness, I don’t know but I think that would be the aim. Assuming that your app is not doing something that is ultimately undefined in the API (i.e. it just works by luck in GStreamer) then the aim should certainly be to make it work in VLC too. If you can try it out that’s the best option then ask on #phonon on Freenode if it doesn’t work as expected.

  7. Actually, the xine backend was the first backend for phonon. Then, the project was integrated in Qt and Qt Software created 3 backends : GStreamer, directshow and on for mac (quartz??). When they were released, the xine backend was already working fine.

  8. I think you got the Phonon history slightly wrong. Phonon was developed as part of KDE 4.0 (and part of kdelibs at that time). The xine-lib backend was part of that. There was no working GStreamer backend at that time, 2 separate attempts had been started, but weren’t finished. In the KDE 4.1 timeframe, it was decided that Trolltech would adopt Phonon for Qt, so it got moved to kdesupport and Qt started importing it as one of the “3rdparty” modules shipped together with Qt in Qt 4.4 (which went out before KDE 4.1, so we had KDE 4.0 shipping with Phonon 4.0, Qt 4.4 with Phonon 4.1 and KDE 4.1 with Phonon 4.2; Phonon version numbers went back into sync with KDE ones later, when KDE 4.3 was released without a new Phonon). This was when the GStreamer backend we know now was written, as well as DirectShow and QuickTime backends. So the Xine backend is actually the one which was there first. Trolltech refused to support xine-lib because it’s under the GPL, not the LGPL, which is a problem for their commercial customers. The Phonon imports into Qt always have the Xine backend ripped out for those licensing reasons.

    • Thanks Kevin. That’s the most complete response (a couple others have pointed this out too). I’ll reference this in the main article for completeness. Thanks.

  9. So, is Phonon fundamentally broken? Is it doomed in the long run? Are developers of new software to be advised not to use it?

    • I’d very much say “no” it’s not fundamentally broken. It does what it does perfectly well. Where it is not appropriate (according to Qt folks) is for the Qt Mobility stuff.

      At present, the Qt Mobility system is a far worse option for app developers as it is simply not available anywhere on any distro etc. etc. As I said in the summary at the end, Phonon + VLC backend is very much the solution to continue focusing on for the short and medium term. With regards to long term, if/when Qt Mobility matures and there is reliable support on a wide variety of platforms, then it will potentially be possible to port things across to it. The API is actually quite similar to Phonon and such a porting effort wont be that difficult if/when that happens. As a stopgap I’m sure a phonon compatibility wrapper will be possible, but I did mention that this would cause two layers of abstraction where one would suffice; which is not ideal (and especially so for mobile systems).

      Nothing lasts forever and while the Phonon API may be around for quite some time, there may be a better, more appropriate route that is a realistic option in a year or two. Ultimately, this is not really news to anyway if you break it down to the fundamental principles :)

      • “there may be a better, more appropriate route that is a realistic option in a year or two”

        we need to do better than that with our API commitments. the only reason to toss out Phonon that quickly is if it is really broken. sometimes we toss things out because “we can do it better if we start again” and not because what we are tossing is really broken.

        sometimes, things are really broken (i can name a number of things we tossed from KDE3, including aRts), but we need to keep ourselves from feeling too cavalier about such decisions.

        Phonon actually works very well, has a nice API and, as we can see here (and as Chani already pointed out), shields app devs from the utter insanity of multimedia framework development (from an app developer POV). that’s invaluable.

        please, as media developers inside of KDE, produce some commitment and resolve to the framework. let’s move from “ok, for the next year or two” towards something more concrete and confident ….. even if that means we’ll need to do some work on the framework to keep it viable. which will be true of _any_ solution that comes along.

        • That’s quoted a little out of context. The bit immediately before was: “… while the Phonon API may be around for quite some time, “, which was meant to imply this commitment to the API that you’re worried about. Just because there is a “better, more appropriate” API available, doesn’t mean the old one immediately stops working.

          While I would love to say, yes this is the “final API and it will last for ever”, it’s not really 100% our (the KDE media developers) to make these decisions. We are at the behest of Qt in many regards and while we can do our best to keep things stable, if change is genuinely merited (for either technical “pureness” or maintainability reasons) then this should certainly be considered and in many cases encouraged.

          But as I said, I do not see Phonon API going anywhere (incompatible) for quite some time.

  10. Hi Colin !
    Because of it’s design, I always thought gstreamer was the ultimate multimedia framework, although it isn’t very easy to understand nor to use…
    Now, I’m reading everywhere (even on you amazing blog ! :) ) that VLC is the next, and that it’s amazingly flexible design let it suit well with phonon… Woowh, I, as a developer of some portable multimedia player, currently working on an engine based upon gstreamer, ask me how such a VLC backend can compete with gstreamer in term of flexibility…
    Well could you explain in just a few words for the rest of us – the “not-really-aware-of-VLC” people – why “[you] think of it differently now (in very much a good way – and for clarity I am really referring to the scope of the project here – I wasn’t fully aware of how flexible it is until JB’s presentation!).” (from your latest blog post, I guess it’s a better place to speak about there.. ?)
    Thanks ;) !

    • Sorry for the late reply here…. somehow missed this and I don’t log in too often to check!

      With regards to VLC vs. GStreamer, I have to say that I really rather like the concepts on both of them. Both of them are really rather similar as far as I can gather. The real trick is getting the right “graph” for any given job. If you want to play back video the graph will include the input, demux, decode and output stages (possible others too, like OSD, subtitles etc.). Both systems offer some kind of automatic graph generation (in GST this is the playbin and decodebin components I believe). In the past I was rather ignorant as to how VLC worked, but after speaking with JB I see that they two have a pretty flexible graph generation system that can wire up all the modular component automatically for any given job.

      Now I wont begin to talk about the internals. I really only have a high level understanding of both GST and VLC really, but from what I can see, they are not all that different. I guess I was just previously assuming that VLC was a media player rather than a media framework, and that prejudice is what formed my general opinion (and it wasn’t a bad opinion, I just didn’t realise that it was much more than a media player).

  11. Hello Collin,

    What is actually the shape of the gstreamer phonon backend (thus I mean the code that is part of phonon, not gstreamer itself), afaik not much development went into it last years/months? Do you expect the vlc and the gstreamer backends soon to be on the same quality/featureset or is the vlc backend already in a much better state and is this fact alone a reason to switch to vlc?

    .. sepecially if one (me) uses pulseaudio…

    • Well, this is where it is tricky. It’s true that the GST backend has not had much love of late. As we use this by default in Mandriva, we try to push our bug fixes upstream, but there hasn’t been much focus on anything other than bug fixes. Qt/Nokia has not done too much with it either. On the flip side, VLC backend has had a lot more love recently and it’s in pretty good shape. That said, the PA support in GST is, at the moment, a lot better than in VLC. I do intend to change that, but I’ve simply not had time to look into it. Don’t get me wrong, it’s not terrible in VLC, the author has done a good job, but I’d like to see proper support for more advanced buffer metrics (for power savings on mobile devices etc), for proper corking/pausing for streams (to allow for instant pause), and proper pass through of volume control. There is also potentially some situations when seeking that the stream is just “lost” in VLC that I’ve not been able to debug yet.

      So the answer, as it invariably is, is somewhat inconclusive! I’m currently using GST (tho’ you have to turn off crossfade in Amarok), but will try and use VLC more and more as I find time to hack on the PA layer inside it.

  12. Hello Colin,

    thanks for your fast answer. Just for clarification why I’m asking this: I’m currently using gstreamer/pulse in kde4 as well and I actually like this combination. I’m using ubuntu as base distribution which come with gstreamer/pulse as default and use my own vanilla compiled kde on top of it. Since I would like to keep just one multimedia backend I’m a little bit worried that everyone in kde land jumps onto the vlc wagon and GST as phonon backend will be forgotten completely.

    (On a side note, yesterday I gave kde45 trunk a try and the pulse integration into kmix is just perfectly, though took me some time to find out how to move a stream :), really cool work!)

  13. After reading your article, I chose to use Qt Mobility instead of Phonon. However, this ended up not being the right choice at all for the reasons mentioned here: http://hammertechengineering.com/qt-mobility-qtmultimediakit-vs-phonon/. I just want to make sure people have both sides of the story.

  14. Hi all,
    I am trying to use the QtMobility in my Qt application on embedded linux system. when I use the gstreamer directly (gst-launch playbin2 uri=file://video-file.avi) the CPU is ~20%,
    but when I use the QtMobility(Video element of QtMultimediaKit) the CPU is ~100% usage.
    I see from the outputs that the qt use the gstreamer, but it’s not efficient.
    Does anyone have an idea why this happen?


Leave a comment