Colin.Guthr.ie Illegitimi non carborundum

12Jun/050

MetaLibrarian

My proposal for a system wide Meta Data Manager for your media content...

NB: After some discussions with some users on the amarok mailing list, Jamie McCracken pointed me towards Tracker. It would appear to be designed to do pretty much what I was thinking and so it's probably a good idea to check it out if you are interested in some of the things I suggested.

 

MetaLibrarian

Outlined here is a proposal for a project which I think will bring
great benefits to PC based music players/managers. I'll try and outline
it as clearly as possible and hopefully you'll either agree, or be able
to shoot the idea down in flames from a learned position!!

The Problem

The problem is I use Linux. Linux is not the problem, the problem is
that the one thing Linux ensures is choice. I almost envy the Mac. You
want tunes on you Mac, you use iTunes. It's pretty much that simple.
Now there are a million reasons why iTunes is evil etc. but that's
beside the point.

OK, so here is the scenario: Personally, I use Amarok, to listen to my music on my PC. I use MythMusic to listen to my music in my living room. Sometimes I use XMMS just for fun (the XMMS Weasel still rocks!!). I've also toyed with Domo, JuK, Prokyon3 and MadMan. The list could easily go on (forgive me if I've missed your favourite music app!).

Some of these programs just work on music files in your file system and
rely on you naming your directories nicely. Others use an inbuilt
database that keeps the data locked away "inside" the application.
Others still use a standard relational database like MySQL
which theoretically allows you to access the data from different
locations/applications, but each application uses it's own database
schema, so are pretty much incompatible.

So I've got, say, 10,000 audio files in my collection. I fire up
MythMusic and it needs to scan all my music and caches the metadata.
Myth also allows me to configure how I want to organise the
heirarchical view of my music: genre/artist/album/track;
artist/album/track etc. It will also track playcounts/scores and other
usage data to "learn" my preferences and musical tastes such that it
can intellegently help me out with my music choice. It also allows me
to store playlists and create "smart" playlists. So I've done a whole
lot of scanning, organising and configuring there and eventually got it
to be something like how I want to work.

Great. That's my living room sorted.

Now I fire up Amarok..... here we go again... it scans my music and caches the metadata.
Myth also allows me to configure how I want to organise the
heirarchical view of my music: genre/artist/album/track;
artist/album/track etc. It will also track playcounts/scores and other
usage data to "learn" my preferences and musical tastes such that it
can intellegently help me out with my music choice. It also allows me
to store playlists and create "smart" playlists. OK, you are perhaps noticing some similarities here.

Add in Prokyon3, Domo etc. etc. and you've got the same setups and the
same data being cached and edited in many different places, under many
different systems, and stored in many different formats etc., all
totally independantly. Whats more, say I use Amarok to edit my Metadata
to rename an album etc., that makes Myth's cache out of date. Same goes
for the other way round. There are all manner of problems/undesirables
with this kind of disparity.

OK, so you may not want to mix Prokyon3/Amarok etc. anyway (pick one
and stick with it), but some programs are designed to be used for
different purposes, such as my setup where Myth is in the living room
and Amarok is on the desktop.

I think I've outlined a problem that is screaming out for a common solution.

Solution

Right. Here is what I think is an ideal solution. It's not been fully
thought out, I'll admit, but I've been milling it around for the last
couple of weeks and bounced it off a few friends etc. who all reckon it
is a fairly sensible idea.

I'd be perfectly happy is someone finds a massive hole in the idea or
if you can tell me why you think it's really not worth while.

What the MetaLibrarian does

  • Maintains a common store for all the metadata
    relating to your audio collection. This would include, but is certainly
    not limited to: Album, Artist, Track Title, Track Number, Genre, Date,
    Album/Compilation Artist, Album Art, Artist Art/Photo, Artist Bio,
    Album Review, Lyrics, Composer etc. etc.
  • Update metadata in the common store/file's tags.
  • Make available, to any application that wants it, any of the cached metadata.
  • Provide interfaces to flexibly search through this metadata
  • Provide a common preference for arranging a heirarchical structure/structures representing the total collection.
  • Provide a common point for the storage of playlists and smart playlists.
  • Provide a common point for the recording of usage data, such as playcount, last play date, score, skipcount etc. etc.
  • Manage the storage/filesystem organisation of your audio files (optional)
  • Provide a simple and easy way for applications to use the MetaLibrarian
  • Provide
    advanced capabilites to applications by performing tasks such as
    grabbing album art, downloading reviews/bios etc once and making this
    additional metadata available to client applications very easily.


What the MetaLibrarian does not do

  • Play music
  • Rip CDs
  • Download music
  • Supply music data (c.f. DAAP)


How does the MetaLibrarian do this?

  • It runs as a service/daemon. This can be on the same machine as
    your music application in a single desktop environment, or on a central
    server in your networked environment.
  • It is network transparent (TCP/UDP or something else?) such that
    it is scalable from a single computer setup to a multi-client network
    system.
  • It
    connects to common relational databases e.g. MySQL or Postgres, or uses
    it's own builtin database (e.g. SQLite) to actually store and perform
    queries on the metadata.
  • It can inform connected clients of changes to the Metadata (such that two apps can run concurrently on different machines etc.)
  • It runs background tasks such as downloading album art, reviews, lyrics etc. when not doing anything better!
  • It stores complex rules for organising heirarchies and cache the
    results of said rules for fast access on playback. This way people can
    organise their classical, compilation or regular albums exactly how
    they want.
  • Requests a unique "client identifier" from each client such that
    stats like playcount etc, can be aggregated on a per client basis or
    globally depending on the user's/client applications's preference. This
    would also permit the ability to store client specific preferences
    (such as overrides to the global heirarchy options or private playlists
    etc).
  • Queue tasks such as writing metadata to files until a later date.
  • Make available a simple API such that a client application can
    link against this API and gain access to the MetaLibrarian. As some
    applications will want the ability to work without the MetaLibrarian
    (understandibly), this API should permit another linking option such
    that the same API (with the exception of some initialisation routines)
    can be used for a 100% local operation, with metadata extraction,
    storage and querying all built in.
  • In network mode: provide a method for identifying equivilent file
    paths (as MetaLibrarian does not supply audio data, it should allow for
    different mount paths for various file systems e.g. NFS, Samba etc. on
    the client than on the librarian server.


What does the MetaLibrarian facilitate

  • All music programs to work on a common set of metadata, rather
    than reinvent the wheel every time. Therefore a common ability to
    read/write metadata to different file types.
  • The same setup/playlists/statistics etc. to be available to every
    application you use/everyone on your network/in your house uses.
  • VFS cleverness. e.g. A Gnome VFS view/views of your heirarchical music collection.


Some thoughts

  • May want/need user based perferences as well as client based
    preferences. Perhaps even user based authentication for an environment
    that requires security.
  • Could be extended to Images for the same reasons. DigiKam rocks,
    but it would be nice if MythGallery could tap into the same information
    repository, or a Web Gallery (similar to Menalto) if you host a simple website from within your network (I do!)

Conclusion

Right. I've no doubt missed out some of the ideas I've thought of over
the last few days, but I think it's all there. Now I'd really like some
feedback on this!!

Questions

  1. Is this just plain stupid?
  2. Are there any similar systems out there?
  3. Do you think the Librarian should do more? (e.g. supply music data/rips cd?)
  4. Do you think this is half covered by another technology that
    could be embraced and extended (I'm thinking mainly of DAAP here, would
    it just be better to make DAAP do all this? Can it do most of this
    already? Should it just be a matter of creating a kick ass DAAP daemon?)
  5. If the answers to the above are No, No, Maybe, No... does anyone want to help?

You can mail me on: metalibrarian (aht) colin (d0t) guthr (d0t) ie

Thanks for reading.

Col.

Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Yahoo! Buzz
  • Twitter
  • Google Bookmarks
  • Identi.ca
  • Slashdot