Saturday, May 14, 2011

Beginning of the GSoC 2011 multichannel project

It's time to start the project again and since I'm afraid there might be some personal things keeping me busy in the summer I'm getting a head start. The idea of the project is to get multichannel working WELL in XMMS2 music player. There are several patches around, but they're partly conflicting and need cleaning up. I will start with the multichannel support and the also included higher quality resampling will be handled later.

One of the most difficult issues in multichannel handling the channel order. As weird as it sounds, there is no standard for channel order and everyone basically uses what they wish. After looking through a lot of options I've come to the conclusion that we should definitely be using the channel order described in Microsoft documentation combined with a channel mask. As the Microsoft website already says "Several external standards define parts of the following master channel layout". The motivation is further explained below.

First let's take a look at most common multichannel audio formats and their 5.1 channel assignments (FL=front left, FR=front right, FC=front center, LFE=low frequency effects, SL=surround/back left, SR=surround/back right):
  1. WAV/MP3/FLAC: FL, FR, FC, LFE, SL, SR
  2. DTS/AAC: FC, FL, FR, SL, SR, LFE
  3. AC3/Vorbis: FL, FC, FR, SL, SR, LFE
  4. AIFF: FL, SL, FC, FR, SR, LFE
The most popular formats in surround files according to my own experience are FLAC, AC3 and DTS. The suggested channel layout will work with FLAC 5.1 files without any changes, but for example for FLAC 4.0 files the center and LFE channels need to be masked out with channel mask. The DTS and AC3 channel assignments are not supported by any known output and require remapping in any case. This should be easy to do in the decoder anyway.

Then we need to have a look at known audio systems, their corresponding default channel assignments for 5.1 channel audio are as follows:
  1. ALSA: FL, FR, SL, SR, FC, LFE
  2. OSS4: FL, FR, FC, LFE, SL, SR
  3. MacOSX: FL, FR, FC, LFE, SL, SR
  4. Win32: FL, FR, FC, LFE, SL, SR
Pulse audio can easily handle all of these configurations and has channel remapping included, so we can leave that out of the calculations. It is quite clear that ALSA is different from all the others and there are quite good reasons for that. First of all ALSA as in most outputs includes support for user configured channel order, so the order can be changed in ALSA configuration. More importantly with the ALSA channel mapping it is possible to output 4.0 quadraphonic audio without any channel masks, because the first four channels are identical. However going with the rather non-standard (although de-facto standard in Linux) ALSA channel mapping is not worth it if we want to do things cleanly.

If we compare the two lists, audio format channel order and audio output channel order, it is quite easy to notice that the suggested channel order has most similarities in both and we can do out without channel masking. It should be noticed though that this is only true with 5.1 audio, if we want to include the mentioned 4.0 audio we need to have a support for channel mask. That means that some channels can be masked out of the audio stream and therefore ignored.

I hope this is good enough introduction to multichannel audio, I will get back to it later when I will introduce how it will affect the converter and what should be done there.

No comments:

Post a Comment