Writing audio decoder with OSX Core Audio


This article covers details on how to start writing an audio decoder which can be invoked by QuickTime or other client applications. Though writing an audio encoder would have very similar steps, this article specifically covers decoders (well just ’cause I have experience writing them). Unlike QuickTime 7 codecs, you use C++ code which is glued to C interface using boilerplate code provided by Apple. This makes codec development very convenient. But there are some learning curves and pitfalls which any newcomer will encounter – some of which might be unexpected. Hence, this article discusses those ‘gotchas’.

Sample codecs for IMA4 format can be found at: Apple Developer Website

The sample code provided by Apple uses boilerplate code which implements all the C interfaces required for interacting with Component Manager and ACPlugin architecture. Component Manager is a deprecated plugin architecture and so is implemented only for backward compatibility. ACPlugin must be implemented for newer OSX operating systems. The C++ class hierarchy is as follows:

Core Audio Class Hierarchy
Core Audio Class Hierarchy

If you are only writing an encoder or decoder, the ‘Codec’ class can be overridden without having to generate a new derived class from it. On the other hand, if you are implementing them both – two different classes (each for encoder and decoder) will make the design much simple (as shown by the sample code).

Some terminology

  • Codec – Refers to both encoder and decoder.
  • Linear Pulse Coded Modulation (LPCM) – Refers to uncompressed audio data whose amplitude values are linear.
  • AudioStreamBasicDescription (ASBD) – Refers to a C struct which carries information about a format such as channel information, bytes per packet, frames per packet etc.
  • Sample – Refers to one audio data value for one channel.
  • Frame – Refers to one group of samples containing one sample for each channel
  • Packet – In a compressed format, one packet contains many frames but in an uncompressed LPCM, one packet has exactly one frame. In addition, compressed format packet may have header information such as presentation timestamps.
  • Magic Cookie – An opaque data sent by the client. An example of this data may be information about the audio from file container header.

XCode project type

The project type must be a ‘bundle’ type and the final compilation must contain both a compiled .rsrc resource file (for Component Manager) and .plist (For ACPlugin) detailing the codec name, company name, whether its a decoder and/or encoder etc. Look for .r (uncompiled resource file) and .plist files in the Apple’s codec sample.

Functions to implement

The following are the main functions that you will need to implement:

  1. GetPropertyInfo(): This function is called by the client, to query the size of buffer (in bytes) it will need to provide to your codec to read a specific property value. After calling this function, the client will allocate that amount of space and call ‘GetProperty()’ to query that specific property.
  2. GetProperty(): This function allows clients to read a property from your codec. Properties of a codec include information like initialization state, input/output audio formats supported, what output formats are supported for a specific input format, maximum audio packet size for input, number of frames in an output packet, whether frame bit rate is variable etc. If you want to return the default value, call your parent class’ GetProperty() function.
  3. SetProperty(): This function allows clients to set a property. A codec may not support setting a specific property or may reject any attempt to set any property by throwing an unsupported exception.
  4. SetCurrentInputFormat(): This function allows clients to communicate to your codec what audio formats they will be giving to your codec. The format information is provided using ‘AudioStreamBasicDescription’ struct. You may reject this format by returning an error code.
  5. SetCurrentOutputFormat(): This function allows client to communicate what audio format they are expecting out of your codec. For decoders, this is usually Linear Pulse Codec Modulation (LPCM) format. Again, ‘AudioStreamBasicDescription’ struct is used to communicate this information. You may reject this format by returning an error code.
  6. Initialize()/Uninitialize(): This function is called by the client to put your codec in initialized state. This means the client will not alter the input and output format agreed earlier (which were set using SetCurrentInput/OutputFormat() functions). The client can put the codec in uninitialized state again by calling ‘Uninitialize()’. Do not assume that the client will destroy an instance of a class after calling ‘Uninitialize()’. The client is free to call ‘Initialize()’ again. The parameters of this function are input format, output format and a magic cookie (i.e. an opaque data sent by the client like importer), if any. Be advised that the parameters of this functions are pointers and hence are optional. This means in some invocations, you may be provided with input and output format but no magic cookie. Then the client may called ‘Uninitialize()’ and call your ‘Initialize()’ function with NULL pointers for input and output format but with a valid magic cookie pointer. Hence, you are expected to make a copy of and save whatever you get in your class member variables.
  7. AppendInputData(): This function is called by the client to provide an input packet (compressed audio packet in case of a decoder). You are expected to save a copy of this packet in your circular buffer (implement by SimpleCodec class in the sample code). The client may provide more than one packet at a time. How many packets the client has provided is given by ‘ioNumberPackets’ function argument. Hence, you must return how may packets you actually added to your queue using the same ‘ioNumberPackets’ reference argument. Be advised, when all packets have been provided, this function may be called with ‘0’ value for ‘ioNumberPackets’. In this case, return from this function without doing anything and throw ‘kAudioCodecProducePacketsEOF’ in ‘ProduceOutputPackets()’ function. If you specified ‘1’ for ‘kAudioCodecPropertyHasVariablePacketByteSizes’ and ‘kAudioCodecPropertyRequiresPacketDescription’ in ‘GetProperty()’, the client will provide a valid pointer for ‘inPacketDescription’ which points to a list of ‘AudioStreamPacketDescription’ struct. The no of elements of this list will be given by ‘ioNumberPackets’ parameter.
  8. ProduceOutputPackets(): This function is called by the client to ask your codec to process the packet they provided earlier in ‘AppendInputData()’. The ‘ioNumberPackets’ function argument contains the number of packets you are expected to process from your circular buffer. The ‘outOutputData’ pointer parameter specifies a memory location where you are expected to write your output data and the ‘ioOutputDataByteSize’ parameter specifies the amount of space provided in bytes. This space is usually the number of frames per packet your reported in ‘GetProperty()’ function multiplied by output format channel count multiplied by output format sample size. If you are given insufficient space, give the amount of bytes you need in ‘ioOutputDataByteSize’ reference variable and throw an insufficient space exception. If you have successfully processed a packet and written the output data, you must notify your client on how many frames of data were written and how much of the buffer space provided was utilized. The number of frames written must always be equal or less than number of frames per packet you reported in ‘GetProperty()’ and ‘ioNumberOutputSize’ must return number of frames outputted multiplied by output format channel count multiplied by output format sample size. If you have more frames than the maximum value, you can return ‘kAudioCodecProduceOutputPacketSuccessHasMore’ to notify the client that you want to work with the same packet because you have more data to output.
  9. Reset(): This is usually called by the client asking your codec to discard your circular buffer contents and start with an empty buffer.

For a decoder, a QuickTime client may call your decoder class functions in the following order:

  1. Class constructor()
  2. GetPropertyInfo() with kAudioCodecPropertyFormatList
  3. GetProperty() with kAudioCodecPropertyFormatList: Return ‘AudioStreamBasicDescription’ struct for each format supported by you codec
  4. Class destructor()
  5. Class constructor()
  6. GetProperty() with kAudioCodecPropertyNameCFString: Return the name of your codec
  7. Class destructor()
  8. Class constructor()
  9. GetPropertyInfo() with kAudioCodecPropertyOutputFormatsForInputFormat
  10. GetProperty() with kAudioCodecPropertyOutputFormatsForInputFormat: Return a list of ‘AudioStreamBasicDescription’ struct for each output formats supported for a given input format. (NOTE: I could not get my QuickTime 7 client to accept LPCM with planar/non-interleaved audio format. I got ‘AppendInputData()’ but ‘ProduceOutputPackets()’ was never called. If anyone knows why, do let me know in the comments section.)
  11. GetProperty() with kAudioCodecIsInitialized: Return ‘0’ to show that the codec hasn’t been initialized
  12. Initialize() with valid input and output format parameters but NULL for magic cookie pointer
  13. GetProperty() with kAudioCodecIsInitialized: Return ‘1’ to show that the codec has been initialized
  14. GetProperty() with kAudioCodecPropertyInputFormat: Return the currently set input format
  15. GetProperty() with kAudioCodecPropertyOutputFormat: Return the currently set output format
  16. GetProperty() with undocumented property ‘grdy’: Pass to lower base class which throws unknown property error
  17. GetProperty() with kAudioCodecPropertyInputBufferSize: Return the maximum size of your circular buffer in bytes
  18. GetProperty() with undocumented property ‘pakx’: Pass to lower base class which throws unknown property error
  19. GetProperty() with kAudioCodecPropertyMaximumPacketByteSize: Return the maximum size of input packet you can handle in bytes
  20. GetProperty() with kAudioCodecPropertyPacketFrameSize: Return the number of output frames the input format packet has. If you have variable number of frames per packet, return a maximum number.
  21. GetProperty() with kAudioCodecPropertyMinimumOutputPacket: Return ‘1’ to indicate that you output at least one packet. Passing handling to the class lower than yours will do the same thing
  22. GetProperty() with kAudioCodecIsInitialized: Return ‘1’ to show that the codec has been initialized
  23. GetPropertyInfo() with kAudioCodecPropertyCurrentOutputChannelLayout: Return the sizeof(struct AudioChannelLayout)
  24. GetProperty() with kAudioCodecPropertyCurrentOutputChannelLayout: Fill and return a ‘AudioChannelLayout’ struct specifying how audio channel data is mapped
  25. GetProperty() with kAudioCodecIsInitialized: Return ‘1’ to show that the codec has been initialized
  26. Uninitialize()
  27. Initialize() with NULL input and output format parameters but valid magic cookie pointer and size, if any
  28. Same calls from steps 13 to 21
  29. GetPropertyInfo() with kAudioCodecPropertyPrimeInfo: Return sizeof(struct AudioCodecPrimeInfo)
  30. GetProperty() with kAudioCodecPropertyPrimeInfo: Return any leading and trailing frame information
  31. GetProperty() with kAudioCodecIsInitialized: Return ‘1’ to show that the codec has been initialized
  32. GetPropertyInfo() with kAudioCodecPropertyUsedInputBufferSize: Return 0 at this point because your circular buffer is empty
  33. Reset()
  34. GetProperty() with kAudioCodecIsInitialized: Return ‘1’ to show that the codec has been initialized
  35. GetPropertyInfo() with kAudioCodecPropertyUsedInputBufferSize: Return 0 at this point because your circular buffer is empty
  36. AppendInputData()
  37. ProduceOutputPackets()
  38. Same calls in steps 36 and 37 until all packets are processed

Installing your codec

To install your codec bundle folder, copy it to:

  • For system wide installation: /Library/Audio/Plug-Ins/Components
  • For user specific installation: ~/Library/Audio/Plug-Ins/Components
Advertisements

Leave a reply here, thanks!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s