Implementing custom captions renderer for Android VideoView

From Android KitKat (API level 16), programmers can use ‘addSubtitleSource()’ on VideoView to add a WebVTT format subtitle track. That’s it. Nothing else to do. The native subtitle handler even automatically listens and changes drawing styles according to system accessibility caption settings. (FYI – System captions settings are under Settings->Accessibility->Captions) Awesome!

But only that it isn’t. Most of the application workflows have a captions button for the user to turn captions on/off. Unfortunately, there are no public member functions on VideoView control to allow the developer, in proxy of app user, to control captions visibility from within the app. All the developer can do is redirect the user to system accessibility page using ‘Intent’. Even then, the captions settings are global. Sure, if the user has permanent hearing disability, this is no problem. He/she would want to turn on caption for all the apps with one switch. Unfortunately, this is not always the case. Even a normal user may want to mute sound and watch the video with captions in a quiet environment. Then, the app will need a button within it for the user to toggle captions. Another, though minor, quirk is that ‘addSubtitleSource()’ works with WebVTT format. The ubiquitous captions format is SRT. This will require you to wire an intermediate converter (A simple SRT to WebVTT converter that I wrote for testing can be downloaded here). Hence, there is atleast two strong cases to implement a custom captions handler.

Android Custom Captions Renderer

1. Get an instance of internal MediaPlayer

The VideoView control has a private member object instance of MediaPlayer control and implements ‘addSubtitleSource()’ on top of it. So, the first plan of attack is getting an instance of this MediaPlayer object instance. One way of doing this is using Java’s reflection capabilities to call hidden member functions in VideoView class – i.e. ‘VideoView.class.getDeclaredMethod(“<Method name>”).invoke(<Params>)’. This, of course, is very ‘hacky’ and we want to stay away from this as much as we can in production code. Fortunately, there’s another way. If we create a new derived class from ‘VideoView’, we can override the ‘onPrepared()’ event and get an instance of its internal MediaPlayer object in the event listener’s function parameter. When this event fires, the video player has loaded the video and is ready to accept captions track. Using MediaPlayer’s ‘addTimedTextSource()’, we can add an external captions track. This can be SRT format.

2. Handle special condition where caption file is on the network

If your captions file is on the network though, we have another hurdle to jump. The function accepts only local file path or a local file descriptor so we have to download the captions to a local file in a temp directory and then add code to handle file clean up when it is no longer needed. If your code fails to properly cleanup, you will be cluttering user’s precious storage space. Alternatively, we could go another path by using ‘MemoryFile’. MemoryFile is a file mapped into memory used for inter-process communication. Nice thing about this is that the system takes care of the file maintenance part while we just read/write to memory. Using Java’s reflection capabilities we can call hidden member ‘getFileDescriptor()’ on MemoryFile to get a ‘FileDescriptor’ to use with ‘addTimedTextSource()’. I would argue that though ‘getFileDescriptor()’ is hidden, it is safe to call it. It ties directly with underlying Linux API and there isn’t any reason for Android  framework to change or remove this function in the future.

3. Rendering caption on specific time

When you set a listener using ‘MediaPlayer.setOnTimedTextListener()’, we have only asked MediaPlayer to notify us using ‘onTimedText’ event when caption text start/end timing is reached on playback. We still have to put a TextView on top of VideoView to show the captions. This can be done easily in the video activity. When the listener handler is invoked, it will have a ‘TimedText’ object on its function parameter. The ‘getText()’ method of this parameter will return the caption text to be rendered at this playback time. If the end time for a caption is reached, this will return a null string. Then, we just hide the TextView captions text control. There is also a ‘getBounds()’ method in the ‘TimedText’ object. Optionally, this may be a non-null ‘Rect’ object. This is a valid Rect object when the input captions file contains positioning information. A rigorous implementation will use this bounds by moving TextView accordingly. One case where this may be used is when an object/person important to the video’s subject matter is on the bottom part of the video which is usually occluded by the captions text.

Pitfalls

The native SRT parser is confused by subtitle information blocks where there is sequence and timing information but the captions text is empty. For instance, as in caption sequence 2 below:

1
00:00:00,000 --> 00:00:00,100
Caption text 1

2
00:00:01,200 --> 00:00:02,000


3
00:00:04,000 --> 00:00:06,000
Caption text 3

One way to fix this is by replacing the ‘\r\n\r\n\r\n’ (assuming Windows style line break) after the end time in sequence 2 with ‘\r\n{Any non-white space character here}\n\r\n’.

QR code generator console application

QRCodeGenerator
QR Code Generator

Source code: https://github.com/sanje2v/QRCodeGenerator

QR code generators are nothing new – they are all over the internet. Yet, I decided to write one as a console application which I didn’t find any (not that it would be groundbreaking in anyway but just to learn how QR code is implemented). This application implements QR code version 1 and encodes only in alphanumeric mode (hence limited to encoding in uppercase English characters).

This project was not intended to cover all implementations of QR code versions and is not likely to be developed further in the future – but perhaps, a decoder if I feel like it. Rather, I wanted to understand how QR code is generated. One of the most important (and interesting) component is error correction which  covers some interesting topics like number theory, Galois field and polynomial arithmetic. I had a lot of fun implementing these. An unexpected field I got to explore was masking where the specification tries to remove unwanted patterns – i.e. patterns which may cause confusion to decoders. The source code for this project will be helpful to anyone starting to learn QR code encoding.

Aside from the official QR code specification document, http://www.thonky.com/qr-code-tutorial/ provides a good overview with walk-through samples for various steps of the encoding process. Not being a math buff, the guide was especially helpful to me on understanding binary polynomial division in Galois field.

In order for the console application to display the square QR code correctly, the program changes console’s font size and type in function ‘SetConsoleAttributes()’. The font index number used in this function is undocumented and was determined by changing Windows 10 console properties and watching the index number change in Windows API. If you are using non-Windows operating system, please change this function to correct your console font to any monospaced font. Also, the function forces the console to black background with bright white foreground as required by the specification.

Pellucid Icons - allows your desktop icons to be semi/transparent so that it doesn't occlude your wallpaper

Pellucid Icons – Transparent Windows Desktop Icons

Pellucid Icons - allows your desktop icons to be semi/transparent so that it doesn't occlude your wallpaper
Pellucid Icons – allows your desktop icons to be semi/transparent so that it doesn’t occlude your wallpaper

So, you have a beautiful wallpaper on your desktop but there’s a barrage of programs and document icons occluding it – ’cause all play and no work makes Jack a dull boy. What to do? Well, here’s a solution.

Pellucid Icons is a shell extension for Explorer that makes your icons transparent until you perform an activity such as mouse move, mouse move to one-third of the screen or double click. Scroll to the bottom for installer.

The extension is implemented with two COM shell extensions in it. One is an icon overlay handler. We (ab)use this handler to make Explorer load our DLL into its address space as soon as possible. The other is a context menu handler. We use this handler to add sub menu items to the context menu when users right click on the desktop.

Once Explorer has mapped our DLL into its address space, we can find the ListView control responsible for desktop icons using ‘FindWindow()’ and set its transparency using ‘SetLayeredWindowAttributes()’. Of course, we need to set ‘WS_EX_LAYERED’ attribute for this ListView before we can set transparency. This attribute is can be applied to child windows only if the application is running in the operating system context of Windows 8 or greater. Notice, that I didn’t say ‘run(ning) in’ but ‘run(ing) in operating system context’. Even though, you may be running your application in Windows 8, it may be running in context of Windows Vista. For normal applications, you need to embed a manifest with ‘supportedOS’ tag to make it run in Windows 8 context. Windows Explorer executable doesn’t have this manifest. Yet, using task manager (in Details tab, you need to right click the header->Select columns… and check Operating system context) we can see that the OS has decided to run it in the context of Windows 8.1 (my test machine’s OS). Phew, we wouldn’t want to make changes to Explorer image file in any way but I wonder how this works? Is this hard coded into the OS? If you have any info regarding this, please do comment.

In order to make Windows Explorer invoke our context menu for the desktop, we need to register it at ‘HKEY_CLASSES_ROOT\DesktopBackground\shellex\ContextMenuHandlers’. Easy peasy. But our context menu handler is invoked even when the user right clicks on an Explorer folder window showing files in user’s Desktop. This is a minor annoyance but fortunately we can detect whether the user right clicked on the actual Desktop or a folder window by checking the flag parameter in ‘IContextMenu::QueryContextMenu(…)’ for ‘CMF_EXPLORE’ (0x4L). If this flag is set, its a folder window so we exit the method without adding anything else we add our menu.

Minimum supported OS: Windows 8

License: Freeware, Open Source, Author disclaims any liability
Credit: Icon from FlatIcons.com, function ‘Utility::Create32BitHBITMAP()’ is lifted from TortoiseSVN project, context menu handler code template is from Microsoft and installer from InnoSetup

Source code

Download installer
Supported architecture for installer: 64-bit machine, for 32-bit machine you’ll have to build yourself
NOTE: This program is installed for all users in your computer but the installer activates it only for the user who ran the installer. Other users in the computer can activate this program by right clicking on their desktop then checking the option ‘Pellucid icons->are enabled’.

MSDOS Header Page

Windows Explorer Property Page Extension for Portable Executables

After years of procrastinating on this project, I finally managed to complete ‘PEPropPageExt’ – yes, I could have named it better. Since, C++11 has been officially released, the old code in C++98 seemed to do a lot of unnecessary memory copying. Hence, parts of this project has been rewritten to take advantage of the language’s new features. The code has been cleaned, more features have been added and more importantly the code is fault tolerant now.

This project creates a Property page extension (property pages are shown when users right click on a file in Explorer and select ‘Properties’ from the context menu) for Microsoft Portable Executable files – EXE and DLL files. This extension shows various information embedded in binary in these files. These information are valuable for developers who are interested in learning how the compiler has built their application executables.

This build has been given release version number 1.0. The project is still under a freeware license for personal or research usage but is restricted for commercial use. Please refer to this project’s readme page for more details on license agreement and disclaimer.

For PEPropPageExt source

Acknowledgements

Before I go about bragging to you about features, I feel some people must be credited whose work has been used in this project:

  • udis86, Disassembler Library by Vivek Thampi
  • Simple Layout Manager by Daniel Horn
  • Rich Signature by Daniel Pistelli

Pages

This section will discuss few significant property pages.

MS-DOS Header

This page shows you information about header for old MS-DOS loader. It is followed by a 16-bit disassembly of code whose sole purpose is to display a message “This program cannot run in MSDOS.” when the executable is run in MS-DOS only machine.

MSDOS Header Page
MSDOS Header Page

Rich Data dialog

Some executables have ‘Rich’ data stored between their MS-DOS and PE headers. This is known to be done by Visual C++ compilers. If there is an embedded data of this kind, you will see the following dialog.

Rich Data dialog
Rich Data dialog

PE Headers

PE Headers is probably the most important dialog among all others. It shows you flags associated with your executable, which minimum version of Windows is being targeted, data directories etc.

PE Headers Page
PE Headers Page

Imports

This page shows you all the modules that are needed and their subsequent symbols for this file. Both static and delayed modules are shown here. Unmangling both Microsoft and GCC C++ style symbol names are supported. Specifically for GCC unmangling though, DLL files ‘LIBSTDC++-6.DLL’ and ‘LIBGCC_S_SEH-1.DLL’ are required in ‘System32’ directory for delay loader to find. These GCC DLL files are distributed with MinGW installations.

Imports Page
Imports Page

Overview

This page shows you an overview of how the virtual address of the image will look like when the Windows Loader has finished mapping the file from disk to memory.

Overview Page
Overview Page

Tools

This page gives you an address converter, hash verifier and Hex Viewer/Disassembler.

Tools Page
Tools Page

CLR Data

For .NET developers, this page shows you the Common Language Runtime header and its associated data.

CLR Data Page
CLR Data Page

Resources

This page shows you information about both native and managed resources. Previewing some types of resources is also be supported. They include icons, bitmaps, string tables, manifest, XML and dialog boxes. Some types of managed resources can also be viewed. If an unknown data format is encountered, it will be shown in hex view.

Resources Page
Resources Page

Frequently Asked Questions

1. How do I install/uninstall this extension?

For installation, first make sure that you have installed Visual C++ 2013 redistributables then copy the DLL files ‘PEPropPageExt.dll’ and ‘ManagedFuncs.dll’ to a convenient location. Open Command Prompt with administrative privileges and navigate to the DLL folder. Enter ‘regsvr32 PEPropPageExt.dll’ to install the product.

To uninstall, enter ‘regsvr32 /u PEPropPageExt.dll’. You may delete the DLL files. NOTE: The ‘ManagedFuncs.dll’ file is loaded by Common Language Runtime and subsequently unloaded by it. A computer restart may be required to unlock this file to delete it.

NOTE: For GNU C++ name unmangling, the DLL files ‘LIBSTDC++-6.DLL’ and ‘LIBGCC_S_SEH-1.DLL’ are needed in the Windows ‘System32’ folder. These files are distributed with MinGW installations.

2. I don’t need all of the tab information, can I hide some of them?

Sure. Navigate to ‘HKCU\Software\SWTBASE\PEPropPageExt\Settings’ and add a new key with the name ‘<SomeThing>’. To hide a specific tab, create a new DWORD value under the key as shown below:

Value Name Description
Hide_AllTabs When Explorer invokes the extension, the extension silently fails. This is not for uninstallation but for temporary disable.
Hide_MSDOSHeaderTab Hides MSDOSHeader page.
Hide_PEHeadersTab Hides PEHeader page.
Hide_SectionsTab Hides Sections page.
Hide_ManifestTab Hides Manifest page.
Hide_ImportsTab Hides Imports page.
Hide_ExportsTab Hides Exports page.
Hide_ResourcesTab Hides Resources page.
Hide_ExceptionTab Hides Exception page.
Hide_BaseRelocTab Hides Base Relocation page.
Hide_DebugTab Hides Debug page.
Hide_LoadConfigTab Hides Load Configuration page.
Hide_TLSTab Hides Thread Local Storage page.
Hide_CLRTab Hides Common Language Runtime page.
Hide_OverviewTab Hides Overview page.
Hide_ToolsTab Hides Tools page.

3. How safe is it to use this against broken or malicious files?

If this extension crashes, the whole ‘Explorer.exe’ parent process crashes. Obviously, this is a nuisance for users. Realizing this, the extension checks to verify that any pointer from the file is within the address space of the executable. The mapped executable’s memory is also marked read-only to prevent execution. There are also checks on values to make sure they are not abnormal.

Unfortunately, not everything is covered. For example, C-String has no size value field. So, checking every byte before reading string would make the extension very slow. This may be tackled in future releases.

Picture Control Preview

Custom scrollable picture control to replace Windows Static Control in ‘SS_BITMAP’ mode

When Windows Static Control’s ‘SS_BITMAP’ style is set, the control can be used to display ‘HBITMAP’ as a static image. It serves its purpose well. Unfortunately, there are two quirks about this control that made it unusable in one of my projects.

  1. The static picture is unscrollable: If the bitmap given to the control is too big, the control merely crops it. Static window class is made to not allow user interaction, so setting the scrollbars to visible won’t work either (they will be shown but will be disabled).
  2. Bitmap resource disposal is tricky: After you assign a bitmap to the control, you are responsible for destroying both the handle that you gave to the control and an internal copy that the control has maintained. I will not go into detail about this but needless to say – it is a major headache and easy target for resource leaking.

Hence, this control. This control solves the above problems by being scrollable automatically if the given bitmap is too big and taking ownership/responsibility of destruction of the bitmap you pass to the control. In addition, the control is wrapped in a neat C++ class.

The source code is available here and is licensed under public domain.

Picture Control Preview
Picture Control Preview

How to use

  1. Call the static function ‘registerControlWindowClass()’ just once in your application (preferably when it starts).
  2. Call the static function ‘create()’ to create a new control with the specified parent window. Check that you received a valid pointer. If this pointer is NULL, there was an internal error.
  3. Call ‘setPosition()’ and ‘setSize()’ to position and size the control on your window.
  4. Call ‘setBitmapHandle()’ to assign an image. The control will enable scrollbars automatically if the picture is too big. The control owns this handle now so do not delete this bitmap.
  5. If you called ‘registerControlWindowClass()’, especially from a DLL, it is a good idea to call ‘unregisterControlWindowClass()’ to remove the control’s window class from registration when the DLL is unloaded.

Bonus stuff

The control consumes bitmap handle to display an image. You can use Windows Imaging Component to load different image formats and convert it to proper bitmap handle using this (dirty and leaky – which you should fix) test code: WIC_FileToHBitmap.cpp (link with Windowscodecs.lib)