FFmpeg LGPL v3 binaries for Windows

It seems cross compiling FFmpeg for LGPL v3 license (i.e. with non-LGPL v3 components and non-free components stripped) for Windows not only is a lot of hassle but takes a whole part of a day. So, this post is to provide links to statically linked LGPL v3 FFmpeg executables for both 32 and 64 bit architectures.

Download binaries with Debug information (has both 32 and 64 bit binaries)
Download binaries without Debug information (has both 32 and 64 bit binaries)

FFmpeg version: N-62439-g5e379cd

Statically linked library versions:
libavutil      52. 76.100 / 52. 76.100
libavcodec     55. 58.103 / 55. 58.103
libavformat    55. 37.100 / 55. 37.100
libavdevice    55. 13.100 / 55. 13.100
libavfilter     4.  4.100 /  4.  4.100
libswscale      2.  6.100 /  2.  6.100
libswresample   0. 18.100 /  0. 18.100

Build configuration:
–arch=x86_64 –target-os=mingw32 –pkg-config=pkg-config –enable-avisynth –enable-libmp3lame –enable-version3 –enable-zlib –enable-librtmp –enable-libvorbis –enable-libtheora –enable-libspeex –enable-libopenjpeg –enable-gnutls –enable-libgsm –enable-libfreetype –enable-libopus –disable-w32threads –enable-libvo-aacenc –enable-bzlib –extra-cflags=-DPTW32_STATIC_LIB –enable-libopencore-amrnb –enable-libopencore-amrwb –enable-libvo-amrwbenc –enable-libschroedinger –enable-libvpx –enable-libilbc –enable-static –disable-shared –enable-libsoxr –enable-fontconfig –enable-libass –enable-libbluray –enable-iconv –enable-libtwolame –extra-cflags=-DLIBTWOLAME_STATIC –enable-libcaca –enable-libmodplug –extra-libs=-lstdc++ –extra-libs=-lpng –extra-cflags= –extra-cflags= –enable-runtime-cpudetect

Tagged , ,

Adding shutdown/restart pushbutton for Raspberry Pi

This article deals with adding a push button on the Raspberry Pi’s¬†GPIO pins and¬†writing a daemon that handles push button events. If we press the push button for less than 2 seconds, we want the daemon to¬†shutdown the system and if the push button is still depressed for 2 or more seconds, the daemon must restart the system. First we deal with the hardware part of the problem.

Things to know:

  • Raspberry Pi has a 3.3 V microcontroller which means 3.3 V is considered logic high
  • The GPIO pins do not have voltage/current protection, so working voltage must strictly be 3.3 V and current in individual pins must be less than 16 mA

Wiring up hardware

Before we can wire up the GPIO pins, we need to convert the male GPIO pins on the board to female. According to the internet, an IDE hard drive cable is a good candidate for this. Unfortunately, I couldn’t find this cable in my local electronics shop so I settled with the following connector.

IDC26WayConnector

IDC 26 Way Connector

I did have to bend the pins a little on the connector so that it grips the GPIO pin well. Scroll down to see how the connector fits on the GPIO pins.

If you are new to electronics, wiring up the 3.3 V source on the board directly to a GPIO pin and putting a switch between them seems to be a no brainer. Unfortunately, this has two problems. First, if an accidental short occurs, a¬† source voltage directly connected to a pin can cause large surge of current¬†- damaging the hardware. Raspberry Pi’s individual pin is rated to work with current less than about 16 mA. We want to keep the current well below this limit. To solve this problem, we add a resistance big enough to limit current.

Secondly, when the switch is open, the pin is in what’s called a floating state. This means that because the pin is connected to neither a voltage source or ground, the input pin is in an indeterminate state.¬†During this project, I once incorrectly wired the circuit to leave the input pin as floating. When I pushed the button, the program would correctly read the pin value as high because it was tried to a 3.3 voltage source. When released though, the program would still read high. But if¬†the program was asked to read the pin a second time, then the program would correctly read low. This strange behaviour is caused by a floating pin.

To avoid this problem, we need to configure our push button circuit such that the pin is driven to either¬†ground or¬†3.3 V depending on the state of the push button. This is done by using a pull-down resistor. We use a ‘pull-down’ resistor to pull the input pin to¬†ground, when the pin is disconnected from logic high voltage by the push button.

Hence, the following circuit diagram shows the final setup of the circuit for GPIO pins:

Raspberry Pi Push button circuit

Raspberry Pi Push button circuit

From the wiring diagram, we can see that when the switch is open, the input pin is connected to the ground. When switch is closed, the 3.3 V source is parallel to the 1.2 K‚Ą¶ load which means input pin will be at 3.3 V (because the source is connected in parallel to both the loads). The 10 K‚Ą¶ resistor is the pull-down resistor while 1.2 K‚Ą¶ resistor is a current protection resistor. Using the consequence of Ohm’s law -¬†Voltage = Resistance * Current -¬†we can¬†calculate that when Voltage = 3.3 V, using the resistance values used above gives us a current below the rated limit.

Wiring view

Wiring view Wiring view

Setting up software

wiringPi‘ is one of the most commonly used GPIO library for Raspberry Pi. Setting up this library is detailed in the library’s website.

Assuming you have correctly setup ‘wiringPi’ and GNU C/C++ tools in your Pi, we can begin setting up the daemon. A daemon (for Windows users – a Linux daemon is equivalent to Windows service) is a perfect candidate for us because:

  1. It can be configured to start automatically Pi boots
  2. Daemons run with ‘root’ privileges which are required for some functions
  3. Daemons don’t require interaction from the user

It would be a waste of space to discuss the boilerplate code common to all daemons. Comments in¬†source file discuss this in abundance. The most important part of the daemon is handling button push interrupt. When the push button is depressed, the voltage in input pin jumps from logic low to high which in turn calls¬†an interrupt handler. The first thing this handler does is disable further interrupts. Unfortunately, the ‘wiringPi’ does not support disabling an installed handler. As suggested by the author of the library, we use a little hack. Then we sleep our thread for 2 seconds after which we test the pin value again. If the pin is released, we initiate shutdown else a restart.

Source for daemon: https://github.com/sanje2v/buttonshutdown-daemon

Follow these steps to setup the daemon:

  1. Download and¬†build the daemon source using¬†‘build.sh’ on your Pi. The build script also handles moving the output daemon file to ‘/usr/sbin’.
  2. Next, you will need to install the service script. This script handles starting and stoping our daemon. To install this script, just copy the ‘buttonshutdown’ file to ‘/etc/init.d’ and set¬†appropriate permissions using ‘sudo chmod 755 /etc/init.d/buttonshutdown’.
  3. Finally, we want our daemon to start automatically at boot time. To do this, run the command ‘sudo update-rc.d buttonshutdown defaults’.
  4. At this point, the daemon is not running. We have merely registered it to run at every next system start-up. To run¬†the daemon now, use ‘sudo service buttonshutdown start’.
  5. Now pushing the push button for less than 2 secs should initiate a shutdown while anything more should restart the system.
  6. If for some reason the daemon fails to load, check ‘/var/log/syslog’ using ‘grep buttonshutdown-daemon /var/log/syslog’.
Tagged , ,

LEGO Art: Luigi

LEGO¬ģ bricks are pretty awesome, aren’t they? Have you seen the latest Australian LEGO ad yet? It’s one of the best ads I have seen in a while. Check it out:

Having just worked in an animated GIF using Luigi, I wondered how awesome it would be to build a Luigi art out of LEGOs. Hence, this project. This is what I got:

LEGO Art: Luigi

LEGO Art: Luigi

Here are the steps I followed to build this:

1. Pixelate image: We need to look images in terms of building blocks of LEGOs before we can begin building. Fortunately, pictures¬†of old arcade game characters such as Luigi are already pixelated. They use limited basic colours and are composed of simple blocks of pixels which we can directly map to different sizes of LEGO blocks. For this character, I found a good sprite at http://www.videogamesprites.net/SuperMarioBros1/Characters/Luigi/index.html. I opened this picture in GIMP image editor, zoomed in to 200% and then set grid settings under ‘Image->Configure Grid…’ to 2 pixels for both x and y axes. To make¬†grids¬†visible,¬†‘View->Show Grid’ menu command must be checked. I set the grid to 2 px because I felt assuming that a LEGO block’s height (9.6 mm) mapped to 2 px on the sprite would give me a final structure with acceptable size. If you want your LEGO structure to be bigger than this, you should increase this ratio accordingly. Keep in mind that as LEGO structures grow larger, they tend to break more easily.

If you want to build a LEGO structure out of a camera picture¬†instead of this simple drawn picture that I am using, you will want to¬†use GIMP filter called ‘Pixelate’. There are other filters in GIMP that will help you reduce the number of colours too. Before being too ambitious, do keep in mind that LEGO bricks provide limited palette for colours and structure will quickly become too big to handle if you are not careful.

2. Building in software: This step is only necessary if you don’t have a ton of bricks on hand. Because, I didn’t and also because I wanted to buy only what was needed,¬†designing the structure in software was a wise thing to do. LEGO provides an awesome software called ‘LEGO Digital Designer’. It’s available for free download from LEGO’s website. Here’s how my software build looked like:

LEGO Art: Software Plan Luigi

LEGO Art: Software Plan Luigi

This software plan not only¬†shows what the structure will look like in reality but also allows you to generate a list of all the bricks that are needed and to see whether the structure has any weak parts which may require reinforcement. The list of all the bricks needed can be generated using ‘LEGO Digital Designer’ application’s menu command¬†‘File->Export BOM’.

My build list can be downloaded from: https://www.dropbox.com/s/m0hvjl5jonrph7q/Luigi.xlsx
My LEGO Digital Designer Model file can be downloaded from: https://www.dropbox.com/s/jxsix2017ojheil/Luigi.lxf

Ordering blocks: From my experience, it is a wise decision to go to a LEGO shop to buy your list of parts. Online order, especially from a cheaper local online store, will most certainly mess your order up. Then you are left with missing parts or parts you didn’t even order. This happened to me. Fortunately, I changed the design a little bit here and there without making it noticeable. If you look¬†closely at the photograph of my build, you will see that the feet are too long and the green hair near the ear has one block missing. LEGO’s official web store does allow online buying. Using this web store you may be less likely to have to deal with such ¬†carelessness but then they do¬†charge you a lot for delivery fee. In my case,¬†the total price of what I was buying was actually less than the AUD $25 for delivery fee.

Building: Finally, ‘LEGO Digital Designer’¬†can create¬†an easy to follow¬†step-by-step designing guide using the menu command ‘View->Building guide mode’.

Tagged ,

Writing property handler for Windows Explorer/Manta Property Extension

Setup for 32-bit Windows systems: Click Here
Setup for 64-bit Windows systems: Click Here
Source: https://github.com/sanje2v/MantaPropertyExtension

There are many instances when we would want to get information contained within an EXE application file. Is it a 32-bit or 64-bit application? Is it a .NET or native application? Does the application run with a window or in command line mode? One way to find the architecture without using a tool in Windows is to run it then look through Task Manager’s ‘Processes’ list to see whether the image name ends with ‘* 32′. If it does, then it’s a 32-bit process otherwise not. Another cumbersome way is to download some PE tool and look through and find ‘Magic Number’ field in an EXE’s Optional header¬† (Reference: Microsoft Portable Executable Specification).

Currently, when selecting an EXE file in Windows Explorer, the only information we get is that it’s an ‘Application’ as shown below.

Details view without MantaIt would be great if we could develop a property handler for Windows Explorer which read the above mentioned properties from within an EXE file and replaced ‘Application’ text with a more useful description like ’32-bit .NET GUI Application’. Taking this further, we could add support for DLL, OBJ, O, LIB and A file extension and display their important properties inside the file manager without having to use any external tool.

Unfortunately, Explorer provides no way for property handlers to modify this text. The source of this text is tied to the registry value:

HKEY_LOCAL_MACHINE\SOFTWARE\Classes\exefile : ‘FriendlyTypeName’ Value

This value can be either a string value or redirection string to a path of a Portable Executable file with reference to a string value in its String Table Resource. Neither is the property handler invoked when the user clicks on a PE file in the file manager. This means there is no way to programmatically control this value. Instead, we can control hundreds of properties already provided by Microsoft. Even though the registered schema for a file extension may quote a limited number of supported properties, when the user right clicks on the file manager’s list view header and adds additional properties, Explorer queries property manager for the selected property even though the registered schema may not have stated that the property is supported. I selected ‘System.Comment’ property to subclass because the property title seemed fit for my purpose. Again, developers do no use this property for their EXE files because the original property handler that comes with Windows does not handle this property (according to behavior seen in Windows 7).

The project was named ‘Manta’ (after Manta Ray fish) and I started writing a C++ based COM in-process property handler DLL. I followed MSDN documentation’s recommendation that I implement ‘IInitializeWithStream’ interface rather than ‘IInitializeWithFile’. If I needed the file name associated with the stream, I could call ‘Stat()’ function on the stream (file name returned is without path). This worked fine in Windows 7 SP1 but when deploying to Windows 8, the property handler crashed immediately. It turns out calling ‘Stat()’ on the stream returned a ‘STATSTG’ structure with only type, size and grfMode fields having correct values. All the other fields including ‘pwcsName’ (this field contains a C-string pointer to¬† the file name) were set to NULL. This meant I had to convert ‘IInitializeWithStream’ implementation to ‘IInitializeWithFile’. Fortunately, using ‘SHCreateStreamOnFileEx()’ function I could create a stream from a file path. This meant code that was written need not be changed.

The property handler was now able to respond to Explorer’s request for ‘System.Comment’ property. But the default property handler, implemented properties such as Copyright, Product Name, Version etc. These information are got from Version Information Blocks contained within Version Resource in a Portable Executable. I had two choices – either implement the version information extraction routine using Win API functions myself or find a way to make use of the original property handler for other properties. The second choice seemed more obvious. Windows Property System provides no way of sub classing an existing property handler, so I used a COM technique called delegation. In this technique, my property handler creates a COM object using original handler’s CLSID. When Explorer asks for natively supported properties, my handler forwards the request¬†to the original handler. My handler then merely re-returns the return value from original handler. When Explorer asks for ‘System.Comment’ property, my handler handles this and does not delegate to the original handler. To properly implement this, my handler had to have access to original handler’s CLSID. I decided that the installer program would be responsible for saving original handler’s CLSID in a convenient location where my handler could find it when needed. I built an installer using ‘InnoSetup’ to do this easily.

After installing Manta, the file explorer shows the information as follows:

Details view with Manta Property ExtensionAs you can see, not only have I added a new property but old properties are still shown too. Even though we discussed only EXE files, the handler supports DLL, OBJ, O, LIB and A file extensions. Unlike the default handler, my handler implements ‘IPropertyStoreCapabilities’ interface so that ‘System.Comment’ property is made read-only removing an annoying faulty ‘Comment’ property¬†handling with¬†EXE and DLL files.

One unexpected consequence of implementing this property handler is that now I can use Windows Search to find all PEs which are .NET or which are 32-bit binaries. By typing ‘Comments:32-bit’ on the search box, I can search for all the 32-bit PEs in a folder. Just to make sure the search index (which is used by Windows Search to perform speedy indexed search) has collected this new property for your file, rebuild the index by going to ‘Indexing Options->Advanced->Rebuild’ (Just type ‘Indexing Options’ on your start menu search box to find indexing options).

Finally, if you are making a property handler on your own, make sure that on a 64-bit Windows you implement and install both 32-bit and 64-bit property handler DLLs in their appropriate registry keys. Even though Windows Explorer is a 64-bit process, a 32-bit process browsing properties of a file programmatically will use the default handler and see only the default properties. On a 32-bit architecture, only a 32-bit DLL will suffice.

Registry value for property handler installation:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\PropertySystem\PropertyHandlers\.<File extension> : (Default) = <CLSID>

Registry value for 32-bit property handler installation in 64-bit Windows:
HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Windows\CurrentVersion\PropertySystem\PropertyHandlers\.<File extension> :  (Default) = <CLSID>

Registry key to change what properties to display in the file manager for what view:
HKEY_CLASSES_ROOT\SystemFileAssociations\.<File extension>

Tagged , ,

CPU vs GPU: Blurring image using DirectCompute

GPU with its SIMD (Single Instruction Multiple Data) type architecture provides a massive parallelizing opportunity over thousands of cores compared to a typically just 8 cores (which again may be just logical and not physical cores) in CPU. Again, GPU threads are implemented in hardware making their context switching almost instantaneous and less expensive than in CPU. Give GPU large floating point calculations and it will put CPU performance to shame.

Of course, CPU excels in performing general purpose tasks. Not to mention, writing CPU programs require next to nothing thought in the part of the programmer about the underlying hardware. Programmers are even discouraged to perform optimizations on binaries produced by compilers of their programming language. It is accepted that the compiler “knows best” when it comes to optimizations. Cache memory is implemented in several levels and underlying cache control hardware (which cannot be controlled by the programmer) tries to predict memory usage of your code. If your code requests memory according to the predicted usage model, then your program will keep the CPU busy and run at optimal level.

The grass isn’t so green on the GPU side. GPU programming requires that you have extensive knowledge of the underlying hardware. There is high reward with amazing performance boost if you make use of correct configuration and harsh punishment with no gain (even loss) if your configurations cannot make use of hardware. In GPU programming, programmer must actually plan their code around the architecture. You as a programmer are given full control of the cache. With three memories – local, shared and global – with varying speeds and sizes to choose from, you must decide where your program data must be kept. By profiling your code, you must make sure that the configuration you chose and the kernel (GPU programs are called kernels) you wrote makes full use of hardware capability. This may sometimes mean that a serial algorithm must be converted to a parallel version and/or you must write different kernels targeting different models of hardware. Finally, there’s the obvious hassle of having to copy data to and fro CPU and GPU memories.

If there are so many requirements to get good performance on GPU, is it really worth it? I wanted to see how fast my NVIDIA GeForce GT 425M GPU could perform a box blur using a 9×9 size blur kernel (matrix used for blurring is also called kernel, not to be confused with GPU programs) over an image of 2560×1600 size. I got the following results:

Box Blur

Device

Housekeeping code time

Blurring code time

CPU n/a 3-4 secs
GPU About 0.8-2 secs ~122 milliseconds
NOTE: CPU program had optimizations on. GPU blurring time is as reported by GPU while other timings are rough. Configuration used for GPU configuration and GPU kernels may not be optimal.
 

The main obstacle in getting started with DirectCompute is learning how to put HLSL (High level shader language is the programming language used for GPU programming) code and data into GPU and copy result back from GPU into CPU memory. Using ‘BasicCompute’ example that comes with DirectX SDK, I wrote a neat little class called ‘ComputeShader’ which wraps this cleanly so that the programmer may focus on writing GPU code rather than trivial task of dealing with DirectX COM interfaces and copy data back and fro.

Download ‘ComputeShader’ class files: Click Here
Download sample program source using the class: Click Here

For new users of this class, please download the sample to see how to use the class. In summary, here’s what you need to know to use it:

1. Call class functions in this order:

  • CompileShader(<filename of HLSL code>, <Entry point function>, <No. of X threads in a block>, <No. of Y threads in a block>, <No. of Z threads in a block>)
  • RunShader(<No. of X blocks>, <No. of Y blocks>, <No. of Z blocks>, <Vector of Input data>, <Vector specifying sizes of Output data>, <Vector of Constant data>)
  • Result<Type you want to be returned>(<Index of output data item>)

2. For Input data, specify vector of tuple as ‘make_tuple(Pointer to Input data, Size of each element, Total elements)’. The order in which you push elements in the vector maps to the order of register in GPU.

3. For Output data, specify vector of tuple as ‘make_tuple(Size of each element, Total elements)’.

4. For Constant data, specify vector of tuple just as in the case of Input data. Make sure that constant data is 32bit aligned otherwise you will have problems.

5. Using already compiled HLSL object file is not supported in the version of the code published with this article. This may change in future.

6. Any error during compilation of shader code is outputted to the Immediate Window of your IDE.

7. Profiling GPU execution time can be done using ‘GetExecutionTime()’ function. Sometimes profiling returned by this function cannot be trusted due to change in GPU execution frequency while your code was executing (One of the causes of this change may be due to GPU being powered down because the computer is running on batteries). This function takes an optional bool variable pointer with which you can determine the value’s trustworthiness.

8. Class assumes a ComputeShader 5.0 supported hardware.

9. To control the no. of threads from your source .cpp file, make sure you use the ‘NUM_OF_THREADS_X’, ‘NUM_OF_THREADS_Y’ and ‘NUM_OF_THREADS_Z’ macros in your HLSL file. See sample.

10. You are free to use the class or sample code in any way you like. Having said so, I disclaim any liability from its use.

Tagged , , , ,
Follow

Get every new post delivered to your Inbox.

Join 67 other followers