Seeing and believing: 2013

Sunday, 29 December 2013

OpenCVKinect: Acquiring Kinect Data Streams in OpenCV

Click here to go to code download step directly
_______________________________________________________________________________________________

Edit (26/05/2016) : I have updated the OpenCVKinect to fix some bugs and make use of different visualization for depth maps. Details of this newer version can be seen here. All other details mentioned in this blog post still apply to the updated version.
_______________________________________________________________________________________________

Holiday season is here, and I have finally found sometime to write something for this blog. Sometime back I wrote a guide for compiling OpenCV with OpenNI support. Since that post a lot has been changed in the newer OpenNI 2.x SDK, there has been a lot of improvements for using Microsoft Kinect Sensor. One of the major change is the Object Oriented Implementation as opposed to pure C implementation in previous OpenNI 1.x versions. I have had many requests asking me to write a tutorial for compiling OpenCV with the new OpenNI 2.x support. Although this is not possible with the current SDK and OpenCV, this blog post introduces an implementation for acquiring Kinect Data Streams in OpenCV.

After a long time, I recently started working with Kinect Sensor again, and therefore I wanted to access the data streams in a more efficient manner. While it looked straightforward to use the built in function calls in OpenNI 2.x, I was more comfortable with using OpenCV format instead. Therefore I wanted to hide all that detail of OpenNI Objects and their methods in a simple and convenient way. To achieve this, I wrote my own implementation and this blog post officially releases this code as open source, for anyone to use/modify/update to. Right now I have some basic methods, but in future I am thinking of adding simple image processing/data recording methods too (to make it easier for beginners).

One Image hiding over eight thousand different stories...

Working with large datasets has its own pros and cons. Whatever the implementation or field might be, there is always a need for training a machine learning algorithm to recognize the pattern in that data. We often discuss this "Pattern" in many different instants and a big chunk of literature addresses this recognition problem. However it is often not considered important to get to know how this pattern looks like? why is it even called "Pattern" in the first place??

Interestingly the answer lies in the above image which shows a collection of 8000 different samples, arranged in columns. Here the first thing to notice is that there actually is a repeating pattern in the data. This is the exact pattern which we are trying to learn. It may not make sense when looking at it, however with correct label representation, each sample can be used to build a model which is able to identify each class with high accuracy.

Tuesday, 10 December 2013

Implementing SnakesGame using OpenCV

A long time back I used to have a nokia phone which came with this awesome and simple game. I still love nokia phones for this, and have a couple of phones just to play it (no kidding, haha). A while back I decided that I will write my own snakes game implementation, so I did. What is interesting about this post is that I did not use any graphics engine or OpenGL rendering at all. Instead this whole game was implemented using OpenCV function calls, enabling me to make my own rendering funcitons and display buffers.

Google's Christmas surprise....!!

I have an android phone which automatically syncs all the photos I take to Google+ Photos.

So today we had our own christmas tree in Uni, and I took a picture to post on instagram.

After a while I checked to see that google's auto awesome feature has made a very cool christmasy effect to make the picture come alive.

What a nice surprise, interesting how google is using different image processing algorithms.

Feels like its holiday season already!!

Tuesday, 19 November 2013

Implementing Noisy TV in OpenCV...wait WHAT?!?

If you are a 90's kid like me, you must admit that back then there was nothing more annoying than losing signals on your TV. Before the age of internet, TV was the main source of entertainment. When signals were lost, all you could see was an infinite race of millions of flies on your screen (a noisy image) and making it more intolerable was the loud noisy sound of the noise.

Mind == Blown!

So sometime back I saw this video presentation of a new and, what I like to call it, novel method for extracting 3D structures from a single image. Part of the reason why this blows my mind, is that this approach is well defined for a specific scenario and it utilizes the best of both human brain and computer's processing power.

We have a great sense of depth perception of objects. Our brains are well trained to construct an object's three dimensional model, by just looking at pictures. This, however, is a trivial and a highly challenging task for computer algorithms. On the other hand, computers are capable of computing and interpolating data at a much faster rate than humans, given that the task is simple and fairly straightforward.

Computer Vision is everywhere...

As most of the android developers, I am a big fan of google nexus tablets and smartphones. Have been using a google nexus 4 for a while now and I am impressed by all kinds of cool stuff you can do with it. A number of cool applications are based on different computer vision techniques. In this post I will be discussing these applications.

To list just a few obvious ones, the android based smartphones have face recognition based unlocking, camera app which can pick up faces, creating panoramas, editing photos and using readings from a number of inertial sensors to stitch multiple picture into one 3D picture called Photo Sphere.

OUT-A-TIME: What is the fourth dimension?

I have been doing my research using three dimensional datasets acquired from both real and synthetic methods. During my past research I utilized Microsoft Kinect to acquire real-world objects in their three dimensional space. On the contrary I have also used computer graphics to generate such three dimensional datasets. Some other projects I have worked on have also revolved around concepts which were vaguely related to different multi-dimensions.

Working with these multi-dimensional datasets, I have always been interested in finding out how these multi-dimensions would exist in reality (if they ever did). Here I was more interested in the question about physical space we live in. Annoyingly this has always confused me. I simple could not comprehend more than three dimensions.

For those of you who are familiar with the picture below, this post is going to be as interesting for you to read as it was for me to write.

Setting up freeglut and GLTools with Visual Studio 2010

It's good to be writing a tutorial after a long time and there are a number of reasons for that. First of all, I have been really busy with a lot of work and research (well actually I still am!). On the other hand, it is only until recently that I have been struggling with a setup which has little tutorials documented, while there seems to be a lot of beginner developers facing the same problem as I am.

This particular tutorial deals with setting up a Microsoft Visual C++ 2010 Project for use with examples found in the OpenGL Superbible 5th Edition. The book has a section which details the same process for a Visual C++ 2008 project, which is completly different than this tutorial. As always, I have tried to keep everything simple and straightforward so even a person who has no knowledge about these settings can make them work.

Information - How much do we need?

The good thing about doing research is that there is a possibility that you can come across a new technique/method everyday. Recently I have been working with a lot of information from different datasets and trying to extract the useful information content, which can be eventually used to describe the major trend in the dataset.

A major hurdle in this is to quantify that how much information is actually informative. One possible method which is widely used to achieve this is called Singular Value Decomposition (SVD). As this is a widely used technique, the details of this method can be easily found across the web. This blog post will visualize the content information and try to comprehend how much is required for computer vision applications. Here I am using one of the applications of SVD, which is to compress the content of an image. Although there exist better approaches to achieve this, the content used can be quantified easily using SVD.

Below is a sample grayscale image that I have used for this blog post. This is the original image, without any compression using SVD. Notice that there are a lot of regions in the image that look similar.

Programming Inception - Function within a function

This post is about a small but vital part in one of my projects in the past. This specific part of my project dealt with interpolation of quantized silhouette images, using a simple averaging based recursive interpolation.

Okay, I know its difficult to understand, but don't just stop reading yet. The most interesting part about this is that it can be related to the concept of "programming inception". You might be wondering this is something very difficult or deep. Don't concentrate too hard like cobb here, from the movie inception. This concept is as simple as it can get, its just that at first its difficult to get.

If you are like me and have watched "Inception" several times, then recursion in programming can be thought of having a dream within a dream. The more number of levels of dream you go into, the more specific to details you get. Likewise, a recursive function calls itself with its input and uses the outputs in multiple layers of levels to get deeper and closer to the answer, just like cobb did in the movie to plot an idea in a deep dream level.

The moment you know you are going in the right direction..

I have been working lately on a project, which has taken up a lot of time uptill now.
I admit it is really difficult to do research and I do admit that sometimes it gets you to the point of total frustration. But the reward is truly the most awesome thing.

Earlier this week, I recieved a the result of review process for the conference paper I had submitted in one of the best conference for computer vision. And to my surprise and excitment, it got accepted.

This might be the beginning but I still have a long way to go. However its good to know once in a while that what you are doing is the right direction. Certainly gets your frustration down and motivation high.

I will be posting more about the topic related to this publication. Stay tuned

Wednesday, 10 April 2013

The Universe is in us!

I have been away from this blog for a while, and there are a number of reasons for that. Mostly I have been really really lazy with lots of work and sleep. The good news is that I am back and I have quite a few things to post about.

While reading this, you might be wondering what this post is about? Well its about surprising similarities between two totally different worlds. The first one involves the microscopic world of DNA. The data I used is specifically cancer mutated DNA I was provided when I went to a GameJam for Cancer Research UK. One of the problems we tried to address in this gamejam was to identify the regions in DNA with cancer mutations. Being a Computer Vision Engineer, I have been really interested on representing the data in a visual way. While I might not have succeeded in creating something useful, however what I found was quite interesting.

Game hackathon for Cancer Research UK

Over the weekend I was at the google campus london for the hackathon for Cancer Research UK. The objective of this hackathon was to convert dna data into an interactive and social game to help accelerate the cancer research. The data provided by Cancer Research UK had different mutations in dna, which could be identified by sudden shift in the data points. Approximately 40 developers and gamer spent 48 hours to design different games that utilized this data, had the social gaming experience and above all provided some feedback for easy identification of the dna mutations resulting in Cancer.

The outcome was a number of games with different diverse ideas, each one focusing on one thing, to analyse the data using human eye. More details on this event coming soon, as we are all waiting to hear about it from Cancer Research UK.

Report on this event can be found here: bit.ly/WsTB1B
City University Press Release on this event: http://tinyurl.com/cmxdbyc

Friday, 1 March 2013

Computer Vision in Matlab

Over the time of my research I have found out that it is really difficult to get access to actual codes that different authors use in their implementation for their publication.

I just found out about a very good link where you can find basic algorithms and implementations of different image processing techniques which can be useful for a person doing research in these field.

Here is the link ( Peter's Functions for Computer Vision ):
http://www.csse.uwa.edu.au/~pk/research/matlabfns/

Kudos to the University of Australia for putting this online for other researchers.

Update 07/03/2013: Since I wrote this post, I have found numerous matlab implementation pages online. I am sharing a list below.

Right now I have just listed down all the available toolboxes and codes. I will be sorting and updating this list soon.

Future of Music == Gesture + AI + Singing

While doing some research, I found this talk in which a musician talks about how she was able to use different gestures to compose a song. The actual talk can be seen below:

What is the first thing that comes to your mind after watching this? yes, it is amazing to see such a performance for a music fanatic. But for me it is even more interesting to see the different aspects of data fusion involved. By looking at her performance, there are a number of things that comes to my mind:

1. Hand gestures recognition using data gloves.
2. Body posture recognition using Kinect Sensor.
3. Localization of the person on stage using Kinect Sensor.

You might have noticed, that there are different hand gestures which are used to start the editing or instrument playing sequence. While the hand gestures are used to play specific notes as well, body posture specifies the different after effects/post processing. Similarly the location of the singer is used to relate it to different music effects.

This really shows the potential of natural interaction technology, and what might be achieved if new ideas are integrated into these natural interaction methods.

Reference:
http://www.kinecthacks.com/imogen-heap-talks-ableton-controlling-gloves/

Wednesday, 6 February 2013

OpenCV-2.4.X Binaries pre-compiled with Visual C++ support

Update (07/05/2013):
Below are the different versions of OpenCV binaries I compiled from source on Windows 7 using Visual C++ 2010

1. OpenCV-2.4.3VC
2. OpenCV-2.4.5VC

You can use either one of the above binaries and set up as follows:

To set these binaries up, extract the contents of this folder. Edit the SYSTEM ENVIRONMENT VARIABLES to add the path to \bin folder similar to what has been done in my previous post

Please post any problems in comments below.

Seeing and believing

Pages

Sunday, 29 December 2013

OpenCVKinect: Acquiring Kinect Data Streams in OpenCV

Sunday, 15 December 2013

One Image hiding over eight thousand different stories...

Tuesday, 10 December 2013

Implementing SnakesGame using OpenCV

Friday, 6 December 2013

Google's Christmas surprise....!!

Tuesday, 19 November 2013

Implementing Noisy TV in OpenCV...wait WHAT?!?

Tuesday, 8 October 2013

Mind == Blown!

Tuesday, 10 September 2013

Computer Vision is everywhere...

Monday, 26 August 2013

OUT-A-TIME: What is the fourth dimension?

Monday, 12 August 2013