Seeing and believing: Computer Vision

Showing posts with label Computer Vision. Show all posts

Tuesday, 20 September 2016

Hiding data in Images

Here is a still from my favourite movie

The Martian

Now, what if I told you that I have hidden a big message inside this image. Would you be able to find out what it is?

Particles explained using Gifs!!

Over the past few months, I have been reading, understanding and implementing a number of existing algorithms in Computer Vision domain. Implementing particles and particle based algorithms have really had me excited and almost on the edge of my seat. One may ask what makes particles so interesting?? Let me try to get the concept through.

Particles, just like most existing algorithms in computer science, are inspired by nature. Have you ever seen a beam of sunlight coming through a window and illuminate a bunch of floating particles (impossible in London though I have seen it before)? When you see these tiny particles, you notice that they are suspended in air and that it's very difficult to predict their motion unless you disturb the surrounding air. This simple concept is vital for many computer algorithms that model motion/dynamics of an object.

Particles, along with their randomness, can be simulated inside a computer program. The simplest of such algorithm is called Random Walk, where a particle is modelled with its current position/state alone and a random displacement/jump determines its next position in time. Here I have shown one Random Walk particle:

A Single Random Walk Particle

Expectation Maximization for Gaussian Mixture Model in OpenCV

I recently wrote code for Gaussian Mixture Model (GMM) based clustering in C++. As always, I found it much convenient to use OpenCV for manipulating matrices. Although there already exist an implementation of Expectation Maximization-based GMM, I tried to understand it by writing my own implementation.

The basic idea of GMM is to first randomly assign each sample to a cluster. This provides initial mixture model for clustering. This is then optimized using Expectation - or the probability/score of assigning each sample to each component in GMM - and Maximization - or updating the characteristics of each mixture component with the given probability/score . An attractive attribute of GMM is its ability to cluster data that does not have clear boundaries for clusters. This is achieved by having a probability/score for each sample from each cluster component.

OpenCVKinect 2.0 - Acquiring Kinect depth stream in OpenCV

It has been almost two years since I first wrote the code for OpenCVKinect. It has been really good to know that it has been used by a number of other students/developers at GitHub for collecting and analysing Kinect depth streams in OpenCV. I have had some feedback about a possible bug and some students have asked how they can visualize the depth maps in a better way. So today, after a long time, I am releasing the first official update to this project.

Particle Filtering - Survival of the fittest

I recently studied dynamic system models such as Kalman and Particle Filters.
For Kalman Filter I followed a Matlab demo that can be found here.

In this demo, the simple problem of tracking a ball is addressed using a Kalman Filter. The input sequence is of a ball, which is travelling at varying velocity and which is occluded in some frames by a box. I think this is a great example to demonstrate the power of dynamic system models, especially the occluded frames can be used to test how good a dynamic model is. Here is the actual sequence:

As you can see the ball goes underneath the box and comes out of the other end. If our dynamic model is accurate it will be able to predict the state of the ball even when it is not visible, and should match the position when the ball comes out.

Long Exposure Shots with a GoPro and Matlab

I recently got a GoPro. You know to get cool selfies, videos and all :D I am very much impressed by all the cool things you can do with it but was specifically impressed by the fact that one can create a time lapse video.

After giving a couple of tries to time lapse videos, I wanted to go beyond. I had always seen photographers make a long exposure shots by using specific DSLR cameras. I wanted to create just that using the only camera I had, a GoPro. However I had something much more than the camera, I knew how to write a code that deals with a number of images (I am a Computer Vision Engineer).

What if I told you, you can use OpenCV code with Matlab mex!!

Matlab is probably one of the best tools for quickly prototyping and testing your research ideas. As quick and flexible it is, sometimes Matlab code can consume a lot of execution time. This is specifically a big hurdle when multiple experiments need to be run. A real-time execution alternative is to implement Matlab compatible C++ code and compile it with mex-compiler. While this works most of the time, it is well known that quickly implementing ideas in C++ is not possible.

Saying hello to the Internet of Things!

A while back I signed up for Microsoft Developer Program for Internet of Things ( #iot for more info ). As much as I love exploring new things this was extremely exciting thing for me.

I have always had the curiosity to know more and try to hack things my own way. Even as a kid I had an investigative mind which always tried to discover more about how everything works. You can imagine this curiosity by the fact that I got severe electric shock as a kid, when I tried to cut a live wire from "Clothes Iron". This curiosity grew more and more in me, to a point that I did an engineering degree (Yes! I was born with an engineer's mind). I have always been interested in hacking different devices to make something more useful out of it.

Compiling OpenCV-3.0 with Matlab Support

A big uppercase HELLO to everyone! I am back and after a long time (yet again) I am going to write a tutorial. The thing I am able to achieve here is awesome for us computer vision researchers. Yes! you heard it correct, exciting stuff.

I have been using OpenCV for quite sometime now. As good as it is for real-time computer vision applications, it can also be time consuming when it comes to exploring and implementing new research designs. Matlab on the other hand has always been flexible and a quick work around to achieve my research goals. The only problem, though, with matlab is that it is not real-time or even worse is that if you plan to implement code in OpenCV for real-time application, you would have to write the algorithms all over again as the usage of Matlab toolboxes is different than using the same methods in OpenCV.

Now comes the fun part, what if you can access OpenCV function calls within Matlab code? What if you can have easily transferable code from Matlab to C++? This is all possible now with the OpenCV 3.0 Dev including matlab mex wrappers, which really is a good big step in the right direction. So lets start compiling the code.

Creating "Mood lights" animation with OpenCV

The other day I went on a typical London walk near Thames, and as always loved the lights, reflections and the view. It was amazing! One thing I really liked was the RGB mood lights on the bridge that transformed from every possible color in a way that it made the whole experience amazing!! Here is a glimpse from my instagram.

Since there was a sequence of colors involved, I thought I would at least try to replicate these mood lights using OpenCV. Turns out its not very difficult to make this animation at all. I wrote an algorithm for doing this using some clever tricks that did make it simple and interesting. Here is a gif showing how cool the animation looks when you execute the code.

Reading a Kinect Depth Image in OpenCV

While reading a captured Kinect Depth Image, it is important to read it in a way that retains the original data. Using a regular cv::imread function call can significantly modify the data stored in a Kinect Depth Image. This is because of the fact that a regular cv::imread function call uses default method, which assumes:

The input image is color (three channels: RGB)
The depth (number of bits per pixel) of input image is UCHAR (CV_8U) or 8bits/pixel

This is, obviously, not true for a Kinect Depth Image, for this image is a special type of grayscale image. The Kinect Depth Image contains only one channel (like any other grayscale image), however the depth of this image is actually UINT16 or unsigned int (CV_16UC) instead of UCHAR. This difference in depth is because of the fact that a Kinect Depth Image contains more values and hence it requires more bits per pixel to store this information (i.e. 16bits/pixel). Now that we know what makes it different, lets see how it can be read inside OpenCV code.

To read a Depth Image use cv::imread function with CV_LOAD_IMAGE_UNCHANGED flag. This will not change the data, reading it in its original state.
e.g. cv::imread("DepthInput.png", CV_LOAD_IMAGE_UNCHANGED);

I will be updating this post later to include details on how to capture and store a data stream from Kinect using OpenCV.

Sunday, 29 December 2013

OpenCVKinect: Acquiring Kinect Data Streams in OpenCV

Click here to go to code download step directly
_______________________________________________________________________________________________

Edit (26/05/2016) : I have updated the OpenCVKinect to fix some bugs and make use of different visualization for depth maps. Details of this newer version can be seen here. All other details mentioned in this blog post still apply to the updated version.
_______________________________________________________________________________________________

Holiday season is here, and I have finally found sometime to write something for this blog. Sometime back I wrote a guide for compiling OpenCV with OpenNI support. Since that post a lot has been changed in the newer OpenNI 2.x SDK, there has been a lot of improvements for using Microsoft Kinect Sensor. One of the major change is the Object Oriented Implementation as opposed to pure C implementation in previous OpenNI 1.x versions. I have had many requests asking me to write a tutorial for compiling OpenCV with the new OpenNI 2.x support. Although this is not possible with the current SDK and OpenCV, this blog post introduces an implementation for acquiring Kinect Data Streams in OpenCV.

After a long time, I recently started working with Kinect Sensor again, and therefore I wanted to access the data streams in a more efficient manner. While it looked straightforward to use the built in function calls in OpenNI 2.x, I was more comfortable with using OpenCV format instead. Therefore I wanted to hide all that detail of OpenNI Objects and their methods in a simple and convenient way. To achieve this, I wrote my own implementation and this blog post officially releases this code as open source, for anyone to use/modify/update to. Right now I have some basic methods, but in future I am thinking of adding simple image processing/data recording methods too (to make it easier for beginners).

One Image hiding over eight thousand different stories...

Working with large datasets has its own pros and cons. Whatever the implementation or field might be, there is always a need for training a machine learning algorithm to recognize the pattern in that data. We often discuss this "Pattern" in many different instants and a big chunk of literature addresses this recognition problem. However it is often not considered important to get to know how this pattern looks like? why is it even called "Pattern" in the first place??

Interestingly the answer lies in the above image which shows a collection of 8000 different samples, arranged in columns. Here the first thing to notice is that there actually is a repeating pattern in the data. This is the exact pattern which we are trying to learn. It may not make sense when looking at it, however with correct label representation, each sample can be used to build a model which is able to identify each class with high accuracy.

Tuesday, 10 December 2013

Implementing SnakesGame using OpenCV

A long time back I used to have a nokia phone which came with this awesome and simple game. I still love nokia phones for this, and have a couple of phones just to play it (no kidding, haha). A while back I decided that I will write my own snakes game implementation, so I did. What is interesting about this post is that I did not use any graphics engine or OpenGL rendering at all. Instead this whole game was implemented using OpenCV function calls, enabling me to make my own rendering funcitons and display buffers.

Google's Christmas surprise....!!

I have an android phone which automatically syncs all the photos I take to Google+ Photos.

So today we had our own christmas tree in Uni, and I took a picture to post on instagram.

After a while I checked to see that google's auto awesome feature has made a very cool christmasy effect to make the picture come alive.

What a nice surprise, interesting how google is using different image processing algorithms.

Feels like its holiday season already!!

Tuesday, 19 November 2013

Implementing Noisy TV in OpenCV...wait WHAT?!?

If you are a 90's kid like me, you must admit that back then there was nothing more annoying than losing signals on your TV. Before the age of internet, TV was the main source of entertainment. When signals were lost, all you could see was an infinite race of millions of flies on your screen (a noisy image) and making it more intolerable was the loud noisy sound of the noise.

Mind == Blown!

So sometime back I saw this video presentation of a new and, what I like to call it, novel method for extracting 3D structures from a single image. Part of the reason why this blows my mind, is that this approach is well defined for a specific scenario and it utilizes the best of both human brain and computer's processing power.

We have a great sense of depth perception of objects. Our brains are well trained to construct an object's three dimensional model, by just looking at pictures. This, however, is a trivial and a highly challenging task for computer algorithms. On the other hand, computers are capable of computing and interpolating data at a much faster rate than humans, given that the task is simple and fairly straightforward.

Computer Vision is everywhere...

As most of the android developers, I am a big fan of google nexus tablets and smartphones. Have been using a google nexus 4 for a while now and I am impressed by all kinds of cool stuff you can do with it. A number of cool applications are based on different computer vision techniques. In this post I will be discussing these applications.

To list just a few obvious ones, the android based smartphones have face recognition based unlocking, camera app which can pick up faces, creating panoramas, editing photos and using readings from a number of inertial sensors to stitch multiple picture into one 3D picture called Photo Sphere.

Programming Inception - Function within a function

This post is about a small but vital part in one of my projects in the past. This specific part of my project dealt with interpolation of quantized silhouette images, using a simple averaging based recursive interpolation.

Okay, I know its difficult to understand, but don't just stop reading yet. The most interesting part about this is that it can be related to the concept of "programming inception". You might be wondering this is something very difficult or deep. Don't concentrate too hard like cobb here, from the movie inception. This concept is as simple as it can get, its just that at first its difficult to get.

If you are like me and have watched "Inception" several times, then recursion in programming can be thought of having a dream within a dream. The more number of levels of dream you go into, the more specific to details you get. Likewise, a recursive function calls itself with its input and uses the outputs in multiple layers of levels to get deeper and closer to the answer, just like cobb did in the movie to plot an idea in a deep dream level.

The Universe is in us!

I have been away from this blog for a while, and there are a number of reasons for that. Mostly I have been really really lazy with lots of work and sleep. The good news is that I am back and I have quite a few things to post about.

While reading this, you might be wondering what this post is about? Well its about surprising similarities between two totally different worlds. The first one involves the microscopic world of DNA. The data I used is specifically cancer mutated DNA I was provided when I went to a GameJam for Cancer Research UK. One of the problems we tried to address in this gamejam was to identify the regions in DNA with cancer mutations. Being a Computer Vision Engineer, I have been really interested on representing the data in a visual way. While I might not have succeeded in creating something useful, however what I found was quite interesting.

Pages

Tuesday, 20 September 2016

Tuesday, 2 August 2016

Monday, 6 June 2016

Saturday, 16 April 2016

Thursday, 14 April 2016

Sunday, 20 December 2015

Sunday, 24 May 2015

Saturday, 27 September 2014

Monday, 31 March 2014

Tuesday, 18 February 2014

Thursday, 16 January 2014

Sunday, 29 December 2013

Sunday, 15 December 2013

Tuesday, 10 December 2013

Friday, 6 December 2013

Tuesday, 19 November 2013

Tuesday, 8 October 2013

Tuesday, 10 September 2013

Monday, 10 June 2013

Wednesday, 10 April 2013