A Multi-touch Application Development Framework

Friday, February 15, 2008

The Detector

As Fouad was done with the input, the next step was to detect the fingers' positions (Blobs) in the frames of the video to process them later on.. That's where the detector comes..

Basically, we wanted a fast algorithm that can run per each frame with the least cost, so our choice was the Component-Labeling Algorithm Using Contour Tracing Technique. That algorithm runs in linear time & was reported faster than other ones (see the table bellow)

#CC : Number of connected components

How the Algorithm works?

As a polygon can be determined by set of vertices, a component can be determined with the set of points lying on its contour.. Mainly, the Algorithm scans a binary image & the process is divided into four steps (as illustrated in the figure):

1- when an external contour point, say A, is encountered the first time, we make a complete trace of the contour until we return to A. We assign a label to A and to all points of that contour.

2- when a labeled external contour point A' is encountered, we follow the scan line to find all subsequent black pixels & assign them the same label as A'.

3- when an internal counter point, say B, is encountered the first time, we assign B the same label as the external contour of the same component. We then trace the internal contour containing B & also assign to all contour points the same label as B.

4- when a labeled internal contour point, say B’, is encountered, we follow the scan line to find all subsequent black pixels & assign them the same label as B’.

When implementing the above procedures first we add a dummy row of black pixels at the top & associate the image with an array carrying the labels per each pixel (initially all pixels are unlabeled).
We start initially with label C (where C = 1). Following the 4 steps described above, we will be able to assign a label for each pixel, The value of C is incremented when a new unlabeled external contour is encountered.
Each time the value of C is incremented, this indicates that a new blob is found, a new blob is then created, given a new id & added to the list of blobs (which is later on handled by the tracker)

Regarding the contour tracing Algorithm, its a flood fill algorithm that works by checking the 8 neighbors of a pixel searching for white pixels, The searching procedure is also done in an optimized way that makes searching faster and also guarantee that each pixel is checked only once during the whole procedure.

Now, The Algorithm is implemented & tested successfully with other modules of the project.


Updates:

For further processing requirements, some additions were added to the algorithm including: calculating the center of mass of each blob by calculating the total mass. In addition to that, Blobs are inserted in a sorted order. (refer to the Tracking post)

Further work:

Investigation about making the thresholding value that determines the blobs a more dynamic one depending on the illumination of the frames received.

References

Component-Labeling Algorithm Using Contour Tracing Technique Paper

Tracking Demo

Well , i'll be discussing with you the TrackingDemo application .

This demo aims to test the Tracking module "twTracker" , the TrackingDemo application is a C# application , it recieves each frame a list of updated blobs from the Tracker and draws this list using GDI+ with a different color for each blob.

TrackingDemo contains one main Class "TrackingSimulator" , it contains a HashTable of the current blobs , the blob ID is the key of the HashTable , each frame a sequence of messages is sent from Touchwork application to the TrackingDemo application :
- A Start message of ID "5002" , contains the number of blobs in this frame used in initialization.
- N number of messages each of ID "5000" contains: blobID , blobPosition ( x , y ) and blobPressure ( number of pixels in the blob ) ,,, where N is the number of blobs in this frame.
-An end message of ID "5001" , signifies the end of frame so as to start processing the input and draw the blobs.

Now i have a list blobs from the TrackingModule , and i have the HashTable of my current blobs , so i compare using ID to check the state of each blob , i have 3 cases :
1)- New Blob : The ID is not found in the Hashtable , so this is a new blob , i generate a random Color for it.
2)- Updated Blob : The ID exists in the Hashable , so i update the position of the blob , and it mentains its color.
3)- Removed Blob : The remaining blobs in the hashTable which don't match any blob in the list of new blobs are removed from the hashTable.

Now that i updated my list of Blobs , i draw each of them in their position with their given color , and the blobPressure is used to determine the raduis of the drawn circle.

Thats it ,, the TrackingDemo is finished and worked well , also it proved that the Tracking module is working good E7L .

Wednesday, February 13, 2008

Too much to do ..

Well, this is too much for a day ..

Fouad and I went today to college to try fix the hardware. Fouad went first, I followed. We were finally able to fix it temporarily by trying the new circuit boards and connect the IR-LEDs in parallel. It worked, but the light gets weaker the farther you move away from the power source: We need an electrician!
We also found a small trouble with the light and the threshold for detecting the blobs. We'll postpone fixing it till we fix the hardware.
Conclusion: We had to fix the hardware to make a live demo. The live demo was to test twInput, the input module, if it now captures the frame from a live camera feed and the rest of the modules could go on with their work. Thanks to Fouad, it's working great now. But the live demo itself? Not that sweet!

Summary of today's decisions:

  • Roles have been distributed among us:
    • Developers: All four.
    • Quality Engineers (QE): All four - We'll be testing other team members' modules.
    • Engagement Manager (EM): Mahmoud - Responsible for interacting with clients (professors!!)
    • Minute Keeper: Roaa - Times, schedules, notes, meeting minutes, etc. (the yellow notebook!!)
    • Configuration Manager & Information Systems: Fouad
    • Project Leader (PL): Alaa
  • This blog will be public from now on.
  • Roaa will fix a way to send us updates via emails rather than RSS feeds.
  • Tasks will be kept in an online task list, containing the WBS (Work Breakdown Structure) along with the Issues List, containing the issues and bugs found and their status.
  • Meeting with ITWorx supervisors will be bi-monthly or whenever needed. Their time changes to Wednesdays from 4.30pm till 5.30pm.
What's next?!

  1. Fix the hardware - Waleed Abdel Wahab from CriticalSites (a former FCIS graduate) has contacted a girl whom I'll be meeting tomorrow isA to help us with the hardware ..
  2. Another Live Demo - After fixing the threshold bugs and the hardware!
  3. twAgent - Fouad and Mahmoud
  4. twGesture - Alaa and Roaa
  5. Finish Documents.
  6. Seminars:
    • Saturday: Our third seminar at FCIS - Scope: Input and Detection modules.
    • Tuesday: Seminar before the QAAP Committee.

Saturday, February 2, 2008

Track Me ..

OK, I'll discuss the tracking module: twTracker ..

twTracker has mainly one class, BlobTracker, responsible for tracking blobs (obviously!).
It works using a simple match-and-mark algorithm, nothing sophisticated so far .. I had to ask Roaa to make some code changes in her detection module, so that blobs are stored in a sorted list as I'll be describing shortly. I also had to make some further modifications to the Blob class ..

Now each blob mainly has the following attributes:
  • int m_id: Carries the blob ID.
  • Point m_center: Holds the x- and y-coordinates of the center of mass of the blob.
  • float m_dist: Represents the actual distance between the origin (0,0) and m_center, the center of the blob (x,y).
  • bool m_checked: Is a flag used in the tracking algorithm.






The tracker is based on comparing two lists of blobs each frame to find the closest two blobs within a constant distance range - a threshold - and update a static list of "current" blobs. To ease the comparing, or the "searching" process, I suggested keeping the blobs in a sorted list, where the sorting would be based on the value of m_dist, the absolute distance between the origin and the center of the blob. It's somehow like putting all blobs on one straight line, where the first is the closest to the origin, and the farthest is the farthest from it (the opposite corner).

This is shown in the following diagram. It shows four sample blobs: A, B, C and D. Each blob is at a different distance from the origin as shown.


In the diagram, you can see that blob C is wrapped in a thick belt. This belt represents the threshold along that imaginary straight line (simply by adding and subtracting the threshold value from the m_dist variable.) Any blob that is outside or inside that belt (having m_dist value greater than that of blob C ± the threshold) will be excluded from the search process; increasing performance and reducing extra checking and comparisons. Notice also how blobs A and B could share very near values of m_dist as shown; which causes a problem that if blob A moved, both new A and B will match with the old A.

Now I keep a static list of blobs, currentBlobs, where I keep the updated positions of the available blobs only. Each frame, I receive a list of blobs present in the frame. I compare them to the current list and try finding the closest one within the previously mentioned threshold based on the m_dist variable. If two new blobs match one old one, I pick the closest of them to the older one. Matched blobs are marked using the flag m_checked. We have three cases that could happen:
  1. Appearance of a new blob: It will be added to the currentBlobs list.
  2. Change in an old blob position: It will be updated and marked in currentBlobs list.
  3. Disappearance of a old blob: It will be removed from the currentBlobs list at the end as it won't be marked.
I also keep the pressure (the number of pixels in each blob) so I can simulate the size of the visual circle representing the blob - a demo will be uploaded soon isA.

More Input

DirectShow, that component that your Media player is built upon, consists of Filters, which you may have noticed while configuring that media player.

Each filter has one or more Input Pins and one or more Output Pins

A compressed image enters at the input pin of a "decoder filter" comes out uncompressed from the out pin to the next filter in the chain, which is called "Video Renderer Filter" which displays it, but since it has no output pin Sink it drops the image, and sends a ready message upstream to the "decoder Filter" the intermediate Transition filter, telling it to send the next image, which in turn tells the "File Filter" the Source to push another image downstream. that is called a DirectShow Graph.

OK, about twInput
we have 3 main filters in the twGraph :)
Source...
1- a Capture Filter "the camera driver" or a Video File Reader
2- any neccesary decoding/Resizing filters to make the images "Media Samples" sent understandable to the next filters
Transition..
3- a Sample Grabber , it reads the samples (images), and sends them to a callback where we can process them, you cna either have a copy of the data, or the original data that will be sent again down stream.
Sink...
4- again some decoders so the next uderstands the samples
5- a VideoRenderer "now you have your media player" or a Null Filter , it displays the images recived then Drops the allocated space for the images.
...

Each filter can have prefered media types, so when DS tries to connect an output pin with an input pin, it finds the most favourable media type, if pins dont prefer the same type, they can nevver connect, DS resolves this by inserting some filters in between "decoders/converters" in an action called "Intelligent connect"
...
All this ado is wrapped in a simple interface "start/capture/stop/readFromFile" so whenever a decision is made to support other platforms/devices it should be somehow easy "dreaming".
...
Note: we did not use ATL in order to convince DS to get programmed, maybe because even if we used it it wont look like C# com interface, "did they call that Interop ?"
or maybe because ATL is not a walk in the park, "i think"
or maybe because we didn't need to use any Events or Sinks or Monks or any of the other crazy COM stuff of which i am not aware and the ATL was built for.
just wanted to remind my self of that.

Touchwork Input

In this post i will discuss the twInput component and try to elaborate some decisions made during its development.

Facts:

1-twInput is developed from scratch over DirectShow
so why not use a library:
  • Input is the main dependence of the framework, so if it is not flexible enough it will force the framework to take a certain shape, i.e. we will think "how to write TW and still fit with Input library" instead of "how to write TW", and still even when it is done we can change it any time to adapt with the changes in the main framework.
  • We needed the input to come form a camera or a video file, so we can easily test the code even if the hardware is not available, and lacked in most of the libraries.
  • Imagine spending a day learning a library interface, then after a month you have to edit in the library itself, so you move to learn the library interface that the library depends on , and then understand the code of the first top library dealing with the lower level library
    just too much ..
2-There are 2 main lowlevel alternatives for DirectShow
why not use them ?
  • ActiveMovie, outdated Now is called DirectShow :D
  • Video For Windows (VFW): much easier than DS , but outdated, mostly made for VxD devices which stopped production since 2000, now its WDM .. more details in the proposal
now i will move to the Design of twInput in another post

Friday, February 1, 2008

My World of Silicon ..

In a devastating search for a good compliant surface, I had to start experimenting at home. Most people abroad used Rosco sheets, Silicon sheets, milk paper, caulk paper, and even silk. Silicon sheets gave best results so far; unfortunately we don't have that here in Egypt ..

Just a reminder: Why do we need a compliant surface?
  1. So "moving" your fingers over the surface becomes easier, and will provide better tracking. The motion is not usually "smooth" enough over bare acrylic. It also holds less marks or stains comparable with the acrylic.
  2. Works as a projection surface. Projecting light on a transparent medium will show barely nothing.
  3. Dims the back-light on the other side of the acrylic sheet; it's really great working in a well-lit room.
So, I decided to try making my own silicon sheet at home ..
I bought a silicon tube, emptied it in a tray, n left it to dry till the next day (I'll add photos ASAP).

Results:
  • Silicon is so thick; spreading it evenly on a surface is nearly impossible. A solvent is needed to make it less thick, it could even be sprayed then.
  • Also the thickness creates millions of air bubbles, which will cause noise later in the detection process.
  • With a live test over the hardware, it's PERFECT!!! :D
I contacted two companies to help us out; one hung-up, and the other knows nothing about silicone. Therefore, the next step will be finding a solvent that will help me reduce the thickness of the silicon so I can spread it evenly across the sheet.

Wednesday, January 30, 2008

Hope ..

I thought of trying out some crazy experiment that might help us find a good compliant surface for the hardware. We decided to meet in college before our first meeting at ITWorx to integrate our tasks and try them out, also a chance for me to see the results of my experiment. The shock was to find that the hardware doesn't work! All circuit connections were totally messed up. The camera didn't work. It was horrible!

The camera issue was fixed a7l (!!!) The hardware will need a proper reconstruction. Fouad and Mahmoud paid El-Nekheily a visit and bought a few electronic-circuit-slim-boards to hold the LEDs. We tried one of them so far. Using them seems promising.

The good news is: My experiment worked a7l!!! (I'll discuss that in a later post isA)

After the continuous failure to fix things, we had to hurry to ITWorx. Mihan and Essam were so nice, they are so helpful as well. They both mentioned two hardware stores where we might find some parts for our hardware construction. They sent me details to check RS and Maamoun stores; we should be doing that in the coming few days, or after we're back isA.

Meeting Summary:
  • Assign roles between us - to be discussed.
  • Create a task-list - suggestions: Google Notebook or GoogleDocs (Excel Spreadsheet).
  • Start writing the requirements document till we get the templates from ITWorx.
  • Start writing the WBS (Work Breakdown Structure) of the project.
  • Source Control.
  • Meeting timing will be changed - hopefully to Tuesday isA.
For source control, I will be hosting the source code at home on my desktop PC where I have Visual SVN installed and a dynamic dns account to access it. We will be using TortoiseSVN on each client - tested and working now a7l, just finished that!

Regarding the tasks due last Sunday:
  • Fouad: Done - awaiting a live test.
  • Roaa: Had a killer bug in the detection code, solved a7l - tested .. Still we have some memory leaks though (undeleted pointers).
  • Alaa: Tracker is done a7l! There were some surprising special cases that are handled now - awaiting two live tests: with Mahmoud's application and the hardware.
  • Mahmoud: Done - awaiting a live test.
We're kinda behind schedule, and we've got lots to do for the acmASCIS Student Chapter when the next semester starts isA. We will be traveling next week isA. Working on the project will be resumed after that isA.

Saturday, January 26, 2008

Back on track ..

We're done with the exams a7l & the so-called vacation has started .. It doesn't seem like anything close to the word "vacation" except that we finally could sleep with no worries that we'll have to wake up & study the next day .. lol

So, next Sunday isA is our first meeting @ ITWorx. We planned a few tasks to finish till then:
  • Fouad: Get done with the bottom architecture and the camera issues and fixes ..
  • Roaa: Kill all memory leaks and change the data structure for storing the blobs into a linked list ..
  • Alaa: The blob tracker!!!!
  • Mahmoud: Demo application for visualizing tracked-blobs in terms of (x,y) coordinates for every blob (with ID for each) ..
I have suggested a change to the way we store blobs and discussed that with Fouad, the sorted-linked-list will save us much effort in tracking and will end up with better performance isA (will discuss that in a later post) ..

I also read, again, about how many people convert from FTIR (Frustrated Total Internal Reflection) to DI (Diffused Illumination) - used in Microsoft's Surface Computing btw!

The most obvious conclusion was that FTIR provided better output (fine, clear, well-illuminated blobs) than that of the DI .. yet, it provides worse tracking of "moving" blobs; as it's really hard to move around ur fingers while pressing them hard against the acrylic sheet - with DI u don't even have to touch that hard!

DI is also much easier to "get-it-to-work", even though a practical life-long setup is usually hard enough to accomplish .. the setup of FTIR on the other hand has the only difficulty in applying the theory - a yes/no question ..

FTIR would definitely beat DI if the "tracking" story was solved out - in other words: finding a proper compliant surface, & that is still under huge research so far .. We have a several candidates for a good compliant surface; we're waiting to experiment with them .. (I started a small secret test at home, hope it works!) .. More details about that will be discussed in later posts isA ..

Tim Roth, from nuigroup, has a blog post discussing FTIR vs DI .. Check it out for more details ..

Reminder: We need to find a different time for the weekly meeting (either Tuesday or Wednesday) coz Sunday's causing a conflict with the 2nd semester's schedule!

Wednesday, January 16, 2008

Sample Website & ITWorx's weekly meeting

ITWorx required that we start a blog (already done; you're here as long as u're reading this), a user-group (we created one) and a website (which is currently under construction).

They have also scheduled our weekly meeting to be on Sunday, 4.00pm to 5.30pm ..
Seems suitable for the first week of vacation, as we shall start on the 27th of Jan isA. We won't make it on the second week for we'll be going on the faculty's trip isA.
There's a chance that we might need to change the time for the weekly meeting if there's a conflict with the second semester's schedule (each two of us are from a different department).

Saturday, January 5, 2008

Timeout ..

We're having a break till February due to the increasing intense of the exams which will leave Touchwork at sleep for a while ..

Here are the news updates for the past, unblogged time:
  • We now have the project sponsored by Microsoft and ITWorx.
  • We have two supervisors from ITWorx: Essam Salah and Mihan Samy, who both seem to be so nice, helpful and interested in our work.
  • Project proposal has been updated.
  • Our third seminar, which was supposed to be on the 29th of Dec., was postponed to the next term .. a7l!!!!!!!
  • So far, we're good according to our time-plan, which reminds me of adding the legend to the proposal as I added the timeplan schedule without one (A).
See ya after the exams .. Ciao!