Sketch Recognition Class Blog

Monday, April 15, 2013

ShortStraw: A Simple and Effective Corner Finder for Polylines

ShortStraw: A Simple and Effective Corner Finder for

Polylines

A. Wolin, B. Eoff, and T. Hammond

Texas A&M University

Department of Computer Science

College Station, TX 77843-3112

Howdy!

In this blog post, I will give a brief summary of the algorithm presented in the above mentioned paper and state my opinion on it.

The authors propose a novel method to find corners in a poly- line. Their method is very simple, easy to implement and highly accurate. For different sets of stroke points, the Euclidean distance (straw) between the start and end points is calculated and the shortest straws indicate a corner. Following is a description of the algorithm:

The stroke points are resampled using a specific algorithm. The distance between two points is calculated as the length of the diagonal of the bounding box divided by 40, 40 being an empirical constant. If this constant is increased, it causes noise while if the constant is decreased, it causes over- smoothing of the stroke. Using the following algorithm, the corners are determined:

The main algorithm terminates when i > |points|. To remove false positives, pairs of consecutive corners are tested for a line test (ratio of Euclidean distance between start and end points and path distance between start and end points is greater than 0.95 and close to 1), If they fail the line test, there must be a corner in between themn and that corner is approximated to be the mid- point. Then triplets of consecutive corners are tested to see if they form a line. If they do form a line, the middle corner is a false positive and is removed.

The algorithm performed really well in the evaluation phase. It outperformed most of the existing algorithms. They use two accuracy measures. The first measure is the number of correctly found corners divided by the number of correct corners a human would perceive. But this doesn't account for false positives. So they use an all- or- nothing measure which means that only the minimum number of corners required to segment the stroke are found.

I like the approach that the authors use. It is simple, fast and easy to implement. Also, the accuracy is very high and so the algorithm can be used by people who don't have a background in advanced math too.

I used the following sources for this blog post:

Thanks for reading my blog! Have a great and blessed day!

Gig'em!!!

Monday, April 8, 2013

A Domain-Independent System for Sketch Recognition

A Domain-Independent System for Sketch Recognition

Bo Yu, Shijie Cai

State Key Laboratory of Software Novel Technology, Nanjing University, China

mikeyucn@hotmail.com

Howdy!

In this blog post, I would like to briefly summarize the above paper and provide my view on it.

Freehand sketching is one of the most effective way to communicate and in this paper, the authors develop a system that can recognize gestures independent of the domain and give the user freedom from using non- intuitive ways like menus and toolbars. It also allows the user to draw shapes using multiple strokes.

The system creates a direction and a curvature graph for the input gesture and users various ways to identify the gesture and re- construct it.

Figure 1

They use various ways to recognize a line, a circle, an arc, an ellipse and other shapes. One of the methods they use is feature area. The system calculates feature area to a line, feature area to a point and feature area to an arc as follows:

Figure 2

The vertices are detected by looking at the sharpest corners of the direction graph. The authors use various methods to eliminate false positives. Once the entire diagram has been identified, the system does post- processing which basically consists of Simple Relation Retrieval, Cleanup and Basic Object Recognition which identifies objects that are independent of the domain. The Cleanup phase (removing useless elements at the beginning and the end of the stroke) has 3 rules:

1.) If a line segment (arc) with at least one extreme point open, i.e. not connected to other elements, is very short compared with the mean length of all the elements in its original stroke, then remove this line segment (arc)

2.) Given one line segment (arc) with each extreme point connected to some other element, ifit is very short compared with its neighbor elements’ length and these two ones are neither symmetrical nor parallel, then shorten this line segment(arc)to its midpoint.

3.) If two line segments (arcs) with similar angles (centers and radii) are connected or overlapped, then merge them into a new one.

The user can select objects by doing a circle gesture around them. There are other commands that the user can provide using gestures:

Figure 3

The evaluation of the system suggests that the users liked the system. I think it's a great idea as it provides good recognition independent of the domain and can be tweaked to suit certain requirements and be made domain- specific.

I used the following sources for this blog post:

[1] Proceeding GRAPHITE '03 Proceedings of the 1st international conference on Computer graphics and interactive techniques in Australasia and South East Asia. Pages 141-146 . ACM New York, NY, USA ©2003

Thanks for reading my blog! Have a great and blessed day!

Gig'em!!!

Sketch Recognition Algorithms for Comparing Complex and Unpredictable Shapes

Sketch Recognition Algorithms for

Comparing Complex and Unpredictable Shapes

Howdy!

I would like to talk a little bit about the above paper and state my opinion on it.

The paper talks about a sketch- recognition system Mechanix which can be used to teach statics problems. The system recognized the gesture drawn as a truss or a free body diagram and then recognizes the mechanics behind it. It allows students to draw diagrams as if they are drawing it with pen and paper and then gives them feedback by comparing the student's drawing with the correct one. This helps professors and TAs teaching classes with a lot of students.

The system uses 3 metrics to compare the user's response with the instructor's stored correct answer: Hausdorff distance, modified Hausdorff distance and Tanimoto co- efficient. Hausdorff distance between two shapes A and B is:

where Pa is the set of points of A and Pb is the set of points of B. Similarly, Db is calculated and the Hausdorff Distance is:

The modified Hausdorff distance is:

From the Hausdorff distance, the probability of the two shaped being similar is calculated as follows:

The constant 20 is chosen as half of the side- length of the bounding box. If P is less than zero, it means that the shapes are really dissimilar and hence it can be treated as a zero probability.

The Tanimoto coefficient is calculated as follows:

nAB is the number of points common to A and B and nA and nB are the number of points in A and B respectively.

The paper discusses about the methods that they use for truss identification and free body identification. Identifying those two shapes is very important as they form the main component in any such diagram.

Their experiments and results suggest that students like the Mechanix system and they find it helpful. I personally think that this is a very cool design as the instructors can leverage it to compensate for the lack of individual attention to students in huge classes.

I used the following sources for this blog post:

Thanks for reading my blog! Have a great and blessed day!

Gig'em!!!

Wednesday, April 3, 2013

The Word-Gesture Keyboard: Reimagining Keyboard Interaction

The Word-Gesture Keyboard:

Reimagining Keyboard

Interaction

Shumin Zhai and per ola Kristensson

Howdy!

I am going to present a short summary of the above mentioned article and give my opinion on it.

In the article, the authors talk about a new text- input system that they developed. The name of the system is ShapeWriter. It is a gesture- based text- input system. Instead of entering the text through the conventional keyboard style, the authors present a novel method which relies on word gestures.

Figure 1

To type in a word, the user needs to make a single- stroke gesture that goes over all the letters in the word. It is demonstrated in the above picture. The stroke features are compared with the word gesture features and the word that has the highest match is returned. The system also returns the k next best matches in case the recognition is not correct. Following equation is used to calculate the probabilities:

where P(G|W) is the probability of gesture G given word W, P(W) is the probability of the word W and P(G) is the probability of the gesture G.

Using Fitts' Law, Index Difficulty (ID) is calculates in bits using the following equations:

where t is the time to go from letter k to letter k+1, D is the distance between letters k and k+1, S is the size of the (k+1)th key and a and b are constants.

The total time to write a word is:

where N is the number of letters in the word.

It is also possible to input commands by pressing the command key followed by the command shortcut as follows:

Figure 2

The authors have tested the program with many users and the general consensus about the product has been positive.

I think that the idea of specifying words by gestures is truly amazing and can be very helpful in increasing the text- input speed and accuracy as there is error correction too. There is a big room for making advances in this area.

I used the following sources for this blog post:

[1] Magazine. Communications of the ACM CACM Homepage archive. Volume 55 Issue 9, September 2012. Pages 91-101. ACM New York, NY, USA

Thanks for reading my blog! Have a great and blessed day!

Gig'em!!!

Monday, March 25, 2013

Tapping into the “folk knowledge” needed to advance machine learning applications

Tapping into the “folk knowledge” needed to

advance machine learning applications

Pedro Domingos

Howdy!

I am going to briefly summarize and express my opinion on the above mentioned article.

In this article, the author talks about Machine Learning and gives a simplified overview of the field. He introduces the topic of machine learning, talks about different ways it can be achieved and discusses various related issues.

Machine Learning algorithms automatically figure extract knowledge from the input data. It learns different rules to distinguish between the different data and acts as a classifier when new data is presented to it. Learning in machine learning algorithms is a combination of representation, evaluation and optimization. The classifier has to be represented in a formal language that the computer can handle. This decided what kind of hypotheses the learner can learn, The set of hypotheses that the learner can learn is called hypothesis space. An evaluation function or an objective function is used to distinguish between good and bad classifiers. The optimization process is used to find the classifier with the optimal efficiency.

Generalization is the main goal of machine learning algorithms. The "goodness" of a machine learning algorithm depends on how well the classifier obtained can classify new input instances. Using a lot of data or a lot of features can induce noise and result in a bad classifier. When the example data cannot cover all the different variety in the total data, it may produce a bad classifier. This problem is known as overfitting. Bias is a learner's tendency to learn the same wrong thing consistently. Variance is the tendency of the learner to learn random things irrespective of the correct signal.

Figure 1

Our intuition that works pretty well in a 3- D world may not work at all in high dimensions. Sometimes, data fed to a machine learner has a lot of dimensions and so it is hard to design algorithms for machine learners. There are some theoretical guarantees about machine learning algorithms but they are probabilistic guarantees. A lot of feature engineering goes into machine learning projects and it cannot be automated. So the actual learning part may take a very short time compared to the time that goes into the preprocessing stages. Recently, the focus has shifted from using just one machine learning algorithm to many. Different algorithms are weighted differently and the results are combined based on those weights to get the final output. This results in a very good classifier.

The knowledge presented in the paper about machine learning is very useful. It explains the basics of machine learning in a very lucid manner. Machine learning seems to be the next big thing in the world of computing.

I used the following sources for this blog post:

[1] Magazine. Communications of the ACM CACM Homepage archive. Volume 55 Issue 10, October 2012. Pages 78-87. ACM New York, NY, USA

Thanks for reading my blog! Have a great and blessed day!

Gig'em!!!

Tuesday, March 19, 2013

Sketch Based Interfaces: Early Processing for Sketch Understanding

Sketch Based Interfaces: Early Processing for Sketch Understanding

Howdy!

In this blog post, I am going to shortly summarize the above mentioned research paper and express my opinion on it.

In the paper, the authors propose a hybrid way to combine knowledge from different areas to recognize a sketch and give the user as much flexibility in drawing as is available while drawing with a pencil and a paper. The process consists of 3 steps: approximation, beautiﬁcation, and basic recognition.

Approximation consists of detecting corners (or vertices) and curves segments of the gesture. For the input gesture, direction graph, curvature graph and speed graph are constructed. For the curvature graph, all the maxima points above the mean value are collected into a set Fd. For the speed graph, all the minima points below 90% of the mean are collected into a set Fs.

Curvature graph

Speed graph

Now, an iterative algorithm is used to select points from both the sets that have a high confidence value of being a vertex (corner). Once that is done, curve segments are detected using another algorithm.

Beautification is rendering of the drawing as intended by the user but in a more presentable way. Here is an example from the paper:

Beautification example

Recognition of basic shapes is done using some simple pre- defined rules and procedures.

The system performs very well and most of the users in the user- study said that they would prefer to use this system to the other ones available at the time.

I find the hybrid approach very interesting. It combines the data available during the gesture making stage with the data available at the end. This can be very useful in drawing sketches in the mechanical and civil engineering domain.

I used the following sources for this blog post:

[1] http://faculty.cs.tamu.edu/hammond/courses/SR/papers/Sezgin/Sezgin2001Sketchbased.pdf

Thanks for reading my blog! Have a great and blessed day!

Gig'em!!!

Protractor: A Fast and Accurate Gesture Recognizer

Protractor: A Fast and Accurate Gesture Recognizer

Yang Li

Google Research

1600 Amphitheatre Parkway

Mountain View, CA 94043

yangli@acm.org

Howdy!

This blog post is a short summary of the above mentioned paper. It also contains my views on the Protractor system proposed in this paper.

Protractor is a gesture recognizer that can be easily implemented and customized for different users. It is very fast and the customization doesn't take a very long time either. Protractor uses the idea of nearest neighbors based on the cosine similarity between two gestures.

The users can create their own gestures and provide samples for them. When the user makes a gesture, that gesture is compared with stored templates and for each stored template, a cosine similarity score is calculated using the following formula:

Due to some orientation adjustment done in the pre- processing step, there is some noise introduced. To overcome this, the input gesture is rotated by an angle as follows:

and 'a' is the dot- product of 'vt' and 'vg'.

The Protractor system performs almost as good as the $1- algorithm but when the number of distinct gestures increases, the performance of Protractor is less affected than that of $1- algorithm.

In my opinion, Protractor serves as a very good tool to recognize gestures for systems that are intended for personal use in the sense that every user has their own system. It is very easy to add custom gestures, something that users would prefer to having a fixed set of gestures.

I used the following sources for this blog post:

Thanks for reading my blog! Have a great and blessed day!

Gig'em!!!