Zen and The Art of Sketch Recognition: December 2010

Tuesday, December 14, 2010

Reading #30 Tahuitu

Summary
Tahuti is a sketch based framework to create UML diagrams, which is another domain for sketch recognition. It also gives option to type in text. In the user study, the participants were more satisfied in using Tahuti against other UML diagram creation software.

Discussion
I believe UML diagrams should be relatively easier to recognize, so I expect that recognition should work with high accuracy. Is there any more paper left to read?

Reading #29 Scratch Input

Comment:
Johnatan

Summary
This paper propose a system to recognize a sketch by using the sound of the sketching device only. A stethoscope mic is attached to the surface that the sketching will be made and the audio during the sketch process is recorded. Then the audio file is processed and looking at the amplitude data, some recognition decisions are made. The author talks about %90 accuracy for some simple shapes.

Discussion
Having worked on the same idea, I can say that the problem is not an easy one. There are to many variations in the recordings of the same shape. Only very very simple shapes can be recognized effectively. And I wonder there could be any application domain for this idea.

Reading #28 iCanDraw

Summary
iCanDraw is a master thesis by Daniel Dixon, that assist people in drawing a human face by recognizing how well they are doing. It provides useful interface tools, such as rulers, markers to help users to preadjust the proportions of face elements. It a provides user a headshot to draw and can check the user's level of correctness by comparing the sketch to a contour set extracted from the photo.

Discussion
Dan is a cool friend of mine and actually I was a test user for this. It was a fun experience. I remember that I did better with the aid of the system. Can u hear me Dan??

Reading #27 K-Sketch

Summary
K-Sketch proposes an easy to use sketch interface for making simple 2D animations. It is Teddy to modeling K-Sketch to animation in that sense. User can draw animation paths as strokes, and sketch figures are animated by translation along those path. The intention of K-Sketch is mostly for entertainment between amateur users.

Discussion
I have seen a video of K-Sketch, it looked premature but promising. Also my advisor knows the person who developed the system and told me that he actually applied to A&M for a faculty position once but did not get selected :/

Reading #26

Summary
This paper proposes a multi-player game to collect sketch data for researchers which is the same deal on reading #24. The game has drawing or describing modes. In drawing mode, player sketches something based on a description text. In description mode, the player is expected to do the inverse, describe a given sketch in text. The description of one player is sent to another player to draw, and their drawing can be send to another player to describe and the message may change significantly on the way.

Discussion
This is just an extended version of paper #24.

Reading #25 Image Retrieval

Summary
This paper proposes as sketch based query system to search an image database of millions of images. The backbone of the search system is the image descriptors, essentially some data obtained by performing some edge detection on the images. The descriptor of the sketch is obtained the same way an image descriptor is obtained.

Discussion
This is another application domain for sketching, where words are not enough for describing things. It reminded me another work, where a sketch based query system was used to retrieve 3D models from a mesh database.

Reading #24 Games for Sketch Data Collection

Comment:
Drew

Summary
This paper came up with two simple game ideas so that it is fun to collect sketch data from users. In the first game, Picturephone, one or more players draw a sketch based on a description read by another player. In the second one, stellasketch, the players individualy label a sketch drawn by a user. At the end of the day, the users should have collected plenty of strokes.

Discussion
This is a cool idea indeed, if you need some sketch data. However I am not how they can find a context for the data collected.

#23 InkSeine

Summary
InkSeine is a note taking app for tablet-pcs, that uses a gesture based interface for functions as searching, linking, and gathering. The interface promotes linking&interaction between the existing notes.

Discussion
This is actually an HCI paper than, sketch regocnition, though sketch recognition could be thought of a subset of HCI. The system sounds neat, but I need to see it in action for a good evaluation.

Reading #22 Plushie

Comment:
Sam

Summary
Plushie is a system that allows non-experienced users to design their own plush toys using the Teddy system. The gesture based interface provide tools tailored for manipulating the mesh for sewing. The system can also run a physics simulation on the mesh to simulate the dynamics of the real toy.

Discussion
I think this is a cool way to commercialize Teddy; Teddy has a almost grandma friendly interface, and the models design using Teddy system looks like nothing but plush toys. I'd like to know how well it did in the market.

Reading #20 MathPad2

Comment:
Chris

Summary
The paper presents a prototype application for sketching math equations and diagrams. The application allows handwriting of regular math notation and free form diagrams. The relations between the two can be established implicitly by the application or by explicit user gestures. This process is called association. The application can also solve some complex equations.

Discussion
I this is a very important domain in sketch recognition. Writing equations+diagrams for papers using mark-up languages has always been painful for scholars. I would not care about the equation solver, but if this tool just worked fine in converting hand written equations & diagrams to a mark-up language, it would be a useful one.

Reading #21 Teddy

Summary
Teddy is pretty much the paper that started all about the sketch based modeling. The idea is that given a closed free-from curve, finding its medial axis and placing evenly distributed circles on the medial axis perpendicular to the projection plane. These circles are then combine together by triangulation and a mesh is generated. Using the same interface user can modify the mesh, create handles or holes.

Discussion
I actually implemented a fair amount of ideas from this paper for my own research. The time it was published, it was like revolutionary approach to modeling. However it could not find real life applications besides being a digital toy.

Monday, December 13, 2010

Reading #18 Spatial Recognition Text & Graphics

Summary

This paper is another one using relation graphs like #16, but it uses the graph to build spatial relationships between strokes. After building the graph, they use classifier to classify the known subgraphs.

Discussion

It has been interesting to me again, since it used graphs, but I did not quite get how their classifier worked. If someone can comment on that, I'd be happy.

Reading #17 Distinguishing Text vs Graphics

Comment: sampath

Summary

This is another text distinguishing paper. It uses a feature set that includes gaps between strokes, their relation to each other and some characteristic features for classification, and uses about 9 of them.

Discussion
The classifier they use sounds like an overkill to me. The entropy paper was a lot simpler and still had similar accuracy.

Reading #16 Graph Based Symbol Recognizer

Comment:
Johnatan.

Summary

As the name suggest, this paper recognizes symbols by building a relational graph and matching the graph to the existing graph templates.

Graph isomorphisim is an np-hard problem in general. This paper uses 4 methods to find the closest match; Stochastic match, Greedy, Sort and Error Driven Matching.

The method scored about %90 accuracy tested on about 20 shapes.

Discussion
This idea is particularly interesting to me since I have been using graphs for my recognition assignments. And actually our final project was almost entirely based on this idea. Graph isomorphisim is an hard problem, but you can reduce the complexity by benefiting from geometry data as we&they did.

Saturday, December 11, 2010

Reading #15 Image Based Recognizer

Summary
This paper introduces an image based approach to sketch recognition as the title suggest. In this method, is sketch is compared to the bitmap templates as 48x48 bitmap image. The rotation invariance is achieved using polar coordinate analysis. The method scored an overall %90 accuracy.

Discussion
This paper essentially discusses a grid based method. The idea is similar to "electronic napkin". I am actually surprised to see their %90 accuracy. I'd expect to be lower than that.

Reading #14 Shape vs Text

Summary
This paper brings up the shape vs text discussion again by handling it from entropy side, which is a degree of a randomness in a source. Text strokes have generally higher entropy than the shape strokes and the paper is based on this idea.It introduces an entropy model alphabet, which stores the degree of curvature each letter goes through and measures the density based on the bbox of letters. The method had %92 overall accuracy in distinguishing text.

Discussion
I believe that this paper does a better job than ink features handling this problem. I think entropy should be the way to go when it comes to distinguishing texts from shapes.

Reading #13 Ink Features

Summary
This paper experimented a method to distinguish shapes over texts using a feature set that includes curvature, speed, intersection and etc that add up to 46. The experiment was tested on 26 participants and only 8 of these features were found to be helpful in the recognition

Discussion:
The problem addressed in this paper is an important one but the contribution of the paper seems to be nothing substantial.

Reading #12 Constellation Models

Summary
This paper talks about a pictorial approach for recognizing strokes of a certain classes. The approach is describing the parts of the big picture in relation to other parts in the scene. The lower parts are recognized by a very simple feature set such as bounding boxes, slopes, diagonals etc.

Each model is trained with labeled data and a provability distribution is computed for each object. Then the recognition process turns out to be running an ML search for the given case. The model was tested on facial recognition.

Discussion
The paper reminded me LADDER where the constrains were declared between lower level shapes to recognize higher level shapes. For sure, this is a common recognition approach, but it can fail drastically if recognition of lower level shapes fails.

Reading #11 LADDER

Summary
This paper describes LADDER, a cool language developed for describing shapes for
recognition purposes. Using LADDER, one can define shapes by describing geometric constrains between primitives, lines, curves, arcs, rectangles, polygons etc. These constraints define how these primitives should interact with each other to form a meaningful shape. The constraints are based on human perception rather than precise distance measures, such as parallel, above, perpendicular etc.

Using LADDER, it is also possible to construct high level shapes using low level constructed shapes. So it works like nesting matryoshka dolls in that sense. It is also possible to specify options to override recognition and allow beautification.

Discussion
LADDER is a first. I think it is pretty cool in that sense. I really like that it uses relaxed constraints based on human cognition rather than scientific measure. Some challenge to it is that it becomes harder to define shapes programmaticly as complexity and details increases. However, it sill works fine most of the time.

Zen and The Art of Sketch Recognition