Font recognition from vector paths

Bug #171453 reported by anatoly techtonik
4
Affects Status Importance Assigned to Milestone
Inkscape
Confirmed
Wishlist
Unassigned

Bug Description

Problem - to edit text, which was previously
transformed into paths. To make it possible the
reverse operation is needed - select an area and
attept to recognize selected font and letters.

Revision history for this message
Horkana-users (horkana-users) wrote :

This idea has been mentioned in passing on the mailing list,
thanks for filing a request

This is not a feature which SVG includes but there might be
some accessibility feature (SVG equivalent of alternative
text on an image?) which might make this easier so if you
are familiar with the SVG specification and have suggestions
on how best to achieve this feature please do add them.

Failing that it does seem like information which could be
stored in the special extra hidden information contained in
Inskcape SVG files.

Patches welcome of course.

Revision history for this message
Bug Importer (bug-importer) wrote :

Just to be clear, I am only suggesting that it would be
great if Inkscape could:
convert text to path
then convert that same
path back to text

Converting generic paths created by other programs outside
inkscape to Text would be practically impossible and a task
better accomplished by OCR (optical chartacter recognition)
software the kind people use with scanners.

Revision history for this message
anatoly techtonik (techtonik) wrote :

There must be separate feature request for saving SVG text
when converting to path. I am more interested in scenario
when the path for specific text was imported from another
application.

I do not know much about OCR, but it seems that the problem
I've described is a little bit different and much easier. I
can name it VCR (vector character recognition).

The logic is simple. Problem of restoring vector text can be
rather difficult if text path was transformed further after
converting into the path from font symbols. The most simple
case is when plain font without any effects and
transformations was used for text and then transformed to
path (probably by some external application, so no initial
text information available). Path contains the same vector
information as the font itself, but font also has name, file
and a plenty of other data. First step is to guess the font.

Bruteforce method tells to build database, which will
contain vector information for all symbols of all fonts and
compare this information with given path. More intuitive
approach would be to give a hint which symbols are actually
rendered by path - this way database must build only for a
subset of symbols and match would be faster. As font to path
transformation may not always be the same to achieve exact
match it is important to have common path normalization
(optimization) algorithm. Additional statistical data about
font symbols can be gathered to provide other means of
matching than direct comparison. Statistical data may also
help to guess font from paths of symbols with additional
effects on them.

If OCR tries to guess "what is written" VCR rather answers
the question "how it is written". So it is ok to retype the
text for VCR - give it a path and ask about font properties.

The second task is to restore text. When font is matched
successfully everything else is nuts.

I've posted this previously at potrace tracker.
https://sourceforge.net/tracker/?func=detail&atid=583851&aid=1536439&group_id=87635

Revision history for this message
Horkana-users (horkana-users) wrote :

Retaining the textual information as long as possible after
Inkscape has converted text to path should be possible, even
though it is not part of SVG.

Converting generalised vector paths into characters would
not be easy and would fall outside the scope of inkscape.
Inkscapes primary goals include implemetning the SVG
standard which is rather large and includes complex features
such as animation. Full printing support also requires that
Inscape implement a huge amount of PDF support.

Certainly if functionality such as you suggest were
available it would be incorporated into Inkscape - as
potrace was for tracing bitmaps - but it is extremely
unlikey that the current developers would work on such a
feature since it is not their area and beyond the scope of
Inkscape.

You might have more luck if you could find developers of
hand writing recognition software, Font creation software,
or OCR software who were interested in extending their
functionality in the direction you suggest. It is a very
interesting suggestion. thanks for your feedback.

(I expect the only signficant different from generic
Opitical Character Recognition (OCR) software would be
seperating out the relevant vectors into pieces and feeding
them in smaller chunks for greater accuracy, a little bit
like how inkscape uses the SIOX foreground extraction tool
to improve the results from potrace. Both SIOX and Potrace
were applications developed externally and later intergrated
into Inkscape.)

vonHalenbach (lustik)
Changed in inkscape:
importance: Low → Wishlist
status: New → Confirmed
Revision history for this message
theAdib (theadib) wrote :

interesting request,
I remember in my company we stored some relevant data into a xml-comment assigned to that xml-element. Thus:
-some additional data are stored to the xml-element,
-does not break the XML-dtd,

Inkscape also does some xml::namespace thing and so we can store those text in some new element assigned to the path.

Still need the UI to to the "undo text to path"

We should either write an wiki or create a blueprint to summarize.

Revision history for this message
anatoly techtonik (techtonik) wrote :

There is enough text for two blueprints here. First is for storing text data together with path data and the second about guessing font and letters associated. I've just stumbled upon a service that does the latter thing - http://www.myfonts.com/WhatTheFont/

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.