Hi,

I designed it as an external library, where raw image data, or i guess library path can be passed referencing training data. The principle behind it i this:

 - Normalise training data
 - generate eigenvectors and values for a covariance matrix of the data
 - discard some values that are too low,
 - project training data onto 'facespace' (This is essentially PCA)

So in order to see where input image is, it reshapes it, normalises it and multiplies with pre calculated eigenfaces matrix. Then generates Euclidean distance to see where it is closest to. (This is subject to some values which need to be determined by trial and error, which is bit sucky). So there is no person fingerprint per se, as it is combined into signle matrix for all people, but each column in of the projected image matrix uniquely identifies a person. So all you need is table that matches person name with this ID. Then you need to store mean vector and eigenfaces as well as projected images. As you don't want to be computing them everytime. An interesting bit is how to update the matrix, I was thinking taking a mean image across all images for a person, but I don't know how good it is in practice. An additional feature in digikam itself can be to identify a face manually by drawing a box around it, then use it to add or check with face recognition. This is the nature of all classifier, that they won't pick up certain things (I just been using Apple's iPhoto and that's what they use as well). 

Face detection is fairly straight forward using haarfeatures in opencv. This classifier has been trained extensively and you can't really do better, without reinventing a wheel. So work flow looks something like this:

digikam library -> use haarfeatures to detect faces on image -> use face recognition on face identified by classifier -> output to digikam to confirm

Then update the Eigenfaces, mean vector and Projected matrix to be reused. Hopefully update makes it better as more data used.

The whole thing works with a single face as input for a person, but obviously more variation you have better it should be. 

With regards to databases, apart from having a table for linking coloumn in Projected matrix with a person, I don't see a need to store matrices in it. They are constantly reused and changed, plus they are going to be OpenCV classes, so serialising them backward and forward is unnecessary. I would propose to use a flat file in the background, this is hopefully, doesn't add any more complexity to digikam itself and the whole thing is self contained. 

With regards to face recognition, I don't know how familiar you are with haar classifier in opencv, but its good. It does recognise multiple faces at the same time, its not perfect, but nothing is. I think it has been trained on all open to access image databases. Alternatively you can train your own classifier using the same principle, but there is no easy, straight forward way to update it. I have previously toyed with idea of having a cascade classifier for each person, but there pretty much is no way of updating it once it is trained, unless you retrain the whole thing again.


Hopefully this answers at least some of your questions. Sorry for long post :)

Alex