Luminoso

Show top concepts for canonical docs

Bug #524508 reported by Rob Speer on 2010-02-19

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Luminoso	Confirmed	Medium	Unassigned	Luminoso 1.1

Bug Description

When a canonical document is selected, the info pane should show a list of commonly-occuring concepts in the documents that are most similar to it.

These should probably be pre-SVD counts, not post-SVD similarity scores. For example, counting the number of times the words "chinese" and "thai" occur, weighted by which documents they are in, but not weighted by anything involving the "chinese" and "thai" concept vectors themselves. This would reassure users that the data reflects reality, even if the SVD comes out kind of weird.

Probably the best way to report these values would be as percentages -- this is what Dentsu does in their existing reports.

Tags:

Rob Speer (rspeer) on 2010-02-19

Changed in luminoso:
importance:	Undecided → Medium
milestone:	none → 1.1
status:	New → Confirmed

Revision history for this message

sgt101 (simonthompson) wrote on 2010-02-20:

Related to this it would be very useful to put the info on canonical documents into some structured file like .xls or .csv separate from the general concepts file that is in results now.

I would like to see the following records produced :

filename.txt,polarity_value, concept 1, concept 2, concept 3

The concepts should not be word occurences but generalised terms from the analogy space...

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.