Account for weighted trees in data ind

Bug #1404162 reported by Jon Hill
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Supertree Toolkit
Confirmed
Medium
Unassigned

Bug Description

When downweighting data to deal with identical sources, take that into account when recalculating data ind

Revision history for this message
Jon Hill (jon-hill) wrote :

This is actually part of a wider issue. The output from data ind is a bit confusing. For example, say tree_1 == tree_2 == tree_3 (i.e. all same taxa and characters, but using a different algorithm to construct them). These are non-independent and identical, but you would get:

tree_2 == tree_1
tree_3 == tree_2

In the present output. It's therefore not clear that tree_1==tree_2==tree_3. It also makes it harder to automatically weight the trees (i.e. you need to figure out to downweight by 1/3, not the more obvious 1.2).

Propose altering the output of data_independence to include:
list of list of identical trees:
[[tree_1, tree_2, tree_3]]
a list of list of subsets where item 0 is the larger tree
[[tree_1, tree_2, tree_3]]
would mean tree_2 and tree_3 are subsets of tree_1

The generate new phyml can then also down-weight the identical trees easily

In future we also want to be able to select and remove trees individually (see https://bugs.launchpad.net/supertree-toolkit/+bug/1404157) but I think this might be easier with the new output format.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.