Supertree Toolkit

Account for weighted trees in data ind

Bug #1404162 reported by Jon Hill on 2014-12-19

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Supertree Toolkit	Confirmed	Medium	Unassigned	Supertree Toolkit 2.15

Bug Description

When downweighting data to deal with identical sources, take that into account when recalculating data ind

Revision history for this message

Jon Hill (jon-hill) wrote on 2015-05-20:

This is actually part of a wider issue. The output from data ind is a bit confusing. For example, say tree_1 == tree_2 == tree_3 (i.e. all same taxa and characters, but using a different algorithm to construct them). These are non-independent and identical, but you would get:

tree_2 == tree_1
tree_3 == tree_2

In the present output. It's therefore not clear that tree_1==tree_2==tree_3. It also makes it harder to automatically weight the trees (i.e. you need to figure out to downweight by 1/3, not the more obvious 1.2).

Propose altering the output of data_independence to include:
list of list of identical trees:
[[tree_1, tree_2, tree_3]]
a list of list of subsets where item 0 is the larger tree
[[tree_1, tree_2, tree_3]]
would mean tree_2 and tree_3 are subsets of tree_1

The generate new phyml can then also down-weight the identical trees easily

In future we also want to be able to select and remove trees individually (see https://bugs.launchpad.net/supertree-toolkit/+bug/1404157) but I think this might be easier with the new output format.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.