Your assessment is unduly harsh as you say. Given that you are right that essentially no TH test to date has met the rigorous Margin of Error standards that, for example, science communities require.. that doesn't mean data samples below that margin are useless. And it would be very wasteful and silly to ignore them completely. Remember, you don't always have to choose between "I believe in X" and "I don't believe in X"; it is perfectly intelligent and probably advisable to neutrally accept that "X is more probable than not", based on the evidence. If you want to do colibri parses though, by all means go ahead. A sample of that size could do nothing but help.