Statistics function incorrectly computing median
Ben Abbott
bpabbott at mac.com
Tue Jan 22 06:39:40 CST 2008
On Jan 22, 2008, at 5:50 AM, Miguel Garcia-Blanco wrote:
>> ... So what I plan to do is to leave discrete_???.m as it is, and
>> have
>> empirical_???.m work as R's method #7. I'll likely have the latter
>> optionally support other algorithms as well (Matlab's for sure).
>>
>> From there other Matlab style compatible functions can be added
>> (quantile, prctile, etc).
>>
>> I'd change discrete_inv.m to work as R's method #2, but I'm not
>> sure it
>> will give the proper result to hygeinv.m ... thoughts?
>
> I don't think there is any need to change empirical_inv. As far as I
> can
> tell, it's working just as it should (*). Likewise, I think
> discrete_inv can
> probably stay unchanged.
>
> I think the best thing to do would be to write quantile/prctile
> functions
> implementing method 7 (R's default), and have statistics.m call
> quantile
> instead of empirical_inv. (I think it would be nice to implement the
> other
> methods too, so that the user has a choice, but there's no need to
> do them
> all at once.)
>
> The point I made about the importance of the discrete/continuous
> nature of
> the parent distribution is nonsense, so ignore it. (I misunderstood
> the help
> file for R's quantile function. The methods themselves are
> discrete/continuous, not the parent distributions. Sorry about the
> confusion.)
>
> (*) I have noticed that empirical_inv returns -Inf when X = 0,
> whereas R
> returns min(DATA). I can see why R's behaviour could be useful, but
> returning -Inf is not inconsistent with the definition of the EDF,
> so I
> don't think this is necessarily a major problem.
>
> -Miguel
Due to the need for consistency between the group of functions,
empirical_inv/pdf/cdf/rnd, I'm leaning your way.
Although, I'm still unhappy that discrete_inv does not determine a
median that is consistent with the commonly accepted definition. So
I'll try to resolve that as well.
How is the following list for a short plan?
(1) Add prctile.m
(2) Add quantile.m
(3) Change to statistic.m (call quantile.m)
(4) Modify discrete_inv.m to use R's method #2
(5) Verify consistency of discrete_pdf/cdf/rnd.
I have some code for (1) that is working for all 9 of R's methods, but
still needs to be cleaned up. I'll try to post it later today.
Ben
More information about the Bug-octave
mailing list