Statistics function incorrectly computing median

Miguel Garcia-Blanco miguel.01 at ihug.com.au
Tue Jan 22 04:50:49 CST 2008


> ... So what I plan to do is to leave discrete_???.m as it is, and have
> empirical_???.m work as R's method #7. I'll likely have the latter
> optionally support other algorithms as well (Matlab's for sure).
>
> From there other Matlab style compatible functions can be added
> (quantile, prctile, etc).
>
> I'd change discrete_inv.m to work as R's method #2, but I'm not sure  it
> will give the proper result to hygeinv.m ... thoughts?

I don't think there is any need to change empirical_inv. As far as I can
tell, it's working just as it should (*). Likewise, I think discrete_inv can
probably stay unchanged.

I think the best thing to do would be to write quantile/prctile functions
implementing method 7 (R's default), and have statistics.m call quantile
instead of empirical_inv. (I think it would be nice to implement the other
methods too, so that the user has a choice, but there's no need to do them
all at once.)

The point I made about the importance of the discrete/continuous nature of
the parent distribution is nonsense, so ignore it. (I misunderstood the help
file for R's quantile function. The methods themselves are
discrete/continuous, not the parent distributions. Sorry about the
confusion.)

(*) I have noticed that empirical_inv returns -Inf when X = 0, whereas R
returns min(DATA). I can see why R's behaviour could be useful, but
returning -Inf is not inconsistent with the definition of the EDF, so I
don't think this is necessarily a major problem.

-Miguel


More information about the Bug-octave mailing list