Polyfit with scaling

Ben Abbott bpabbott at mac.com
Sat Feb 2 16:44:45 CST 2008


On Feb 2, 2008, at 3:57 PM, Dmitri A. Sergatskov wrote:

> On Feb 2, 2008 2:28 PM, Ben Abbott <bpabbott at mac.com> wrote:
>
>> Regarding "scale x unconditionally", do you refer to the scaling used
>> by wpolyfit;
>>
>>        (x - mean (x)) / std (x)
>>
>> or to Thomas' suggestion to just scale the magnitude?
>>
>>        x / max (x)
> I mean Thomas' suggestion. That is to be precise x / max(abs(x))
>
>> If you refer to Thomas' suggestion, the maximum value will result in
>> as much trouble/benefit as the minimum value.
>
> No. If your data are well centered, the min(abs(x)) ~  eps, so  
> scaling to
> min does not work. In case of data having a large offset,
> min(abs(x)) ~ max(abs(x)) ~ mean(abs(x)), so scaling to any of these
> numbers would be equally helpful. But scaling to max(abs(x)) would  
> guarantee
> to make all the data in (-1,1) range in all cases and that should help
> with numerical precision.
>

It is been my understanding that the importance is how geometrically  
different the numbers are from 1.

Meaning that eps and 1/eps are equally bad.

Specifically, I thought it would be best if the geometric mean (or  
perhaps median) were at unity.

>>
>> Perhaps a better solution would be (a) the geometric mean of the
>> magnitudes, (b) the median of the magnitudes, (c) the mean of the
>> magnitudes, (d) consider several normalization options and select the
>> most numerically stable one.
>
> See above. I doubt that the fit will be that sensitive to the scaling
> parameter, i.e. to say instead of max(abs(x)) you can use  
> 0.5*max(abs(x))
> and you probably will not see much of a difference.
> Using some more sofisticated approximation is to make some  
> assumption of
> the data distribution and we do not want to do that in a generic  
> function.
>
>>
>> In any event, what should be done about s.R and s.X? Are they to
>> represent the scaled dependent variable?
>>
>
> I do not know.

This is the point of my greatest concern. Since I don't know how this  
output is used, I have no way to determine who it should behave when  
the dependent variable is shifted and/or scaled.

Ben


More information about the Octave-maintainers mailing list