The inverse F distribution is throwing a Domain Error before achieving convergence with a few sample problems I've tried. For example, df1 = 5, df2 = 10, p = 0.75, as well as the complementary problem of df1 = 10, df2 = 5, p = 0.25. I don't think there is anything too taxing about either of these samples, and I am pretty sure that I got correct results the last I checked them quite a few revisions ago.

I know there have been challenges with the Newton refiner, and I wonder if things have cropped up again. I haven't run into this with the inverse normal, chi-squared, or t, though admitted the inverse problem can seems slow at times if the underlying CDF, like the t, is pretty complicated to calculate each time.

This isn't a quest for obsessive accuracy. Just concern that something that was once okay seems now broken.

Les

*Edited: 17 May 2012, 3:58 p.m. *

Something has regressed badly both here and leaving the device in double precision mode at time.

I'm sure this one used to work well previously.

- Pauli

The Newton refiner seems to be diverging slowly in this case :-(

- Pauli

This issue still lingers. I do indeed appreciate it is likely of lesser priority these days, so I will try to educate myself more about the workings of the code and see if I can come up with some ideas. If nothing else, at least I will have learned something.

I honestly don't think *lingers* is an appropriate word to use here. This was reported three and a half days ago. If the 34S were produced commercially is it reasonable to expect such a rapid turnaround over a weekend? This is a hobby and gets time allocated when/if possible and certainly not automatically.

Hmmm, that sounds a bit harsher than intended so please don't be put off -- I've actually had a miserable couple of days, I managed to put my back out working in the garden and I'm consequently feeling somewhat grumpy.

Anyway, I'm aware of the problem and will get to it eventually. If you or anyone else wants to poke around in the Newton solver (trunk/xrom/distributions.wp34s) and figure out what is going awry and how to fix it, it would be very welcome.

I suspect that the problem will be in the jump range limiting code or the can't be negative code. Of course I could easily be and likely am wrong. The jump range limiting for the F distribution is still a bit coarse -- it was based on the previous bad initial estimate that was further from the solution than it should have been.

- Pauli

Sorry, I meant it "lingers" for me, kinda tugging at me in my awareness. I just was trying to convey that instead of whingeing about something it may be more informative for me to bring myself up to speed and see what I can learn--shining some light into the black box, if you will.

I hope your back gets better.

It isn't the Newton solver at all (well, it could be in some cases, but that's not the hugest problem).

The problem is that the F-dist PDF returns bad values, so the denominator of the Newton corrector is off and the thing doesn't converge.

For example, for the problem I presented the initial guess is about 1.59, which is perfectly respectable. Calc's PDF returns about 6.49, whereas the real value is in the ballpark of .25 or so. Even without checking in Mathematica, we know 6.49 can't be right--the area under the skewed bell of the F-dist curve is unity, and the domain is the entire nonzero real line. No way under these circumstances that the ordinate of such a graph will ever exceed 1.

Will eyeball the F-dist PDF to try to catch the glitch.

Les

*Edited: 21 May 2012, 5:35 a.m. *

Les, nice find.

The last change I remember making there is replacing divide by two with multiply by a half. The RCL/ L at the end of the pdf code seems the likely candidate. I think the df should be divided by two not a half before going into the beta function. If so, the fix is RCL[times] L instead.

Again, I'm probably off track.

- Pauli

Quote:

It isn't the Newton solver at all (well, it could be in some cases, but that's not the hugest problem).

Now that you're up to speed, how about an implementation of Halley's method instead :-)

Time for some pain killers and soon off to bed.

- Pauli

Yep, that was it.

Change the RCL/ to RCL* and all will be right again.

Halley's method should in theory be simple enough to implement. The ratio of the second derivative to the first derivative of the CDF that figures in the higher order correction is a simple enough rational function of the estimated quantile and the degrees of freedom.

I'll see what I can do. Keep in mind that Halley's method offers no guarantee that some interim guess won't enter forbidden domain, and the cubic convergence really only applies if the initial guess is pretty good. The existing formulae for the initial guess seem to do poorly with very small probabilities, so in those circumstances I don't think Halley will confer much benefit over Newton as frequent bisection and quite a few more iterations will be needed to keep the problem within range.

Les

The Newton solver is guarded against entering a forbidden zone (x<0) if such exists. There are also guards in place once the interval of search brackets the solution, to prevent Newton's method escaping.

It might be worth a try to see if Halley's is better for some of the distributions -- we can always have a flag indicating which to use depending on the distribution.

- Pauli

The good news is that I worked up today in Mathematica a Halley's method approach to inverse F. Except in pathological cases--usually when the probability is very tiny or one or both of the df is unity--Halley's method converges to over well over 30 digits most of the time in three iterations.

The bad news is that Mathematica crashed in the middle of my work and I lost the whole thing. No, I did not save an interim file.

I envision a couple of struggles with the inverse F. First, the initial approximation tends to be abominably bad with the probability is tiny. Moreover, I don't know (yet) whether substituting 1/(df-1) with 1 when one or both of the df's is unity is the best way to go. I wonder what Dieter makes of this. This is again another clash between theory and practice. In real life, the F quantile associated with, say 1e-300 at a single df in the numerator and denominator is effectively zero, but it would nice to get closer in order of magnitude, even if only a few digits are correct.

Wonder what Dieter thinks?

Les

I've had a quick look at a few other implementations. Most implement one or more inverse beta functions and these typically use Newton's method. There is mention of bisection and even Halley's method, although it only seems to be applied in some cases and not others. Likely due to the pathological cases you mentioned.

- Pauli

Les,

Quote:

Wonder what Dieter thinks?

Dieter says he's definitely not an F-distribution expert. ;-) Sorry, I think my contribution cannot be of much help here.

Quote:

In real life, the F quantile associated with, say 1e-300 at a single df in the numerator and denominator is effectively zero, but it would nice to get closer in order of magnitude, even if only a few digits are correct.

There's an easy solution to the case df1 = df2 = 1. Here, the F-quantile simplifies to tan^2(x*pi/2, which can be coded very nicely with the respective 34s conversion function:

#090

DEG->

*

TAN

x^2

For the mentioned case this will return as many digits as you like (here the result is 2,4674...E-600). For similar solutions cf.

Wikipedia. Yes, that's German, but the English version does not include such a nice table, and the language of mathematics is universal. ;-)

Anyway, maybe this can be a hint: For the Normal and Student quantile there are asymptotic expressions that approach the true value as p gets closer to zero. This, for instance, is used in the estimate for the Student quantile, and it's also the basis of the Normal estimate. Maybe there is a comparable solution for the F distribution as well?!

In general, I think we should not neglect cases like p or x < 1E-100 or similar. Yes, for most real life applications three digits will do, but that's not what the 34s project is designed for. The only limit should be set by the working range, so I think we should get the quantile right even for p = 1E-6140 (in DP mode). At least that's what I think how the 34s should work. ;-)

By the way, my current project is a 34s program for the Normal distribution with two (user-selectable) accuracy levels. It also handles p very close to 0.5 without problems, as well as the central symmetric CDF from -x to +x with x close to 0. The quantile function uses a method that is significantly better than the Halley approach, leading to 32-34 digits (and with a slight tweak about 36) with just two refinements. I wonder how fast this will execute, so I think I will soon post the result of my humble efforts, hoping someone will try it on his "hardware 34s".

Dieter

Quote:

For similar solutions cf. Wikipedia.

Dieter, one of my favourite math papers is Hilbert's 1893 proof of the transcendence of Pi. I have never been able to find a translation. I have spent many an hour with Google Translate working my way through that one. My math-German is pretty passable, so I can get the gist of that entry.

This is important stuff so I will carry this up to a new thread later tonight when I have thought about it more.

Les

Quote:

Anyway, maybe this can be a hint: For the Normal and Student quantile there are asymptotic expressions that approach the true value as p gets closer to zero. This, for instance, is used in the estimate for the Student quantile, and it's also the basis of the Normal estimate. Maybe there is a comparable solution for the F distribution as well?!

Indeed, I have to crack open A&S, but I suspect the chi-square and inverse chi-square may be to F as the normal is to the student t. It makes sense. The normal and chi-square are special cases of the incomplete gamma, the t and F special cases of the incomplete beta, and the incomplete gamma and incomplete beta are related to one another.

I am going to bump this to its own thread later with my next comment. Will keep you posted.

Les