## 5269 Reputation

7 years, 114 days

## An existential question...

MaplePrimes

What do you think is the acceptable limit to the effort required to answer a question?

At what point does the question-and-answer game between two contributors become unreasonable?

How do you, the most highly ranked, deal with situations that last for days?

## Solving a least-squares problem with QR ...

Maple

This post is motivated by a question asked by @vs140580  ( The program is making intercept zero even though There is a intercept in regression Fit (A toy code showing the error attached) ).

The problem met by @vs140580 comes from the large magnitudes of the (two) regressors and the failure to Fit/LinearFit to find the correct solution unless an undecent value of Digits is used.
This problem has been answerd by @dharr after scaling the data (which is always, when possible, a good practice) and by
myself while using explicitely the method called "Normal Equations" (see https://en.wikipedia.org/wiki/Least_squares).

The method of "Normal Equations" relies upon the inversion of a symmetric square matrix H whose dimension is equal to the number of coefficients of the model to fit.
It's well known that this method can potentially lead to matrices H extremely ill-conditionned, and thus face severe numerical problems (the most common situation being the fit of a high degree polynomial).

• In English: http://www.math.kent.edu/~reichel/courses/intr.num.comp.1/fall09/lecture4/lecture4.pdf
• In French: https://moodle.utc.fr/pluginfile.php/24407/mod_resource/content/5/MT09-ch3_ecran.pdfI

The attached file illustrates how the QR decomposition method works.
The test case is @vs140580's.

Maybe the development team could enhance Fit/LinearFit in future versions by adding an option which specifies what method is to be used?

 > restart:
 > with(Statistics):
 > interface(version) (1)
 > Data := Matrix([[4.74593554708566, 11385427.62, 2735660038000], [4.58252830679671, 25469809.77, 12833885700000], [4.29311160501838, 1079325200, 11411813200000000], [4.24176959154225, 1428647556, 18918585950000000], [5.17263072694618, 1428647556, 18918585950000000], [4.39351114955735, 1877950416, 30746202150000000], [4.39599006758777, 1428647556, 18918585950000000], [5.79317412396815, 2448320309, 49065217290000000], [4.48293612651735, 2448320309, 49065217290000000], [4.19990181982522, 2448320309, 49065217290000000], [5.73518217699046, 1856333905, 30648714900000000], [4.67943831980476, 3071210420, 75995866910000000], [4.215240105336, 2390089264, 48670072110000000], [4.41566877563247, 3049877383, 75854074610000000], [4.77780395369828, 2910469403, 74061327950000000], [4.96617430604669, 1416936352, 18891734280000000], [4.36131111330988, 1416936352, 18891734280000000], [5.17783192063198, 1079325200, 11411813200000000], [4.998266287191, 1067513353, 11402362980000000], [4.23366152474871, 2389517120, 48661380410000000], [4.58252830679671, 758079709.3, 5636151969000000], [6.82390874094432, 1304393838, 14240754750000000], [4.24176959154225, 912963601.2, 8621914602000000], [4.52432881167557, 573965555.4, 3535351888000000], [4.84133601918601, 573965555.4, 3535351888000000], [6.88605664769316, 732571773.2, 5558875538000000], [4.35575841415627, 1203944381, 13430693320000000], [4.42527441640593, 955277678, 8795128298000000], [6.82390874094432, 997591754.9, 8968341995000000], [4.35144484433733, 143039477.1, 305355143300000]]):
 > # Direct use of LinearFit. # # As far as I know LinearFit is based on the resolution of the "Normal Equations" # (see further down), a system of equations that is known to be ill-conditioned # when regressors have large values (in particular when polynomial regression # is used). X := Data[.., [2, 3]]: Y := Data[.., 1]: LinearFit(C1+C2*v+C3*w, X, Y, [v, w]); (2)
 > # For roundoff issues the 3-by-3 matrix involved in the "Normal Equations" (NE) # appears to of rank < 3. # The rank of this matrix is rqual to 1+rank(X) and one can easily verify that # the 2 columns of X are linearly independent: LinearAlgebra:-LinearSolve(X, Vector(numelems(Y), 0)); LinearAlgebra:-Rank(X);  (3)
 > # Solve the least squares problem by using explicitely the NE. # # To account for an intercept we augment X by a vector column of "1" # traditionally put in column one. Z := `<|>`(Vector(numelems(Y), 1), X):   A := (Z^+ . Z)^(-1) . Z^+ . Y;          # Normal Equations (4)
 > # What is the rank of Z? # Due to the scale of compared to "1", Rank fails to return the good value # of rank(Z), which is obviously equal to rank(X)+1. LinearAlgebra:-LinearSolve(Z, Vector(numelems(Y), 0)); LinearAlgebra:-Rank(Z);  (5)

A WORKAROUND : SCALING THE DATA

 > model := unapply( LinearFit(C1+C2*v+C3*w, Scale(X), Scale(Y), [v, w]), [v, w] ); (6)
 > mX, sX := (Mean, StandardDeviation)(X); mY, sY := (Mean, StandardDeviation)(Y);  (7)
 > MODEL := model((x1-mX)/sX, (x2-mX)/sX) * sY + mY (8)
 > # Check that the vector of regression coefficients is almost equal to A found above # relative error lesst than 10^(-14) A_from_scaling       := < coeffs(MODEL) >: Relative_Discrepancy := (A_from_scaling - A) /~ A (9)

THE QR DECOMPOSITION  (applied on raw data)

The QR decomposition, as well as Given's rotation method, are two alternatives to the the NE method
to find the vector of regression coefficients.
Both of them are known to be less sensitive to the magnitudes of the regressors and do nt require (not
always) a scaling of the data (which can be quite complex with polynomial regression or when some
transformation is used to liearize the statistical model, for instanc Y=a*exp(b*X) --> log(Y)=log(a)+b*X).

 > N := numelems(Y); P := numelems(Z);  (10)
 > # Perform the QR decomposition of Z. Q, R := LinearAlgebra:-QRDecomposition(Z, fullspan); (11)
 > # Let C the column vector of length P defined by: C := (Q^+ . Y)[1..P]; (12)
 > # Then the vector of regression coefficients is given by: A_QR                 := (R[1..P, 1..P])^(-1) . C; Relative_Discrepancy := (A_QR - A) /~ A  (13)
 > # The matrix H = Z^+ . Z writes H                    := Z^+ . Z: H_QR                 := R^+ . Q^+ . Q . R: Relative_Discrepancy := (H_QR - H) /~ H (14)
 > # H_QR expression is required to obtain the covariance matrix of the regression coefficients.

## What is the true equation your time and...

Maple

This post is closely related to this one Centered Divided Difference approximations whose purpose is to build Finite-Difference (FD) approxmation schemes.
In this latter post I also talked, without explicitely naming it, about Truncation Error, see for instance https://en.wikiversity.org/wiki/Numerical_Analysis/Truncation_Errors.

I am foccusing here on a less known concept named "Equivalent Equation" (sometimes named "Modified Equation").
The seminal paper (no free acccess, which is surprising since it was first published 50 years ago) is this one by Warming and Hyett https://www.sciencedirect.com/science/article/pii/0021999174900114.
For a scholar example you can see here https://guillod.org/teaching/m2-b004/TD2-solution.pdf.
An alternative method to that of Warming and Hyett to derive the Equivalent Equation is given here (in French)
http://www.numdam.org/item/M2AN_1997__31_4_459_0.pdf (Carpentier et al.)

I never heard of the concept of Modified Equation applied to elliptic PDEs ; it's main domain of application is advection PDEs (parabolic PDEs in simpler cases).

Basically, any numerical scheme for solving an ODE or a PDE has a Truncation Error: this means there is some discrepancy between the exact solution of this PDE and the solution the scheme provides.
This discrepancy depends on the truncation error, in space alone for ODEs or elliptic PDEs, or in space and time for hyperbolic or advection PDEs.

One main problem with the Truncation Error is that it doesn't enable to understand the impact it will have on the solution it returns. For instance, will this sheme introduce diffusion, antidiffusion, scattering, ...
The aim of the Modified Equation is to answer these questions by constructing the continuous equation a given numerical scheme solves with a zero truncation error.
This is the original point of view Warming and Hyett developped in their 1974 paper.

This is a subject I work on 30 years ago (Maple V), and later in 2010.
It is very easy with Maple to do the painstaking development that Warming and Hyett did by hand half a century ago. And it is even so easy that the trick used by Carpentier et al. to make the calculations easier has lost some of its interest.

Two examples are given in the attched file
EquivalentEquation.mw

If a moderator thinks that this post should be merged with Centered Divided Difference approximations, he can do it freely, I won't be offended.

## Centered Divided Difference approximatio...

Maple

This code enables building Centered Divided Difference (CDD) approximations of derivatives of a univariate function.
Depending on the stencil we choose we can get arbitrary high order approximations.

The extension to bivariate functions is based upon what is often named tensorization in numerical analysis: for instance diff(f(x, y), [x, y] is obtained this way (the description here is purely notional)

1. Let CDD_x the CDD approximation of diff( f(x), x) ) .
CDD_x is a linear combination of shifted replicates of f(x)
2. Let s one of this shifted replicates
Let CDD_y(s) the CDD approximation of diff( s(y), y) ) .
3. Replace in CDD_x all shifted replicates by their corresponding expression CDD_y(s)

REMARKS:

• When I write for instance "approximation of diff(f(x), x)", this must be intended as a short for "approximation of diff(f(x), x) at point x=a"
• I agree that a notation like, for instance, diff(f(a), a) is not rigourous and that something like a Liebnitz notation would be better. Unfortunately I don't know how to get it when I use mtaylor.

 > restart:

CDDF stands for Cendered Divided Difference Formula

 > CDDF := proc(f, A, H, n, stencil)   description "f = target function,\nA = point where the derivatives are approximated,\nH = step,\nn = order of the derivative,\nstencil = list of points for the divided differenceCDDF\n";   local tay, p, T, sol, unknown, Unknown, expr:   tay := (s, m) -> convert(                      eval(                        convert(                          taylor(op(0, f)(op(1, f)), op(1, f)=A, m),                          Diff                        ),                        op(1, f)=A+s*H),                      polynom                    ):   p   := numelems(stencil):   T   := add(alpha[i]*tay(i, p+1), i in stencil):   T   := convert(%, diff):   if p > n+1 then     sol := solve([seq(coeff(T, h, i)=0, i in subsop(n+1=NULL, [\$0..p]))], [seq(alpha[i], i in stencil)])[];   else     sol := solve([seq(coeff(T, H, i)=0, i in subsop(n+1=NULL, [\$0..n]))], [seq(alpha[i], i in stencil)])[];   end if:   if `and`(is~(rhs~(sol)=~0)[]) then     WARNING("no solution found"):     return   else     unknown := `union`(indets~(rhs~(sol))[])[];     Unknown := simplify(solve(eval(T, sol) = Diff(op(0, f)(A), A\$n), unknown));     sol     := lhs~(sol) =~ eval(rhs~(sol), unknown=Unknown);     expr    := normal(eval(add(alpha[i]*op(0, f)(A+i*H), i in stencil), sol));   end if:   return expr end proc:
 > Describe(CDDF)
 # f = target function, # A = point where the derivatives are approximated, # H = # step, # n = order of the derivative, # stencil = list of points for the divided # differenceCDDF # CDDF( f, A, H, n, stencil )
 > # 2-point approximation of diff(f(x), x) | x=a CDDF(f(x), a, h, 1, [-1, 1]); convert(simplify(mtaylor(%, h=0, 4)), Diff);  (1)
 > # 3-point approximation of diff(f(x), x\$2) | x=a CDDF(f(x), a, h, 2, [-1, 0, 1]); convert(simplify(mtaylor(%, h=0)), Diff);  (2)
 > # 5-point pproximation of diff(f(x), x\$2) | x=a CDDF(f(x), a, h, 2, [\$-2..2]); convert(simplify(mtaylor(%, h=0, 8)), Diff);  (3)
 > # 7-point approximation of diff(f(x), x\$2) | x=a CDDF(f(x), a, h, 2, [\$-3..3]); # simplify(taylor(%, h=0, 10)); convert(simplify(mtaylor(%, h=0, 10)), Diff);  (4)
 > # 4-point staggered approximation of diff(f(x), x\$3) | x=a CDDF(f(x), a, h, 3, [seq(-3/2..3/2, 1)]); convert(simplify(mtaylor(%, h=0, 6)), Diff);  (5)
 > # 6-point staggered approximation of diff(f(x), x\$3) | x=a CDDF(f(x), a, h, 3, [seq(-5/2..5/2, 1)]); # simplify(taylor(%, h=0, 8)); convert(simplify(mtaylor(%, h=0, 8)), Diff);  (6)
 > # 5-point approximation of diff(f(x), x\$4) | x=a CDDF(f(x), a, h, 4, [\$-2..2]); convert(simplify(mtaylor(%, h=0, 8)), Diff);  (7)
 > # 7-point approximation of diff(f(x), x\$4) | x=a CDDF(f(x), a, h, 4, [\$-3..3]); convert(simplify(mtaylor(%, h=0, 10)), Diff);  (8)

A FEW 2D EXTENSIONS

 > # diff(f(x, y), [x, y]) approximation over a (2 by 2)-point stencil stencil := [-1, 1]: # step 1: approximate diff(f(x, y), x) over stencil < stencil > fx  := CDDF(f(x), a, h, 1, stencil): fx  := eval(% , f=(u -> f[u](y))): ix  := [indets(fx, function)[]]: # step 2: approximate diff(g(y), y) over stencil < stencil > where #         g represents any function in fx. fxy := add(map(u -> CDDF(u, b, k, 1, stencil)*coeff(fx, u), ix)): # step 3: rewrite fxy in a more convenient form [seq(u=op([0, 0], u)(op([0, 1], u), op(1, u)), u in indets(fxy, function))]: fxy := simplify( eval(fxy, %) ); convert(mtaylor(fxy, [h=0, k=0]), Diff)  (9)
 > # Approximation of diff(f(x, y), [x, x, y, y] a (3 by 3)-point stencil stencil := [-1, 0, 1]: # step 1: approximate diff(f(x, y), x) over stencil < stencil > fx  := CDDF(f(x), a, h, 2, stencil): fx  := eval(% , f=(u -> f[u](y))): ix  := [indets(fx, function)[]]: # step 2: approximate diff(g(y), y) over stencil < stencil > where #         g represents any function in fx. fxy := add(map(u -> CDDF(u, b, k, 2, stencil)*coeff(fx, u), ix)): # step 3: rewrite fxy in a more convenient form [seq(u=op([0, 0], u)(op([0, 1], u), op(1, u)), u in indets(fxy, function))]: fxy := simplify( eval(fxy, %) ); convert(mtaylor(fxy, [h=0, k=0], 8), Diff)  (10)
 > # Approximation of diff(f(x, y), [x, x, y] a (3 by 2)-point stencil stencil_x := [-1, 0, 1]: stencil_y := [-1, 1]: # step 1: approximate diff(f(x, y), x) over stencil < stencil > fx  := CDDF(f(x), a, h, 2, stencil_x): fx  := eval(% , f=(u -> f[u](y))): ix  := [indets(fx, function)[]]: # step 2: approximate diff(g(y), y) over stencil < stencil > where #         g represents any function in fx. fxy := add(map(u -> CDDF(u, b, k, 1, stencil_y)*coeff(fx, u), ix)): # step 3: rewrite fxy in a more convenient form [seq(u=op([0, 0], u)(op([0, 1], u), op(1, u)), u in indets(fxy, function))]: fxy := simplify( eval(fxy, %) ); convert(mtaylor(fxy, [h=0, k=0], 6), Diff)  (11)
 > # Approximation of the laplacian of f(x, y) stencil := [-1, 0, 1]: # step 1: approximate diff(f(x, y), x) over stencil < stencil > fx  := CDDF(f(x), a, h, 2, stencil): fy  := CDDF(f(y), b, k, 2, stencil): fxy := simplify( eval(fx, f=(u -> f(u, b))) + eval(fy, f=(u -> f(a, u))) ); convert(mtaylor(fxy, [h=0, k=0], 6), Diff)  (12)
 >

## Maple proof of Poncelet's theorem for th...

Maple

This post is inspired by a serie of questions from @JAMET.
I wondered if it was possible to prove plane geometry theorems with the geometry package.

Here is an illustration for the Poncelet's theorem for the triangle (French designation), see for instance
https://en.wikipedia.org/wiki/Poncelet%27s_closure_theorem

Are any of you interested in challenging the geometry package to proof other plane geometry theorems?

 > restart:

Poncelet's theorem for the triangle

Let ABC a triangle, c its incircle (center omega, radius r) and C its circumcircle (center Omega, radius R).
Let D the distance between omega and Omega.

then R^2 - D^2 - 2*r*R = 0

Proof

Without loss of generality one sets :

x(A) = y(A) = 0
and  y(B) = 0

ABC is a non degerated triangle provided x(B) <> 0 and y(C) <> 0

 > with(geometry):
 > kernelopts(version); (1)
 > assume(x__B <> 0): assume(y__C <> 0):
 > point(A, 0, 0); point(B, x__B, 0); point(C, x__C, y__C);   (2)
 > triangle(T, [A, B, C]) (3)
 > bisector(bA, A, T); bisector(bB, B, T); eA := isolate(Equation(bA, [x, y]), y): eB := isolate(Equation(bB, [x, y]), y): xc := solve(rhs(eA)=rhs(eB), x): yc := eval(rhs(eA), x=xc): point(omega, xc, yc): r := distance(omega, line(lAB, [A, B]))   (4)
 > circumcircle(C, T, 'centername' = Omega); R := radius(C);  (5)
 > Oo := distance(Omega, omega) (6)
 > S := simplify(R^2 - Oo^2 - 2*r*R)  assuming x__B::real, x__C::real, y__C::real (7)
 > simplify(S) assuming x__B > 0, y__C > 0; simplify(S) assuming x__B > 0, y__C < 0; simplify(S) assuming x__B < 0, y__C > 0; simplify(S) assuming x__B < 0, y__C < 0;    (8)
 >

Improvements of the geometry package

It already appears that (some) assumptions are not (always) correctly taken into account. This is a weak point which requires, as in the attached mw, to use an indirect approache to construct the incircle.

As a matter of fact, the procedure incircle, whose first lines are

```showstat(incircle)

geometry:-incircle := proc(inci, T)
local cname, d, A, B, l1, l2, dis, x, y, tmp, msg;
1   if nargs < 2 or 3 < nargs then
2     error "wrong number of arguments"
end if;
3   if geometry:-form(T) <> ('triangle2d') then
4     error "wrong type of arguments"
end if;
5   if nargs = 3 and op(1,args) = ('centername') and type(op(2,args),'name') then
6     cname := op(2,args)
else
7     cname := cat('center_',inci)
end if;
8   if geometry:-method(T) = (':-points') then
9     d := geometry:-DefinedAs(T);
10     A := op(1,d);
11     B := op(2,d);
12     msg := sprintf("find the bisector of %a at vertex %a",T,A);
13     userinfo(2,geometry,msg);
14     geometry:-bisector('l1',A,T);
15     msg := sprintf("find the bisector of %a at vertex %a",T,B);
16     userinfo(2,geometry,msg);
17     geometry:-bisector('l2',B,T);
18     msg := sprintf("find the intersection of the two bisectors");
19     userinfo(2,geometry,msg);
20     geometry:-intersection(cname,l1,l2);
```

requires that the two bissectors are not parallel(call to geometry:-intersection).

Since the non-parallelism of bisectors is trivial for all non-degenerate triangles, why doesn't incircle inherit this property rather than not being able to decide if the bisectors are parallel or not?)

Here is a detail of what happens and the endless loop in which incircle seems to be caught

```kernelopts(version);
Maple 2015.2, APPLE UNIVERSAL OSX, Dec 20 2015, Build ID 1097895

AreParallel(bA, bB, 'condition'):
AreParallel: hint: cannot determine if -y__C*(x__B^2)^(1/2)*(x__C*(x__B^2)^(1/2)-x__B*(x__B^2)^(1/2)-x__B*((x__B-x__C)^2+y__C^2)^(1/2))+y__C*(x__B^2)^(1/2)*(x__B*(x__C^2+y__C^2)^(1/2)+x__C*(x__B^2)^(1/2)) is zero

assume(lhs(condition) <> 0);
AreParallel(bA, bB);
false
intersection(J, bA, bB);
J

infolevel[geometry] := 4:
incircle(inc, T);
incircle: find the bisector of T at vertex A
incircle: find the bisector of T at vertex B
incircle: find the intersection of the two bisectors
intersection: find the intersection between two lines l1 and l2
intersection: one point of intersection
incircle: find the radius of the incircle
line: define the line from two points
circle: define the circle from its center and radius
circle: hint: abs(x__B^2*y__C/(csgn(x__B)*x__B+(x__B^2-2*x__B*x__C+x__C^2+y__C^2)^(1/2)+(x__C^2+y__C^2)^(1/2)))/(x__B^2)^(1/2) > 0
Error, (in geometry:-circle) not enough information: the radius might not be positive
assume(abs(x__B^2*y__C/(csgn(x__B)*x__B+(x__B^2-2*x__B*x__C+x__C^2+y__C^2)^(1/2)+(x__C^2+y__C^2)^(1/2)))/(x__B^2)^(1/2) > 0):

assume(abs(x__B^2*y__C/(csgn(x__B)*x__B+(x__B^2-2*x__B*x__C+x__C^2+y__C^2)^(1/2)+(x__C^2+y__C^2)^(1/2)))/(x__B^2)^(1/2) > 0):
incircle(inc, T);
incircle: find the bisector of T at vertex A
incircle: find the bisector of T at vertex B
incircle: find the intersection of the two bisectors
intersection: find the intersection between two lines l1 and l2
AreParallel: hint: cannot determine if -y__C*(x__B^2)^(1/2)*(x__C*(x__B^2)^(1/2)-x__B*(x__B^2)^(1/2)-x__B*((x__B-x__C)^2+y__C^2)^(1/2))+y__C*(x__B^2)^(1/2)*(x__B*(x__C^2+y__C^2)^(1/2)+x__C*(x__B^2)^(1/2)) is zero
intersection: two given lines intersect each other if -y__C*(x__B^2)^(1/2)*(x__C*(x__B^2)^(1/2)-x__B*(x__B^2)^(1/2)-x__B*((x__B-x__C)^2+y__C^2)^(1/2))+y__C*(x__B^2)^(1/2)*(x__B*(x__C^2+y__C^2)^(1/2)+x__C*(x__B^2)^(1/2)) <> 0
Error, (in geometry:-intersection) not enough information

assume(-y__C*(x__B^2)^(1/2)*(x__C*(x__B^2)^(1/2)-x__B*(x__B^2)^(1/2)-x__B*((x__B-x__C)^2+y__C^2)^(1/2))+y__C*(x__B^2)^(1/2)*(x__B*(x__C^2+y__C^2)^(1/2)+x__C*(x__B^2)^(1/2)) <> 0):
incircle(inc, T);
incircle: find the bisector of T at vertex A
incircle: find the bisector of T at vertex B
incircle: find the intersection of the two bisectors
intersection: find the intersection between two lines l1 and l2
intersection: one point of intersection
incircle: find the radius of the incircle
line: define the line from two points
circle: define the circle from its center and radius
circle: hint: abs(x__B^2*y__C/(csgn(x__B)*x__B+(x__C^2+y__C^2)^(1/2)+(x__B^2-2*x__B*x__C+x__C^2+y__C^2)^(1/2)))/(x__B^2)^(1/2) > 0
Error, (in geometry:-circle) not enough information: the radius might not be positive

```

 1 2 3 4 5 Page 1 of 5
﻿