acer

32348 Reputation

29 Badges

19 years, 329 days
Ontario, Canada

Social Networks and Content at Maplesoft.com

MaplePrimes Activity


These are replies submitted by acer

I look forward to finding out what mark Alec's qsort gets on the assignment. ;)

The essence of the mechanism by which mysort works looks the same as that of numsorter above (with a few differences like evalhf speed vs evalf flexibility, ~ vs map, etc, etc).

acer

This can be made more efficient, in several ways. Either the transposition could be inplace on Vt and U, or one could do only vector-transposition (also inplace, but it's less of a savings). And Maple doesn't know that it can take the inverse of S without resorting to the whole LU solve. And the thin option of SingularValues can reduce the size of the output (here just a bit, but more for "taller" problems).

> U,S,Vt := SingularValues(A,output=['U','S','Vt'],'thin'):

> (b^%T.U.DiagonalMatrix(Vector(3,i->evalhf(1/S[i]))).Vt)^%T;
                             [2.63293929474353927]
                             [                   ]
                             [3.66652382290048129]
                             [                   ]
                             [5.20251471911851304]

acer

This can be made more efficient, in several ways. Either the transposition could be inplace on Vt and U, or one could do only vector-transposition (also inplace, but it's less of a savings). And Maple doesn't know that it can take the inverse of S without resorting to the whole LU solve. And the thin option of SingularValues can reduce the size of the output (here just a bit, but more for "taller" problems).

> U,S,Vt := SingularValues(A,output=['U','S','Vt'],'thin'):

> (b^%T.U.DiagonalMatrix(Vector(3,i->evalhf(1/S[i]))).Vt)^%T;
                             [2.63293929474353927]
                             [                   ]
                             [3.66652382290048129]
                             [                   ]
                             [5.20251471911851304]

acer

This is my favourite kind of post -- putting it altogther to get the desired functional result. I look forward to "modding it up" in the new Primes.

Use of ssystem that would work in the Standard GUI, with various quotes in the argument, is an achievement in itself. That function in Standard is quite different from the one in the commandline interface, and some examples of it which work as expected in the latter go wrong in the former.

acer

Yes, I knew that it was builtin, thanks. But I can think of no reason for that last one's returning a float. I will submit an SCR against it.

acer

Yes, I knew that it was builtin, thanks. But I can think of no reason for that last one's returning a float. I will submit an SCR against it.

acer

The calls to combinat:-randperm produce a lot of collectible garbage, which takes time to manage.

In a way, this boils down to discussions of a few years back: the give and take between producing a lot of random values up front (with corresponding large memory allocation) versus repeated Vector (or list) creation of many smaller samples (and the ensuing time cost of memory management of those temp objects). I still say that the ideal solution would consist of being be able to repopulate a single Vector with a multitude of random samples (inplace, using a compiled routine). That's how other good RNGs work. I got the impression, Joe, that you were expressing the same sort of idea.

My solution below is to generate a huge sample up front, based on how many iterations I plan to do and how many values I expect to need (at most, with a bit of leeway).

First, if I may make a few minor corrections, Joe. Your ShuffleMatrix3 had a typo in the typecheck of parameter T. And your example M and T would need subtype=Matrix to get past the parameter typing of ShuffleMatrix3. But I'll get rid of ShuffleMatrix3 altogether below and call PermuteMatrixRows directly in the loop, to save all those extra function calls.

I also commented out the part of PermuteMatrixRows that updated M, to try and cut its work&time in half. I think that it's fine to have a routine which re-uses a container like T, and that with such in hand it's not necessary to "fake" inplace operation on input M.

> restart:
>
> PermuteMatrixRows := proc(M :: Array(datatype=integer[4])
>                           , T :: Array(datatype=integer[4])
>                           , P :: Vector(datatype=integer[4])
>                           , m :: nonnegint
>                           , n :: nonnegint
>                          )
> local i,j;
>     for i to m do
>         for j to n do
>             T[i,j] := M[P[i],j];
>         end do;
>     end do;
>     ## Letting it off the hook for updating M.
>     #for i to m do
>     #    for j to n do
>     #        M[i,j] := T[i,j];
>     #    end do;
>     #end do;
>     return NULL;
> end proc:
>
> kernelopts(printbytes=false):
>
> (m,n):=3,5:
> M := rtable(1..m,1..n,random(-99..99),'subtype'=Matrix,'datatype'=integer[4]):> T := rtable(1..m,1..n,'subtype'=Matrix,'datatype'=integer[4]):
>
> maxnum:=100000:
>
> st:=time():
> cPermuteMatrixRows:=Compiler:-Compile(PermuteMatrixRows):
> for k from 1 to maxnum do
>   P := Vector(combinat:-randperm(m), 'datatype'=integer[4]):
>   cPermuteMatrixRows(M, T, P, m, n);
> end do:
> time()-st;
                                    20.465

Now, using a method which explicitly creates its own permutation of 1..m. (I hope that it is a valid implementation of Fisher-Yates, and that any mistake will be pointed out.) It's a bit wasteful is this manner: whenever it needs a random value from 1..j with 1<=j<=m it uses the same big sample, reject values >j which wastes quite a lot of the precomputed values. So the memory use could be improved, by utilizing several reservoirs instead.

> restart:
>
> Mshuffle:=proc(M1::Matrix(datatype=integer[4]),
>                M2::Matrix(datatype=integer[4]),
>                m::integer, n::integer,
>                V::Vector(datatype=integer[4]),
>                ransam::Vector(datatype=integer[4]))
# Copies row i of of mxn M1 into row V[i] of mxn M2,
# for i=1..m, where V is an inplace generated random
# shuffle of 1..m.
# ransam is a source of precomputed random values in 1..m,
# with ransam[1] storing position of previuos run's last
# accessed entry.
> local i::integer, j::integer, count::integer,
>       need::integer, cand::integer;
>   count:=ransam[1];
>   for i from 1 to m do
>     V[i]:=i;
>   end do;
>   for i from m to 2 by -1 do
>     need:=1;
>     while need=1 do
>       count:=count+1;
>       cand:=ransam[1+count];
>       if cand<=i then
>         need:=0;
>       end if;
>     end do;
>     V[i],V[cand]:=V[cand],V[i];
>   end do;
>   for i from 1 to m do
>     for j from 1 to n do
>        M2[V[i],j]:=M1[i,j];
>     end do;
>   end do;
>   ransam[1]:=count;
>   NULL;
> end proc:
>
> kernelopts(printbytes=false):
>
> (m,n) := 3,5:
> M := rtable(1..m,1..n,random(-99..99),'subtype'=Matrix,'datatype'=integer[4]):> T := rtable(1..m,1..n,'subtype'=Matrix,'datatype'=integer[4]):
>
> maxnum:=100000:
>
> st:=time():
> cMshuffle:=Compiler:-Compile(Mshuffle):
> B:=Vector[row](m,datatype=integer[4]):
> rsrc:=LinearAlgebra:-RandomVector(maxnum*m*m,generator=1..m,
>                            outputoptions=[datatype=integer[4]]):
> for k from 1 to maxnum do
>   cMshuffle(M,T,m,n,B,rsrc);
> end do:
> time()-st;
                                     0.837

For 1000000 repetitions cMshuffle took 7.5 sec and 110MB for the session while cPermuteMatrixRows&randperm took 201.5 sec and 35.8MB for the session.

I'd like to get a little philosophical. I don't think that implementing simple algortihms (for sorting, or standard deviation, or shuffling, etc) is too much of a pain. The Compiler gives the current best performance possible in Maple for many purely numerical problems (except compared to calling external to some other program...). I like to think that examples which illustrate it (along with inplace a memory management refinements) are going to help some people with other similar tasks.

BTW, I think that your ideas, Joe, about efficiently permuting the entries of any datatype rtable is feasible. The OpenMaple API allows for it, not problem. But it might require writing a (reusable) compiled external routine.

acer

The calls to combinat:-randperm produce a lot of collectible garbage, which takes time to manage.

In a way, this boils down to discussions of a few years back: the give and take between producing a lot of random values up front (with corresponding large memory allocation) versus repeated Vector (or list) creation of many smaller samples (and the ensuing time cost of memory management of those temp objects). I still say that the ideal solution would consist of being be able to repopulate a single Vector with a multitude of random samples (inplace, using a compiled routine). That's how other good RNGs work. I got the impression, Joe, that you were expressing the same sort of idea.

My solution below is to generate a huge sample up front, based on how many iterations I plan to do and how many values I expect to need (at most, with a bit of leeway).

First, if I may make a few minor corrections, Joe. Your ShuffleMatrix3 had a typo in the typecheck of parameter T. And your example M and T would need subtype=Matrix to get past the parameter typing of ShuffleMatrix3. But I'll get rid of ShuffleMatrix3 altogether below and call PermuteMatrixRows directly in the loop, to save all those extra function calls.

I also commented out the part of PermuteMatrixRows that updated M, to try and cut its work&time in half. I think that it's fine to have a routine which re-uses a container like T, and that with such in hand it's not necessary to "fake" inplace operation on input M.

> restart:
>
> PermuteMatrixRows := proc(M :: Array(datatype=integer[4])
>                           , T :: Array(datatype=integer[4])
>                           , P :: Vector(datatype=integer[4])
>                           , m :: nonnegint
>                           , n :: nonnegint
>                          )
> local i,j;
>     for i to m do
>         for j to n do
>             T[i,j] := M[P[i],j];
>         end do;
>     end do;
>     ## Letting it off the hook for updating M.
>     #for i to m do
>     #    for j to n do
>     #        M[i,j] := T[i,j];
>     #    end do;
>     #end do;
>     return NULL;
> end proc:
>
> kernelopts(printbytes=false):
>
> (m,n):=3,5:
> M := rtable(1..m,1..n,random(-99..99),'subtype'=Matrix,'datatype'=integer[4]):> T := rtable(1..m,1..n,'subtype'=Matrix,'datatype'=integer[4]):
>
> maxnum:=100000:
>
> st:=time():
> cPermuteMatrixRows:=Compiler:-Compile(PermuteMatrixRows):
> for k from 1 to maxnum do
>   P := Vector(combinat:-randperm(m), 'datatype'=integer[4]):
>   cPermuteMatrixRows(M, T, P, m, n);
> end do:
> time()-st;
                                    20.465

Now, using a method which explicitly creates its own permutation of 1..m. (I hope that it is a valid implementation of Fisher-Yates, and that any mistake will be pointed out.) It's a bit wasteful is this manner: whenever it needs a random value from 1..j with 1<=j<=m it uses the same big sample, reject values >j which wastes quite a lot of the precomputed values. So the memory use could be improved, by utilizing several reservoirs instead.

> restart:
>
> Mshuffle:=proc(M1::Matrix(datatype=integer[4]),
>                M2::Matrix(datatype=integer[4]),
>                m::integer, n::integer,
>                V::Vector(datatype=integer[4]),
>                ransam::Vector(datatype=integer[4]))
# Copies row i of of mxn M1 into row V[i] of mxn M2,
# for i=1..m, where V is an inplace generated random
# shuffle of 1..m.
# ransam is a source of precomputed random values in 1..m,
# with ransam[1] storing position of previuos run's last
# accessed entry.
> local i::integer, j::integer, count::integer,
>       need::integer, cand::integer;
>   count:=ransam[1];
>   for i from 1 to m do
>     V[i]:=i;
>   end do;
>   for i from m to 2 by -1 do
>     need:=1;
>     while need=1 do
>       count:=count+1;
>       cand:=ransam[1+count];
>       if cand<=i then
>         need:=0;
>       end if;
>     end do;
>     V[i],V[cand]:=V[cand],V[i];
>   end do;
>   for i from 1 to m do
>     for j from 1 to n do
>        M2[V[i],j]:=M1[i,j];
>     end do;
>   end do;
>   ransam[1]:=count;
>   NULL;
> end proc:
>
> kernelopts(printbytes=false):
>
> (m,n) := 3,5:
> M := rtable(1..m,1..n,random(-99..99),'subtype'=Matrix,'datatype'=integer[4]):> T := rtable(1..m,1..n,'subtype'=Matrix,'datatype'=integer[4]):
>
> maxnum:=100000:
>
> st:=time():
> cMshuffle:=Compiler:-Compile(Mshuffle):
> B:=Vector[row](m,datatype=integer[4]):
> rsrc:=LinearAlgebra:-RandomVector(maxnum*m*m,generator=1..m,
>                            outputoptions=[datatype=integer[4]]):
> for k from 1 to maxnum do
>   cMshuffle(M,T,m,n,B,rsrc);
> end do:
> time()-st;
                                     0.837

For 1000000 repetitions cMshuffle took 7.5 sec and 110MB for the session while cPermuteMatrixRows&randperm took 201.5 sec and 35.8MB for the session.

I'd like to get a little philosophical. I don't think that implementing simple algortihms (for sorting, or standard deviation, or shuffling, etc) is too much of a pain. The Compiler gives the current best performance possible in Maple for many purely numerical problems (except compared to calling external to some other program...). I like to think that examples which illustrate it (along with inplace a memory management refinements) are going to help some people with other similar tasks.

BTW, I think that your ideas, Joe, about efficiently permuting the entries of any datatype rtable is feasible. The OpenMaple API allows for it, not problem. But it might require writing a (reusable) compiled external routine.

acer

Yes, numapprox, mentioned also in an ealier reply above, is a good approach for this.

But be careful, the accuracy might not be as good as that last argument to minimax.

For 8 digits (was that actually requested by the OP!?) one could try something like this, after evaluating at the values for Km, S0 and v,

Digits:=15:

numapprox:-minimax(s,t=0..135,[5,6],.5e-9);

And so on.

That gives a result that is easy to code in double-precision, with 15 digits float coefficients.

And the numerator and denominator are even returned already in Horner form.

A tougher question might be: how to get or demonstrate that a numapprox result is adequate for all the parameter ranges mentioned in the OP's followup replies.

acer

Yes, numapprox, mentioned also in an ealier reply above, is a good approach for this.

But be careful, the accuracy might not be as good as that last argument to minimax.

For 8 digits (was that actually requested by the OP!?) one could try something like this, after evaluating at the values for Km, S0 and v,

Digits:=15:

numapprox:-minimax(s,t=0..135,[5,6],.5e-9);

And so on.

That gives a result that is easy to code in double-precision, with 15 digits float coefficients.

And the numerator and denominator are even returned already in Horner form.

A tougher question might be: how to get or demonstrate that a numapprox result is adequate for all the parameter ranges mentioned in the OP's followup replies.

acer

Hi Alec,

The main difference between d01amc and d01smc is that the latter is supposed to be thread-safe (in its own sake). So, if the Maple integrand were also known to be thread-safe too then the define_external call to the wrapper around d01smc could reasonably get the 'THREAD_SAFE' option. Without that option to define_external (which is presently the case under `evalf/int` in M13 and M14) then accessing the library containing the NAG function entry points is only allowed for one thread at a time. See ?define_external for a paragraph on this.

But it's a tricky thing to automate correctly in general: the maple integrand might be thread-unsafe. It's easier for everyone when the evalf/Int call is explicit in the user's code, but what about code that calls a Maple Library routine which in turn calls evalf/Int internally? In that case there is no convenient and automatic way to pass an evalf/Int option which correctly toggles the relevant define_external call. The user's code might be using Maple Threads, but perhaps the computation instead relies on recognizing that some integrand is actually thread-unsafe.

The d01smc routine may also have an extra (new, over d01amc) argument of type (struct) Nag_Comm, for communicating with the function. But exposing that via evalf/Int might not be of worth to most people.

What I'm trying to suggest is that, in the absence of the relevant  define_external call having its THREAD_SAFE' option, the choice between calling out from Maple to d01amc versus d01smc may be of little or no consequence. And the question of when and how to allow that define_external option might be problematic.

I don't know of a fast, efficient, and robust piece of code to ascertain thread-safety of a Maple routine. What with lexical scoping and other things, there are many devious ways for a Maple procedure to write to a higher scope. A user-defined integrand in operator form might have 'option threadsafe' or a similar flag. But in expression form, an integrand could contain a quoted function call to some other thread-unsafe procedure. And so on...

So perhaps it would be better for evalf/Int only to allow the user to select explicitly the thread-safe NAG function, either by explicit name like _d01smc, _d01skc, etc or by a general option such as, say, 'allowthreads', or by passing the integrand as a proc with 'option threadsafe' (which is a made-up term).

acer

Hi Alec,

The main difference between d01amc and d01smc is that the latter is supposed to be thread-safe (in its own sake). So, if the Maple integrand were also known to be thread-safe too then the define_external call to the wrapper around d01smc could reasonably get the 'THREAD_SAFE' option. Without that option to define_external (which is presently the case under `evalf/int` in M13 and M14) then accessing the library containing the NAG function entry points is only allowed for one thread at a time. See ?define_external for a paragraph on this.

But it's a tricky thing to automate correctly in general: the maple integrand might be thread-unsafe. It's easier for everyone when the evalf/Int call is explicit in the user's code, but what about code that calls a Maple Library routine which in turn calls evalf/Int internally? In that case there is no convenient and automatic way to pass an evalf/Int option which correctly toggles the relevant define_external call. The user's code might be using Maple Threads, but perhaps the computation instead relies on recognizing that some integrand is actually thread-unsafe.

The d01smc routine may also have an extra (new, over d01amc) argument of type (struct) Nag_Comm, for communicating with the function. But exposing that via evalf/Int might not be of worth to most people.

What I'm trying to suggest is that, in the absence of the relevant  define_external call having its THREAD_SAFE' option, the choice between calling out from Maple to d01amc versus d01smc may be of little or no consequence. And the question of when and how to allow that define_external option might be problematic.

I don't know of a fast, efficient, and robust piece of code to ascertain thread-safety of a Maple routine. What with lexical scoping and other things, there are many devious ways for a Maple procedure to write to a higher scope. A user-defined integrand in operator form might have 'option threadsafe' or a similar flag. But in expression form, an integrand could contain a quoted function call to some other thread-unsafe procedure. And so on...

So perhaps it would be better for evalf/Int only to allow the user to select explicitly the thread-safe NAG function, either by explicit name like _d01smc, _d01skc, etc or by a general option such as, say, 'allowthreads', or by passing the integrand as a proc with 'option threadsafe' (which is a made-up term).

acer

Hi Axel,

Yes, it looks as if `evalf/int/control` might use `evalf/int/CreateProc` or something to make a proc from an expression-form integrand.

But before that happens, `evalf/int/control` seems to really poke expensively at an expression-form integrand when a NAG method is not specified. It may be looking for singularities. I can understand why it might do that: it wants to compute tough problems correctly by default. I wouldn't mind seeing another way to prevent that cost.

acer

Hi Axel,

Yes, it looks as if `evalf/int/control` might use `evalf/int/CreateProc` or something to make a proc from an expression-form integrand.

But before that happens, `evalf/int/control` seems to really poke expensively at an expression-form integrand when a NAG method is not specified. It may be looking for singularities. I can understand why it might do that: it wants to compute tough problems correctly by default. I wouldn't mind seeing another way to prevent that cost.

acer

Thanks, Alec. The minor mystery is now solved (see your Edit.)

But it does puzzle me that evalf/Int can do so much more (symbolic?) work, digging away at investigating the integrand when in expression form. I suppose that passing the integrand as a procedure/operator merely tricks it into considering it as a black box that cannot be poked. It's a little worrisome that one has to use a cheap trick, or know which forced method is appropriate, to disable it. It might be nicer to have an optional parameter that controls it, eg. discont=false (assuming that I'm right and that it is discontinuity checking that makes the difference) or whatever.

acer

First 459 460 461 462 463 464 465 Last Page 461 of 592