acer

32348 Reputation

29 Badges

19 years, 330 days
Ontario, Canada

Social Networks and Content at Maplesoft.com

MaplePrimes Activity


These are replies submitted by acer

@Joe Riel 

When I first saw this topic the first thing that occurred to me was to redefine `assuming` in the equality case. But I discounted it since I was convinced that it would break some existing functionality. As we know, currently 'assuming' places temporary assumptions by replacing by local names with assumptions on them. So then 'is' works, but there is no evaluated substitution (and evalb checks don't ensue, and all kinds of other stuff doesn't come about, etc).

I suspect that there is some class of problem which would behave significantly differently if actual temporary assignment were done instead of merely temporary assumptions, when using such a new `assuming` in the equality case. But alas I have no significant example at hand of "breakage" -- just a gut feeling. ;) So one question is: would such examples (of now differing functioning) always be a good enhancement to `assuming`? Or could there be some undesirable breakage in some (rare?) current usage? Sure, this sounds like an all round improvement, but what about code that relies on the current behaviour? This is mostly why I considered that `using` may be a better choice of infix operator for such new functionality, instead of highjacking `assuming`.

And, of course, highjacking is quite the wrong term. Maybe it's a great fit.

It's a neat piece of syntactic sugar.

But perhaps `assigning` is not the best choice of word, as it doesn't hang as nicely as it might as an English statement. How about `using` instead? Eg,

plot(a*sin(x),x=0..b) &using a=-2,b=7;

I note that what you are proposing seems to function like the following, which already works and relies on the word  `use`.

use a=-2,b=7 in plot(a*sin(x),x=0..b); end use;

It would be nicer with (both 1D and 2D) parser support, to accept the infix operator without its being prefixed by `&`. And, as you mentioned if it had a high enough precedence to work even without round-bracket grouping of the substitution terms. Ie.

plot(a*sin(x),x=0..b) using a=-2,b=7;

which seems quite natural.

acer

It seems to work as expected in the Classic GUI of Maple 14.

It seems to work as expected in the Classic GUI of Maple 14.

Thanks for comments. But I should mention that I knew full well what caused it; and that cause is not so interesting as the fact that the slowdown is very severe. It's a huge amount of overhead to impose (unnecessarily) on the inexperienced user in the default environment. Most users will not realize that their code has this performance penalty.

Do not for a moment forget that what is happening is not a initial parsing slowdown, but an ongoing computational slowdown. And that is far more remarkable as an effect of typesetting merely of the code and not of the output. And it is non-negligible -- each application of that dot-lookalike takes as much time as a 2x2 Matrix multiplication!

Working around the problem by using prefix operators is not the answer for the majority of users. The experienced user will likely have no trouble finding that or other deft workarounds. But the nonexpert user, who is most susceptible to whether or not the default environment is effective and naturally efficient, will likely not even realize that there is any avoidable performance issue. For the majority of users the solution to this issue is to not use 2D Math input mode for writing procedures.

acer

It was really unclear (to me) whether the OP just wanted to easily create a 3x2 Matrix from a list and have the scanning be done by column, or whether the goal was to produce something specifically with C order.

There seemed to be the possibility that she only really wanted the scanning of the flat listdata by columns. (Doug seems to have inferred that, and ignored what would then be the C_order red-herring). In that case, the underlying "order" that Maple is using to store the data internally is irrelevant, and the OP should most certainly not be trying to get cheap transposition just by flipping the order. It seemed as if the OP might not realize that the orcer=C_order refers to the internal storage in memory and not how the data appears and gets printed.

The other possibility is that the OP wanted not only the data in A to be injected (scanned)  into a Matrix by columns, but that she also very specifically wanted C_order storage underneath. (Who knows, maybe the OP wanted to use ImageTools instead of LinearAlgebra and failed to clarify it).

It is confusing that the Matrix constructor reads a flat list as a row-scan, and never as a column-scan unless the list is split. I mean it seems almost a bug that this does not work,

A:=[$1..6]:
Matrix(3,2,A,scan=columns);

while this does work,

A:=[$1..6]:
Matrix(3,2,[ListTools:-LengthSplit(A,3)],scan=columns);

Fortunately, the answer that I gave (and I see since submission that Preben's is essentially the same) does not get garbled if one also slaps on an order=C_order option. So hopefully it works whatever the OP's actual goal was.

In case anyone cares, here's a tip. Do not try and transpose a Matrix by just flipping the order, even for square Matrices. It may print like what you want, but internally it is not, and it won't always get treated as you expect by LinearAlgebra.

M:=Matrix([[1.0,2.0],[3.0,4.0]],datatype=float[8]);
rtable_options(M,order=C_order); # acts inplace on M
M;

And don't get beguiled into expecting that inplace 'order' flipping will "transpose" as neatly for nonsquare Matrices, just because it seems to work for square Matrices..

BTW, ArrayTools has lots of nifty exports for manipulation. I was able to cook up another three ways to get the effect that Erik showed, using mixes of DataTranspose and Reshape. But I advise using some of those only if one understands the storage vs display distinction.

acer

It was really unclear (to me) whether the OP just wanted to easily create a 3x2 Matrix from a list and have the scanning be done by column, or whether the goal was to produce something specifically with C order.

There seemed to be the possibility that she only really wanted the scanning of the flat listdata by columns. (Doug seems to have inferred that, and ignored what would then be the C_order red-herring). In that case, the underlying "order" that Maple is using to store the data internally is irrelevant, and the OP should most certainly not be trying to get cheap transposition just by flipping the order. It seemed as if the OP might not realize that the orcer=C_order refers to the internal storage in memory and not how the data appears and gets printed.

The other possibility is that the OP wanted not only the data in A to be injected (scanned)  into a Matrix by columns, but that she also very specifically wanted C_order storage underneath. (Who knows, maybe the OP wanted to use ImageTools instead of LinearAlgebra and failed to clarify it).

It is confusing that the Matrix constructor reads a flat list as a row-scan, and never as a column-scan unless the list is split. I mean it seems almost a bug that this does not work,

A:=[$1..6]:
Matrix(3,2,A,scan=columns);

while this does work,

A:=[$1..6]:
Matrix(3,2,[ListTools:-LengthSplit(A,3)],scan=columns);

Fortunately, the answer that I gave (and I see since submission that Preben's is essentially the same) does not get garbled if one also slaps on an order=C_order option. So hopefully it works whatever the OP's actual goal was.

In case anyone cares, here's a tip. Do not try and transpose a Matrix by just flipping the order, even for square Matrices. It may print like what you want, but internally it is not, and it won't always get treated as you expect by LinearAlgebra.

M:=Matrix([[1.0,2.0],[3.0,4.0]],datatype=float[8]);
rtable_options(M,order=C_order); # acts inplace on M
M;

And don't get beguiled into expecting that inplace 'order' flipping will "transpose" as neatly for nonsquare Matrices, just because it seems to work for square Matrices..

BTW, ArrayTools has lots of nifty exports for manipulation. I was able to cook up another three ways to get the effect that Erik showed, using mixes of DataTranspose and Reshape. But I advise using some of those only if one understands the storage vs display distinction.

acer

If you look carefully enough at the results resturned by `solve` in both my and John's earlier answers you should be able to see that they do in fact contain such formulae. That is, they contain answers of the form a=..., b=... in terms of X.

But the right-hand-sides of those answers like {a=..,b=...} in terms of X are implicit RootOfs, and not explicit arithmetic formulae in terms of radicals.

You may not like that implicit RootOf form. But as I showed it is useful. Using evalf or fsolve following substitution of X by a float value (as I showed using evalf), all the roots for any given numeric X can be obtained. Hence, you can much more easily and reliably obtain all the roots of those RootOfs involving simple polynomials in X (for any given numeric X) than you can obtain roots of the system of equations using fsolve (for any given numeric X).

The extra RootFinding:-Parametric bit I added in my worksheet, before the `solve` stuff, was to show how many roots (ie. distinct sets of values for a,b,c,d) there are for separate ranges of real-valued parameter  X.

acer

Something has gone wrong with the last two paragraphs of the original post.

At some earlier date, those two paragraphs likely mentioned setting values for kernelopts(prettyprint). But the text's meaning is now slightly garbled, as if instances of that term had disappeared from the text.

For those wondering, labelling can be turned enabled in the Standard GUI by first setting the value of the `prettyprint` interface setting to either 1 or 2. For example, by issuing,

interface(prettyprint=1):

Now, enabled labeling doesn't usually produce as nice results in Standard. Oh, sure, it allows results to be neatly displayed, whereas without it those results would be practically unviewable. But enabling it also seems to wreck proper breaking of other, non-labeled terms in long lines. And so huge, long lines can appear, sometimes in displayed nonscalar objects' output. And, as John mentioned, the mechanism tends to choose common subexpressions for labeling which are awkwardly too short. Presumably that is why labeling is not enabled by default in Standard: it works too poorly.

But labeling should be an important aspect of functionality in any good CAS's user interface, and it really ought to be properly ported from commandline/Classic to Standard. After all, the Standard GUI was introduced seven years and six major releases ago.

Unfortunately, most Standard users don't realize what great stuff they are missing out on... because they've never seen the alternative. There are lots of simple symbolic solving examples where the unlabeled output either exceeds the permitted display length at the default settings or is too long to be sensibly legible.

acer

Thanks, John. I had also considered using `eliminate`.

But I figured that the OP might be interested in knowing how many solutions lay in the various regions for X along the real line.

As a wise man once told me: people asking (vague) questions about parametric systems are almost always going to be disappointed by answers based upon guesses as to what they really want.

Hey! You and your fancy commandline/Classic subexpression labeling. You made my Windows Standard GUI "go away" with your semi-colon. ;)

> sol:=eliminate({a*b^2-b+c-d-3, a^3*b^2-b+c-d^3+4, 
> a^4+b^3-b^2+c-d+2, 2*a*b^2-b+c-d-1-X}, {a,b,c,d}):

> sol2:= SolveTools:-PolynomialSystem( {a*b^2-b+c-d-3, a^3*b^2-b+c-d^3+4,
> a^4+b^3-b^2+c-d+2, 2*a*b^2-b+c-d-1-X}, {a,b,c,d}, domain=parametric):

> length(sol), length(sol2);
5183500, 1398546

I suppose that I'd have to read the SolveTools help-page to figure out how to pick apart the essentially unviewable answer in the Standard GUI, and substitute in values for X, etc. It's ridiculous that the result sol2 can be viewed ok -- without the Standard GUI turning to sludge and finally printing something near unreadable -- but only after issuing something like,

> interface(prettyprint=1);

acer

Thanks, John. I had also considered using `eliminate`.

But I figured that the OP might be interested in knowing how many solutions lay in the various regions for X along the real line.

As a wise man once told me: people asking (vague) questions about parametric systems are almost always going to be disappointed by answers based upon guesses as to what they really want.

Hey! You and your fancy commandline/Classic subexpression labeling. You made my Windows Standard GUI "go away" with your semi-colon. ;)

> sol:=eliminate({a*b^2-b+c-d-3, a^3*b^2-b+c-d^3+4, 
> a^4+b^3-b^2+c-d+2, 2*a*b^2-b+c-d-1-X}, {a,b,c,d}):

> sol2:= SolveTools:-PolynomialSystem( {a*b^2-b+c-d-3, a^3*b^2-b+c-d^3+4,
> a^4+b^3-b^2+c-d+2, 2*a*b^2-b+c-d-1-X}, {a,b,c,d}, domain=parametric):

> length(sol), length(sol2);
5183500, 1398546

I suppose that I'd have to read the SolveTools help-page to figure out how to pick apart the essentially unviewable answer in the Standard GUI, and substitute in values for X, etc. It's ridiculous that the result sol2 can be viewed ok -- without the Standard GUI turning to sludge and finally printing something near unreadable -- but only after issuing something like,

> interface(prettyprint=1);

acer

@Christopher2222 It's really not clear to what you are referring, when you write that Maple scores last.

I'm pretty sure that you are referring to the review Duncan mentioned, and not to the review mentioned in the parent post. But you've failed to mention which it was that you were discussing, and with the lack of comment-threading-indentation it's ambiguous.

@DuncanA It's true that the ncrunch review is more in depth. It has a slant toward computational statistics (which is fine, as that is pretty much declared explicitly).  But it is also quite wrong in places. And the benchmark code is not all implemented in a fair or near optimal way.

See here for a Mapleprimes post about it, from a few years ago.

The functionality charts are also wrong in places, and misleading in some others. For example, some claims about Maple's debugger are just plain wrong. It says that Maple doesn't support FFT in more than 1D, while of course one can just use the 1D version to get a 2D calculation. And so on. It takes off points for not having dedicated commands for some things which are easily done as 1-liners (eg. I can think of at least 3 simple ways to make a "Pascal Matrix" generator in 1 line of code, so why would anyone want a routine for that!?). That's just a sample; there are more errors both large and small.

And the weighting scheme for the various sections is quite arbitrary and subjective.

Please don't misunderstand. It's obviously the result of a lot of hard work and effort. And the author has tried to keep it regularly updated. But it needs better review and correction by experts before publication, or at least some sort of working feedback and correction mechanism.

There are more advisers/reviewers listed on the ncrunch site for Matlab and Mathematica than for Maple. (No offense is intended to anyome; more heads always do better.) Even still, I suspect that there might be errors related to each of the programs in the Review, and significant improvements possible perhaps for the posted code for others and not just for Maple.

Maybe comprehensive comparisons are just too much work to do really well. An interesting alternative might be tight, narrowly focused product comparisons of individual aspects of functionality, a bit like this earlier suggestion.

acer

Could you post the code here, by uploading it in a file?

If you aren't at liberty to show the code in public then maybe you could send it to Maplesoft's technical support.

acer

@marvin Presumably you are talking about hardware precision float Matrices and Vectors (float[8] or complex[8] datatypes) since you mention alternatively doing it on CUDA.

Are you sure that the external MKL (Intel Math Kernel Library) is not already making use of all or most of the multiple cores when doing each of the individual Matrix-Vector multiplications? If that were the case, then not much more speedup might even be possible. (You might check it most simply by using the Task Manager, to see the total load.)

Investigation with compiler utilities reveals that dgemm and dgemv are not always called directly from Maple. Instead, an external library linalg.dll is called from Maple. (Just how, seems to depend on whether one uses `.` or MatrixMatrixMultiply.) And linalg.dll is linked against nag.dll. And the Library-side define_external may not have the option to enable multiple Threads to concurrently access that external library. There may be thread-safety issues here, related to MKL.

But if the MKL is already using multiple cores, then perhaps thing are already near optimal for your example. As a rough general rule, the lower-level the threadedness the better. Especially for array calculations where cache might be shared. Accelerated BLAS functions gain as much by virtue of being highly cache-tuned as they do by being threaded for 4-8 cores. (This touches a bit on another excellent post by Darin.)

I realize that you mention that example because it seems like a good test: easily parallelized across the list of Vectors. But in principal it would be suboptimal to storing all the Vectors in a single Matrix, so as to perform Matrix-Matrix multiplication. If nothing else this would avoid the overhead cost of `n` Maple function calls, which are relatively expensive compared to C function calls. But it also allows for a single call  to dgemm to be more cleverly cache-tuned than the overall task of `n` dgemv calls. Apologies if you considerd that -- as mentioned, yours is likely intended as a test example of the Threading functionality. But it may not be a clean example, on practical considerations.

In summary, I see at these considerations:

  • MKL may already be using multiple cores for each Matrix-Vector multiplication
  • multiple calls to linalg.dll may not be allowed concurrently (MKL higher-access thread-safety?)
  • `n` Matrix-Vector calls adds Maple overhead
  • `n` dgemv calls subverts cache-tuning specific to dgemm
  • `n` dgemv calls may subvert any Strassen or other superior dgemm algorithm

Changing topic slightly, things are really going to get interesting (I think) when machines commonly get hundreds of cores. Consider the case of solving a (non-modular) integer linear system using modular techniques. Sure, for each prime modulus a particular external call might use lots of cores, but perhaps not with linear speedup. Or maybe there are a huge number of cores available. At some point, it could be desirable to perform the modular LA operations in parallel, even though each one uses many cores. There would be enough cores that one would actually want to apportion them out to many concurrent modular subtasks to be run concurrently. In such a state of affairs it would be desirable to "fix" or work around the second bullet point above.

First 451 452 453 454 455 456 457 Last Page 453 of 592