acer

works for me...

Answered: acer 32348

November 17 2010

0 0

It worked fine for me in 12.02, Windows 32bit (XP).

acer

works for me...

Commented: acer 32348

November 17 2010

It worked fine for me in 12.02, Windows 32bit (XP).

acer

an observation...

Answered: acer 32348

November 16 2010

0 0

I had written myself a similar procedure, this evening. But then I noticed that it had an unfortunate behaviour, shared with your submission. Namely what happens to `z` below,

> restart;
> RemoveAllAssumptions:=proc() local A,Ls,L,Lc,indx;
>    Ls:=map(convert,[anames('user')],string);
>    L:=ListTools:-Enumerate([anames('user')]);
>    Lc:=select(hastype,eval(L),`local`);
>    if Lc=[] then NULL
>    else
>       indx:=map2(op,1,Lc);
>       A:=[seq(Ls[i],i=indx)];
>       unassign(op(map(parse,A)));
>    end if
> end proc:

> z:=y;
                               y

> assume(x>0);

> y:=x;
                               x

> z;
                               x

> RemoveAllAssumptions();

> z;
                               z

> y;
                               y

I could see that one might reasonably argue that, after RemoveAllAssumptions is called above, y is unassigned. (How would the procedure know, otherwise, whether to make y take as some new value the global x, or some procedure's escaped local x, etc?) But it doesn't look right to me that z gets unassigned in the above example.

I'm sure that a robust version, which somehow preserves the desired previous assignments, can be written. But offhand I could only see kludgy ways to handle some stranger cases.

Of course, your routine may suffice for the Orginal Poster. (Although, maybe his assumed names are all simply indexed and some loop would also do.)

acer

an observation...

Commented: acer 32348

November 16 2010

I had written myself a similar procedure, this evening. But then I noticed that it had an unfortunate behaviour, shared with your submission. Namely what happens to `z` below,

> restart;
> RemoveAllAssumptions:=proc() local A,Ls,L,Lc,indx;
>    Ls:=map(convert,[anames('user')],string);
>    L:=ListTools:-Enumerate([anames('user')]);
>    Lc:=select(hastype,eval(L),`local`);
>    if Lc=[] then NULL
>    else
>       indx:=map2(op,1,Lc);
>       A:=[seq(Ls[i],i=indx)];
>       unassign(op(map(parse,A)));
>    end if
> end proc:

> z:=y;
                               y

> assume(x>0);

> y:=x;
                               x

> z;
                               x

> RemoveAllAssumptions();

> z;
                               z

> y;
                               y

I could see that one might reasonably argue that, after RemoveAllAssumptions is called above, y is unassigned. (How would the procedure know, otherwise, whether to make y take as some new value the global x, or some procedure's escaped local x, etc?) But it doesn't look right to me that z gets unassigned in the above example.

I'm sure that a robust version, which somehow preserves the desired previous assignments, can be written. But offhand I could only see kludgy ways to handle some stranger cases.

Of course, your routine may suffice for the Orginal Poster. (Although, maybe his assumed names are all simply indexed and some loop would also do.)

acer

re-use float[8] workspaces...

Answered: acer 32348

November 15 2010

0 0

One useful technique is to re-use float[8] (or other hardware datatype) Matrix/Vector/Array workspaces.

This can save on the total memory allocation needed.

There is a useful command, ArrayTools:-Fill, which can be utilized to zero-out an existing Matrix/Vector/Array.

Things can get more complicated when the individual tasks require varying sizes of workspace. In such cases in can still be useful and efficient to re-use a single instance of the workspace. Of course, in such a scenario the code may have to handle the fact that the workspace is larger than necessary (see Historical Note below).

When using separate workspace objects then the memory performance and usage can depend on whether or not one creates & unassigns one's workspaces in order of descending size. The following comparison illustrates that the worse scenario is to create then in order of ascending size, because doing so can involve much more total memory allocation (due to the mentioned problem of "memory fragmentation").

Case of generating separate workspaces, in order of increaing size,

> restart:
> for i from 1 to 10 do
>   M:=Matrix(2000+i*100,datatype=float[8]):
>   # some task involving a clean M
>   M:='M': dummy1: dummy2: dummy3:
>   gc():
> end do:
> kernelopts(bytesalloc);
                           527992392

Case of generating separate workspaces, in order of decreasing size,

> restart:
> for i from 1 to 10 do
>   M:=Matrix(3000-i*100,datatype=float[8]):
>   # some task involving a clean M
>   M:='M': dummy1: dummy2: dummy3:
>   gc():
> end do:
kernelopts(bytesalloc);
                            68144960

Case of re-using single (largest required) workspace,

> restart:
> M:=Matrix(2900,datatype=float[8]): # largest needed
> for i from 1 to 10 do
>   # some task involving a clean M
>   ArrayTools:-Fill(2900*2900,0.0,M);
>   # no `gc` needed to clear out M!
> end do:
> kernelopts(bytesalloc);
                            68144960

Historical Note: Old-timer Fortran programmers would often write code in which single larger-than-always-necessary blocks of memory were allocated up front. The size would depend on the largest size that would ever be required. That large full block would be used and re-used within the individual tasks. Indeed, a lot of classic code like LAPACK is implemented in this style. Assuming Fortran-order storage, many LAPACK functions have associated with each array argument a triple of scalars to denote the size. For example, 2D array A would have `lda`, `m`, and `n'. The `lda` meant the leading dimension of A, or in other words the height of columns since `Fortran order` means `stored by column`. The `n` meant number of columns and could be used naturally. But the `m` meant the height of the data for the given task, and its value could be much less than `lda`. In this scheme the data for each individual task could be stored and accessed in the upper portion of the (much larger than necessary) full array block. The functions were written to know how to use `lda` and `m` as appropriate, according to the stride needed to jump from one column to another of the actual data subblock actually in use. One would use `m` as the number of entries of any particular column, but the code would use `lda` as the number of places to jump in order to walk along a particular row.

  | A[1,1]  ..  A[1,j]  ..   A[1,n] |
  |   .           .            .    |
  |   .           .            .    |
  | A[i,1]  ..  A[i,j]  ..   A[i,n] |
  |   .           .            .    |
  |   .           .            .    |
  | A[m,1]  ..  A[m,j]  ..   A[m,n] |
  |   .                             |
  |   .                             |
  |   .                             |
  | A[lda,1]  ..        .. A[lda,n] |

In the above example, A would declared as size (lda,n).

acer

re-use float[8] workspaces...

Commented: acer 32348

November 15 2010

One useful technique is to re-use float[8] (or other hardware datatype) Matrix/Vector/Array workspaces.

This can save on the total memory allocation needed.

There is a useful command, ArrayTools:-Fill, which can be utilized to zero-out an existing Matrix/Vector/Array.

Things can get more complicated when the individual tasks require varying sizes of workspace. In such cases in can still be useful and efficient to re-use a single instance of the workspace. Of course, in such a scenario the code may have to handle the fact that the workspace is larger than necessary (see Historical Note below).

When using separate workspace objects then the memory performance and usage can depend on whether or not one creates & unassigns one's workspaces in order of descending size. The following comparison illustrates that the worse scenario is to create then in order of ascending size, because doing so can involve much more total memory allocation (due to the mentioned problem of "memory fragmentation").

Case of generating separate workspaces, in order of increaing size,

> restart:
> for i from 1 to 10 do
>   M:=Matrix(2000+i*100,datatype=float[8]):
>   # some task involving a clean M
>   M:='M': dummy1: dummy2: dummy3:
>   gc():
> end do:
> kernelopts(bytesalloc);
                           527992392

Case of generating separate workspaces, in order of decreasing size,

> restart:
> for i from 1 to 10 do
>   M:=Matrix(3000-i*100,datatype=float[8]):
>   # some task involving a clean M
>   M:='M': dummy1: dummy2: dummy3:
>   gc():
> end do:
kernelopts(bytesalloc);
                            68144960

Case of re-using single (largest required) workspace,

> restart:
> M:=Matrix(2900,datatype=float[8]): # largest needed
> for i from 1 to 10 do
>   # some task involving a clean M
>   ArrayTools:-Fill(2900*2900,0.0,M);
>   # no `gc` needed to clear out M!
> end do:
> kernelopts(bytesalloc);
                            68144960

Historical Note: Old-timer Fortran programmers would often write code in which single larger-than-always-necessary blocks of memory were allocated up front. The size would depend on the largest size that would ever be required. That large full block would be used and re-used within the individual tasks. Indeed, a lot of classic code like LAPACK is implemented in this style. Assuming Fortran-order storage, many LAPACK functions have associated with each array argument a triple of scalars to denote the size. For example, 2D array A would have `lda`, `m`, and `n'. The `lda` meant the leading dimension of A, or in other words the height of columns since `Fortran order` means `stored by column`. The `n` meant number of columns and could be used naturally. But the `m` meant the height of the data for the given task, and its value could be much less than `lda`. In this scheme the data for each individual task could be stored and accessed in the upper portion of the (much larger than necessary) full array block. The functions were written to know how to use `lda` and `m` as appropriate, according to the stride needed to jump from one column to another of the actual data subblock actually in use. One would use `m` as the number of entries of any particular column, but the code would use `lda` as the number of places to jump in order to walk along a particular row.

  | A[1,1]  ..  A[1,j]  ..   A[1,n] |
  |   .           .            .    |
  |   .           .            .    |
  | A[i,1]  ..  A[i,j]  ..   A[i,n] |
  |   .           .            .    |
  |   .           .            .    |
  | A[m,1]  ..  A[m,j]  ..   A[m,n] |
  |   .                             |
  |   .                             |
  |   .                             |
  | A[lda,1]  ..        .. A[lda,n] |

In the above example, A would declared as size (lda,n).

acer

pls tell us when you find out...

Answered: acer 32348

November 12 2010

0 0

Now I'm all keen to know whether the answer is not just something obvious like -315.19 kJ/mol

Please let us know, when you find out.

acer

pls tell us when you find out...

Commented: acer 32348

November 12 2010

Now I'm all keen to know whether the answer is not just something obvious like -315.19 kJ/mol

Please let us know, when you find out.

acer

comparison...

Commented: acer 32348

November 06 2010

@PatrickT Your large test is no longer appropriate, in the sense that the inputs are floats. LinearAlgebra converts that to float[8] and bang, is done. Anyone can take the pure loop version, Compile it after typing it, and get similar. Or they could write a precompiled external routine for this (like LinearAlgebra has). There's nothing wrong with floats, but it's an apple and oranges situation.

It shouldn't be surprising that the result object's size may be very different according to whether it is datatype=anything or not. It makes a difference1

The versions users posted are all (obviously, I say) designed for the exact case (like your symbolic example).

Also, did you try the new submission KPKP above, on your example that for which you wrote that nothing else succeeded? On my i7 it took 7 secs and the memory fit in about 250Mb Ithink it was. Of course. it too should be altered for float/float[8].

An appropriate large test/comparison of the original posted codes would be exact or symbolic.also, you could reasonably use routine `KP` in an above response since that is like LinearAlgebra's exact code in KroneckerProduct.

Alternatively, large float tests could be done on new submissions tailored for that type. Apples to apples. Oranges to oranges.

You could also use kernelopts(bytesalloc) to compute a memory difference, ie an allocation increase incurred for each operation.

yes...

Commented: acer 32348

November 06 2010

@hirnyk Yes, it is important for several reasons.

Just because you cannot imagine a situation where highly repeated use of a low-level function is necessary does not in any way mean that no situation would likely ever require it. There are many more programming situations than you or I or anyone else here can imagine. That is why programming is so important to Maple (and, incidentally, why it should be treated as much, much more important that program-free "Clickable Math").
The procedure in question is simple and basic. So the issue of how it may be optimized is very general, and the answers to that in turn are very likely broadly applicable. Observe how the very fastest code solutions to this question rely on kernel built-ins and inplace semantics. Observe how slower code solutions mostly involve a lot of temporary garbage production. There are important, re-usable lessons to be learnt from all this.
As Patrick expressed, such comparisons are also useful for learning Maple.

not always...

Answered: acer 32348

October 30 2010

0 0

@Alejandro Jakubi Not if it helps provide an immediate workaround to resolve the issue at hand for the given individual.

not always...

Commented: acer 32348

October 30 2010

@Alejandro Jakubi Not if it helps provide an immediate workaround to resolve the issue at hand for the given individual.

excellent...

Commented: acer 32348

October 28 2010

@Alec Mihailovs This is very interesting. I did a quick check on 32bit Linux, with comparable results. I did all tests on local hardisk, under /tmp.

I also tested fread'ind into a 2^26 length Array in the case that the file only had 5 bytes previously written to it in total. The `fread1` command did the right thing, and did not introduce garbage into the entries past the 5th position.

At length 2^26 the timings differed between fread1 and readbytes was about a factor of one hundred. Very impressive. It's not even clear that the result isn't even better than that, since the fread1 timing varies a bit and is so small that it maybe mostly noise.

It was mostly in the range of sizes 2^22 through 2^24 that readbytes started to slow down. It performed fine at, say 2^18.

One thing that I noticed (Linux 32bit) was that at size 2^26 the readbytes command took this amount of time in various Maple releases:

Maple 14: 5.240 sec

Maple 13.01: 5.644 sec

Maple 12.02: 5.112 sec

Maple 11.02: 1.832 sec

Maple 10.06: 1.552 sec

But 0.05 sec is still much better than 1.552 sec.

I used this code below, in a file. The file "aaa" must first be written out to currentdir(), and contain the data, naturally. Eg, writebytes("aaa",[3,4,5,6,7]) for the "short test".

### cut here ### snip 8< ### check.mpl ###

restart:
kernelopts(printbytes=false):

str,st,ba,bu:=
time[real](),time(),kernelopts(bytesalloc),kernelopts(bytesused):
   B:=Array(1..2^26,98,datatype=integer[1]):
time[real]()-str,time()-st,
kernelopts(bytesalloc)-ba,kernelopts(bytesused)-bu;

str,st,ba,bu:=
time[real](),time(),kernelopts(bytesalloc),kernelopts(bytesused):
   readbytes("aaa",B):
time[real]()-str,time()-st,
kernelopts(bytesalloc)-ba,kernelopts(bytesused)-bu;

Vector[row](B[1..10]),Vector[row](B[-10..-1]);

restart:
kernelopts(printbytes=false):

fopen1:=define_external('fopen',
    'filename'::string,
    'mode'::string,
    'RETURN'::integer[4],
    'LIB'="libc.so.6"):
fread1:=define_external('fread',
    'ptr'::REF(ARRAY(datatype=integer[1])),
    'size'::integer[4],
    'count'::integer[4],
    'stream'::integer[4],
    'RETURN'::integer[4],
    'LIB'="libc.so.6"):
fclose1:=define_external('fclose',
    'stream'::integer[4],
    'RETURN'::integer[4],
    'LIB'="libc.so.6"):

B:=Array(1..2^26,98,datatype=integer[1]):

str,st,ba,bu:=
time[real](),time(),kernelopts(bytesalloc),kernelopts(bytesused):
   p:=fopen1("aaa","rb"):
   fread1(B,1,2^26,p):
   fclose1(p):
time[real]()-str,time()-st,
kernelopts(bytesalloc)-ba,kernelopts(bytesused)-bu;

Vector[row](B[1..10]),Vector[row](B[-10..-1]);

### cut here ### snip 8< ### end of file ###

ps. It can be handy to use integer[kernelopts('wordsize')/8] although Windows 64 sometimes makes that problematic.

It's probably worth mentioning that calling readbytes("aaa",B,N) for N the size didn't seem to speed it up.

So, now I wonder how fast endianness can be accomodated (in a C wrapper, obviously) for wider formats. And I wonder about ImageTools:-Read and ImportMatrix:-ReadBinaryFile. It's hard to tell, since as I am sure you know iolib(4,..) is a kernel built-in.

acer

reality := realities[reality];

look...

Answered: acer 32348

October 27 2010

0 0

@herclau You don't have to explain to me the use of SVD for Least Squares solving.

Look:

> y.U.Ss.V;

         [0.00520590831446885, 1.00159598094724, 0.999421478032667]

> V^%T.Ss.U^%T.y;

                            [0.00520590831446883]
                            [                   ]
                            [   1.00159598094724]
                            [                   ]
                            [  0.999421478032667]

Those two above are just transpositions of each other. Earlier, I suggested you use y.U.Ss.V. But of course you can use V^%T.Ss.U^%T.y as well, because they are just transpositions of each other! I simply chose the version which required no additional cost of actually doing transpositon.

You constructed Ss as the inverse of DiagonalMatrix(S[1 .. 3]) so of course Ss is symmetric. And so when multiplying y by V^%T.Ss.U^%T you are (least squares) solving M.x=y.

In your P2 you actually used V.U^%T.y.Ss which is no good. I don't know where you get that formula from.

Remember, SingularValues returns Vt which is the transpose of V. Forgetting that is not enough to generate your strange formula used in your P2, I think.

For fun (but perhaps not optimally using the 'thin' option, I didn't check), since Maple uses SVD to compute the non-square float pseudo-inverse,

> MatrixInverse(M).y;

                            [0.00520590831446895]
                            [                   ]
                            [   1.00159598094724]
                            [                   ]
                            [  0.999421478032667]

look...

Commented: acer 32348

October 27 2010

@herclau You don't have to explain to me the use of SVD for Least Squares solving.

Look:

> y.U.Ss.V;

         [0.00520590831446885, 1.00159598094724, 0.999421478032667]

> V^%T.Ss.U^%T.y;

                            [0.00520590831446883]
                            [                   ]
                            [   1.00159598094724]
                            [                   ]
                            [  0.999421478032667]

Those two above are just transpositions of each other. Earlier, I suggested you use y.U.Ss.V. But of course you can use V^%T.Ss.U^%T.y as well, because they are just transpositions of each other! I simply chose the version which required no additional cost of actually doing transpositon.

You constructed Ss as the inverse of DiagonalMatrix(S[1 .. 3]) so of course Ss is symmetric. And so when multiplying y by V^%T.Ss.U^%T you are (least squares) solving M.x=y.

In your P2 you actually used V.U^%T.y.Ss which is no good. I don't know where you get that formula from.

Remember, SingularValues returns Vt which is the transpose of V. Forgetting that is not enough to generate your strange formula used in your P2, I think.

For fun (but perhaps not optimally using the 'thin' option, I didn't check), since Maple uses SVD to compute the non-square float pseudo-inverse,

> MatrixInverse(M).y;

                            [0.00520590831446895]
                            [                   ]
                            [   1.00159598094724]
                            [                   ]
                            [  0.999421478032667]

E-Mail Address:
Password:
Remember Me:	Automatically sign in on future visits

E-Mail Address:
Password:
Remember Me:	Automatically sign in on future visits

Ask a Question

Create a Post

32348 Reputation

29 Badges

Social Networks and Content at Maplesoft.com

MaplePrimes Activity

These are replies submitted by acer

works for me...

works for me...

an observation...

an observation...

re-use float[8] workspaces...

re-use float[8] workspaces...

pls tell us when you find out...

pls tell us when you find out...

comparison...

yes...

not always...

not always...

excellent...

look...

look...

Save this setting as your default sorting preference?

Ask a Question

Create a Post

Generating PDF…

Save this setting as your default sorting preference?
Note: You can change your preference any time in your account settings
Don't show this again

From:
To:

Custom Message (optional):