Ronan

242 Reputation

13 Badges

7 years, 190 days
East Grinstead, United Kingdom

MaplePrimes Activity


These are replies submitted by Ronan

Read this from 2015. @ecterrab  would know if further developments have been implimented along this line.

https://www.mapleprimes.com/questions/205246-How-To-Use-Matrix-Symbol-Directly-In

 

@vv Thank you. I was not aware of msolve.

@Carl Love Originally I didn't think there would be so many solutions. So that idea went in the bin. Your line of code could will work really well.

 

@Kitonum Thank you. I didnt realise there would be 4000 solutions. 

 Something similar happened  around the start of the year. I suggested then that if a member with some level of status reples the OP should not be allowed without permission to delete the question.

https://www.mapleprimes.com/questions/226210-Disappearing-Questions#comment255592

 

@acer This is just an example for the question. I picked an arbitrary way of making large numbers and then extracting the data I want. I did try  a billion elements:- no hope ran all night on part 1 of your code. I have been wondering about what one would do with larger arrays. But that is not the immediate concern. It would be interesting to see how this runs on a a processor with say 16+ physical cores expecially a Ryzen.

@acer  I am using an I7-7700 4.8Ghz 64G ram on Win10. Maple 2019.0. For some reason I can't upload the maple files so it is pdfs.

I am aware of the timing variations you mentioned.

threadthingm_10million.pdf

 

threadthingm_100million.pdf

I like Threads:-Seq because it is simple for me to understand.

 

@tomleslie Sorry for the delay in replying to you.

Attached is a pdf of the worksheet run for 100million. 430secs parallel.!!!!

Lately I cant seem to upload maple documents. Just get a failed message.

 

 

speedUP4-1.pdf

@TechnicalSupport 

A couple of potentially relevant points might be. Using microsoft edge and I have File history switched on.

@acer Well the computer is nearly 2 years old, an i7-7700 4 core 4.2Gh overclocked to 4.8Gh 

kernelopts(numcpus)   8

ssystem("processor full");              [-1, ""]

How did the example run for you? The point here is with a simpler calculation using smaller numbers (that don't require much if any on the arbitrary precision routines) oddly the parallel processing was slower. With much larger numbers parallel is  it is significently faster.


I have attached the worksheet with simplified calculation. Look for the #s at end of lines to see changes. 

#
# Using the fastest sequential code I cam up with
# with nelems=10000000. Takes about 55 secs on my
# machine
#
  restart;
  nelems := 10000000: #also test 100,000,000 i.e 10x bigger
  n := 700;#374894756873546859847556:
  op(n):
  A := Array(1 .. 4, 1 .. nelems):
  length(n):
  st := time[real]():
  A[1,1..nelems]:= Array([seq( i*n, i=1..nelems)]): #was i^10*n   now i*n
  A[2,1..nelems]:= Array([seq( length(A[1, i])-1, i=1..nelems)]):
  A[3,1..nelems]:= Array([seq(iquo(A[1,i], 10^(A[2, i]-2)),i=1..nelems)]):
  A[4,1..nelems]:= Array([seq(irem(A[1,i],1000),i=1..nelems)]):
  time[real]() - st;
  A:
  A[1, -2];
  A[2, -2];
  A[3, -2];
  A[4, -2];

700

 

15.767

 

6999999300

 

9

 

699

 

300

(1)

#
# Parallelize the above across four tasks, Takes about
# 23 secs on my machine, so a speed up of 2.5X on the
# above sequential code
#
  restart;
  nelems := 10000000:
  n := 700;#374894756873546859847556:
  op(n):
  length(n):
  st := time():
  doTask:=proc(sv, nv)
               local rng:=nv-sv+1,
                     B:=Array(1..4,1..nv-sv+1),
                     i :
               B[1,1..rng]:= Array([seq( (sv-1+i)*n, i=1..rng)]):# was (sv-1+i)^10*n  now  (sv-1+i)*n
               B[2,1..rng]:= Array([seq( length(B[1, i])-1, i=1..rng)]):
               B[3,1..rng]:= Array([seq(iquo(B[1,i], 10^(B[2, i]-2)),i=1..rng)]):
               B[4,1..rng]:= Array([seq(irem(B[1,i],1000),i=1..nv-sv+1)]):
               return B;
         end proc:
  time() - st;
  contTask:=proc( V,W,X,Y )
                  return copy(ArrayTools:-Concatenate(2, V, W, X, Y)):
           end proc:
                  
  setupTask:=proc(n)
                  Threads:-Task:-Continue
                  ( contTask,
                    Task=[doTask, 1, n/4 ],
                    Task=[doTask, n/4+1, n/2],
                    Task=[doTask, n/2+1, 3*n/4],
                    Task=[doTask, 3*n/4+1, n]
                  );
            end proc:

  st := time[real]():
  A:=Threads:-Task:-Start( setupTask, nelems):
  time[real]() - st;
  A[1, -2];
  A[2, -2];
  A[3, -2];
  A[4, -2];

 

700

 

0.

 

23.518

 

6999999300

 

9

 

699

 

300

(2)

interface(rtablesize=10);

[10, 10]

(3)

 


 

Download speedUP3_but.mw

@tomleslie 

@tomleslie Thank you. Very imperssive. I have run two sets of tests.

1st set

nelems 100,000,000

n=374894756873546859847556 

timings 

non parallel 1570secs

parallel 520secs

2nd test

nelems=100,000,000

n=700

and I simplified the major calculation to i*n instead of i^10*n

non parallel 392secs That shows how cpu intensive opreating on large numbers is.

parallel 705secs. That was a surprise. I tested it twice

Now I have never dealt with parallel code before. I see you break up n into 4 sections. So I would presume one has to tell the system what to parallel and pick that carefully. I will look in the programming guide to get some basic understanding.

 

 

@tomleslie That is much faster. Glad you could get rig of convert to string. checked times using time[real]() 1000,000 took<3secs.

100,000,000 just over 1500secs. Cpu is maixing out but most of that seems to be multiplying very large numbers. Reduced the size of numbers by setting n=7. and just i*n.  Cpu usage dropped to between 30-50% on 4 core i7. Have some sort of other bug where suddenly produces iquo 1/100 error. Will check that again tonight.

1 2 3 4 5 6 7 Last Page 1 of 12