mmcdara

3815 Reputation

17 Badges

6 years, 120 days

MaplePrimes Activity


These are replies submitted by mmcdara

@vv 

Thank you for this intelligent and necessary (it successfully handle my first test case) reappropriation
(I hope this last term is the good one for its meanings in French and English seem to diverge slightly)

 I vote up

@nmacsai 

Another comment
If you have to open/close your csv file multiple times, it's worth saving it as a binary file and using that afterwards .
Note that @Carl Love's answer essentially raises the main question. 

@Carl Love 

"That depends on who's doing the recommending"

People who are used to handling  big data files generally agree on the point that CSV files are not optimal from the point of view of efficiency (red/write time and memory size).
For archiving (even if I do not understand what you mean), there exist more efficient file format than csv, for instance JSON or NetCDF to say a few (which is a binary self domumented format which can handle complex data structures).

See here big-data-file-formats-explained-dfaabe9e8b33 for instance

From my own experience (shared with many of my coworkers) I (most of us) think that a lot of people keep using CSV files or alike just because they can "see" what these files contain and that this reassures them.

What is funny is that you critize my comment about prefering binary format and that at the same time your answer to the @nmacsai  is based upon m files (which are binary files), thus implicitely considering binary format would be a better solution than csv.

 

Just a comment...
As a general rule, regardless of the language used, it is not recommended to work with csv files this size.
It is much more efficient to use binary files instead.

@kiraMou 

Sorry, I didn't read carefully enough.
To make up for it, here are some further details that might interest you:
solutions.mw

@kiraMou 

I thought you wanted solutions whose imaginary parts were in the range 0.5..1 for x[1] and x[2] and in the range -1..-0.5 for x[3] and x[4]?

The solution you give obviously does not meet these requirements.

Do I missed something?

@acer 

Happy to hear that it works for recent versions. So I just have to wait until I'm back in the office. In the meantime, I'll do something else.
Thank you

The first version is the faster than the two (not necessaily the tastest in absolute)
perf.mw
 

restart:

interface(version)

`Standard Worksheet Interface, Maple 2015.2, Mac OS X, December 21 2015 Build ID 1097895`

(1)

N   := 1000:
r   := rand(1..N):
L   := [seq(r(), k=1..N)]:
ind := [seq(r(), k=1..N)]:

f := proc(L, N)
  local ind := [seq(r(), k=1..N)]:
  return L[ind]
end proc:

CodeTools:-Usage(f(L, N), iterations=100):

memory used=15.84KiB, alloc change=0 bytes, cpu time=320.00us, real time=320.00us, gc time=0ns

 

with(Threads):
f := proc(L, N)
  local ind := [seq(r(), k=1..N)]:
  return Map(w->L[w],ind)
end proc:

CodeTools:-Usage(f(L, N), iterations=100):

memory used=76.12KiB, alloc change=6.56MiB, cpu time=1.40ms, real time=690.00us, gc time=0ns

 

f := proc(L)
  local res:=NULL:
  local n:
  local ind := [seq(r(), k=1..N)]:
  for n from 1 to nops(N) do
    res := res, L[ind[n]]
  end do:
  return res
end proc:

CodeTools:-Usage(f(L,ind), iterations=100):

memory used=8.12KiB, alloc change=512.00KiB, cpu time=360.00us, real time=370.00us, gc time=0ns

 

f := proc(L, N)
  local ind := [seq(r(), k=1..N)]:
  return L[ind]
end proc:

CodeTools:-Usage(f(L, N), iterations=100):

memory used=15.84KiB, alloc change=0 bytes, cpu time=320.00us, real time=320.00us, gc time=0ns

 

with(Threads):
f := proc(L, N)
  local ind := [seq(r(), k=1..N)]:
  return Map(w->L[w],ind)
end proc:

CodeTools:-Usage(f(L, N), iterations=100):

memory used=76.12KiB, alloc change=6.56MiB, cpu time=1.40ms, real time=690.00us, gc time=0ns

 

# with no rebuild of ind

ind := [seq(r(), k=1..N)]:

f := proc(L, ind)
  return L[ind]
end proc:
CodeTools:-Usage(f(L,ind), iterations=100):

f := proc(L, ind)
  return Map(w->L[w],ind)
end proc:
CodeTools:-Usage(f(L,ind), iterations=100):

memory used=7.90KiB, alloc change=0.78MiB, cpu time=20.00us, real time=20.00us, gc time=0ns
memory used=68.05KiB, alloc change=360.00KiB, cpu time=1.04ms, real time=350.00us, gc time=35.61us

 

 


 

Download perf.mw

 

@Carl Love 

Thanks

@acer @Carl Love

Thanks acer for correcting me on the number of digits that some _d01ajc  can handle.
If I have thought it there was such a limitation, that is because, while I was working on Mehdi's example, I got an error message that is easy to reproduce (with Maple 2015):

Digits:=20:
evalf(Int(x, x=0..1, method=_d01ajc))
Error, (in evalf/int) expect Digits<=15 for method _d01ajc but received 23


Two more points:

  1. Can you explain me why evalf is of no use use in your plot command?
    Example
    Digits:=10:
    
    f := Omega -> Int(subs(omega=Omega, omega*x), x=0..1, method=_d01ajc,epsilon=1e-4):
    
    f(1.0);   # non evaluated
    evalf(%); # evaluated
    
    # Below: Why is the definite integral evaluated without invoking evalf?
    # Would the plot command automatically apply evalf?
    
    plot(Omega -> Int(subs(omega=Omega, omega*x), x=0..1, method=_d01ajc,epsilon=1e-4),0..50)
    
      Int(Omega x, x = 0 .. 1, method = _d01ajc, epsilon = 0.0001)
         Int(x, x = 0 .. 1, method = _d01ajc, epsilon = 0.0001)
    
  2. About   int(eval(A1+A2,omega=20.099752),theta=0..Pi, numeric, digits=5)
    I agree with both of you that explicitely specifying the tolerance or using epsilon option would be a better way to get the desired accuracy.
    In my repy I just focused on what happens with a particular value of omega when using the same command that @mehdi jafari uses.

 

@MapleEnthusiast 

Counterexample
 

M := Matrix(2, 2, [a, 1-a, b, 1-b]):
LinearAlgebra:-Eigenvalues(M):
printf("%a",%);
Vector(2, [1,a-b])

# nothing proves that   a-b < 1

 

@acer 

For info: your own code plus symmetry and periodicity considerations

Maple 2015.2, APPLE UNIVERSAL OSX, Dec 20 2015, Build ID 1097895

Digits:=15:
CodeTools:-Usage(plot(Omega->Int(subs(omega=Omega,unapply(A1+A2,theta)),
                                 0..2*Pi,method=_d01ajc,epsilon=1e-4),
                      0..50)):

CodeTools:-Usage(plot(Omega->4*Int(subs(omega=Omega,unapply(A1+A2,theta)),
                                 0..Pi/2,method=_d01ajc,epsilon=1e-4),
                      0..50));

memory used=8.68MiB, alloc change=0 bytes, cpu time=1.64s, real time=1.64s, gc time=0ns
memory used=5.53MiB, alloc change=0 bytes, cpu time=445.00ms, real time=447.00ms, gc time=0ns

 

@mehdi jafari 

Very far from a  super computer:
Maple 2015.2
MacOS Mojave, 10.14.3, proc 2.9 GHz Intel Core i5  (7 years old I think)


"do you think solving it with more digits would change the results or not" : 
I don't think so, furthermore I believe that evalf(Int(...)) returns an error message when used with more than 15 digits (see @acer reply)

 

 

@acer 

Thank you acer.
Furthermore, I was completely unaware of the LargeExpression package, I just had a quick look at it and it looks extremely interesting

 

@mehdi jafari 

Please excuse me @Preben Alsholm  if I take the liberty of intervening in this discussion.
To @mehdi jafari :

The cpu time seems (to me) reasonable given the complexity of A1 and A2

CodeTools:-Usage( plot(w->  int(eval(A1+A2,omega=w),theta=0..2*Pi, numeric, digits=5), 0..50, color=red) );

memory used=314.85MiB, alloc change=0 bytes, cpu time=4.79s, real time=4.64s, gc time=224.68ms


Given the symmetries of A1+A2 one can reduce it this way

CodeTools:-Usage(plot(w->  int(eval(A1+A2,omega=w),theta=0..2*Pi, numeric, digits=5), 0..50)):
CodeTools:-Usage(plot(w->4*int(eval(A1+A2,omega=w),theta=0..Pi/2., numeric, digits=5), 0..50)):
memory used=328.38MiB, alloc change=36.00MiB, cpu time=5.13s, real time=5.08s, gc time=179.32ms
memory used=215.62MiB, alloc change=40.00MiB, cpu time=2.90s, real time=2.84s, gc time=131.88ms


As A1+A2 contains a "boundary layer of length of order 1e-4" located around omega=16 (the integral is numerically null for omega < 16 - 1e-4)...

taylor(op(2, A1)+op(2, A2), omega=16, 3);
plot(w->int(eval(A1+A2,omega=w),theta=0..2*Pi,numeric,digits=5),16-2e-5..16+2e-5);

... one may think that computing the integral for omega >= 16 (maybe this value can be adjusted) is enough. What I propose is to plot this.

plot(w->4*int(eval(A1+A2,omega=w),theta=0..Pi/2., numeric, digits=5), 16-4e-5..50)

To realize an objective (I think) comparison, one must evaluate each plot with the same number of points.
Doing

pp := plot(w->  int(eval(A1+A2,omega=w),theta=0..2*Pi, numeric, digits=5), 0..50):
plottools:-getdata(pp)[3]

shows 231 values of omega are used in the range [0..50].
Let's fix the omega step to the same value 50/230 and evaluate the integration time alone (plotting time discarded). One obtains:

J := proc() seq(int(eval(A1+A2,omega=w),theta=0..2*Pi, numeric, digits=5), w=0..50, 50/230): end proc:
CodeTools:-Usage(J(), output='cputime', iterations=10):
memory used=180.73MiB, alloc change=0 bytes, cpu time=2.24s, real time=2.13s, gc time=168.40ms

K := proc() seq(4*int(eval(A1+A2,omega=w),theta=0..Pi/2, numeric, digits=5), w=0..50, 50/230): end proc:
CodeTools:-Usage(K(), output='cputime', iterations=10):
memory used=162.65MiB, alloc change=0 bytes, cpu time=1.96s, real time=1.85s, gc time=152.76ms

Ka := proc() seq(4*int(eval(A1+A2,omega=w),theta=0..Pi/2, numeric, digits=5), w=16..50, 50/231): end proc:
CodeTools:-Usage(Ka(), output='cputime', iterations=10):
memory used=57.89MiB, alloc change=0 bytes, cpu time=718.70ms, real time=675.30ms, gc time=61.45ms

Comparing the cpu times between J and Ka shows the latter is three times faster than the former.

Next refinement (at least for me): precalculate the list of points [x, int(...)] and plot this list.

Kb := proc() 
  local u:=[ seq([w, 4*int(eval(A1+A2,omega=w),theta=0..Pi/2, numeric, digits=5)], w=16..50, 50/231) ]:
  plot(u):
end proc:

CodeTools:-Usage(Kb());
memory used=57.99MiB, alloc change=0 bytes, cpu time=879.00ms, real time=764.00ms, gc time=174.41ms

We are then 5 times faster than the original version given at the top ogfthis reply.
Probably better performances can be obtained by:

  • choosing carefuly the integration method and its options
  • adjusting the "omega step" (beyond, let's say, omega=30, the definite integral slighly evolves)

 

To end this: believe it would be more important to verify that the the results are not dependent of the integration method, of its options, and of the omega step.
For instance A1+A2 has discontinuities for all omega (at least for values right to the "boundary layer")r

discont(eval(A1+A2, omega=16.0001), theta);
{-0.5004339588e-1+3.141592654*_Z29, 0.5004339588e-1+3.141592654*_Z27, Pi*_Z67+(1/2)*Pi}

Are these discontinuities correctly handled with the integration procedure? Does this matter?

printf("%1.3e", evalf(eval(eval(A1+A2, omega=16.0001), theta=0.05004339586)))
1.600e+05

Interestingly the integral cannot be computed for omega=20.099742 .. 20.099751 ; the peak obtained in the original plot is hugely underestimated:

int(eval(A1+A2,omega=20.099752),theta=0..Pi, numeric, digits=5)
                            2.3543

 

First 17 18 19 20 21 22 23 Last Page 19 of 87