MaplePrimes Posts

MaplePrimes Posts are for sharing your experiences, techniques and opinions about Maple, MapleSim and related products, as well as general interests in math and computing.

Latest Post
  • Latest Posts Feed
  • Yesterday I wrote a post that began,

    "I realized recently that, while 64bit Maple 15 on Windows (XP64, 7) is now using accelerated BLAS from Intel's MKL, the Operating System environment variable OMP_NUM_THREADS is not being set automatically."

    But that first sentence is about where it stopped being correct, as far as how I was interpreting the performance on 64bit Maple on Windows. So I've rewritten the whole post, and this is the revision.

    I concluded that, by setting the Windows operating system environment variable OMP_NUM_THREADS to 4, performance would double on a quad core i7. I even showed timings to help establish that. And since I know that memory management and dynamic linking can cause extra overhead, I re-ran all my examples in freshly launched GUI sessions, with the user-interface completely closed between examples. But I got caught out in a mistake, nonetheless. The problem was that there is extra real-time cost to having my machine's Windows operating system dynamically open the MKL dll the very first time after bootup.

    So my examples done first after bootup were at a disadvantage. I knew that I could not look just at measured cpu time, since for such threaded applications that reports as some kind of sum of cycles for all threads. But I failed to notice the real-time measurements were being distorted by the cost of loading the dlls the first time. And that penalty is not necessarily paid for each freshly launched, completely new Maple session. So my measurements were not fair.

    Here is some illustration of the extra real-time cost, which I was not taking into account. I'll do Matrix-Matrix multiplication for a 1x1 example, to try and show just how much this extra cost is unrelated to the actual computation. In these examples below, I've done a full reboot on Windows 7 where so annotated. The extra time cost for the very first load of the dynamic MKL libraries can be from 1 to over 3 seconds. That's about the same as the cpu time this i7 takes to do the full 3000x3000 Matrix multiplication! Hence the confusion.

    Roman brought up hyperthreading in his comment on the original post. So part of redoing all these examples, with full restarts between them, is testing each case both with and without hyperthreading enabled (in the BIOS).

    Quad core Intel i7. (four physical cores)
    
    Hyperthreading disabled in BIOS
    -------------------------------
    
    > restart: # actual OS reboot
    > getenv(OMP_NUM_THREADS);   # NULL, unset in OS
    
    > CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
    memory used=217.18KiB, alloc change=127.98KiB, cpu time=219.00ms, real time=3.10s
    
    > CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ):
    memory used=9.46KiB, alloc change=0 bytes, cpu time=0ns, real time=0ns
    
    
    > restart: # actual OS reboot
    > getenv(OMP_NUM_THREADS);
                                  "4"
    
    > CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
    memory used=216.91KiB, alloc change=127.98KiB, cpu time=140.00ms, real time=2.81s
    
    > CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ):
    memory used=9.46KiB, alloc change=0 bytes, cpu time=0ns, real time=0ns
    
    
    Hyperthreading enabled in BIOS
    ------------------------------
    
    > restart: # actual OS reboot
    > getenv(OMP_NUM_THREADS);    # NULL, unset in OS
    
    > CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
    memory used=217.00KiB, alloc change=127.98KiB, cpu time=202.00ms, real time=2.84s
    
    > CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ):
    memory used=9.46KiB, alloc change=0 bytes, cpu time=0ns, real time=0ns
    
    
    > restart: # actual OS reboot
    > getenv(OMP_NUM_THREADS);
                                  "4"
    
    > CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
    memory used=215.56KiB, alloc change=127.98KiB, cpu time=187.00ms, real time=1.12s
    
    > CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ):
    memory used=9.46KiB, alloc change=0 bytes, cpu time=0ns, real time=0ns
    
    
    

    Having established that the first use after reboot was incurring a real time penalty of a few seconds, I redid the timings in order to gauge the benefit of having OMP_NUM_THREADS set appropriately. These too were done with and without hyperthreading enabled. The timings below appear to indicate that slightly bettern performance can be had for this example in the case that hyperthreading is disabled. The timings also appear to indicate that having OMP_NUM_THREADS unset results in performance competitive with having it set to the number of physical cores.

    Hyperthreading disabled in BIOS
    -------------------------------
    
    > restart:
    > CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
    memory used=217.84KiB, alloc change=127.98KiB, cpu time=141.00ms, real time=142.00ms
    
    > getenv(OMP_NUM_THREADS);  # NULL, unset in OS
    
    > M:=LinearAlgebra:-RandomMatrix(3000,datatype=float[8]):
    > CodeTools:-Usage( M . M ):
    memory used=68.67MiB, alloc change=68.74MiB, cpu time=7.50s, real time=1.92s
    
    
    > restart:
    > CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
    memory used=217.84KiB, alloc change=127.98KiB, cpu time=141.00ms, real time=141.00ms
    
    > getenv(OMP_NUM_THREADS);
                                  "1"
    
    > M:=LinearAlgebra:-RandomMatrix(3000,datatype=float[8]):
    > CodeTools:-Usage( M . M ):
    memory used=68.67MiB, alloc change=68.74MiB, cpu time=7.38s, real time=7.38s
    
    
    > restart:
    > CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
    memory used=217.11KiB, alloc change=127.98KiB, cpu time=125.00ms, real time=125.00ms
    
    > getenv(OMP_NUM_THREADS);
                                  "4"
    
    > M:=LinearAlgebra:-RandomMatrix(3000,datatype=float[8]):
    > CodeTools:-Usage( M . M ):
    memory used=68.67MiB, alloc change=68.74MiB, cpu time=7.57s, real time=1.94s
    
    
    
    Hyperthreading enabled in BIOS
    ------------------------------
    
    > restart:
    > CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
    memory used=216.57KiB, alloc change=127.98KiB, cpu time=125.00ms, real time=125.00ms
    
    > getenv(OMP_NUM_THREADS);  # NULL, unset in OS
    
    > M:=LinearAlgebra:-RandomMatrix(3000,datatype=float[8]):
    > CodeTools:-Usage( M . M ):
    memory used=68.67MiB, alloc change=68.74MiB, cpu time=8.46s, real time=2.15s
    
    
    > restart:
    > CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
    memory used=216.80KiB, alloc change=127.98KiB, cpu time=125.00ms, real time=125.00ms
    
    > getenv(OMP_NUM_THREADS);
                                  "1"
    
    > M:=LinearAlgebra:-RandomMatrix(3000,datatype=float[8]):
    > CodeTools:-Usage( M . M ):
    memory used=68.67MiB, alloc change=68.74MiB, cpu time=7.35s, real time=7.35s
    
    
    > restart:
    > CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
    memory used=216.80KiB, alloc change=127.98KiB, cpu time=125.00ms, real time=125.00ms
    
    > getenv(OMP_NUM_THREADS);  # NULL, unset in OS
                                  "4"
    
    > M:=LinearAlgebra:-RandomMatrix(3000,datatype=float[8]):
    > CodeTools:-Usage( M . M ):
    memory used=68.67MiB, alloc change=68.74MiB, cpu time=8.56s, real time=2.15s
    
    
    > restart:
    > CodeTools:-Usage( Matrix([[3.]]) . Matrix([[3.]]) ): # initialize external libs
    memory used=216.80KiB, alloc change=127.98KiB, cpu time=125.00ms, real time=125.00ms
    
    > getenv(OMP_NUM_THREADS);
                                  "8"
    
    > M:=LinearAlgebra:-RandomMatrix(3000,datatype=float[8]):
    > CodeTools:-Usage( M . M ):
    memory used=68.67MiB, alloc change=68.74MiB, cpu time=8.69s, real time=2.23s
    

    With all those new timing measurements it appears that having to set the global environment variable OMP_NUM_THREADS to the number of physical cores may not be necessary. The performance is comparable, when that variable is left unset. So, while this post is now a non-story, it's interesting to know.

    And the lesson about comparitive timings is also useful. Sometimes, even complete GUI/kernel relaunch is not enough to get a level and fair field for comparison.

    As we look ahead to our next Maple release, I wanted to let you know of some changes that we are making  to the platforms and operating systems that will be supported by Maple 16.

    With Maple 16 we will be adding support for Linux Ubuntu 11.10 and Macintosh OS X 10.7 while dropping support for Linux Ubuntu 10.10 and Macintosh OS X 10.5.  As a result, we will no longer support Maple on the PPC platform (Apple stopped PPC support as of OS X 10.5).

    If...

    Recently posted onto Wolfram's Blog is a set of 10 tips for how to write fast Mathematica code.  It is a very amusing read -- go read it now, because below I am going to make some comments on it, assuming that you have read it.

     

    1. Use floating-point numbers if you can, and use them early.
      Basically: if you're using Mathematica as a...

    On November 22, Joe Riel posted an implicit differentiation problem that caught my attention. It took the manipulations typically learned in an Advanced Calculus course one step further, but the devices learned in such a course could readily be applied. Joe's solution was expressed in terms of exterior...

    I recently had a journal article accepted but was told that several of the graphs had to be redone because the axes were not thick enough to reproduce. Unfortunately, Maple has no way to edit the axes for thickness, so I had to export the differential equations to Matlab and integrate them there to get publication quality graphs. I have been having more trouble every year with the quality of Maple's graphics output, and this really puts a cherry on it. I would be pleased as punch if Maple could include a graph editor that would let us customize a graph to make it presentable to publishers. Maple does so many things so well it is a shame to leave this on the back burner.

    I've submitted an application to the Application Center: An Epidemic Model (for Influenza or Zombies).  This is an interactive Maple document, suitable for instructional use in an undergraduate course in mathematical biology or differential equations, or a calculus course that include differential equations. ...


    This is the Affine Scaling Algorithm outlined by:

    Linear and nonlinear programming with Maple: an interactive, applications ...
    Paul E. Fishback,Paul F. Fishback

    Wide set of expressions can convert to compiled functions. Expressions can even include definite integrals.
    Hope it helps for others who want really speedup calculations in maple as much as possible for now.

    ex.mw

     

    Checked under 15.01 version

    My daughter the psychiatrist recently shared a link with me that mentioned a factoid about Facebook: "84 per cent of people think their friends have more friends than they do".  Actually they don't just think this: for 84 percent of Facebook users, the median friend count of their friends is higher than their own friend count, according to

    The "." notation for the dot product of Vectors is very convenient and intuitive.  For example:

    > <1,2,3> . <1,1,1>;

    6

    One sometimes annoying feature of it, however, is that by default Maple is using a dot product (suitable for Vectors with complex scalars) that is conjugate-linear in the first argument.  But let's say you will only be working with real scalars.  There's no problem if your Vectors have numeric entries, but...

    Please reconsider order of messages in posts acorrding to option reply. I don't understand why my answer

    http://www.mapleprimes.com/questions/127969-Encapsulated-Transformation-Of-Expression

    ( Moments ago by to Axel Vogt) appears at the end of list instead to be in that branch where it should be.

    I also know that branch menu is completly for other purposes, not for this.

    Over the weekend I was attempting to estimate the tension change in a bicycle spoke due to an applied load.  After various simplifications and approximations, the problem was reduced to the following.

    Given a constraint, F(x,y) = 0, and functions G(x,y) and H(x,y), find dG/dH at a particular point, here (0,0). 

    The constraint, F, was sufficiently complicated that solving for either variable was not feasible, so implicit differentiation seemed the best...

     

    I have data organized in a Matrix from which I want to select a sample based on a certain (simple) criterion, e.g. no cell in the first column to be negative.

    I was quickly able to find a way to do it, inspired by a method Preben Alsholm used in a recent mapleprimes post.

    But, I wondered, is that the most natural approach? So I quickly found another approach and compared them.

    Any other suggestions welcome. Particularly methods that could...

    We are looking for someone that can develop some maple scripts for us. Must have some chemical engineering knowledge.

    contact us: smne33@hotmail.com

     

    regards.

    The following is the matter of so-called central limit theorems. We have the sum of random variables S:= ksi[1] + ksi[2] + .. ksi[n]. We know only that the number n is large, the variables are independent or weakly dependent, and each ksi[j] is small with respect to S in a certain sense.

    By the  central limit theorems it implies that S is close to the normal distribution.

    Here is the procedure which illustrates the Lindeberg-Levi theorem ( see

    First 114 115 116 117 118 119 120 Last Page 116 of 306