Carl Love

Lexical variables in evalhf'able and com...

Answered: Carl Love 28075

November 04 2018

0 0

[This Reply is not related to binomial-mod-prime computations, and is intended only for expert-level module writers who write some numerical or semi-numerical code. It relates to a technique that I used in the module above.]

I just developed a coding technique---shown in the module above---that may be novel. It allows lexical variables (in this case module locals) to be used in evalhf'able and compilable procedures. The evalf'able and compilable procedures are defined containing a dummy variable _M. The values of _M are potentially machine dependent (at least theoretically), so they are computed in the ModuleLoad and subs into the procedures. The same thing could be done for any values that are not known until runtime.

It's the only exact contingency table te...

Commented: Carl Love 28075

November 03 2018

I am saddened by the lack of enthusiasm for this Post. I believe that the code above is the only implementation in Maple of any exact^[*1] statistical test of independence in a contingency table (the same property that is tested by the inexact Statistics:-ChiSquareIndependenceTest). Some modern statisticians recommend that given that we have the necessary mathematical software, that exact tests always be used for small n (the sum of the entries in the table), say n < 1000. Furthermore the code above works for tables of any number of cells (i.e., the number of entries in the matrix) whereas some other programs that do Fisher's exact test are limited to two rows, two columns, or both.

^[*1]"Exact" in this context refers to the fact that the computations are done with the relevant discrete distributions, the multinomial (generalized hypergeometric), rather than with a continuous distribution such as Chi-squared, which is only asymptotically equivalent to the discrete distributions as n approaches infinity. "Exact" does not refer to the fact that the computations are done in exact rational-number arithmetic; however, the code above does that also.

Please try the new code...

Answered: Carl Love 28075

November 03 2018

0 0

@vv I understand now why there was that factor-of-4 time difference between our codes in the last test that you showed: I was trying to do too much with just one variety of Mul, which handled arbitrary expressions akin to mul. I improved it by adding a separate Mul for ranges of integers, such as factorials, and I made it compiled for the appropriate cases. Please try it out.

I may have used the word "recursive" incorrectly earlier, leading to some confusion on your part. I simply meant that the procedure Binomial makes a single call back to itself to handle the smaller binomials. I didn't mean that the Lucas code was being called again.

Faster factorials implemented...

Answered: Carl Love 28075

November 03 2018

0 0

Because of its abstractness and generality, my previous modular Mul command wasn't efficient for its most-common use (or most-common use relevant to this thread): the product of a range of integers (such as a factorial). So, I've now implemented Mul(m::integer..n::integer) mod p::posint as syntax additional to the Mul(e::algebraic, k::name= m::integer..n::integer) mod p::posint that I'd already implemented. This brings my code into closer alignment with VV's Fac procedure.

You don't need to use or understand any Mul syntax to use this code for its primary purpose: computing binomial(n,k) modulo a prime. It's all "under the hood". You simply use the syntax

Binomial(n,k) mod p.

The code is now (code below) in the form of a module. Simply entering the code into a Maple session or reading it in from a file will install the `mod/...` procedures. If the module is stored in a library, then simply invoking Binomial_mod_p() will install them.

Here is the code. For convenience, it also exists in the Code Edit Region of the attached worksheet.

Binomial_mod_p:= module()
description "Some tools to speed up the computation of binomial(n,k) mod p, p::prime";
option
      date= "9-Nov-2018",
       authors= (
          "VV from MaplePrimes  <...>",
          "Carl Love <carl.j.love@gmail.com> (Corresponding)"
    ),     
    reference= "https://en.wikipedia.org/wiki/Lucas%27s_theorem" 
;
################################################################################################################
# This is a module of tools whose primary purpose is to speed up the computation of binomial(n,k) mod p, for   #
# prime p. This code interacts with `mod` through the `mod/...` global hooks (an ancient pre-module            #
# interface). So, one simply invokes Binomial(n,k) mod p, where the Binomial is an inert function (see ?mod).  #  
#                                                                                                              #
# The speed of the computation, and the choice of algorithm, are based on the relative sizes of n, k, and p.   #
# all cases where p <= binomial(n,k)*k!, this code makes some significant attempt to use modular arithmetic to #
# improve upon the speed of the naive command binomial(n,k) mod p.                                             #
#                                                                                                              #
# This module has no exports. It may "installed" any of these ways:                                            #
#     1. Simply copy-and-paste this code into an execution group and execute;                                   #
#    2. Use the `read` command to import it from a file;                                                       #
#    3. If it's stored in a library, execute the module's name: Binomial_mod_p().                              #
# Any of these techniques will assign the necessary global `mod/...` procedures.                               #
#                                                                                                              #
# The module has a few userinfo statements, all keyed to infolevel[Binomial]:= 1.                              #
################################################################################################################
local
    #The methods implemented as `mod/...` procedures by this module are Mul and Binomial. If methods are added,
    #then their names should local procedures in this module named X such that the corresponding `mod/...` 
    #procedure will be named `mod/X` (and thus the usage will be X(...) mod p). The names should be added to
    #the list immediately below. Everything else will then happen automatically. 
    ModMethods:= [Mul, Binomial],    

    #The ModuleApply only exists so that invoking Binomial_mod_p() will force the ModuleLoad to
    #run if the module is stored in a library.
    ModuleApply:= proc($)
        userinfo(1, Binomial, "The `mod/...` procedures have been protected and installed.");
        return
    end proc,

    ModuleUnload:= proc()
        userinfo(1, Binomial, "The module Binomial_mod_p is being garbage collected.");
        return
    end proc,
        
    #Some constants specifying the largest "convenient" integer for various modes of computation:
    Mhf, Mgmp, Mcomp,
    #...and a proc to initialize them:
    Init_Ms:= proc()
        #In hfloat mode, we try to keep the mantissa half full so that products don't overflow it: 
            Mhf:= isqrt(Scale2(1, trunc(evalhf(DBL_MANT_DIG))-1));  
        #In GMP mode, we try to keep products to 1-word integers.
         Mgmp:= kernelopts(maximmediate);
         #Likewise for compiled mode, but the wordsize is slightly different.
         Mcomp:= Scale2(1, kernelopts(wordsize)-1) - 1;
        return
    end proc,

     ###########################################################
    # The forms of Mul currently implemented are
    #    - Mul(m..n): 
    #        a range of integers;
    #    - Mul(e, k= m..n):
    #         an expression e (likely depending on name k), with k iterated over a range of integers. 
       Mul:= proc()
    option remember;
    description "like mul, but in modular arithmetic";
           if nargs=2 then 
               if args[1]::range then `Mul/range`(args)
            else `Mul/container`(args)
             fi
           elif nargs=3 then `Mul/2arg`(args)
           else error "Expecting 1 or 2 arguments, but received", nargs-1
           fi
    end proc,

    `Mul/range/compiled`:=  
    proc(m::integer, n::integer, p::posint)::integer;
    option autocompile;
    local k::integer, r::nonnegint:= 1;
        for k from m to n do
            r:= r*k mod p;
            if r = 0 then break fi
        od;
        r
    end proc,

    `Mul/range`:= proc(R::range(integer), p::posint)
    description
          "Computes the modular product of a range of integers (just like 1-argument mul, but modular)."
           " The modulus needn't be prime. Attempts compiled mode for word-sized p."
    ;
    local r, k, m:= lhs(R), n:= rhs(R), M, MR:= max(1, CopySign(m,1), CopySign(n,1)); 
          if p < Mcomp then #Use compiled code.
              try return `Mul/range/compiled`(m, n, p) mod p
              catch: userinfo(1, Binomial, "Compiled code failed. Will use endogenous arithmetic instead.")
              end try
        fi;
        r:= 1; M:= max(p, iquo(Mgmp, max(1, MR)));
        for k from `if`(m>=0, m, -n) to `if`(m>=0, n, -m) while r <> 0 do
            r:= r*k; 
            if r >= M then r:= irem(r, p) fi
        od;
        `if`(m>0 or (k-m)::even, r, -r) mod p
    end proc,

    `Mul/2arg/evalhf`:= proc(e, m, n, p)
     local k, r:= 1, s:= 1, M:= _M, M2:= M^2;
         for k from m to n while r <> 0 do
            r:= r*e(k);
               if r > M2 then error fi; #too large for exact representation
               if r < 0 then r:= -r; s:= -s fi;
               if r >= M then r:= irem(r,p) fi
          od;
          s*r
     end proc,
            
    `Mul/2arg`:= proc(E, J::(name=range), p::posint)
    description 
          "Just like two-argument `mul`, but in modular arithmetic. "
          "The modulus needn't be prime. Attempts hfloat mode when p < 2^26."
    ;
    local 
           r, k, s, m:= lhs(rhs(J)), n:= rhs(rhs(J)), M,
           #This is an iterated subs--not an ordinary multiple subs--used like `unapply`: 
           e:= subs(_E= E, lhs(J)= _K, _K-> _E)
    ;
          if p < Mhf then 
              #It's worth trying to compute exactly in hfloats because it's much faster.
             try return trunc(evalhf(`Mul/2arg/evalhf`(e, m, n, p, M))) mod p
              catch: userinfo(1, Mul, "Evalhf failed. Will use endogenous arithmetic instead.")
             end try  
           fi;
           r:= 1;  s:= true;  M:= max(p, iquo(Mgmp, p)); 
           for k from m to n while r <> 0 do
              r:= r*e(k);
              if r < 0 then r:= -r; s:= not s fi;
              if r >= M then r:= irem(r,p) fi
        od;
           `if`(s, r, -r) mod p
    end proc,

    Binomial:= proc(N::nonnegint, K::nonnegint, p::prime)
    option remember;
    local r, n, k:= min(K, N-K), i, num, M:= max(p, iquo(Mgmp, p));
        if p > k and k >= 0 then #Straight-forward method for large p 
             num:= Mul(N-k+1..N, p);
             return `if`(num=0, 0, num/':-Mul'(1..k) mod p)
        fi;
           #Lucas's method for small p, with callbacks to method above for the
        #sub-binomials:
        r:= `if`(K > N, 0, 1);  n:= N;
        while k <> 0 and r <> 0 do 
             r:= r*thisproc(irem(n,p,'n'), irem(k,p,'k'), p);
              if r >= M then r:= irem(r,p) fi 
           od;
           r mod p
    end proc,    
    
    ModuleLoad:= proc()
    local P, N;
        #Set upper bounds of computation modes:
        Init_Ms();
        #Insert them into relevant procs:
        `Mul/2arg/evalhf`:= subs(_M= Mhf, eval(`Mul/2arg/evalhf`));
        #Compile compilable procedure:
        Compiler:-Compile(`Mul/range/compiled`, 'warnings'); 
        #Protect global `mod` names and assign `mod/...` procs:
        for P in ModMethods do
            N:= cat(`mod/`, P);
            if not N::protected then assign(N= P) fi;
             protect~([N, cat(``, P)])
        od;
        return
    end proc,  
`;`;
    ModuleLoad()
end module:

Binomial_mod_p.mw

I need to add some compiled code to this, akin to what VV did. When I do, I'll just edit this Reply.

[Edit: I've now added the compiled code.]

[Edit 9-Nov-2018: The code and worksheet have been updated.]

The hypergeom works better...

Answered: Carl Love 28075

November 02 2018

0 0

@acer No doubt your hypergeometric expression is equivalent to my regularized incomplete beta function in integral form. I lost my kernel connection about a dozen times (mostly on smaller p) while producing that post, so I guess that the fault is in the numeric integration rather than NextZero. Did you have any such trouble?

p <= k case...

Answered: Carl Love 28075

November 02 2018

0 0

@vv I am extremely skeptical that your direct modular method for p <= k (posted immediately above) could possibly be faster than using Lucas's combined with the direct modular method for the smaller binomials. Please post an example that shows it.

Your matrices...

Answered: Carl Love 28075

November 02 2018

0 0

@jga You wrote:

To be honest I can't follow the discussion that's going on in here.

There are two algorithms being discussed here: Lucas's and what VV may have inadvertently dubbed "the direct method". Both of these are implemented in modular arithmetic. Both are much much faster than the naive direct method binomial(n,k) mod p. The only issue is when is the optimal time to automatically switch from one method to the other. You don't need to understand the inner workings of this in order to use the code; it's done automatically.

I can tell you that in my algorithm there's no relation between N and K with p.

Even if they're completely independent, you must have some idea of their individual ranges or distributions.

I'll try the integrated version and see how does it compare to the direct method in my case.

You'd just be wasting your time if by "direct method" you mean binomial(n,k) mod p. Use the integrated version that I just posted. It's definitely faster.

After this I still need to find the kernel (or equivalently, the rank) of them in prime characteristic.

The command LinearAlgebra:-Modular:-Rank is super fast for this, especially for p < 2^25.

Optimized and recursively combined proce...

Answered: Carl Love 28075

November 02 2018

0 0

@vv I agree that they should be recursively combined, and I've done it below. I've also implemented an automatic diversion to evalhf for the "direct method" when the computations can thus be done exactly, because it's much faster. (They can be done exactly when the magnitude of the result is less than 2^53.) Any deviation from exactness causes the code to revert to native GMP.

`mod/Mul`:= proc(E, J::(name=range), p::posint)
description 
   "Just like two-argument `mul`, but in modular arithmetic. "
   "The modulus needn't be prime. Attempts hfloat mode when p < 2^26."
;
option author= "Carl Love <carl.j.love@gmail.com> 2-Nov-2018";
local 
   r, k, s, m:= lhs(rhs(J)), n:= rhs(rhs(J)),
   #This a is an iterated subs, not an ordinary multiple subs: 
   e:= subs(_E= E, lhs(J)= _K, _K-> _E),
   #In hfloat mode, we try to keep the mantissa half full
   #so that products don't overflow it: 
   M:= Scale2(1, iquo(trunc(evalhf(DBL_MANT_DIG)), 2)) #i.e., M = 2^26 
;
   if p < M then 
      #It's worth trying to compute exactly in hfloats because it's much faster.
      try
         return 
            trunc(evalhf(proc(e, m, n, p, M)
            local 
               k, r:= 1, s:= 1, M2:= M^2;
               for k from m to n while r > 0 do
                  r:= r*e(k);
                  if r > M2 then error fi; #too large for exact representation
                  if r < 0 then r:= -r; s:= -s fi;
                  if r >= M then r:= irem(r,p) fi
               od;
               s*r
            end proc
            (e, m, n, p, M)
            )) mod p
      catch:
         userinfo(1, Mul, "Evalhf failed. Will use GMP instead.")
      end try  
   fi;
   r:= 1;  s:= true;  M:= max(p, iquo(kernelopts(maximmediate), p)); 
   for k from m to n while r > 0 do
      r:= r*e(k);
      if r < 0 then r:= -r; s:= not s fi;
      if r >= M then r:= irem(r,p) fi
   od;
   `if`(s, r, -r) mod p
end proc:

`mod/Binomial`:= proc(N::nonnegint, K::nonnegint, p::prime)
description "Computes binomial(N,K) mod p, p::prime, by Lucas's theorem";
option
   author= "Carl Love <carl.j.love@gmail.com> 2-Nov-2018",
   reference= "https://en.wikipedia.org/wiki/Lucas%27s_theorem"
; 
local r, n, k:= min(K, N-K), i, num, M:= max(p, iquo(kernelopts(maximmediate), p));
   if p > k then #Straight-forward method for large p 
      num:= Mul(i, i= N-k+1..N) mod p;
      return `if`(num=0, 0, num/Mul(i, i= 1..k) mod p)
   fi;
   #Lucas's method for small p, with recursive callbacks to method above for the
   #sub-binomials:
   r:= `if`(K > N, 0, 1);  n:= N;
   while k > 0 and r > 0 do 
      r:= r*irem(thisproc(irem(n,p,'n'), irem(k,p,'k'), p), p);
      if r >= M then r:= irem(r,p) fi 
   od;
   r mod p
end proc:

Here's your most-recent example:

p:=nextprime(8*10^5): n:=p*2-1:  k:=floor(p/2):
CodeTools:-Usage(Binomial(n,k) mod p);
memory used=4.62KiB, alloc change=0 bytes, cpu time=312.00ms, real time=317.00ms, gc time=0ns
                             800010

Negative binomial with non-integer para...

Answered: Carl Love 28075

November 01 2018

0 0

@mmcdara According to the Wikipedia article "Negative binomial distribution", it is common (perhaps even standard) to use non-integer values for the "number of failures" parameter r. There it is said about the associated nomenclature:

The Pascal distribution (after Blaise Pascal) and Polya distribution (for George Pólya) are special cases of the negative binomial distribution. A convention among engineers, climatologists, and others is to use "negative binomial" or "Pascal" for the case of an integer-valued stopping-time parameter r, and use "Polya" for the real-valued case.

So it seems that the use of non-integer r reflects how the distribution is used in actual practice and has nothing to do with internal representations of factorials as Gamma functions.

Eliminating a variable...

Answered: Carl Love 28075

November 01 2018

0 0

@Rouben Rostamian Ah, I didn't "see" that it was possible to eliminate varphi from the constraint. Indeed the constraint is linear wrt varphi. As I have always been suspicious of the implicitdiff command, someone should check whether the results are really equivalent. If they aren't, then I have far more confidence that your results are the correct ones.

static is efficient by what measure?...

Answered: Carl Love 28075

October 31 2018

0 0

@Joe Riel Are static procedures more time efficient on a per-use basis? Or are they just more efficient (timewise and memorywise) when the object is created?

@acer Can you give an example of what you mean? (It is frustrating, yet understandable, that thismodule and static don't mix.)

Combined procedure...

Answered: Carl Love 28075

October 31 2018

0 0

@vv The two methods (VV's and mine) should be combined into a single procedure, so here it is. I think that p > k is sufficient for your method, right?

`mod/Mul`:= proc(E, J::(name=range), p::posint)
description "Just like `mul`, but in modular arithmetic. The modulus needn't be prime.";
local r:= 1, k, e:= subs(lhs(J)= k, E), s:= true;
   for k from lhs(rhs(J)) to rhs(rhs(J)) while r > 0 do
      r:= r*eval(e,2);
      if r < 0 then r:= -r; s:= not s fi;
      if r >= p then r:= irem(r,p) fi
   od;
   `if`(s, r, -r) mod p
end proc:

`mod/Binomial`:= proc(N::nonnegint, K::nonnegint, p::prime)
description "Computes binomial(N,K) mod p, p::prime, by Lucas's theorem";
option
   author= "Carl Love <carl.j.love@gmail.com> 31-Oct-2018",
   reference= "https://en.wikipedia.org/wiki/Lucas%27s_theorem"
; 
local r:= `if`(K > N, 0, 1), n:= N, k:= min(K, N-K), i;
   if r = 0 then return 0 fi;
   #Straight-forward method for large p: 
   if p > k then return Mul(n-i, i= 0..k-1)/Mul(i, i= 1..k) mod p fi; 
   while k > 0 and r > 0 do #Lucas's method for small p:
      r:= r*irem(binomial(irem(n,p,'n'), irem(k,p,'k')), p) 
   od;
   r mod p
end proc:

Your example:

(n,k,p):= (8*10^5, 5*10^5, nextprime(8*10^5)):
CodeTools:-Usage(Binomial(n,k) mod p);
memory used=84.86MiB, alloc change=6.01MiB, cpu time=781.00ms, 
real time=780.00ms, gc time=31.25ms
                             494383

It's a valid reason to edit...

Answered: Carl Love 28075

October 31 2018

0 0

@acer I agree with what you did: It's a good reason to edit a Question. And thank you for being a conscientious Moderator. There should still be a way to do it without changing the position, or to reset the position once it's done.

Likely possible, but not with solve...

Answered: Carl Love 28075

October 31 2018

0 0

It's likely possible, but you need to clean it up. I mean clarify. There's no p in your equations. What is G? Will you be able to specify p and q numerically at the time that you solve it?

This is definitely not handled by the command solve, which operates strictly over the complex numbers. (Even trying to restrict solve to real numbers is often problematic.) But there are several integer solvers, such as chrem, msolve, and isolve. Look up their help by typing

?chrem

etc., at a command prompt.

solve(..., ..., parametric)...

Answered: Carl Love 28075

October 31 2018

0 0

@Rohith Since you only deal with polynomials, there may be a symbolic solution available for you. See ?solve,parametric. Using your posted example:

f:=2*a*b-3*a-2*b-c: # 15 <= f <= 30

Pick any two variables to get piecewise solutions broken down by the third. For example, do

solve({f > 30}, {a,b}, parametric)

This'll tell you (among a great many other things) that c=33 is a critical value upon which the branching of solutions happens. Apply this recursively:

solve({f > 30, c < 33}, {a}, parametric)

We thus learn that b=3/2 is critical. We are building a tree of all valid combination of a, b, c, f. This is hard symbolic work, but it's a start. In particular, deconstructing any piecewise programmatically is annoying; deconstructing those spit out by solve(..., ..., parametric) is brutal^[*1].

The parametric option to solve is only available for sets of polynomial inequalities and/or equations.

^[*1]It would help a little if there were an option akin to the ifactor / ifactors dichotomy: one version being prefrerred for visual display and one for programmatic uses.

E-Mail Address:
Password:
Remember Me:	Automatically sign in on future visits

E-Mail Address:
Password:
Remember Me:	Automatically sign in on future visits

Ask a Question

Create a Post

28075 Reputation

25 Badges

MaplePrimes Activity

These are replies submitted by Carl Love

Lexical variables in evalhf'able and com...

It's the only exact contingency table te...

Please try the new code...

Faster factorials implemented...

The hypergeom works better...

p <= k case...

Your matrices...

Optimized and recursively combined proce...

Negative binomial with non-integer para...

Eliminating a variable...

static is efficient by what measure?...

Combined procedure...

It's a valid reason to edit...

Likely possible, but not with solve...

solve(..., ..., parametric)...

Save this setting as your default sorting preference?

Ask a Question

Create a Post

Generating PDF…

Save this setting as your default sorting preference?
Note: You can change your preference any time in your account settings
Don't show this again

From:
To:

Custom Message (optional):