sursumCorda

1279 Reputation

15 Badges

2 years, 353 days

MaplePrimes Activity


These are questions asked by sursumCorda

There are two opposing commands remove and select in Maple. According to the main help page, StringTools:-RegSplit effectively implements the removal (i.e., capturing substrings that does not match the given pattern), but as regards extracting the matching parts of the input string (e.g., this example from MatLab), where is the command to carry out the selection?
At present I can do something like 

use StringTools in RegFilter := (p::string, s::string) -> select[2](RegMatch, sprintf("^%s$", p), (op@NGrams)~(s, [`$`](Length(s)))) end:
RegFilter("a.++b", "aabbbaaabb");
 = 
 ["aab", "abb", "aab", "abb", "aabb", "abbb", "aaab", "aabb", 

   "aabbb", "aaabb", "abbbaaab", "aabbbaaab", "abbbaaabb", 

   "aabbbaaabb"]


Nevertheless, there exist at least two disadvantages to it.

This is essentially equivalent to letting the matcher keeps starting at the same position until no more new matches are found, while sometimes one may just need the matcher to continue the shortest-match testing at the character following the last matched substring after finding a match:

For instance, there should be three flags in “RegFilter("a.+?b", "aabbbaaabb", 'overlapped'=⁇);”: 
⒈“["aab", "aaab"]” (selection with no overlap), 
⒉“["aab", "abb", "aaab", "aab", "abb"]” ( with partial overlaps), and 
⒊“["aab", "aabb", "aabbb", "aabbbaaab", "aabbbaaabb", "abb", "abbb", "abbbaaab", "abbbaaabb", "aaab", "aaabb", "aab", "aabb", "abb"]” ( with full overlaps). 

Unfortunately, the  above can only handle the last mode.

Another disadvantage is its inefficiency. Considering the following regular expression which matchs email-like strings

sample := Import("https://github.com/mariomka/regex-benchmark/raw/optimized/input-text.txt"): # lengthy 
re := "[a-zA-Z_0-9\\.+-]+@[a-zA-Z_0-9\\.-]+\\.[a-zA-Z_0-9\\.-]+":
time[real]((remnants := StringTools:-RegSplit(re, sample)));
 = 
                             3.157

So Maple is capable of completing this removal within 4 seconds, yet if I execute the analogous

timelimit(60, time[real](RegFilter(re, sample))); 

Maple will end up running out of memory.

In view of these, where is the generic StringTools:-RegFilter functionality? Or can we construct those matched cases from  and the original text?

In short, I'd like to obtain n largest/smallest elements in a huge list of (probably non-numeric) data. Of cource I can sort it and then extract the desired part, yet isn't there a dedicated procedure that do a partial sort of the input data in Maple?

Edit. In a MatLab weblog, the blogger gave: 

So I believe that a dedicated one is not useless. But what is the Maple equivalent to MatLab's maxk, mink, and topkrows?

The sequence A161786 consists of “primes with at least one digit appearing exactly four times in the decimal expansion”. Below is the Maple program given in that OEIS page:

The code above picks out primes having exactly four identical digits (determined by ) from the first 10,000 prime numbers. However, it's easy to check that this program is rather slow (It takes about 2.6s to execute it!). 
Actually, I would like to select such primes from pn, pn+1, pn+2, …, pn+m-1 (typically, n=1,000,000 and m=1,000,000), where pk denotes the k-th prime number, yet the original program failes to do so in twenty minutes. Part of the reason is that for long sequences, the efficiency can be critical. Therefore I make a slight modification to the original code: 

A161786__0 := proc(m::nonnegint, n::posint := 1, ` $`)::'Vector'(prime):
	#(*
	    kernelopts(opaquemodules = false):
	# *)
	local p := ifelse(n = 1, 1, ithprime(n - 1)), vec := Vector('datatype' = prime);
	to m do
		if ormap(`=`, MultiSet(`convert/base`:-MakeSplit(length((p := nextprime(p))), 1, 10)(p)):-hash, 4) then
			vec ,= p
		fi
	od;
	vec
end:

Nevertheless, this version is still inefficient: 

time[real](A161786__0(10**6, 10**6));
 = 
                            182.414

Another choice is converting each of integers into a string: 

A161786__1 := proc(m::nonnegint, n::posint := 1, ` $`)::'Vector'(prime):
	options no_options:
	local p := ifelse(n = 1, 1, ithprime(n - 1)), vec := DEQueue();
	to m do
		if member(4, rhs~({StringTools['CharacterFrequencies'](nprintf("%d", (p := nextprime(p))), 'digit')})) then
			vec ,= p
		fi
	od;
	Vector([vec[]], 'datatype' = prime)
end:

This time the elapsed time is reduced to nearly two minutes: 

time[real](A161786__1(10**6, 10**6));
 = 
                            118.409

But can this task be accomplished within (a quarter of) a minute in modern Maple? In other words, is there a way to make further improvement on the performance? (Note that the reference time is mesured using a adjusted version (i.e., ) of the Mma code provided in that OEIS page.) 

Since strings are not mutable objects in Maple, the package provides two procedures, StringTools:-OldStringBuffer and StringTools:-StringBuffer, which appear heavily correlated with Java's  and . 

The help page of StringBuffer claims that use of a is much more efficient than the naive approach: 

(*
`G` and `F` are taken from the link above.
*)
G := proc()
   description "extremely inefficient string concatenator";
   local   r;
   r := proc()
       if nargs = 0 then
           ""
       elif nargs = 1 then
           args[ 1 ]
       else
           cat( args[ 1 ], procname( args[ 2 .. -1 ] ) )
       end if
   end proc;
   r( args )
end proc:
# # This can be transformed into an O(1) algorithm by passing a string buffer to the recursive calls.
F := proc()
   description "efficient version of G";
   local    b, r;
   b := StringTools:-StringBuffer();
   r := proc()
       if nargs = 1 then
           b:-append( args[ 1 ] )
       else
           b:-append( args[ 1 ] );
           procname( args[ 2 .. -1 ] )
       end if
   end proc;
   r( args ):-value()
end proc:
s := 'StringTools:-Random(10, print)' $ 1e4:
NULL;
time(G(s));
                             5.375

time(F(s));
                             1.125

But why not use the built-in cat directly? 

time(cat(s));
                               0.

time(StringTools:-Join([s], ""));
                               0.

Clearly, this is even more efficient

Here is the last example in that link. 

FilterFile := proc( fname::string, filter )
   local   b, line;
   b := StringTools:-StringBuffer();
   do
       line := readline( fname );
       if line = 0 then break end if;
       b:-append( filter( line ) )
   end do;
   b:-value()
end proc: # verbatim 
filename__0 := FileTools:-JoinPath(["example", "odyssey.txt"], 'base' = 'datadir'):
filename__1 := URL:-Download("https://gutenberg.org/ebooks\
/2600.txt.utf-8", "War-and-Peace.txt"):

fclose(filename__0):
    time[real]((rawRes0 := FilterFile(filename__0, StringTools:-Unique)));
                             0.223

fclose(filename__1):
    time[real]((rawRes1 := FilterFile(filename__1, StringTools:-Unique)));
                             1.097

Nevertheless, 

close(filename__0):
use StringTools, FileTools:-Text in
	time[real]((newRes0 := String(Support~(fscanf(filename__0, Repeat("%[^\n]%*c", CountLines(filename__0))))[])))
end;
                             0.118

close(filename__1):
use StringTools, FileTools:-Text in
	time[real]((newRes1 := String(Support~(fscanf(filename__1, Repeat("%[^\n]%*c", CountLines(filename__1))))[])))
end;
                             0.580

evalb(newRes0 = rawRes0 and newRes1 = rawRes1);
                              true

As you can see, these experiments just tell an opposite story. Isn't the so-called "StringBuffer" obsolete today

Delay differential equations in Chebfun lists 15 examples "taken from the literature". Many of them can be (numerically) solved in Maple without difficulty, yet when I attempt to solve the  in the above link, Maple's internal solver `dsolve/numeric` just halts with an error. 

plots:-odeplot(dsolve({D(u)(t) + u(t)**2 + 2*u(1/2*t) = 1/2*exp(t), u(0) = u(1/3)}, type = numeric, range = 0 .. 1/3), size = ["default", "golden"]);
Error, (in dsolve/numeric) delay equations are not supported for bvp solvers

Even if I guess an initial (or final) value artificially, the solution is still less reliable (For instance, what is the approximate endpoint value? 0.26344 or 0.2668?): 

restart;
dde := D(u)(t) + u(t)**2 + u(t/2)*2 = exp(t)/2:
x__0 := 2668/10000:
sol0 := dsolve([dde, u(0) = x__0], type = numeric, 'delaymax' = 1/6, range = 0 .. 1/3):
plots['odeplot'](sol0, [[t, u(t)], [t, x__0]], 'size' = ["default", "golden"]);

x__1 := 26344/100000:
sol1 := dsolve([dde, u(1/3) = x__1], type = numeric, 'delaymax' = 1/6, range = 0 .. 1/3):
plots['odeplot'](sol1, [[t, u(t)], [t, x__1]], size = ["default", "golden"]);

Compare:  (Note that the reference numerical solution implies that its minimum should be no less than 0.258 (Is this incorrect?).).

And actually, the only known constraint is simply u(0)=u(⅓) (so neither value is known beforehand). Can Maple process this boundary condition automatically (that is, without the need for manual preprocessing and in absence of any other prior information)?
I have read the help page How to | Numeric Delay Differential Equations and Numerical Solution of Difficult ODE Boundary Value Problems, but it appears that those techniques are more or less ineffective here. So, how do I solve such a "first order nonlinear 'BVP' with pantograph delay" in Maple?

First 8 9 10 11 12 13 14 Last Page 10 of 23