## Bug in Mode

Maple 2016

Let us consider

```Statistics:-Mode(Binomial(n, p));
floor((1 + n) p)
```

Up to Wiki, the output is not correct. Simply no words.

## Bug in Statistics,PDF

Maple 2016

 (1)

 (2)

 (3)

There were recently submitted a dozen Maple bugs by me and others. Maplesoft have brought no responses. They keep strategic silence. True merit is not afraid of criticism.

## Bug in Probability

Maple 2016
```restart; with(Statistics):
X := RandomVariable(Normal(0, 1)): Y := RandomVariable(Uniform(-2, 2)):
Probability(X*Y < 0);```

crashes my comp in approximately 600 s. Mma produces 1/2 on my comp in 0.078125 s.

## Bug in Statistics:-CDF

Maple 2016

Let us consider

```with(Statistics):
X1 := RandomVariable(Normal(0, 1)):
X2 := RandomVariable(Normal(0, 1)):
X3 := RandomVariable(Uniform(0, 1)):
X4 := RandomVariable(Uniform(0, 1)):
Z := max(X1, X2, X3, X4); CDF(Z, t);

int((1/2)*(_t0*Heaviside(_t0-1)-_t0*Heaviside(_t0)-Heaviside(1-_t0)*Heaviside(-_t0)+Heaviside(-_t0)+Heaviside(1-_t0)-1)*(1+erf((1/2)*_t0*2^(1/2)))*(2^(1/2)*Heaviside(_t0-1)*exp(-(1/2)*_t0^2)*_t0-2^(1/2)*Heaviside(_t0)*exp(-(1/2)*_t0^2)*_t0-2^(1/2)*Heaviside(-_t0)*Heaviside(1-_t0)*exp(-(1/2)*_t0^2)-Pi^(1/2)*undefined*erf((1/2)*_t0*2^(1/2))*Dirac(_t0)-Pi^(1/2)*undefined*erf((1/2)*_t0*2^(1/2))*Dirac(_t0-1)+2^(1/2)*Heaviside(-_t0)*exp(-(1/2)*_t0^2)+2^(1/2)*Heaviside(1-_t0)*exp(-(1/2)*_t0^2)-Pi^(1/2)*undefined*Dirac(_t0)-Pi^(1/2)*undefined*Dirac(_t0-1)+Pi^(1/2)*Heaviside(_t0-1)*erf((1/2)*_t0*2^(1/2))-Pi^(1/2)*Heaviside(_t0)*erf((1/2)*_t0*2^(1/2))-exp(-(1/2)*_t0^2)*2^(1/2)+Pi^(1/2)*Heaviside(_t0-1)-Pi^(1/2)*Heaviside(_t0))/Pi^(1/2), _t0 = -infinity .. t)```

whereas Mma 11 produces the correct piecewise expression (see that here screen15.11.16.docx).

`Edit. Mma output.`

## Statistics:-Mode does not work properly...

reference :

Question:Quantile function
Posted:
Mikhail Drugov 88

In the reference above, Mikhail has raised a problem concerning the function Statistics:-Quantile.
A problem of the same kind exists for the function Mode.

In fact  Mode returns the value of the mode only for unimodal distributions ; but for "bimodal" distributions it does not work properly.
Theoritically the mode is the value where the PDF reaches its maximum maximorum. Except in very particular cases this maximum is unique, even if common language speaks of "bimodal distributions" instead of "two bumped distributions".

Here is an example of a two bumped distribution (Z) obtained by mixing two gaussians distributions.
It has two bumps (z=-1, z=2) but only one mode (z=-2).
It could be hopefully acceptable that Mode returns the {-2, 2} (even if only -2 is the true mode), but Mode returns also the value of z that minimizes PDF(Z, z), which is not correct at all.

 > restart: with(Statistics):
 > X := RandomVariable(Normal(-2,1)): Y := RandomVariable(Normal(2,1)):
 > r    := 0.4: f__Z := unapply((1-r)*PDF(X,t)+r*PDF(Y,t), t); Z    := Distribution(PDF=f__Z):
 (1)
 > plot(PDF(Z,t), t=-4..4);
 > Mode(Z);
 (2)
 >

## Quantile function...

Hello,

I need a bimodal distribution. Since I could not find any among the ones provided by Maple, I created a simple one:

with(Statistics):
U := Distribution(PDF = (proc (t) options operator, arrow; piecewise(t < -5, 0, t < 5, -(1/2000)*t^4+(9/1000)*t^2+7/80, 0) end proc)):
X := RandomVariable(U):

#Plotting PDF and CDF works fine:
plot(PDF(X, t), t = -infinity .. infinity);

plot(CDF(X, t), t = -infinity .. infinity)

However, plotting the quantile function does not work:

plot(Quantile(X, z), z = 0 .. 1);

it has a decreasing part for z<1/2 and a discontinuity at z=1/2.
I can plot it correctly as
plot('Quantile'(X, z), z = 0 .. 1);

but I wonder why the first option does not work for such a simple distribution.

## Nonlinear fit of difficult function...

I have a data point set:

```x_val:=<250,300,350,397,451,497,547,593,647,691,745,788,840,897>:
y_val:=<0,0.5,2,6.3,23.2,48.7,71.2,83.4,90.1,92.8,94.7,95.7,96.9,97.8>:```

I want to make a least square fit using this difficult function:

`function:=x->1-exp(-(k*exp(-(E/(8.314*873.15))*((873.15/x)-1)))*(0.026/350))`

but both Statistics[Fit]:

```with(Statistics):fit_nelog:=Fit(1-exp(-(k*exp(-(E/(8.314*873.15))*((873.15/x)-1)))*(0.026/350)),<x_val|y_val>,x,parameternames=[k,E],output=[parametervector,residualsumofsquares]);
```

and DirectSearch[DataFit]:

`with(DirectSearch):fit_nelog2:=DataFit(1-exp(-(k*exp(-(E/(8.314*873.15))*((873.15/x)-1)))*(0.026/350)),x_val,y_val,x,method=cdos);`

give wrong k,E parameters. The correct parameter values were obtained with Excel Solver:

```k=27843.3551042397

E=68.4```

The approximately correct parameters were fitted when using logarithm form of the function.
How can I obtain correct parameter values in Maple using given form of the function?

## Sampling from multivariate probability distributio...

Hi All,

I have a fucntion f(x,y,z) = exp(-x^2 -y^2 - z^4) and would like to plot the probabity density in real space. One method would be to randomly sample points in a grid based on f(x,y,z). The function f(x,y,z) is clearly peaked around x=y=z=0, so you would expect many points to lie around there. So the plot would look like a clump near (0,0,0) which gets less dense away from (0,0,0).

In the worksheet below, I sampled points from the Uniform distribution to file in the 3d-plot. I would like these points to be sampled from f instead, but am not sure how to do this.

Any help is appreciated,

 > restart;
 > with(Statistics):
 > R := 10; # x-axis size N := 100; # Number f points to sample
 (1)
 > # Unnormalized Probability distrubution f := (x,y,z) -> exp(-x^2 -y^2 - z^2);
 (2)
 > # Clearly f is peaked at (0,0,0) and decays. Therefore I want a plot a lot of points near (0,0,0), and fewer points away from (0,0,0) plot3d(f(x,y,0), x = -1..1, y = -1..1);
 > X := Sample(Uniform(-R, R), N):
 > Y := Sample(Uniform(-R, R), N): Z := Sample(Uniform(-R, R), N): XYZ := Matrix([[X], [Y], [Z]])^%T;
 (3)
 > ScatterPlot3D(XYZ, color = blue, symbolsize = 20);
 >
 >
 >
 >

## How to sample correlated Random Variables?...

Hi everyone,

I've not been able to figure this one out. Say I have an expression dependant on two random variables, like this:

C:=A+B

where A and B are randomvariables, each following a specific distribution.

If I ask for a Sample of C

Sample(C,1)

Maple will sample A, sample B and compute C. But say that A and B are correlated (with a cc of 0.8). How do I define this?

## Aggregate Statistics on DataFrames

by: Maple 2016

Aggregate statistics are calculated by splitting the rows of a DataFrame by each factor in a given column into subsets and computing summary statistics for each of these subsets.

The following is a short example of how the Aggregate command is used to compute aggregate statistics for a DataFrame with housing data:

To begin, we construct a DataFrame with housing data: The first column has number of bedrooms, the second has the area in square feet, the third has price.

`bedrooms := <3, 4, 2, 4, 3, 2, 2, 3, 4, 4, 2, 4, 4, 3, 3>:area := <1130, 1123, 1049, 1527, 907, 580, 878, 1075,          1040, 1295, 1100, 995, 908, 853, 856>:price := <114700, 125200, 81600, 127400, 88500, 59500, 96500, 113300,           104400, 136600, 80100, 128000, 115700, 94700, 89400>:HouseSalesData := DataFrame([bedrooms, area, price], columns = [Bedrooms, Area, Price]);`

Note that the Bedrooms column has three distinct levels: 2, 3, and 4.

`convert(HouseSalesData[Bedrooms], set);`

The following returns the mean of all other columns for each distinct level in the column, Bedrooms:

`Aggregate(HouseSalesData, Bedrooms);`

Adding the columns option controls which columns are returned.

`Aggregate(HouseSalesData, Bedrooms, columns = [Price])`

Additionally, the tally option returns a tally for each of the levels.

`Aggregate(HouseSalesData, Bedrooms, tally)`

The function option allows for the specification of any command that can be applied to a DataSeries. For example, the Statistics:-Median command computes the median for each of the levels of Bedrooms.

`Aggregate(HouseSalesData, Bedrooms, function = Statistics:-Median);`

By default, Aggregate uses the SplitByColumn command to creates a separate sub-DataFrame for every discrete level in the column given by bycolumn.

`with(Statistics);`
`ByRooms := SplitByColumn(HouseSalesData, Bedrooms);`

We can create box plots of the price for subgroups of sales defined by number of bedrooms.

`BoxPlot( map( (m)->m[Price], ByRooms),              deciles=false,              datasetlabels=["2 bdrms", "3 bdrms", "4 bdrms"],              color=["Red", "Purple", "Blue"]);`

I have recorded a short video that walks through this example here: https://youtu.be/e0pqCMyO3ks

## Results of TwoSampleTTest...

Hi,

I did some hypothesis testing exercises and I cross checked the result with Maple. I just used following vectors for an unpaired test

a := [88, 89, 92, 90, 90];
b := [92, 90, 91, 89, 91];

I ended up with the following solution:

HFloat(1.5225682336585966)
HFloat(-3.122568233658591)
for a 0.95 confidence interval.

Using

TwoSampleTTest(a, b, 0, confidence = .95, summarize = embed)

and

TwoSampleTTest(a, b, 0, confidence = .975, summarize = embed)

I get following results:

-2.75177 .. 1.15177

-3.13633 .. 1.53633

respectively. I can not explain the discrepancy.

Best regards,

Oliver

PS:

Maple Code in case files won´t be attached.

Unpaired t Test
restart;
Unpaired test-test dataset
a := [88, 89, 92, 90, 90];
b := [92, 90, 91, 89, 91];
The seÂ² estimate is given by:
seÂ²=var(a)+var(b)+2*cov(a*b)=var(a)+var(b)
seÂ²=
sigma[a]^2/Na+sigma[b]^2/Nb;
with Na, Nb being the length of vector a and b respectively.
2                              2
sigma[[88, 89, 92, 90, 90]]    sigma[[92, 90, 91, 89, 91]]
---------------------------- + ----------------------------
Na                             Nb
sigma[a]^2;
and
sigma[b]^2;
are approximated by
S[a]^2;
and
S[b]^2;
2
sigma[[88, 89, 92, 90, 90]]
2
sigma[[92, 90, 91, 89, 91]]
2
S[[88, 89, 92, 90, 90]]
2
S[[92, 90, 91, 89, 91]]
with
S[X]^2;
defined as
S[X]*`Â²` = (sum(X[i]-(sum(X[j], j = 1 .. N))/N, i = 1 .. N))^2/N;
2
S[X]
2
/      /         N       \\
|      |       -----     ||
|  N   |        \        ||
|----- |         )       ||
| \    |        /    X[j]||
|  )   |       -----     ||
| /    |       j = 1     ||
|----- |X[i] - ----------||
\i = 1 \           N     //
S[X] ï¿ï¾² = ----------------------------
N
with(Statistics);
Sa := Variance(a);
HFloat(2.1999999999999993)
Sb := Variance(b);
HFloat(1.3000000000000003)
Now we are ready to do hypothesis testing (0.95).
We have (with k=min(Na,Nb)=5):
C = mean(a)-mean(b); Deviation := t_(alpha/a, k-1)*se(Sa/k-Sb/k);
c := Mean(a)-Mean(b); deviation := 2.776*sqrt((1/5)*Variance(a)+(1/5)*Variance(b));
HFloat(-0.7999999999999972)
HFloat(2.3225682336585938)
upperlimit := c+deviation; lowerlimit := c-deviation;
HFloat(1.5225682336585966)
HFloat(-3.122568233658591)

Execution of built in student test
TwoSampleTTest(a, b, 0, confidence = .95, summarize = embed);

## An Interactive Application for Exploring Country...

by: Maple 2015

This is the second of three blog posts about working with data sets in Maple.

In my previous post, I discussed how to use Maple to access a large number of data sets from Quandl, an online data aggregator. In this post, I’ll focus on exploring built-in data sets in Maple.

Data is being generated at an ever increasing rate. New data is generated every minute, adding to an expanding network of online information. Navigating through this information can be daunting. Simply preparing a tabular data set that collects information from several sources is often a difficult and time consuming effort. For example, even though the example in my previous post only required a couple of lines of Maple code to merge 540 different data sets from various sources, the effort to manually search for and select sources for data took significantly more time.

In an attempt to make the process of finding data easier, Maple’s built-in country data set collects information on country-specific variables including financial and economic data, as well as information on country codes, population, area, and more.

The built-in database for Country data can be accessed programmatically by creating a new DataSets Reference:

`CountryData := DataSets:-Reference( "builtin", "country" );`

This returns a Reference object, which can be further interrogated. There are several commands that are applicable to a DataSets Reference, including the following exports for the Reference object:

`exports( CountryData, static );`

The list of available countries in this data set is given using the following:

`GetElementNames( CountryData );`

The available data for each of these countries can be found using:

`GetHeaders( CountryData );`

There are many different data sets available for country data, 126 different variables to be exact. Similar to Maple’s DataFrame, the columns of information in the built-in data set can be accessed used the labelled name.

For example, the three-letter country codes for each country can be returned using:

`CountryData[.., "3 Letter Country Code"];`

The three-letter country code for Denmark is:

`CountryData["Denmark", "3 Letter Country Code"];`

Built-in data can also be queried in a similar manner to DataFrames. For example, to return the countries with a population density less than 3%:

`pop_density := CountryData[ .., "Population Density" ]:`
`pop_density[ `Population Density` < 3 ];`

At this time, Maple’s built-in country data collection contains 126 data sets for 185 countries. When I built the example from my first post, I knew exactly the data sets that I wanted to use and I built a script to collect these into a larger data container. Attempting a similar task using Maple’s built-in data left me with the difficult decision of choosing which data sets to use in my next example.

So rather than choose between these available options, I built a user interface that lets you quickly browse through all of Maple’s collection of built-in data.

Using a couple of tricks that I found in the pages for Programmatic Content Generation, I built the interface pictured above. (I’ll give more details on the method that I used to construct the interface in my next post.)

This interface allows you to select from a list of countries, and visualize up to three variables of the country data with a BubblePlot. Using the preassigned defaults, you can select several countries and then visualize how their overall number of internet users has changed along with their gross domestic product. The BubblePlot visualization also adds a third dimension of information by adjusting the bubble size according to the relative population compared with the other selected countries.

Now you may notice that the list of available data sets is longer than the list of available options in each of the selection boxes. In order to be able to generate BubblePlot animations, I made an arbitrary choice to filter out any of the built-in data sets that were not of type TimeSeries. This is something that could easily be changed in the code. The choice of a BubblePlot could also be updated to be any other type of Statistical visualization with some additional modifications.

You can also interact with it via the MapleCloud: http://maplecloud.maplesoft.com/application.jsp?appId=5743882790764544

I’ll be following up this post with an in-depth post on how I authored the country selector interface using programmatic content generation.

## Why does Mean fail with it?...

Below is a custom distribution created based on a function that takes a parameter.

It is possible to create the custom distribution e.g. as D1 and then use it afterwards to find e.g. Mean, but it is not possible to call Mean directly with the creation of the distribution in the call.

Why is that ?

## How to get reverse probability?...

When defining a plain standard distributed stochastic variable X, and can find the probability of X <= 0.6 using the Probability function, but how can I get the value for a certain probability, as is done with the fsolve function for example below.

However, the fsolve used to defined Prev above appears to be a bad way to do it, since the Prev function can't for example plot.

Is there some build in way of doing reverse of Probability for a stocastical variable ?

## Maximum likelihood estimation...

Hi All

Assume that we have a stochastic model with following density function

and our goal is to estimate unknown parameters namely, alpha, beta, landa, mu and sigma by any available method especially maximum likelihood estimation method.
How can we do it with maple software?

Does the "MaximumLikelihoodEstimate" command can help?

or should i define Maximum Likelihood function first and then differentiate it according to unknown parameters?