Items tagged with statistics

I am finding that the  PDF command with Student's t-distribution in the Statistics package is not behaving as expected. Here is what I tried so far:

>restart;
>with(Statistics):
>X := RandomVariable(StudentT(nu));
    
>PDF(X,0.5);
                         Dirac(X - 0.5)

Note that PDF(X,0.5) is evaluating to Dirac(X-0.5) instead of the pdf of Student's t-distribution density function. 

Any help in identifying the issue is greatly appreciated. I am running Maple2015 on linuxmint 17.

Thanks!

i want to solve the system of equation ( 1 )  , (2)  ,  (3)   under the assumation that x , y have the CDF in (4)  ,  (5)
 

diff(L(lambda[1], lambda[2], alpha), lambda[1]) = n/lambda[1]+sum(x[i], i = 1 .. n)-(sum(2*x[i]*exp(lambda[1])/(exp(x__i*`λ__1`)-1+alpha), i = 1 .. n))

diff(L(lambda[1], lambda[2], alpha), lambda[1]) = n/lambda[1]+sum(x[i], i = 1 .. n)-(sum(2*x[i]*exp(lambda[1])/(exp(x__i*`λ__1`)-1+alpha), i = 1 .. n))

(1)

diff(L(lambda[1], lambda[2], alpha), lambda[2]) = m/lambda[2]+sum(y[j], j = 1 .. m)-(sum(2*y[j]*exp(lambda[2])/(exp(y__j*`λ__2`)-1+alpha), j = 1 .. m))

diff(L(lambda[1], lambda[2], alpha), lambda[2]) = m/lambda[2]+sum(y[j], j = 1 .. m)-(sum(2*y[j]*exp(lambda[2])/(exp(y__j*`λ__2`)-1+alpha), j = 1 .. m))

(2)

diff(L(lambda[1], lambda[2], alpha), alpha) = (n+m)/alpha-(sum(2/(exp(x[i]*`λ__1`)-1+alpha), i = 1 .. n))-(sum(2/(exp(y[j]*`λ__2`)-1+alpha), j = 1 .. m))

diff(L(lambda[1], lambda[2], alpha), alpha) = (n+m)/alpha-(sum(2/(exp(x[i]*`λ__1`)-1+alpha), i = 1 .. n))-(sum(2/(exp(y[j]*`λ__2`)-1+alpha), j = 1 .. m))

(3)

G(x, lambda[1], alpha) = 1-alpha/(exp(lambda[1]*x)-1+alpha)

G(x, lambda[1], alpha) = 1-alpha/(exp(lambda[1]*x)-1+alpha)

(4)

G(y, lambda[2], alpha) = 1-alpha/(exp(lambda[2]*x)-1+alpha)

G(y, lambda[2], alpha) = 1-alpha/(exp(lambda[2]*x)-1+alpha)

(5)

``

``


 

Download internet.mw

Application that allows us to measure the reliability of a group of data through a row and columns called cronbach alpha at the same time to measure the correlation of items through the pearson correlation of even and odd items. It can run on maple 18 to maple 2017. This will be useful when we are developing a thesis in the statistical part.

In Spanish

StatisticsSocialCronbachPearson.zip

Lenin Araujo Castillo

Ambassador of Maple

 

 

There is a horse a buggy ride around a small village which takes roughly 30 minutes.  Here is an example timing for 12 consecutive rides [34, 29, 32, 32, 28, 28, 27, 28, 39, 24, 27, 27].

How can I create a monte carlo simulation graph that would estimate the future times based on given data?  Do I randomly pick numbers from the given list for a simulation or generate random numbers based on mean and standard deviation generated from the data?

When would the best possible time to come back after 4 rides be?

Let us consider 

Statistics:-Mode(Binomial(n, p));
                        floor((1 + n) p)

Up to Wiki, the output is not correct. Simply no words.


 

with(Statistics):````

X := Statistics:-RandomVariable(Normal(0, 1)):

PDF(sin(X), t)

piecewise(t <= -1, 0, t < 1, 2^(1/2)*exp(-(1/2)*arcsin(t)^2)/(Pi^(1/2)*(-t^2+1)^(1/2)), 1 <= t, 0)

(1)

int(%, t = -1 .. 1)

2*erf((1/4)*Pi*2^(1/2))

(2)

evalf(%)

1.767540069

(3)

``


There were recently submitted a dozen Maple bugs by me and others. Maplesoft have brought no responses. They keep strategic silence. True merit is not afraid of criticism.

Download Bug_in_Statistics_PDF.mw

restart; with(Statistics):
X := RandomVariable(Normal(0, 1)): Y := RandomVariable(Uniform(-2, 2)):
Probability(X*Y < 0);

crashes my comp in approximately 600 s. Mma produces 1/2 on my comp in 0.078125 s.

Let us consider

with(Statistics):
X1 := RandomVariable(Normal(0, 1)):
X2 := RandomVariable(Normal(0, 1)):
X3 := RandomVariable(Uniform(0, 1)): 
X4 := RandomVariable(Uniform(0, 1)):
Z := max(X1, X2, X3, X4); CDF(Z, t);

int((1/2)*(_t0*Heaviside(_t0-1)-_t0*Heaviside(_t0)-Heaviside(1-_t0)*Heaviside(-_t0)+Heaviside(-_t0)+Heaviside(1-_t0)-1)*(1+erf((1/2)*_t0*2^(1/2)))*(2^(1/2)*Heaviside(_t0-1)*exp(-(1/2)*_t0^2)*_t0-2^(1/2)*Heaviside(_t0)*exp(-(1/2)*_t0^2)*_t0-2^(1/2)*Heaviside(-_t0)*Heaviside(1-_t0)*exp(-(1/2)*_t0^2)-Pi^(1/2)*undefined*erf((1/2)*_t0*2^(1/2))*Dirac(_t0)-Pi^(1/2)*undefined*erf((1/2)*_t0*2^(1/2))*Dirac(_t0-1)+2^(1/2)*Heaviside(-_t0)*exp(-(1/2)*_t0^2)+2^(1/2)*Heaviside(1-_t0)*exp(-(1/2)*_t0^2)-Pi^(1/2)*undefined*Dirac(_t0)-Pi^(1/2)*undefined*Dirac(_t0-1)+Pi^(1/2)*Heaviside(_t0-1)*erf((1/2)*_t0*2^(1/2))-Pi^(1/2)*Heaviside(_t0)*erf((1/2)*_t0*2^(1/2))-exp(-(1/2)*_t0^2)*2^(1/2)+Pi^(1/2)*Heaviside(_t0-1)-Pi^(1/2)*Heaviside(_t0))/Pi^(1/2), _t0 = -infinity .. t)

whereas Mma 11 produces the correct piecewise expression (see that here screen15.11.16.docx).

Edit. Mma output.

reference :

Question:Quantile function
Posted:
Mikhail Drugov 88 

 

In the reference above, Mikhail has raised a problem concerning the function Statistics:-Quantile.
A problem of the same kind exists for the function Mode.

In fact  Mode returns the value of the mode only for unimodal distributions ; but for "bimodal" distributions it does not work properly.
Theoritically the mode is the value where the PDF reaches its maximum maximorum. Except in very particular cases this maximum is unique, even if common language speaks of "bimodal distributions" instead of "two bumped distributions".

Here is an example of a two bumped distribution (Z) obtained by mixing two gaussians distributions.
It has two bumps (z=-1, z=2) but only one mode (z=-2).
It could be hopefully acceptable that Mode returns the {-2, 2} (even if only -2 is the true mode), but Mode returns also the value of z that minimizes PDF(Z, z), which is not correct at all.


 

restart:
with(Statistics):

X := RandomVariable(Normal(-2,1)):
Y := RandomVariable(Normal(2,1)):

r    := 0.4:
f__Z := unapply((1-r)*PDF(X,t)+r*PDF(Y,t), t);
Z    := Distribution(PDF=f__Z):

proc (t) options operator, arrow; .1692568750*2^(1/2)*exp(-(1/2)*(t+2)^2)+.1128379167*2^(1/2)*exp(-(1/2)*(t-2)^2) end proc

(1)

plot(PDF(Z,t), t=-4..4);

 

Mode(Z);

{-1.999102417, .1352239093, 1.997971857}

(2)

 


 

Download ProblemWithMode.mw

 

Hello,

I need a bimodal distribution. Since I could not find any among the ones provided by Maple, I created a simple one:

with(Statistics):
U := Distribution(PDF = (proc (t) options operator, arrow; piecewise(t < -5, 0, t < 5, -(1/2000)*t^4+(9/1000)*t^2+7/80, 0) end proc)):
X := RandomVariable(U):

#Plotting PDF and CDF works fine:
plot(PDF(X, t), t = -infinity .. infinity);

plot(CDF(X, t), t = -infinity .. infinity)

However, plotting the quantile function does not work:

plot(Quantile(X, z), z = 0 .. 1);

it has a decreasing part for z<1/2 and a discontinuity at z=1/2.
I can plot it correctly as
plot('Quantile'(X, z), z = 0 .. 1);

but I wonder why the first option does not work for such a simple distribution.

 

 

I have a data point set:

x_val:=<250,300,350,397,451,497,547,593,647,691,745,788,840,897>:
y_val:=<0,0.5,2,6.3,23.2,48.7,71.2,83.4,90.1,92.8,94.7,95.7,96.9,97.8>:

I want to make a least square fit using this difficult function:
 

function:=x->1-exp(-(k*exp(-(E/(8.314*873.15))*((873.15/x)-1)))*(0.026/350))

but both Statistics[Fit]:
 

with(Statistics):fit_nelog:=Fit(1-exp(-(k*exp(-(E/(8.314*873.15))*((873.15/x)-1)))*(0.026/350)),<x_val|y_val>,x,parameternames=[k,E],output=[parametervector,residualsumofsquares]);

and DirectSearch[DataFit]:

with(DirectSearch):fit_nelog2:=DataFit(1-exp(-(k*exp(-(E/(8.314*873.15))*((873.15/x)-1)))*(0.026/350)),x_val,y_val,x,method=cdos);


give wrong k,E parameters. The correct parameter values were obtained with Excel Solver:

k=27843.3551042397

E=68.4

The approximately correct parameters were fitted when using logarithm form of the function.
How can I obtain correct parameter values in Maple using given form of the function?

Hi All,

I have a fucntion f(x,y,z) = exp(-x^2 -y^2 - z^4) and would like to plot the probabity density in real space. One method would be to randomly sample points in a grid based on f(x,y,z). The function f(x,y,z) is clearly peaked around x=y=z=0, so you would expect many points to lie around there. So the plot would look like a clump near (0,0,0) which gets less dense away from (0,0,0).

In the worksheet below, I sampled points from the Uniform distribution to file in the 3d-plot. I would like these points to be sampled from f instead, but am not sure how to do this.

Any help is appreciated,

restart;

with(Statistics):

R := 10; # x-axis size
N := 100; # Number f points to sample

10

 

100

(1)

# Unnormalized Probability distrubution

f := (x,y,z) -> exp(-x^2 -y^2 - z^2);

proc (x, y, z) options operator, arrow; exp(-x^2-y^2-z^2) end proc

(2)

# Clearly f is peaked at (0,0,0) and decays. Therefore I want a plot a lot of points near (0,0,0), and fewer points away from (0,0,0)

plot3d(f(x,y,0), x = -1..1, y = -1..1);

 

X := Sample(Uniform(-R, R), N):

Y := Sample(Uniform(-R, R), N):
Z := Sample(Uniform(-R, R), N):
XYZ := Matrix([[X], [Y], [Z]])^%T;

XYZ := Matrix(100, 3, {(1, 1) = 9.758694699049908, (1, 2) = 2.6237746853802246, (1, 3) = 5.657441459582465, (2, 1) = -6.591359538862333, (2, 2) = -2.89852696242302, (2, 3) = 3.875752299737945, (3, 1) = -4.844154988559739, (3, 2) = 9.940065432132954, (3, 3) = -9.803954954738758, (4, 1) = -2.0640136273371272, (4, 2) = -5.516570020337457, (4, 3) = 6.864266760210192, (5, 1) = -8.52010460846124, (5, 2) = 3.049021459372298, (5, 3) = 8.446639955925516, (6, 1) = 3.68192133924018, (6, 2) = 2.099812838165187, (6, 3) = 5.41908441347849, (7, 1) = -1.9522333460767616, (7, 2) = -2.2550913703373006, (7, 3) = -9.146802881299026, (8, 1) = 9.65670402787902, (8, 2) = -7.156256814189918, (8, 3) = -2.4362772589956228, (9, 1) = -1.9563202955503058, (9, 2) = -9.497300285795937, (9, 3) = 4.086792489667353, (10, 1) = 2.4134389439915687, (10, 2) = -1.5777549246951743, (10, 3) = 4.590260910092939, (11, 1) = -6.912603890414553, (11, 2) = -6.317994211449776, (11, 3) = -5.514458586709711, (12, 1) = -2.3730959111105605, (12, 2) = 4.515505349389063, (12, 3) = -4.618905364532699, (13, 1) = -6.777320563012783, (13, 2) = -2.592746269696038, (13, 3) = 3.4606233000823785, (14, 1) = 5.162248626548372, (14, 2) = 6.831201749364123, (14, 3) = -.45015604546277466, (15, 1) = 7.422222438307784, (15, 2) = 4.684593823866264, (15, 3) = 2.4743282533488493, (16, 1) = -2.9844651022821473, (16, 2) = 1.4205174564875769, (16, 3) = -5.2711013471817925, (17, 1) = 3.710714174950745, (17, 2) = -6.462898847493945, (17, 3) = -6.457524910033669, (18, 1) = -4.117027324643008, (18, 2) = 9.147680451914468, (18, 3) = 6.592867713951691, (19, 1) = .6125860771377116, (19, 2) = -4.693559276141599, (19, 3) = 5.338433358705297, (20, 1) = 6.648467725703679, (20, 2) = 8.491617904792019, (20, 3) = 8.68956546236539, (21, 1) = 1.9498038374515865, (21, 2) = -5.52459190605918, (21, 3) = -7.842221898312729, (22, 1) = -3.2937733858950775, (22, 2) = -2.5287238471471003, (22, 3) = -6.355449887978885, (23, 1) = -4.015499533337867, (23, 2) = -8.249993008468286, (23, 3) = -8.01809435155083, (24, 1) = -.9481491686135186, (24, 2) = 2.802330964934301, (24, 3) = -.20472396153106232, (25, 1) = -1.5470869355907517, (25, 2) = -6.387662244937832, (25, 3) = -6.13509339062259, (26, 1) = -2.8078736405552878, (26, 2) = -9.098977850528517, (26, 3) = 7.917831475851365, (27, 1) = 1.1663839973859425, (27, 2) = 4.4634695836619045, (27, 3) = -8.01820700636371, (28, 1) = 4.850907314038782, (28, 2) = -3.051247088364198, (28, 3) = -9.116688564746777, (29, 1) = -1.5133043274861873, (29, 2) = 3.2123364900580764, (29, 3) = 1.145903116095237, (30, 1) = -1.4128842284758996, (30, 2) = -2.322627978560572, (30, 3) = 5.449901343752481, (31, 1) = -7.502544825603743, (31, 2) = 2.5469300488693403, (31, 3) = -3.7611988500746225, (32, 1) = -9.511319678992521, (32, 2) = -9.567003707393871, (32, 3) = -6.420350413713298, (33, 1) = -4.196294697385456, (33, 2) = 8.21139977046057, (33, 3) = -3.220886435045635, (34, 1) = -3.6495883420154733, (34, 2) = 6.011173125576221, (34, 3) = -5.7970872591289595, (35, 1) = 3.0738026793295035, (35, 2) = 4.916949686854423, (35, 3) = .20305039530500402, (36, 1) = 9.138718481413683, (36, 2) = 6.262256272215215, (36, 3) = 8.127286465304294, (37, 1) = 8.71461745569761, (37, 2) = -2.3338736274894156, (37, 3) = 2.578478773046358, (38, 1) = -.8422733229126642, (38, 2) = 2.345584646328984, (38, 3) = -7.969322223753757, (39, 1) = -5.190432063358308, (39, 2) = 1.5098971940562773, (39, 3) = -2.1829049454729077, (40, 1) = 5.277958885729566, (40, 2) = .6010340953003119, (40, 3) = -8.907667695526849, (41, 1) = 5.186547662621926, (41, 2) = -4.498604883561299, (41, 3) = 0.25658264064304603e-1, (42, 1) = 4.812961299572285, (42, 2) = -5.027420806760592, (42, 3) = -1.3655765623150558, (43, 1) = 4.87376682974652, (43, 2) = -.9672245909605444, (43, 3) = 9.951206990243783, (44, 1) = -7.881591665344693, (44, 2) = -5.445743479469048, (44, 3) = 6.232051619906457, (45, 1) = 3.631208609406313, (45, 2) = 6.0889916722614, (45, 3) = -.2869666020396462, (46, 1) = -.7347884281256167, (46, 2) = 9.722084837919404, (46, 3) = 7.888955111347865, (47, 1) = -5.756735894901313, (47, 2) = -9.4001609946122, (47, 3) = -7.249068104658704, (48, 1) = -8.029625246237833, (48, 2) = .7132838133447539, (48, 3) = -2.1999017110942916, (49, 1) = 6.471489478556769, (49, 2) = -8.258455601982153, (49, 3) = 8.547124499962496, (50, 1) = -6.499805252358408, (50, 2) = 6.04182881111608, (50, 3) = 8.34987664832234, (51, 1) = -6.728601804300136, (51, 2) = 9.782898194006798, (51, 3) = 4.271480231886315, (52, 1) = 3.319744328222212, (52, 2) = -8.661074832044998, (52, 3) = 2.3667476724388, (53, 1) = 7.887787507084855, (53, 2) = 8.787967237690697, (53, 3) = -3.1342421951730914, (54, 1) = .33116416702540796, (54, 2) = -9.636449327266085, (54, 3) = 8.720546533795396, (55, 1) = 4.054046139009506, (55, 2) = 3.6767722749271066, (55, 3) = -7.504519186790148, (56, 1) = -6.9281924676119955, (56, 2) = 5.674729601664373, (56, 3) = 4.611707230114142, (57, 1) = 9.069141397724955, (57, 2) = .6827513576545652, (57, 3) = 2.929548648516276, (58, 1) = .8176816248295289, (58, 2) = 7.7071890186228345, (58, 3) = 6.663039713385899, (59, 1) = 3.594677964209339, (59, 2) = 7.980097978122803, (59, 3) = -2.034355435624491, (60, 1) = -9.268739639030944, (60, 2) = 2.518752521609917, (60, 3) = 4.9964441872127185, (61, 1) = 6.184077025875865, (61, 2) = -7.242620151748835, (61, 3) = 6.70441020956261, (62, 1) = 4.972377435523942, (62, 2) = -5.6439681257575085, (62, 3) = -3.5507920527548116, (63, 1) = -7.596259640258387, (63, 2) = -6.357178482191326, (63, 3) = 1.0452323371671, (64, 1) = .5009032952521757, (64, 2) = -9.163602720540913, (64, 3) = 9.582582648677842, (65, 1) = -3.483327424735016, (65, 2) = -7.86116682899586, (65, 3) = .9861706603660547, (66, 1) = .9289887980613702, (66, 2) = 2.328869701713703, (66, 3) = -3.391527807867945, (67, 1) = -2.0223849523360204, (67, 2) = 8.793220203221335, (67, 3) = 2.389431103555598, (68, 1) = -1.6981322677390676, (68, 2) = -2.910885380653423, (68, 3) = -2.7872685799559456, (69, 1) = -6.3852447949041125, (69, 2) = -1.7874181988097213, (69, 3) = 5.130190870038886, (70, 1) = -4.892265190238985, (70, 2) = 9.68698833968903, (70, 3) = -1.7219850261962062, (71, 1) = -9.589284506836309, (71, 2) = 8.911583780705254, (71, 3) = -.15309791230124503, (72, 1) = 8.473512252408145, (72, 2) = 3.532893568670783, (72, 3) = 3.8948646626522017, (73, 1) = 3.073997780165058, (73, 2) = 9.766045246265726, (73, 3) = 9.454677701595681, (74, 1) = 8.652271440971283, (74, 2) = 5.336627744331885, (74, 3) = -3.4449007901318645, (75, 1) = -6.729752629449488, (75, 2) = -3.2660147121704775, (75, 3) = 6.756063661571513, (76, 1) = 8.42194511784395, (76, 2) = 3.2476372079896247, (76, 3) = 4.781444545470562, (77, 1) = 5.893157707775064, (77, 2) = -5.116694264194415, (77, 3) = 9.083489127590862, (78, 1) = 1.5478839341329742, (78, 2) = -4.089854983368064, (78, 3) = -9.361547409920432, (79, 1) = -1.1992880847949277, (79, 2) = 3.6035674246100413, (79, 3) = -2.8626202763491593, (80, 1) = -4.8477252657512455, (80, 2) = .5569366083759579, (80, 3) = 3.25307668574429, (81, 1) = 5.038927877349, (81, 2) = -1.7681297318493083, (81, 3) = -4.369968817030188, (82, 1) = -5.426610357889972, (82, 2) = 2.0527643607279433, (82, 3) = -5.392338653650725, (83, 1) = -8.716258252162028, (83, 2) = 5.010401118474713, (83, 3) = 4.222571023606502, (84, 1) = 5.346590215531489, (84, 2) = 1.6706634852391726, (84, 3) = 2.4914583398661705, (85, 1) = 3.4240437071307106, (85, 2) = 1.0358502987193496, (85, 3) = 1.8121730583927196, (86, 1) = 4.304250295716802, (86, 2) = 1.6714123751542882, (86, 3) = 3.2087593262520375, (87, 1) = 2.8412165686770443, (87, 2) = .236398399169504, (87, 3) = -9.04890653772268, (88, 1) = -1.6190341275023385, (88, 2) = -8.348145460026013, (88, 3) = -3.0243038297988223, (89, 1) = -2.184758355916509, (89, 2) = 4.391402697189795, (89, 3) = -.9731883928851364, (90, 1) = 6.322802057506454, (90, 2) = 9.923122225937387, (90, 3) = -5.181900057597786, (91, 1) = -3.6514427268830074, (91, 2) = -2.909313900861563, (91, 3) = 4.300900265923531, (92, 1) = 6.290795458013026, (92, 2) = 9.425176303668113, (92, 3) = 7.123645840125757, (93, 1) = 5.7814702987791655, (93, 2) = -3.071024773992807, (93, 3) = -4.369846097628933, (94, 1) = 7.045277806876914, (94, 2) = 7.730877235206126, (94, 3) = 4.621016594474829, (95, 1) = .11273235143512395, (95, 2) = -.9061027001618438, (95, 3) = -7.244742149609673, (96, 1) = 2.7132277772275373, (96, 2) = -1.7314542195836946, (96, 3) = 6.734455634994351, (97, 1) = 9.017888307562703, (97, 2) = -5.645358632853991, (97, 3) = -7.2279656851528, (98, 1) = -1.1207168996237922, (98, 2) = -7.486908252747475, (98, 3) = 1.7641877077898727, (99, 1) = -8.799623604410481, (99, 2) = -3.821708128663694, (99, 3) = -2.6768639909012437, (100, 1) = 7.334997939986373, (100, 2) = 4.522088633296637, (100, 3) = 6.135190893222113}, datatype = float[8])

(3)

ScatterPlot3D(XYZ, color = blue, symbolsize = 20);

 

 

 

 

 


 

Download Sample_Test.m

 

Hi everyone,

I've not been able to figure this one out. Say I have an expression dependant on two random variables, like this:

C:=A+B

where A and B are randomvariables, each following a specific distribution.

If I ask for a Sample of C

Sample(C,1)

Maple will sample A, sample B and compute C. But say that A and B are correlated (with a cc of 0.8). How do I define this?

Thanks in advance

 

Aggregate statistics are calculated by splitting the rows of a DataFrame by each factor in a given column into subsets and computing summary statistics for each of these subsets.

The following is a short example of how the Aggregate command is used to compute aggregate statistics for a DataFrame with housing data:

To begin, we construct a DataFrame with housing data: The first column has number of bedrooms, the second has the area in square feet, the third has price.

bedrooms := <3, 4, 2, 4, 3, 2, 2, 3, 4, 4, 2, 4, 4, 3, 3>:
area := <1130, 1123, 1049, 1527, 907, 580, 878, 1075,
1040, 1295, 1100, 995, 908, 853, 856>:
price := <114700, 125200, 81600, 127400, 88500, 59500, 96500, 113300,
104400, 136600, 80100, 128000, 115700, 94700, 89400>:
HouseSalesData := DataFrame([bedrooms, area, price], columns = [Bedrooms, Area, Price]);

Note that the Bedrooms column has three distinct levels: 2, 3, and 4.

convert(HouseSalesData[Bedrooms], set);

The following returns the mean of all other columns for each distinct level in the column, Bedrooms:

Aggregate(HouseSalesData, Bedrooms);

Adding the columns option controls which columns are returned.

Aggregate(HouseSalesData, Bedrooms, columns = [Price])

Additionally, the tally option returns a tally for each of the levels.

Aggregate(HouseSalesData, Bedrooms, tally)

The function option allows for the specification of any command that can be applied to a DataSeries. For example, the Statistics:-Median command computes the median for each of the levels of Bedrooms.

Aggregate(HouseSalesData, Bedrooms, function = Statistics:-Median);

By default, Aggregate uses the SplitByColumn command to creates a separate sub-DataFrame for every discrete level in the column given by bycolumn.

with(Statistics);
ByRooms := SplitByColumn(HouseSalesData, Bedrooms);

We can create box plots of the price for subgroups of sales defined by number of bedrooms.

BoxPlot( map( (m)->m[Price], ByRooms), 
deciles=false,
datasetlabels=["2 bdrms", "3 bdrms", "4 bdrms"],
color=["Red", "Purple", "Blue"]);

 

I have recorded a short video that walks through this example here: https://youtu.be/e0pqCMyO3ks

The worksheet for this example can be downloaded here: Aggregate.mw

Hi,

I did some hypothesis testing exercises and I cross checked the result with Maple. I just used following vectors for an unpaired test

a := [88, 89, 92, 90, 90];
b := [92, 90, 91, 89, 91];

I ended up with the following solution:

HFloat(1.5225682336585966)
HFloat(-3.122568233658591)
for a 0.95 confidence interval.

 

Using

TwoSampleTTest(a, b, 0, confidence = .95, summarize = embed)

and

TwoSampleTTest(a, b, 0, confidence = .975, summarize = embed)

I get following results:

-2.75177 .. 1.15177

-3.13633 .. 1.53633

respectively. I can not explain the discrepancy.

 

Best regards,

Oliver

 

PS:

Maple Code in case files won´t be attached.

 

 

Unpaired t Test
restart;
Unpaired test-test dataset
a := [88, 89, 92, 90, 90];
b := [92, 90, 91, 89, 91];
The se² estimate is given by:
se²=var(a)+var(b)+2*cov(a*b)=var(a)+var(b)
se²=
sigma[a]^2/Na+sigma[b]^2/Nb;
with Na, Nb being the length of vector a and b respectively.
                             2                              2
  sigma[[88, 89, 92, 90, 90]]    sigma[[92, 90, 91, 89, 91]]
  ---------------------------- + ----------------------------
               Na                             Nb             
sigma[a]^2;
 and
sigma[b]^2;
 are approximated by
S[a]^2;
 and
S[b]^2;
                                             2
                  sigma[[88, 89, 92, 90, 90]]
                                             2
                  sigma[[92, 90, 91, 89, 91]]
                                           2
                    S[[88, 89, 92, 90, 90]]
                                           2
                    S[[92, 90, 91, 89, 91]]
with
S[X]^2;
 defined as
S[X]*`²` = (sum(X[i]-(sum(X[j], j = 1 .. N))/N, i = 1 .. N))^2/N;
                                 2
                             S[X]
                                                 2
                      /      /         N       \\
                      |      |       -----     ||
                      |  N   |        \        ||
                      |----- |         )       ||
                      | \    |        /    X[j]||
                      |  )   |       -----     ||
                      | /    |       j = 1     ||
                      |----- |X[i] - ----------||
                      \i = 1 \           N     //
             S[X] ᅡᄇ = ----------------------------
                                   N              
with(Statistics);
Sa := Variance(a);
                   HFloat(2.1999999999999993)
Sb := Variance(b);
                   HFloat(1.3000000000000003)
Now we are ready to do hypothesis testing (0.95).
We have (with k=min(Na,Nb)=5):
C = mean(a)-mean(b); Deviation := t_(alpha/a, k-1)*se(Sa/k-Sb/k);
c := Mean(a)-Mean(b); deviation := 2.776*sqrt((1/5)*Variance(a)+(1/5)*Variance(b));
                  HFloat(-0.7999999999999972)
                   HFloat(2.3225682336585938)
upperlimit := c+deviation; lowerlimit := c-deviation;
                   HFloat(1.5225682336585966)
                   HFloat(-3.122568233658591)

Execution of built in student test
TwoSampleTTest(a, b, 0, confidence = .95, summarize = embed);

 

 

1 2 3 4 5 6 7 Last Page 1 of 12