# doc-cache created by Octave 10.3.0
# name: cache
# type: cell
# rows: 3
# columns: 137
# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
adtest


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 4649
 -- statistics: H = adtest (X)
 -- statistics: H = adtest (X, NAME, VALUE)
 -- statistics: [H, PVAL] = adtest (...)
 -- statistics: [H, PVAL, ADSTAT, CV] = adtest (...)

     Anderson-Darling goodness-of-fit hypothesis test.

     ‘H = adtest (X)’ returns a test decision for the null hypothesis
     that the data in vector X is from a population with a normal
     distribution, using the Anderson-Darling test.  The alternative
     hypothesis is that x is not from a population with a normal
     distribution.  The result H is 1 if the test rejects the null
     hypothesis at the 5% significance level, or 0 otherwise.

     ‘H = adtest (X, NAME, VALUE)’ returns a test decision for the
     Anderson-Darling test with additional options specified by one or
     more Name-Value pair arguments.  For example, you can specify a
     null distribution other than normal, or select an alternative
     method for calculating the p-value, such as a Monte Carlo
     simulation.

     The following parameters can be parsed as Name-Value pair
     arguments.

     Name               Description
     --------------------------------------------------------------------------
     "Distribution"     The distribution being tested for.  It tests whether
                        X could have come from the specified distribution.
                        There are two choices available for parsing
                        distribution parameters:

        • One of the following char strings: "norm", "exp", "ev",
          "logn", "weibull", for defining either the 'normal',
          'exponential', 'extreme value', lognormal, or 'Weibull'
          distribution family, respectively.  In this case, X is tested
          against a composite hypothesis for the specified distribution
          family and the required distribution parameters are estimated
          from the data in X.  The default is "norm".

        • A cell array defining a distribution in which the first cell
          contains a char string with the distribution name, as
          mentioned above, and the consecutive cells containing all
          specified parameters of the null distribution.  In this case,
          X is tested against a simple hypothesis.

     "Alpha"            Significance level alpha for the test.  Any scalar
                        numeric value between 0 and 1.  The default is 0.05
                        corresponding to the 5% significance level.
                        
     "MCTol"            Monte-Carlo standard error for the p-value, PVAL,
                        value.  which must be a positive scalar value.  In
                        this case, an approximation for the p-value is
                        computed directly, using Monte-Carlo simulations.
                        
     "Asymptotic"       Method for calculating the p-value of the
                        Anderson-Darling test, which can be either true or
                        false logical value.  If you specify 'true', adtest
                        estimates the p-value using the limiting
                        distribution of the Anderson-Darling test statistic.
                        If you specify 'false', adtest calculates the
                        p-value based on an analytical formula.  For sample
                        sizes greater than 120, the limiting distribution
                        estimate is likely to be more accurate than the
                        small sample size approximation method.

        • If you specify a distribution family with unknown parameters
          for the distribution Name-Value pair (i.e.  composite
          distribution hypothesis test), the "Asymptotic" option must be
          false.
        • 
          If you use MCTol to calculate the p-value using a Monte Carlo
          simulation, the "Asymptotic" option must be false.

     ‘[H, PVAL] = adtest (...)’ also returns the p-value, PVAL, of the
     Anderson-Darling test, using any of the input arguments from the
     previous syntaxes.

     ‘[H, PVAL, ADSTAT, CV] = adtest (...)’ also returns the test
     statistic, ADSTAT, and the critical value, CV, for the
     Anderson-Darling test.

     The Anderson-Darling test statistic belongs to the family of
     Quadratic Empirical Distribution Function statistics, which are
     based on the weighted sum of the difference [Fn(x)-F(x)]^2 over the
     ordered sample values X1 < X2 < ... < Xn, where F is the
     hypothesized continuous distribution and Fn is the empirical CDF
     based on the data sample with n sample points.

     See also: kstest.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 49
Anderson-Darling goodness-of-fit hypothesis test.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
anova1


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 3036
 -- statistics: P = anova1 (X)
 -- statistics: P = anova1 (X, GROUP)
 -- statistics: P = anova1 (X, GROUP, DISPLAYOPT)
 -- statistics: P = anova1 (X, GROUP, DISPLAYOPT, VARTYPE)
 -- statistics: [P, ATAB] = anova1 (X, ...)
 -- statistics: [P, ATAB, STATS] = anova1 (X, ...)

     Perform a one-way analysis of variance (ANOVA) for comparing the
     means of two or more groups of data under the null hypothesis that
     the groups are drawn from distributions with the same mean.  For
     planned contrasts and/or diagnostic plots, use anovan instead.

     anova1 can take up to three input arguments:

        • X contains the data and it can either be a vector or matrix.
          If X is a matrix, then each column is treated as a separate
          group.  If X is a vector, then the GROUP argument is
          mandatory.

        • GROUP contains the names for each group.  If X is a matrix,
          then GROUP can either be a cell array of strings of a
          character array, with one row per column of X.  If you want to
          omit this argument, enter an empty array ([]).  If X is a
          vector, then GROUP must be a vector of the same length, or a
          string array or cell array of strings with one row for each
          element of X.  X values corresponding to the same value of
          GROUP are placed in the same group.

        • DISPLAYOPT is an optional parameter for displaying the groups
          contained in the data in a boxplot.  If omitted, it is 'on' by
          default.  If group names are defined in GROUP, these are used
          to identify the groups in the boxplot.  Use 'off' to omit
          displaying this figure.

        • VARTYPE is an optional parameter to used to indicate whether
          the groups can be assumed to come from populations with equal
          variance.  When vartype is "equal" the variances are assumed
          to be equal (this is the default).  When vartype is "unequal"
          the population variances are not assumed to be equal and
          Welch's ANOVA test is used instead.

     anova1 can return up to three output arguments:

        • P is the p-value of the null hypothesis that all group means
          are equal.

        • ATAB is a cell array containing the results in a standard
          ANOVA table.

        • STATS is a structure containing statistics useful for
          performing a multiple comparison of means with the MULTCOMPARE
          function.

     If anova1 is called without any output arguments, then it prints
     the results in a one-way ANOVA table to the standard output.  It is
     also printed when DISPLAYOPT is 'on'.

     Examples:

          x = meshgrid (1:6);
          x = x + normrnd (0, 1, 6, 6);
          anova1 (x, [], 'off');
          [p, atab] = anova1(x);

          x = ones (50, 4) .* [-2, 0, 1, 5];
          x = x + normrnd (0, 2, 50, 4);
          groups = {"A", "B", "C", "D"};
          anova1 (x, groups);

     See also: anova2, anovan, multcompare.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Perform a one-way analysis of variance (ANOVA) for comparing the means
of two...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
anova2


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2519
 -- statistics: P = anova2 (X, REPS)
 -- statistics: P = anova2 (X, REPS, DISPLAYOPT)
 -- statistics: P = anova2 (X, REPS, DISPLAYOPT, MODEL)
 -- statistics: [P, ATAB] = anova2 (...)
 -- statistics: [P, ATAB, STATS] = anova2 (...)

     Performs two-way factorial (crossed) or a nested analysis of
     variance (ANOVA) for balanced designs.  For unbalanced factorial
     designs, diagnostic plots and/or planned contrasts, use anovan
     instead.

     anova2 requires two input arguments with an optional third and
     fourth:

        • X contains the data and it must be a matrix of at least two
          columns and two rows.

        • REPS is the number of replicates for each combination of
          factor groups.

        • DISPLAYOPT is an optional parameter for displaying the ANOVA
          table, when it is 'on' (default) and suppressing the display
          when it is 'off'.

        • MODEL is an optional parameter to specify the model type as
          either:

             • "interaction" or "full" (default): compute both main
               effects and their interaction

             • "linear": compute both main effects without an
               interaction.  When REPS > 1 the test is suitable for a
               balanced randomized block design.  When REPS == 1, the
               test becomes a One-way Repeated Measures (RM)-ANOVA with
               Greenhouse-Geisser correction to the column factor
               degrees of freedom to make the test robust to violations
               of sphericity

             • "nested": treat the row factor as nested within columns.
               Note that the row factor is considered a random factor in
               the calculation of the statistics.

     anova2 returns up to three output arguments:

        • P is the p-value of the null hypothesis that all group means
          are equal.

        • ATAB is a cell array containing the results in a standard
          ANOVA table.

        • STATS is a structure containing statistics useful for
          performing a multiple comparison of means with the MULTCOMPARE
          function.

     If anova2 is called without any output arguments, then it prints
     the results in a one-way ANOVA table to the standard output as if
     DISPLAYOPT is 'on'.

     Examples:

          load popcorn;
          anova2 (popcorn, 3);

          [p, anovatab, stats] = anova2 (popcorn, 3, "off");
          disp (p);

     See also: anova1, anovan, multcompare.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Performs two-way factorial (crossed) or a nested analysis of variance
(ANOVA)...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
anovan


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8723
 -- statistics: P = anovan (Y, GROUP)
 -- statistics: P = anovan (Y, GROUP, NAME, VALUE)
 -- statistics: [P, ATAB] = anovan (...)
 -- statistics: [P, ATAB, STATS] = anovan (...)
 -- statistics: [P, ATAB, STATS, TERMS] = anovan (...)

     Perform a multi (N)-way analysis of (co)variance (ANOVA or ANCOVA)
     to evaluate the effect of one or more categorical or continuous
     predictors (i.e.  independent variables) on a continuous outcome
     (i.e.  dependent variable).  The algorithms used make ‘anovan’
     suitable for balanced or unbalanced factorial (crossed) designs.
     By default, ‘anovan’ treats all factors as fixed.  Examples of
     function usage can be found by entering the command ‘demo anovan’.
     A bootstrap resampling variant of this function, ‘bootlm’, is
     available in the statistics-resampling package and has similar
     usage.

     Data is a single vector Y with groups specified by a corresponding
     matrix or cell array of group labels GROUP, where each column of
     GROUP has the same number of rows as Y.  For example, if ‘Y = [23;
     27; 31; 29; 30; 32]; GROUP = [1, 2; 1, 3; 1, 2; 2, 3; 2, 3; 3, 2];’
     then observation 23 was measured under conditions 1,2; observation
     27 was measured under conditions 1,3; and so on.  If the GROUP
     provided is empty, then the linear model is fit with just the
     intercept (no predictors).

     ‘anovan’ can take a number of optional parameters as name-value
     pairs.

     ‘[...] = anovan (Y, GROUP, "continuous", CONTINUOUS)’

        • CONTINUOUS is a vector of indices indicating which of the
          columns (i.e.  factors) in GROUP should be treated as
          continuous predictors rather than as categorical predictors.
          The relationship between continuous predictors and the outcome
          should be linear.

     ‘[...] = anovan (Y, GROUP, "random", RANDOM)’

        • RANDOM is a vector of indices indicating which of the columns
          (i.e.  factors) in GROUP should be treated as random effects
          rather than fixed effects.  Octave ‘anovan’ provides only
          basic support for random effects.  Specifically, since all
          F-statistics in ‘anovan’ are calculated using the mean-squared
          error (MSE), any interaction terms containing a random effect
          are dropped from the model term definitions and their
          associated variance is pooled with the residual, unexplained
          variance making up the MSE. In effect, the model then fitted
          equates to a linear mixed model with random intercept(s).
          Variable names for random factors are appended with a '
          symbol.

     ‘[...] = anovan (Y, GROUP, "model", MODELTYPE)’

        • MODELTYPE can specified as one of the following:

             • "linear" (default) : compute N main effects with no
               interactions.

             • "interaction" : compute N effects and N*(N-1) two-factor
               interactions

             • "full" : compute the N main effects and interactions at
               all levels

             • a scalar integer : representing the maximum interaction
               order

             • a matrix of term definitions : each row is a term and
               each column is a factor

               -- Example:
               A two-way ANOVA with interaction would be: [1 0; 0 1; 1 1]

     ‘[...] = anovan (Y, GROUP, "sstype", SSTYPE)’

        • SSTYPE can specified as one of the following:

             • 1 : Type I sequential sums-of-squares.

             • 2 or "h" : Type II partially sequential (or hierarchical)
               sums-of-squares

             • 3 (default) : Type III partial, constrained or marginal
               sums-of-squares

     ‘[...] = anovan (Y, GROUP, "varnames", VARNAMES)’

        • VARNAMES must be a cell array of strings with each element
          containing a factor name for each column of GROUP.  By default
          (if not parsed as optional argument), VARNAMES are
          "X1","X2","X3", etc.

     ‘[...] = anovan (Y, GROUP, "alpha", ALPHA)’

        • ALPHA must be a scalar value between 0 and 1 requesting
          100*(1-ALPHA)% confidence bounds for the regression
          coefficients returned in STATS.coeffs (default 0.05 for 95%
          confidence).

     ‘[...] = anovan (Y, GROUP, "display", DISPOPT)’

        • DISPOPT can be either "on" (default) or "off" and controls the
          display of the model formula, table of model parameters, the
          ANOVA table and the diagnostic plots.  The F-statistic and
          p-values are formatted in APA-style.  To avoid p-hacking, the
          table of model parameters is only displayed if we set planned
          contrasts (see below).

     ‘[...] = anovan (Y, GROUP, "contrasts", CONTRASTS)’

        • CONTRASTS can be specified as one of the following:

             • A string corresponding to one of the built-in contrasts
               listed below:

                  • "simple" or "anova" (default): Simple (ANOVA)
                    contrast coding.  (The first level appearing in the
                    GROUP column is the reference level)

                  • "poly": Polynomial contrast coding for trend
                    analysis.

                  • "helmert": Helmert contrast coding: the difference
                    between each level with the mean of the subsequent
                    levels.

                  • "effect": Deviation effect coding.  (The first level
                    appearing in the GROUP column is omitted).

                  • "sdif" or "sdiff": Successive differences contrast
                    coding: the difference between each level with the
                    previous level.

                  • "treatment": Treatment contrast (or dummy) coding.
                    (The first level appearing in the GROUP column is
                    the reference level).  These contrasts are not
                    compatible with SSTYPE = 3.

             • A matrix containing a custom contrast coding scheme (i.e.
               the generalized inverse of contrast weights).  Rows in
               the contrast matrices correspond to factor levels in the
               order that they first appear in the GROUP column.  The
               matrix must contain the same number of columns as there
               are the number of factor levels minus one.

          If the anovan model contains more than one factor and a
          built-in contrast coding scheme was specified, then those
          contrasts are applied to all factors.  To specify different
          contrasts for different factors in the model, CONTRASTS should
          be a cell array with the same number of cells as there are
          columns in GROUP.  Each cell should define contrasts for the
          respective column in GROUP by one of the methods described
          above.  If cells are left empty, then the default contrasts
          are applied.  Contrasts for cells corresponding to continuous
          factors are ignored.

     ‘[...] = anovan (Y, GROUP, "weights", WEIGHTS)’

        • WEIGHTS is an optional vector of weights to be used when
          fitting the linear model.  Weighted least squares (WLS) is
          used with weights (that is, minimizing ‘sum (WEIGHTS *
          RESIDUALS .^ 2))’; otherwise ordinary least squares (OLS) is
          used (default is empty for OLS).

     ‘anovan’ can return up to four output arguments:

     ‘P = anovan (...)’ returns a vector of p-values, one for each term.

     ‘[P, ATAB] = anovan (...)’ returns a cell array containing the
     ANOVA table.

     ‘[P, ATAB, STATS] = anovan (...)’ returns a structure containing
     additional statistics, including degrees of freedom and effect
     sizes for each term in the linear model, the design matrix, the
     variance-covariance matrix, (weighted) model residuals, and the
     mean squared error.  The columns of STATS.coeffs (from
     left-to-right) report the model coefficients, standard errors,
     lower and upper 100*(1-alpha)% confidence interval bounds,
     t-statistics, and p-values relating to the contrasts.  The number
     appended to each term name in STATS.coeffnames corresponds to the
     column number in the relevant contrast matrix for that factor.  The
     STATS structure can be used as input for ‘multcompare’.

     ‘[P, ATAB, STATS, TERMS] = anovan (...)’ returns the model term
     definitions.

     See also: anova1, anova2, multcompare, fitlm.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Perform a multi (N)-way analysis of (co)variance (ANOVA or ANCOVA) to
evaluat...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 4
bar3


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 4329
 -- statistics: bar3 (Z)
 -- statistics: bar3 (Y, Z)
 -- statistics: bar3 (..., WIDTH)
 -- statistics: bar3 (..., STYLE)
 -- statistics: bar3 (..., COLOR)
 -- statistics: bar3 (..., NAME, VALUE)
 -- statistics: bar3 (AX, ...)
 -- statistics: P = bar3 (...)

     Plot a 3D bar graph.

     ‘bar3 (Z)’ plots 3D bar graph for the elements of Z.  Each bar
     corresponds to an element in Z, which can be a scalar, vector, or
     2D matrix.  By default, each column in Z is considered as a series
     and it is handled as a distinct series of bars.  When Z is a
     vector, unlike MATLAB, which plots it as a single series of bars,
     Octave discriminates between a row and column vector of Z.  Hence,
     when Z is column vector, it is plotted as a single series of bars
     (same color), whereas when Z is row vector, each bar is plotted as
     a different group (different colors).  For an MxN matrix, the
     function plots the bars corresponding to each row on the y-axis
     ranging from 1 to M and each column on the x-axis ranging from 1 to
     N.

     ‘bar3 (Y, Z)’ plots a 3D bar graph of the elements in Z at the
     y-values specified in Y.  It should be noted that Y only affects
     the tick names along the y-axis rather the actual values.  If you
     want to specify non-numerical values for Y, you can specify it with
     the paired NAME/VALUE syntax shown below.

     ‘bar3 (..., WIDTH)’ sets the width of the bars along the x- and
     y-axes and controls the separation of bars among each other.  WIDTH
     can take any value in the range (0,1].  By default, WIDTH is 0.8
     and the bars have a small separation.  If width is 1, the bars
     touch one another.  Alternatively, you can define WIDTH as a two-
     element vector using the paired NAME/VALUE syntax shown below, in
     which case you can control the bar separation along each axis
     independently.

     ‘bar3 (..., STYLE)’ specifies the style of the bars, where STYLE
     can be 'detached', 'grouped', or 'stacked'.  The default style is
     'detached'.

     ‘bar3 (..., COLOR)’ displays all bars using the color specified by
     color.  For example, use 'red' or 'r' to specify all red bars.
     When you want to specify colors for several groups, COLOR can be a
     cellstr vector with each element specifying the color of each
     group.  COLOR can also be specified as a numerical Mx3 matrix,
     where each row corresponds to a RGB value with its elements in the
     range [0,1].  If only one color is specified, then it applies to
     all bars.  If the number of colors equals the number of groups,
     then each color is applied to each group.  If the number of colors
     equals the number of elements in Z, then each individual bar is
     assigned the particular color.  You can also define COLOR using the
     paired NAME/VALUE syntax shown below.

     ‘bar3 (..., NAME, VALUE)’ specifies one or more of the following
     name/value pairs:

          Name           Value
     ---------------------------------------------------------------------------
          "width"        A two-element vector specifying the width of the
                         bars along the x- and y-axes, respectively.  Each
                         element must be in the range (0,1].
                         
          "color"        A character or a cellstr vector, or a numerical Mx3
                         matrix following the same conventions as the COLOR
                         input argument.
                         
          "xlabel"       A cellstr vector specifying the group names along
                         the x-axis.
                         
          "ylabel"       A cellstr vector specifying the names of the bars in
                         the same series along the y-axis.

     ‘bar3 (AX, ...)’ can also take an axes handle AX as a first
     argument in which case it plots into the axes specified by AX
     instead of into the current axes specified by ‘gca ()’.  The
     optional argument AX can precede any of the input argument
     combinations in the previous syntaxes.

     ‘P = bar3 (...)’ returns a patch handle P, which can be used to set
     properties of the bars after displaying the 3D bar graph.

     See also: boxplot, hist3.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 20
Plot a 3D bar graph.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 5
bar3h


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 4356
 -- statistics: bar3h (Y)
 -- statistics: bar3h (Z, Y)
 -- statistics: bar3h (..., WIDTH)
 -- statistics: bar3h (..., STYLE)
 -- statistics: bar3h (..., COLOR)
 -- statistics: bar3h (..., NAME, VALUE)
 -- statistics: bar3h (AX, ...)
 -- statistics: P = bar3h (...)

     Plot a horizontal 3D bar graph.

     ‘bar3h (Y)’ plots 3D bar graph for the elements of Y.  Each bar
     corresponds to an element in Y, which can be a scalar, vector, or
     2D matrix.  By default, each column in Y is considered as a series
     and it is handled as a distinct series of bars.  When Y is a
     vector, unlike MATLAB, which plots it as a single series of bars,
     Octave distinguishes between a row and column vector of Y.  Hence,
     when Y is column vector, it is plotted as a single series of bars
     (same color), whereas when Y is row vector, each bar is plotted as
     a different group (different colors).  For an MxN matrix, the
     function plots the bars corresponding to each row on the z-axis
     ranging from 1 to M and each column on the x-axis ranging from 1 to
     N.

     ‘bar3h (Z, Y)’ plots a 3D bar graph of the elements in Y at the
     z-values specified in Z.  It should be noted that Z only affects
     the tick names along the z-axis rather the actual values.  If you
     want to specify non-numerical values for Z, you can specify it with
     the paired NAME/VALUE syntax shown below.

     ‘bar3h (..., WIDTH)’ sets the width of the bars along the x- and
     z-axes and controls the separation of bars among each other.  WIDTH
     can take any value in the range (0,1].  By default, WIDTH is 0.8
     and the bars have a small separation.  If width is 1, the bars
     touch one another.  Alternatively, you can define WIDTH as a two-
     element vector using the paired NAME/VALUE syntax shown below, in
     which case you can control the bar separation along each axis
     independently.

     ‘bar3h (..., STYLE)’ specifies the style of the bars, where STYLE
     can be 'detached', 'grouped', or 'stacked'.  The default style is
     'detached'.

     ‘bar3h (..., COLOR)’ displays all bars using the color specified by
     color.  For example, use 'red' or 'r' to specify all red bars.
     When you want to specify colors for several groups, COLOR can be a
     cellstr vector with each element specifying the color of each
     group.  COLOR can also be specified as a numerical Mx3 matrix,
     where each row corresponds to a RGB value with its elements in the
     range [0,1].  If only one color is specified, then it applies to
     all bars.  If the number of colors equals the number of groups,
     then each color is applied to each group.  If the number of colors
     equals the number of elements in Y, then each individual bar is
     assigned the particular color.  You can also define COLOR using the
     paired NAME/VALUE syntax shown below.

     ‘bar3h (..., NAME, VALUE)’ specifies one or more of the following
     name/value pairs:

          Name           Value
     ---------------------------------------------------------------------------
          "width"        A two-element vector specifying the width of the
                         bars along the x- and z-axes, respectively.  Each
                         element must be in the range (0,1].
                         
          "color"        A character or a cellstr vector, or a numerical Mx3
                         matrix following the same conventions as the COLOR
                         input argument.
                         
          "xlabel"       A cellstr vector specifying the group names along
                         the x-axis.
                         
          "zlabel"       A cellstr vector specifying the names of the bars in
                         the same series along the z-axis.

     ‘bar3h (AX, ...)’ can also take an axes handle AX as a first
     argument in which case it plots into the axes specified by AX
     instead of into the current axes specified by ‘gca ()’.  The
     optional argument AX can precede any of the input argument
     combinations in the previous syntaxes.

     ‘P = bar3h (...)’ returns a patch handle P, which can be used to
     set properties of the bars after displaying the 3D bar graph.

     See also: boxplot, hist3.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 31
Plot a horizontal 3D bar graph.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 13
bartlett_test


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2055
 -- statistics: H = bartlett_test (X)
 -- statistics: H = bartlett_test (X, GROUP)
 -- statistics: H = bartlett_test (X, ALPHA)
 -- statistics: H = bartlett_test (X, GROUP, ALPHA)
 -- statistics: [H, PVAL] = bartlett_test (...)
 -- statistics: [H, PVAL, CHISQ] = bartlett_test (...)
 -- statistics: [H, PVAL, CHISQ, DF] = bartlett_test (...)

     Perform a Bartlett test for the homogeneity of variances.

     Under the null hypothesis of equal variances, the test statistic
     CHISQ approximately follows a chi-square distribution with DF
     degrees of freedom.

     The p-value (1 minus the CDF of this distribution at CHISQ) is
     returned in PVAL.  H = 1 if the null hypothesis is rejected at the
     significance level of ALPHA.  Otherwise H = 0.

     Input Arguments:

        • X contains the data and it can either be a vector or matrix.
          If X is a matrix, then each column is treated as a separate
          group.  If X is a vector, then the GROUP argument is
          mandatory.  NaN values are omitted.

        • GROUP contains the names for each group.  If X is a vector,
          then GROUP must be a vector of the same length, or a string
          array or cell array of strings with one row for each element
          of X.  X values corresponding to the same value of GROUP are
          placed in the same group.  If X is a matrix, then GROUP can
          either be a cell array of strings of a character array, with
          one row per column of X in the same way it is used in ‘anova1’
          function.  If X is a matrix, then GROUP can be omitted either
          by entering an empty array ([]) or by parsing only ALPHA as a
          second argument (if required to change its default value).

        • ALPHA is the statistical significance value at which the null
          hypothesis is rejected.  Its default value is 0.05 and it can
          be parsed either as a second argument (when GROUP is omitted)
          or as a third argument.

     See also: levene_test, vartest2, vartestn.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 57
Perform a Bartlett test for the homogeneity of variances.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
barttest


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1093
 -- statistics: NDIM = barttest (X)
 -- statistics: NDIM = barttest (X, ALPHA)
 -- statistics: [NDIM, PVAL] = barttest (X, ALPHA)
 -- statistics: [NDIM, PVAL, CHISQ] = barttest (X, ALPHA)

     Bartlett's test of sphericity for correlation.

     It compares an observed correlation matrix to the identity matrix
     in order to check if there is a certain redundancy between the
     variables that we can summarize with a few number of factors.  A
     statistically significant test shows that the variables (columns)
     in X are correlated, thus it makes sense to perform some
     dimensionality reduction of the data in X.

     ‘NDIM = barttest (X, ALPHA)’ returns the number of dimensions
     necessary to explain the nonrandom variation in the data matrix X
     at the ALPHA significance level.  ALPHA is an optional input
     argument and, when not provided, it is 0.05 by default.

     ‘[NDIM, PVAL, CHISQ] = barttest (...)’ also returns the
     significance values PVAL for the hypothesis test for each dimension
     as well as the associated chi^2 values in CHISQ


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 46
Bartlett's test of sphericity for correlation.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
binotest


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1182
 -- statistics: [H, PVAL, CI] = binotest (POS, N, P0)
 -- statistics: [H, PVAL, CI] = binotest (POS, N, P0, NAME, VALUE)

     Test for probability P of a binomial sample

     Perform a test of the null hypothesis P == P0 for a sample of size
     N with POS positive results.

     Name-Value pair arguments can be used to set various options.
     "alpha" can be used to specify the significance level of the test
     (the default value is 0.05).  The option "tail", can be used to
     select the desired alternative hypotheses.  If the value is "both"
     (default) the null is tested against the two-sided alternative ‘P
     != P0’.  The value of PVAL is determined by adding the
     probabilities of all event less or equally likely than the observed
     number POS of positive events.  If the value of "tail" is "right"
     the one-sided alternative ‘P > P0’ is considered.  Similarly for
     "left", the one-sided alternative ‘P < P0’ is considered.

     If H is 0 the null hypothesis is accepted, if it is 1 the null
     hypothesis is rejected.  The p-value of the test is returned in
     PVAL.  A 100(1-alpha)% confidence interval is returned in CI.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 43
Test for probability P of a binomial sample



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
boxplot


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8351
 -- statistics: S = boxplot (DATA)
 -- statistics: S = boxplot (DATA, GROUP)
 -- statistics: S = boxplot (DATA, NOTCHED, SYMBOL, ORIENTATION,
          WHISKER, ...)
 -- statistics: S = boxplot (DATA, GROUP, NOTCHED, SYMBOL, ORIENTATION,
          WHISKER, ...)
 -- statistics: S = boxplot (DATA, OPTIONS)
 -- statistics: S = boxplot (DATA, GROUP, OPTIONS, ...)
 -- statistics: [..., H] = boxplot (DATA, ...)

     Produce a box plot.

     A box plot is a graphical display that simultaneously describes
     several important features of a data set, such as center, spread,
     departure from symmetry, and identification of observations that
     lie unusually far from the bulk of the data.

     Input arguments (case-insensitive) recognized by boxplot are:

        • DATA is a matrix with one column for each data set, or a cell
          vector with one cell for each data set.  Each cell must
          contain a numerical row or column vector (NaN and NA are
          ignored) and not a nested vector of cells.

        • NOTCHED = 1 produces a notched-box plot.  Notches represent a
          robust estimate of the uncertainty about the median.

          NOTCHED = 0 (default) produces a rectangular box plot.

          NOTCHED within the interval (0,1) produces a notch of the
          specified depth.  Notched values outside (0,1) are amusing if
          not exactly impractical.

        • SYMBOL sets the symbol for the outlier values.  The default
          symbol for points that lie outside 3 times the interquartile
          range is 'o'; the default symbol for points between 1.5 and 3
          times the interquartile range is '+'.
          Alternative SYMBOL settings:

          SYMBOL = '.': points between 1.5 and 3 times the IQR are
          marked with '.'  and points outside 3 times IQR with 'o'.

          SYMBOL = ['x','*']: points between 1.5 and 3 times the IQR are
          marked with 'x' and points outside 3 times IQR with '*'.

        • ORIENTATION = 0 makes the boxes horizontally.
          ORIENTATION = 1 plots the boxes vertically (default).
          Alternatively, orientation can be passed as a string, e.g.,
          'vertical' or 'horizontal'.

        • WHISKER defines the length of the whiskers as a function of
          the IQR (default = 1.5).  If WHISKER = 0 then ‘boxplot’
          displays all data values outside the box using the plotting
          symbol for points that lie outside 3 times the IQR.

        • GROUP may be passed as an optional argument only in the second
          position after DATA.  GROUP contains a numerical vector
          defining separate categories, each plotted in a different box,
          for each set of DATA values that share the same GROUP value or
          values.  With the formalism (DATA, GROUP), both must be
          vectors of the same length.

        • OPTIONS are additional paired arguments passed with the
          formalism (Name, Value) that provide extra functionality as
          listed below.  OPTIONS can be passed at any order after the
          initial arguments and are case-insensitive.

          'Notch'        'on'           Notched by 0.25 of the boxes width.
                         'off'          Produces a straight box.
                         scalar         Proportional width of the notch.
                                        
          'Symbol'       '.'            Defines only outliers between 1.5 and 3
                                        IQR.
                         ['x','*']      2nd character defines outliers > 3 IQR
                                        
          'Orientation'  'vertical'     Default value, can also be defined with
                                        numerical 1.
                         'horizontal'   Can also be defined with numerical 0.
                                        
          'Whisker'      scalar         Multiplier of IQR (default is 1.5).
                                        
          'OutlierTags'  'on' or 1      Plot the vector index of the outlier
                                        value next to its point.
                         'off' or 0     No tags are plotted (default value).
                                        
          'Sample_IDs'   'cell'         A cell vector with one cell for each data
                                        set containing a nested cell vector with
                                        each sample's ID (should be a string).
                                        If this option is passed, then all
                                        outliers are tagged with their respective
                                        sample's ID string instead of their
                                        vector's index.
                                        
          'BoxWidth'     'proportional' Create boxes with their width
                                        proportional to the number of samples in
                                        their respective dataset (default value).
                         'fixed'        Make all boxes with equal width.
                                        
          'Widths'       scalar         Scaling factor for box widths (default
                                        value is 0.4).
                                        
          'CapWidths'    scalar         Scaling factor for whisker cap widths
                                        (default value is 1, which results to
                                        'Widths'/8 halflength)
                                        
          'BoxStyle'     'outline'      Draw boxes as outlines (default value).
                         'filled'       Fill boxes with a color (outlines are
                                        still plotted).
                                        
          'Positions'    vector         Numerical vector that defines the
                                        position of each data set.  It must have
                                        the same length as the number of groups
                                        in a desired manner.  This vector merely
                                        defines the points along the group axis,
                                        which by default is [1:number of groups].
                                        
          'Labels'       cell           A cell vector of strings containing the
                                        names of each group.  By default each
                                        group is labeled numerically according to
                                        its order in the data set
                                        
          'Colors'       character      If just one character or 1x3 vector of
                         string or      RGB values, specify the fill color of all
                         Nx3            boxes when BoxStyle = 'filled'.  If a
                         numerical      character string or Nx3 matrix is
                         matrix         entered, box #1's fill color corresponds
                                        to the first character or first matrix
                                        row, and the next boxes' fill colors
                                        corresponds to the next characters or
                                        rows.  If the char string or Nx3 array is
                                        exhausted the color selection wraps
                                        around.

     Supplemental arguments not described above (...) are concatenated
     and passed to the plot() function.

     The returned matrix S has one column for each data set as follows:

     1       Minimum
     2       1st quartile
     3       2nd quartile (median)
     4       3rd quartile
     5       Maximum
     6       Lower confidence limit for median
     7       Upper confidence limit for median

     The returned structure H contains handles to the plot elements,
     allowing customization of the visualization using set/get
     functions.

     Example

          title ("Grade 3 heights");
          axis ([0,3]);
          set(gca (), "xtick", [1 2], "xticklabel", {"girls", "boys"});
          boxplot ({randn(10,1)*5+140, randn(13,1)*8+135});


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 19
Produce a box plot.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 9
canoncorr


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 535
 -- statistics: [A, B, R, U, V] = canoncorr (X, Y)

     Canonical correlation analysis.

     Given X (size K*M) and Y (K*N), returns projection matrices of
     canonical coefficients A (size M*D, where D is the smallest of M,
     N, D) and B (size M*D); the canonical correlations R (1*D, arranged
     in decreasing order); the canonical variables U, V (both K*D, with
     orthonormal columns); and STATS, a structure containing results
     from Bartlett's chi-square and Rao's F tests of significance.

     See also: princomp.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 31
Canonical correlation analysis.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
cdfcalc


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 812
 -- statistics: [YCDF, XCDF, N, EMSG, EID] = cdfcalc (X)

     Calculate an empirical cumulative distribution function.

     ‘[YCDF, XCDF] = cdfcalc (X)’ calculates an empirical cumulative
     distribution function (CDF) of the observations in the data sample
     vector X.  X may be a row or column vector, and represents a random
     sample of observations from some underlying distribution.  On
     return XCDF is the set of X values at which the CDF increases.  At
     XCDF(i), the function increases from YCDF(i) to YCDF(i+1).

     ‘[YCDF, XCDF, N] = cdfcalc (X)’ also returns N, the sample size.

     ‘[YCDF, XCDF, N, EMSG, EID] = cdfcalc (X)’ also returns an error
     message and error id if X is not a vector or if it contains no
     values other than NaN.

     See also: cdfplot.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 56
Calculate an empirical cumulative distribution function.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
cdfplot


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1235
 -- statistics: HCDF = cdfplot (X)
 -- statistics: [HCDF, STATS] = cdfplot (X)

     Display an empirical cumulative distribution function.

     ‘HCDF = cdfplot (X)’ plots an empirical cumulative distribution
     function (CDF) of the observations in the data sample vector X.  X
     may be a row or column vector, and represents a random sample of
     observations from some underlying distribution.

     ‘cdfplot’ plots F(x), the empirical (or sample) CDF versus the
     observations in X.  The empirical CDF, F(x), is defined as follows:

     F(x) = (Number of observations <= x) / (Total number of
     observations)

     for all values in the sample vector X.  NaNs are ignored.  HCDF is
     the handle of the empirical CDF curve (a handle graphics 'line'
     object).

     ‘[HCDF, STATS] = cdfplot (X)’ also returns a structure with the
     following fields as a statistical summary.

          STATS.min              minimum value of X
          STATS.max              maximum value of X
          STATS.mean             sample mean of X
          STATS.median           sample median (50th percentile) of X
          STATS.std              sample standard deviation of X

     See also: qqplot, cdfcalc.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 54
Display an empirical cumulative distribution function.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
chi2gof


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 4553
 -- statistics: H = chi2gof (X)
 -- statistics: [H, P] = chi2gof (X)
 -- statistics: [P, H, STATS] = chi2gof (X)
 -- statistics: [...] = chi2gof (X, NAME, VALUE, ...)

     Chi-square goodness-of-fit test.

     ‘chi2gof’ performs a chi-square goodness-of-fit test for discrete
     or continuous distributions.  The test is performed by grouping the
     data into bins, calculating the observed and expected counts for
     those bins, and computing the chi-square test statistic
     SUM((O-E).^2./E), where O is the observed counts and E is the
     expected counts.  This test statistic has an approximate chi-square
     distribution when the counts are sufficiently large.

     Bins in either tail with an expected count less than 5 are pooled
     with neighboring bins until the count in each extreme bin is at
     least 5.  If bins remain in the interior with counts less than 5,
     ‘chi2gof’ displays a warning.  In that case, you should use fewer
     bins, or provide bin centers or binedges, to increase the expected
     counts in all bins.

     ‘H = chi2gof (X)’ performs a chi-square goodness-of-fit test that
     the data in the vector X are a random sample from a normal
     distribution with mean and variance estimated from X.  The result
     is H = 0 if the null hypothesis (that X is a random sample from a
     normal distribution) cannot be rejected at the 5% significance
     level, or H = 1 if the null hypothesis can be rejected at the 5%
     level.  ‘chi2gof’ uses by default 10 bins ("nbins"), and compares
     the test statistic to a chi-square distribution with NBINS - 3
     degrees of freedom, to take into account that two parameters were
     estimated.

     ‘[H, P] = chi2gof (X)’ also returns the p-value P, which is the
     probability of observing the given result, or one more extreme, by
     chance if the null hypothesis is true.  If there are not enough
     degrees of freedom to carry out the test, P is NaN.

     ‘[H, P, STATS] = chi2gof (X)’ also returns a STATS structure with
     the following fields:

          "chi2stat"             Chi-square statistic
          "df"                   Degrees of freedom
          "binedges"             Vector of bin binedges after pooling
          "O"                    Observed count in each bin
          "E"                    Expected count in each bin

     ‘[...] = chi2gof (X, NAME, VALUE, ...)’ specifies optional
     Name/Value pair arguments chosen from the following list.

          Name           Value
     ---------------------------------------------------------------------------
          "nbins"        The number of bins to use.  Default is 10.
          "binctrs"      A vector of bin centers.
          "binedges"     A vector of bin binedges.
          "cdf"          A fully specified cumulative distribution function
                         or a function handle provided in a cell array whose
                         first element is a function handle, and all later
                         elements are its parameter values.  The function
                         must take X values as its first argument, and other
                         parameters as later arguments.
          "expected"     A vector with one element per bin specifying the
                         expected counts for each bin.
          "nparams"      The number of estimated parameters; used to adjust
                         the degrees of freedom to be NBINS - 1 - NPARAMS,
                         where NBINS is the number of bins.
          "emin"         The minimum allowed expected value for a bin; any
                         bin in either tail having an expected value less
                         than this amount is pooled with a neighboring bin.
                         Use the value 0 to prevent pooling.  Default is 5.
          "frequency"    A vector of the same length as X containing the
                         frequency of the corresponding X values.
          "alpha"        An ALPHA value such that the hypothesis is rejected
                         if P < ALPHA.  Default is ALPHA = 0.05.

     You should specify either "cdf" or "expected" parameters, but not
     both.  If your "cdf" input contains extra parameters, these are
     accounted for automatically and there is no need to specify
     "nparams".  If your "expected" input depends on estimated
     parameters, you should use the "nparams" parameter to ensure that
     the degrees of freedom for the test is correct.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 32
Chi-square goodness-of-fit test.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
chi2test


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 4464
 -- statistics: PVAL = chi2test (X)
 -- statistics: [PVAL, CHISQ] = chi2test (X)
 -- statistics: [PVAL, CHISQ, DF] = chi2test (X)
 -- statistics: [PVAL, CHISQ, DF, E] = chi2test (X)
 -- statistics: [...] = chi2test (X, NAME, VALUE)

     Perform a chi-squared test (for independence or homogeneity).

     For 2-way contingency tables, ‘chi2test’ performs and a chi-squared
     test for independence or homogeneity, according to the sampling
     scheme and related question.  Independence means that the two
     variables forming the 2-way table are not associated, hence you
     cannot predict from one another.  Homogeneity refers to the concept
     of similarity, hence they all come from the same distribution.

     Both tests are computationally identical and will produce the same
     result.  Nevertheless, they answer to different questions.
     Consider two variables, one for gender and another for smoking.  To
     test independence (whether gender and smoking is associated), we
     would randomly sample from the general population and break them
     down into categories in the table.  To test homogeneity (whether
     men and women share the same smoking habits), we would sample
     individuals from within each gender, and then measure their smoking
     habits (e.g.  smokers vs non-smokers).

     When ‘chi2test’ is called without any output arguments, it will
     print the result in the terminal including p-value, chi^2
     statistic, and degrees of freedom.  Otherwise it can return the
     following output arguments:

          PVAL    the p-value of the relevant test.
          CHISQ   the chi^2 statistic of the relevant test.
          DF      the degrees of freedom of the relevant test.
          E       the EXPECTED values of the original contingency table.

     Unlike MATLAB, in GNU Octave ‘chi2test’ also supports 3-way tables,
     which involve three categorical variables (each in a different
     dimension of X.  In its simplest form, ‘[...] = chi2test (X)’ will
     will test for mutual independence among the three variables.
     Alternatively, when called in the form ‘[...] = chi2test (X, NAME,
     VALUE)’, it can perform the following tests:

     NAME           VALUE   Description
     --------------------------------------------------------------------------
     "mutual"       []      Mutual independence.  All variables are
                            independent from each other, (A, B, C). Value
                            must be an empty matrix.
     "joint"        scalar  Joint independence.  Two variables are jointly
                            independent of the third, (AB, C). The scalar
                            value corresponds to the dimension of the
                            independent variable (i.e.  3 for C).
     "marginal"     scalar  Marginal independence.  Two variables are
                            independent if you ignore the third, (A, C). The
                            scalar value corresponds to the dimension of the
                            variable to be ignored (i.e.  2 for B).
     "conditional"  scalar  Conditional independence.  Two variables are
                            independent given the third, (AC, BC). The
                            scalar value corresponds to the dimension of the
                            variable that forms the conditional dependence
                            (i.e.  3 for C).
     "homogeneous"  []      Homogeneous associations.  Conditional (partial)
                            odds-ratios are not related on the value of the
                            third, (AB, AC, BC). Value must be an empty
                            matrix.

     When testing for homogeneous associations in 3-way tables, the
     iterative proportional fitting procedure is used.  For small
     samples it is better to use the Cochran-Mantel-Haenszel Test.
     K-way tables for k > 3 are supported only for testing mutual
     independence.  Similar to 2-way tables, no optional parameters are
     required for k > 3 multi-way tables.

     ‘chi2test’ produces a warning if any cell of a 2x2 table has an
     expected frequency less than 5 or if more than 20% of the cells in
     larger 2-way tables have expected frequencies less than 5 or any
     cell with expected frequency less than 1.  In such cases, use
     ‘fishertest’.

     See also: crosstab, fishertest, mcnemar_test.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 61
Perform a chi-squared test (for independence or homogeneity).



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
cholcov


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1413
 -- statistics: T = cholcov (SIGMA)
 -- statistics: [T, P = cholcov (SIGMA)
 -- statistics: [...] = cholcov (SIGMA, FLAG)

     Cholesky-like decomposition for covariance matrix.

     ‘T = cholcov (SIGMA)’ computes matrix T such that SIGMA = T' T.
     SIGMA must be square, symmetric, and positive semi-definite.

     If SIGMA is positive definite, then T is the square, upper
     triangular Cholesky factor.  If SIGMA is not positive definite, T
     is computed with an eigenvalue decomposition of SIGMA, but in this
     case T is not necessarily triangular or square.  Any eigenvectors
     whose corresponding eigenvalue is close to zero (within a
     tolerance) are omitted.  If any remaining eigenvalues are negative,
     T is empty.

     The tolerance is calculated as ‘10 * eps (max (abs (diag
     (sigma))))’.

     ‘[T, P = cholcov (SIGMA)’ returns in P the number of negative
     eigenvalues of SIGMA.  If P > 0, then T is empty, whereas if P = 0,
     SIGMA) is positive semi-definite.

     If SIGMA is not square and symmetric, P is NaN and T is empty.

     ‘[T, P = cholcov (SIGMA, 0)’ returns P = 0 if SIGMA is positive
     definite, in which case T is the Cholesky factor.  If SIGMA is not
     positive definite, P is a positive integer and T is empty.

     ‘[...] = cholcov (SIGMA, 1)’ is equivalent to ‘ [...] = cholcov
     (SIGMA)’.

     See also: chov.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 50
Cholesky-like decomposition for covariance matrix.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 11
cl_multinom


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 3316
 -- statistics: CL = cl_multinom (X, N, B)
 -- statistics: CL = cl_multinom (X, N, B, METHOD)

     Confidence level of multinomial portions.

     ‘cl_multinom’ returns confidence level of multinomial parameters
     estimated as p = X / sum(X) with predefined confidence interval B.
     Finite population is also considered.

     This function calculates the level of confidence at which the
     samples represent the true distribution given that there is a
     predefined tolerance (confidence interval).  This is the upside
     down case of the typical exercises at which we want to get the
     confidence interval given the confidence level (and the estimated
     parameters of the underlying distribution).  But once we accept
     (lets say at elections) that we have a standard predefined maximal
     acceptable error rate (e.g.  B=0.02 ) in the estimation and we just
     want to know that how sure we can be that the measured proportions
     are the same as in the entire population (ie.  the expected value
     and mean of the samples are roughly the same) we need to use this
     function.

     Arguments
     ---------

     Variable  Type      Description
     -----------------------------------------------------------------------------
     X         int       sample frequencies bins.
               vector
     N         int       Population size that was sampled by X.  If N < sum
               scalar    (X), infinite number assumed.
     B         real      confidence interval.  If vector, it should be the size
               vector    of X containing confidence interval for each cells.
                         If scalar, each cell will have the same value of b
                         unless it is zero or -1.  If value is 0, B = 0.02 is
                         assumed which is standard choice at elections
                         otherwise it is calculated in a way that one sample in
                         a cell alteration defines the confidence interval.
     METHOD    string    An optional argument for defining the calculation
                         method.  Available choices are "bromaghin" (default),
                         "cochran", and agresti_cull.

     Note!  The agresti_cull method is not exactly the solution at
     reference given below but an adjustment of the solutions above.

     Returns
     -------

     Confidence level.

     Example
     -------

     CL = cl_multinom ([27; 43; 19; 11], 10000, 0.05) returns 0.69
     confidence level.

     References
     ----------

       1. "bromaghin" calculation type (default) is based on the
          article:

          Jeffrey F. Bromaghin, "Sample Size Determination for Interval
          Estimation of Multinomial Probabilities", The American
          Statistician vol 47, 1993, pp 203-206.

       2. "cochran" calculation type is based on article:

          Robert T. Tortora, "A Note on Sample Size Estimation for
          Multinomial Populations", The American Statistician, , Vol 32.
          1978, pp 100-102.

       3. "agresti_cull" calculation type is based on article:

          A. Agresti and B.A. Coull, "Approximate is better than 'exact'
          for interval estimation of binomial portions", The American
          Statistician, Vol.  52, 1998, pp 119-126


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 41
Confidence level of multinomial portions.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
cluster


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1224
 -- statistics: T = cluster (Z, "Cutoff", C)
 -- statistics: T = cluster (Z, "Cutoff", C, "Depth", D)
 -- statistics: T = cluster (Z, "Cutoff", C, "Criterion", CRITERION)
 -- statistics: T = cluster (Z, "MaxClust", N)

     Define clusters from an agglomerative hierarchical cluster tree.

     Given a hierarchical cluster tree Z generated by the ‘linkage’
     function, ‘cluster’ defines clusters, using a threshold value C to
     identify new clusters ('Cutoff') or according to a maximum number
     of desired clusters N ('MaxClust').

     CRITERION is used to choose the criterion for defining clusters,
     which can be either "inconsistent" (default) or "distance".  When
     using "inconsistent", ‘cluster’ compares the threshold value C to
     the inconsistency coefficient of each link; when using "distance",
     ‘cluster’ compares the threshold value C to the height of each
     link.  D is the depth used to evaluate the inconsistency
     coefficient, its default value is 2.

     ‘cluster’ uses "distance" as a criterion for defining new clusters
     when it is used with the 'MaxClust' method.

     See also: clusterdata, dendrogram, inconsistent, kmeans, linkage,
     pdist.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 64
Define clusters from an agglomerative hierarchical cluster tree.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 11
clusterdata


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 874
 -- statistics: T = clusterdata (X, CUTOFF)
 -- statistics: T = clusterdata (X, NAME, VALUE)

     Wrapper function for ‘linkage’ and ‘cluster’.

     If CUTOFF is used, then ‘clusterdata’ calls ‘linkage’ and ‘cluster’
     with default value, using CUTOFF as a threshold value for
     ‘cluster’.  If CUTOFF is an integer and greater or equal to 2, then
     CUTOFF is interpreted as the maximum number of cluster desired and
     the "MaxClust" option is used for ‘cluster’.

     If CUTOFF is not used, then ‘clusterdata’ expects a list of pair
     arguments.  Then you must specify either the "Cutoff" or "MaxClust"
     option for ‘cluster’.  The method and metric used by ‘linkage’, are
     defined through the "linkage" and "distance" arguments.

     See also: cluster, dendrogram, inconsistent, kmeans, linkage,
     pdist.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 53
Wrapper function for ‘linkage’ and ‘cluster’.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
cmdscale


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2726
 -- statistics: Y = cmdscale (D)
 -- statistics: [Y, E] = cmdscale (D)

     Classical multidimensional scaling of a matrix.

     Takes an N by N distance (or difference, similarity, or
     dissimilarity) matrix D.  Returns Y, a matrix of N points with
     coordinates in P dimensional space which approximate those
     distances (or differences, similarities, or dissimilarities).  Also
     returns the eigenvalues E of ‘B = -1/2 * J * (D.^2) * J’, where ‘J
     = eye(N) - ones(N,N)/N’.  P, the number of columns of Y, is equal
     to the number of positive real eigenvalues of B.

     D can be a full or sparse matrix or a vector of length ‘N*(N-1)/2’
     containing the upper triangular elements (like the output of the
     ‘pdist’ function).  It must be symmetric with non-negative entries
     whose values are further restricted by the type of matrix being
     represented:

     * If D is either a distance, dissimilarity, or difference matrix,
     then it must have zero entries along the main diagonal.  In this
     case the points Y equal or approximate the distances given by D.

     * If D is a similarity matrix, the elements must all be less than
     or equal to one, with ones along the main diagonal.  In this case
     the points Y equal or approximate the distances given by ‘D =
     sqrt(ones(N,N)-D)’.

     D is a Euclidean matrix if and only if B is positive semi-definite.
     When this is the case, then Y is an exact representation of the
     distances given in D.  If D is non-Euclidean, Y only approximates
     the distance given in D.  The approximation used by ‘cmdscale’
     minimizes the statistical loss function known as STRAIN.

     The returned Y is an N by P matrix showing possible coordinates of
     the points in P dimensional space (‘P < N’).  The columns
     correspond to the positive eigenvalues of B in descending order.  A
     translation, rotation, or reflection of the coordinates given by Y
     will satisfy the same distance matrix up to the limits of machine
     precision.

     For any ‘K <= P’, if the largest K positive eigenvalues of B are
     significantly greater in absolute magnitude than its other
     eigenvalues, the first K columns of Y provide a K-dimensional
     reduction of Y which approximates the distances given by D.  The
     optional return E can be used to consider various values of K, or
     to evaluate the accuracy of specific dimension reductions (e.g., ‘K
     = 2’).

     Reference: Ingwer Borg and Patrick J.F. Groenen (2005), Modern
     Multidimensional Scaling, Second Edition, Springer, ISBN:
     978-0-387-25150-9 (Print) 978-0-387-28981-6 (Online)

     See also: pdist.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 47
Classical multidimensional scaling of a matrix.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
combnk


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 89
 -- statistics: C = combnk (DATA, K)

     Return all combinations of K elements in DATA.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 46
Return all combinations of K elements in DATA.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 14
confusionchart


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1901
 -- statistics: confusionchart (TRUELABELS, PREDICTEDLABELS)
 -- statistics: confusionchart (M)
 -- statistics: confusionchart (M, CLASSLABELS)
 -- statistics: confusionchart (PARENT, ...)
 -- statistics: confusionchart (..., PROP, VAL, ...)
 -- statistics: CM = confusionchart (...)

     Display a chart of a confusion matrix.

     The two vectors of values TRUELABELS and PREDICTEDLABELS, which are
     used to compute the confusion matrix, must be defined with the same
     format as the inputs of ‘confusionmat’.  Otherwise a confusion
     matrix M as computed by ‘confusionmat’ can be given.

     CLASSLABELS is an array of labels, i.e.  the list of the class
     names.

     If the first argument is a handle to a ‘figure’ or to a ‘uipanel’,
     then the confusion matrix chart is displayed inside that object.

     Optional property/value pairs are passed directly to the underlying
     objects, e.g.  "xlabel", "ylabel", "title", "fontname", "fontsize"
     etc.

     The optional return value CM is a ‘ConfusionMatrixChart’ object.
     Specific properties of a ‘ConfusionMatrixChart’ object are:
        • "DiagonalColor" The color of the patches on the diagonal,
          default is [0.0, 0.4471, 0.7412].

        • "OffDiagonalColor" The color of the patches off the diagonal,
          default is [0.851, 0.3255, 0.098].

        • "GridVisible" Available values: on (default), off.

        • "Normalization" Available values: absolute (default),
          column-normalized, row-normalized, total-normalized.

        • "ColumnSummary" Available values: off (default), absolute,
          column-normalized,total-normalized.

        • "RowSummary" Available values: off (default), absolute,
          row-normalized, total-normalized.

     Run ‘demo confusionchart’ to see some examples.

     See also: confusionmat, sortClasses.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 38
Display a chart of a confusion matrix.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 12
confusionmat


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1441
 -- statistics: C = confusionmat (GROUP, GROUPHAT)
 -- statistics: C = confusionmat (GROUP, GROUPHAT, "Order", GROUPORDER)
 -- statistics: [C, ORDER] = confusionmat (GROUP, GROUPHAT)

     Compute a confusion matrix for classification problems

     ‘confusionmat’ returns the confusion matrix C for the group of
     actual values GROUP and the group of predicted values GROUPHAT.
     The row indices of the confusion matrix represent actual values,
     while the column indices represent predicted values.  The indices
     are the same for both actual and predicted values, so the confusion
     matrix is a square matrix.  Each element of the matrix represents
     the number of matches between a given actual value (row index) and
     a given predicted value (column index), hence correct matches lie
     on the main diagonal of the matrix.  The order of the rows and
     columns is returned in ORDER.

     GROUP and GROUPHAT must have the same number of observations and
     the same data type.  Valid data types are numeric vectors, logical
     vectors, character arrays, string arrays (not implemented yet),
     cell arrays of strings.

     The order of the rows and columns can be specified by setting the
     GROUPORDER variable.  The data type of GROUPORDER must be the same
     of GROUP and GROUPHAT.

     MATLAB compatibility: Octave misses string arrays and categorical
     vectors.

     See also: crosstab.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 54
Compute a confusion matrix for classification problems



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
cophenet


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1114
 -- statistics: [C, D] = cophenet (Z, Y)

     Compute the cophenetic correlation coefficient.

     The cophenetic correlation coefficient C of a hierarchical cluster
     tree Z is the linear correlation coefficient between the cophenetic
     distances D and the euclidean distances Y.

     It is a measure of the similarity between the distance of the
     leaves, as seen in the tree, and the distance of the original data
     points, which were used to build the tree.  When this similarity is
     greater, that is the coefficient is closer to 1, the tree renders
     an accurate representation of the distances between the original
     data points.

     Z is a hierarchical cluster tree, as the output of ‘linkage’.  Y is
     a vector of euclidean distances, as the output of ‘pdist’.

     The optional output D is a vector of cophenetic distances, in the
     same lower triangular format as Y.  The cophenetic distance between
     two data points is the height of the lowest common node of the
     tree.

     See also: cluster, dendrogram, inconsistent, linkage, pdist,
     squareform.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 47
Compute the cophenetic correlation coefficient.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 16
correlation_test


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2401
 -- statistics: H = correlation_test (X, Y)
 -- statistics: [H, PVAL] = correlation_test (Y, X)
 -- statistics: [H, PVAL, STATS] = correlation_test (Y, X)
 -- statistics: [...] = correlation_test (Y, X, NAME, VALUE)

     Perform a correlation coefficient test to determine whether two
     samples X and Y come from uncorrelated populations.

     ‘H = correlation_test (Y, X)’ tests the null hypothesis that the
     two samples X and Y come from uncorrelated populations.  The result
     is H = 0 if the null hypothesis cannot be rejected at the 5%
     significance level, or H = 1 if the null hypothesis can be rejected
     at the 5% level.  Y and X must be vectors of equal length with
     finite real numbers.

     The p-value of the test is returned in PVAL.  STATS is a structure
     with the following fields:
          Field               Value
     ----------------------------------------------------------------------------
          method              the type of correlation coefficient used for the
                              test
          df                  the degrees of freedom (where applicable)
          corrcoef            the correlation coefficient
          stat                the test's statistic
          dist                the respective distribution for the test
          alt                 the alternative hypothesis for the test

     ‘[...] = correlation_test (..., NAME, VALUE)’ specifies one or more
     of the following name/value pairs:

          Name           Value
     ---------------------------------------------------------------------------
          "alpha"        the significance level.  Default is 0.05.
                         
          "tail"         a string specifying the alternative hypothesis
             "both"             corrcoef is not 0 (two-tailed, default)
             "left"             corrcoef is less than 0 (left-tailed)
             "right"            corrcoef is greater than 0 (right-tailed)

          "method"       a string specifying the correlation coefficient used
                         for the test
             "pearson"          Pearson's product moment correlation
                                (Default)
             "kendall"          Kendall's rank correlation tau
             "spearman"         Spearman's rank correlation rho

     See also: regression_ftest, regression_ttest.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Perform a correlation coefficient test to determine whether two samples
X and...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
createns


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2725
 -- Function File: OBJ = createns (X)
 -- Function File: OBJ = createns (X, NAME, VALUE, ...)

     Create a nearest neighbor searcher object.

     ‘OBJ = createns (X)’ creates a nearest neighbor searcher object
     using the training data X.  By default, it constructs an
     ‘ExhaustiveSearcher’ object with the Euclidean distance metric.

     ‘OBJ = createns (X, NAME, VALUE, ...)’ allows customization of the
     searcher type and its properties through name-value pairs.  The
     following name-value pair is supported to specify the searcher
     type:

     NAME            VALUE
                     
     ---------------------------------------------------------------------------
     "NSMethod"      Specifies the nearest neighbor search method.  Possible
                     values are:
                        • "exhaustive": Creates an ‘ExhaustiveSearcher’
                          object.
                        • "kdtree": Creates a ‘KDTreeSearcher’ object.
                        • "hnsw": Creates an ‘hnswSearcher’ object.
                     Default is "exhaustive".
                     

     Additional name-value pairs depend on the selected "NSMethod" and
     are passed directly to the constructor of the corresponding class:

        • For "exhaustive", see ‘ExhaustiveSearcher’ documentation for
          parameters like "Distance", "P", "Scale", and "Cov".
        • For "kdtree", see ‘KDTreeSearcher’ documentation for
          parameters like "Distance", "P", and "BucketSize".
        • For "hnsw", see ‘hnswSearcher’ documentation for parameters
          like "Distance", "P", "Scale", "Cov", "MaxNumLinksPerNode",
          and "TrainSetSize".

     *Input Arguments:*
        • X - Training data, specified as an NxP numeric matrix where
          rows represent observations and columns represent features.
          Must be finite and numeric.

     *Output:*
        • OBJ - A nearest neighbor searcher object of type
          ‘ExhaustiveSearcher’, ‘KDTreeSearcher’, or ‘hnswSearcher’,
          depending on the specified "NSMethod".

     *Examples:*

          ## Create an ExhaustiveSearcher with default parameters
          X = [1, 2; 3, 4; 5, 6];
          obj = createns (X);

          ## Create a KDTreeSearcher with Euclidean distance
          obj = createns (X, "NSMethod", "kdtree", "Distance", "euclidean");

          ## Create an hnswSearcher with Minkowski distance and custom parameters
          obj = createns (X, "NSMethod", "hnsw", "Distance", "minkowski", "P", 3, "MaxNumLinksPerNode", 2);

     See also: ExhaustiveSearcher, KDTreeSearcher, hnswSearcher,
     knnsearch, rangesearch.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 42
Create a nearest neighbor searcher object.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
crosstab


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 566
 -- statistics: T = crosstab (X1, X2)
 -- statistics: T = crosstab (X1, ..., XN)
 -- statistics: [T, CHISQ, P, LABELS] = crosstab (...)

     Create a cross-tabulation (contingency table) T from data vectors.

     The inputs X1, X2, ...  XN must be vectors of equal length with a
     data type of numeric, logical, char, or string (cell array).

     As additional return values ‘crosstab’ returns the chi-square
     statistics CHISQ, its p-value P and a cell array LABELS, containing
     the labels of each input argument.

     See also: grp2idx, tabulate.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 66
Create a cross-tabulation (contingency table) T from data vectors.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
crossval


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2179
 -- statistics: RESULTS = crossval (F, X, Y)
 -- statistics: RESULTS = crossval (F, X, Y, NAME, VALUE)

     Perform cross validation on given data.

     F should be a function that takes 4 inputs XTRAIN, YTRAIN, XTEST,
     YTEST, fits a model based on XTRAIN, YTRAIN, applies the fitted
     model to XTEST, and returns a goodness of fit measure based on
     comparing the predicted and actual YTEST.  ‘crossval’ returns an
     array containing the values returned by F for every
     cross-validation fold or resampling applied to the given data.

     X should be an N by M matrix of predictor values

     Y should be an N by 1 vector of predicand values

     Optional arguments may include name-value pairs as follows:

     "KFold"
          Divide set into K equal-size subsets, using each one
          successively for validation.

     "HoldOut"
          Divide set into two subsets, training and validation.  If the
          value K is a fraction, that is the fraction of values put in
          the validation subset (by default K=0.1); if it is a positive
          integer, that is the number of values in the validation
          subset.

     "LeaveOut"
          Leave-one-out partition (each element is placed in its own
          subset).  The value is ignored, but it is required.

     "Partition"
          The value should be a CVPARTITION object.

     "Given"
          The value should be an N by 1 vector specifying in which
          partition to put each element.

     "stratify"
          The value should be an N by 1 vector containing class
          designations for the elements, in which case the "KFold" and
          "HoldOut" partitionings attempt to ensure each partition
          represents the classes proportionately.

     "mcreps"
          The value should be a positive integer specifying the number
          of times to resample based on different partitionings.
          Currently only works with the partition type "HoldOut".

     Only one of "KFold", "HoldOut", "LeaveOut", "Given", "Partition"
     should be specified.  If none is specified, the default is "KFold"
     with K = 10.

     See also: cvpartition.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 39
Perform cross validation on given data.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 10
datasample


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1188
 -- statistics: Y = datasample (DATA, K)
 -- statistics: Y = datasample (DATA, K, DIM)
 -- statistics: Y = datasample (..., NAME, VALUE)
 -- statistics: [Y IDCS] = datasample (...)

     Randomly sample data.

     Return K observations randomly sampled from DATA.  DATA can be a
     vector or a matrix of any data.  When DATA is a matrix or a
     n-dimensional array, the samples are the subarrays of size n - 1,
     taken along the dimension DIM.  The default value for DIM is 1,
     that is the row vectors when sampling a matrix.

     Output Y is the returned sampled data.  Optional output IDCS is the
     vector of the indices to build Y from DATA.

     Additional options are set through pairs of parameter name and
     value.  Available parameters are:

     ‘Replace’
          a logical value that can be ‘true’ (default) or ‘false’: when
          set to ‘true’, ‘datasample’ returns data sampled with
          replacement.

     ‘Weights’
          a vector of positive numbers that sets the probability of each
          element.  It must have the same size as DATA along dimension
          DIM.

See also: rand, randi, randperm, randsample.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 21
Randomly sample data.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 4
dcov


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 713
 -- statistics: [DCOR, DCOV, DVARX, DVARY] = dcov (X, Y)

     Distance correlation, covariance and correlation statistics.

     It returns the distance correlation (DCOR) and the distance
     covariance (DCOV) between X and Y, the distance variance of X in
     (DVARX) and the distance variance of Y in (DVARY).

     X and Y must have the same number of observations (rows) but they
     can have different number of dimensions (columns).  Rows with
     missing values (NaN) in either X or Y are omitted.

     The Brownian covariance is the same as the distance covariance:

     cov_W (X, Y) = dCov (X, Y)

     and thus Brownian correlation is the same as distance correlation.

     See also: corr, cov.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 60
Distance correlation, covariance and correlation statistics.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 10
dendrogram


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2313
 -- statistics: dendrogram (TREE)
 -- statistics: dendrogram (TREE, P)
 -- statistics: dendrogram (TREE, PROP, VAL)
 -- statistics: dendrogram (TREE, P, PROP, VAL )
 -- statistics: H = dendrogram (...)
 -- statistics: [H, T, PERM] = dendrogram (...)

     Plot a dendrogram of a hierarchical binary cluster tree.

     Given TREE, a hierarchical binary cluster tree as the output of
     ‘linkage’, plot a dendrogram of the tree.  The number of leaves
     shown by the dendrogram plot is limited to P.  The default value
     for P is 30.  Set P to 0 to plot all leaves.

     The optional outputs are H, T and PERM:
        • H is a handle to the lines of the plot.

        • T is the vector with the numbers assigned to each leaf.  Each
          element of T is a leaf of TREE and its value is the number
          shown in the plot.  When the dendrogram plot is collapsed,
          that is when the number of shown leaves P is inferior to the
          total number of leaves, a single leaf of the plot can
          represent more than one leaf of TREE: in that case multiple
          elements of T share the same value, that is the same leaf of
          the plot.  When the dendrogram plot is not collapsed, each
          leaf of the plot is the leaf of TREE with the same number.

        • PERM is the vector list of the leaves as ordered as in the
          plot.

     Additional input properties can be specified by pairs of properties
     and values.  Known properties are:
        • "Reorder" Reorder the leaves of the dendrogram plot using a
          numerical vector of size n, the number of leaves.  When P is
          smaller than N, the reordering cannot break the P groups of
          leaves.

        • "Orientation" Change the orientation of the plot.  Available
          values: top (default), bottom, left, right.

        • "CheckCrossing" Check if the lines of a reordered dendrogram
          cross each other.  Available values: true (default), false.

        • "ColorThreshold" Not implemented.

        • "Labels" Use a char, string or cellstr array of size N to set
          the label for each leaf; the label is displayed only for nodes
          with just one leaf.

     See also: cluster, clusterdata, cophenet, inconsistent, linkage,
     pdist.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 56
Plot a dendrogram of a hierarchical binary cluster tree.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 4
ecdf


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2355
 -- statistics: [F, X] = ecdf (Y)
 -- statistics: [F, X, FLO, FUP] = ecdf (Y)
 -- statistics: ecdf (...)
 -- statistics: ecdf (AX, ...)
 -- statistics: [...] = ecdf (Y, NAME, VALUE, ...)
 -- statistics: [...] = ecdf (AX, Y, NAME, VALUE, ...)

     Empirical (Kaplan-Meier) cumulative distribution function.

     ‘[F, X] = ecdf (Y)’ calculates the Kaplan-Meier estimate of the
     cumulative distribution function (cdf), also known as the empirical
     cdf.  Y is a vector of data values.  F is a vector of values of the
     empirical cdf evaluated at X.

     ‘[F, X, FLO, FUP] = ecdf (Y)’ also returns lower and upper
     confidence bounds for the cdf.  These bounds are calculated using
     Greenwood's formula, and are not simultaneous confidence bounds.

     ‘ecdf (...)’ without output arguments produces a plot of the
     empirical cdf.

     ‘ecdf (AX, ...)’ plots into existing axes AX.

     ‘[...] = ecdf (Y, NAME, VALUE, ...)’ specifies additional parameter
     name/value pairs chosen from the following:

     NAME           VALUE
     --------------------------------------------------------------------------
     "censoring"    A boolean vector of the same size as Y that is 1 for
                    observations that are right-censored and 0 for
                    observations that are observed exactly.  Default is all
                    observations observed exactly.
                    
     "frequency"    A vector of the same size as Y containing non-negative
                    integer counts.  The jth element of this vector gives
                    the number of times the jth element of Y was observed.
                    Default is 1 observation per Y element.
                    
     "alpha"        A value ALPHA between 0 and 1 specifying the
                    significance level.  Default is 0.05 for 5%
                    significance.
                    
     "function"     The type of function returned as the F output argument,
                    chosen from "cdf" (the default), "survivor", or
                    "cumulative hazard".
                    
     "bounds"       Either "on" to include bounds or "off" (the default) to
                    omit them.  Used only for plotting.

     Type ‘demo ecdf’ to see examples of usage.

     See also: cdfplot, ecdfhist.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 58
Empirical (Kaplan-Meier) cumulative distribution function.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
einstein


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1436
 -- statistics: einstein ()
 -- statistics: TILES = einstein (A, B)
 -- statistics: [TILES, RHAT] = einstein (A, B)
 -- statistics: [TILES, RHAT, THAT] = einstein (A, B)
 -- statistics: [TILES, RHAT, THAT, SHAT] = einstein (A, B)
 -- statistics: [TILES, RHAT, THAT, SHAT, PHAT] = einstein (A, B)
 -- statistics: [TILES, RHAT, THAT, SHAT, PHAT, FHAT] = einstein (A, B)

     Plots the tiling of the basic clusters of einstein tiles.

     Scalars A and B define the shape of the einstein tile.  See Smith
     et al (2023) for details: <https://arxiv.org/abs/2303.10798>

        • TILES is a structure containing the coordinates of the
          einstein tiles that are tiled on the plot.  Each field
          contains the tile coordinates of the corresponding clusters.
             • TILES.rhat contains the reflected einstein tiles
             • TILES.that contains the three-hat shells
             • TILES.shat contains the single-hat clusters
             • TILES.phat contains the paired-hat clusters
             • TILES.fhat contains the fylfot clusters

        • RHAT contains the coordinates of the first reflected tile
        • THAT contains the coordinates of the first three-hat shell
        • SHAT contains the coordinates of the first single-hat cluster
        • PHAT contains the coordinates of the first paired-hat cluster
        • FHAT contains the coordinates of the first fylfot cluster


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 57
Plots the tiling of the basic clusters of einstein tiles.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 12
evalclusters


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 4814
 -- statistics: EVA = evalclusters (X, CLUST, CRITERION)
 -- statistics: EVA = evalclusters (..., Name, Value)

     Create a clustering evaluation object to find the optimal number of
     clusters.

     ‘evalclusters’ creates a clustering evaluation object to evaluate
     the optimal number of clusters for data X, using criterion
     CRITERION.  The input data X is a matrix with ‘n’ observations of
     ‘p’ variables.  The evaluation criterion CRITERION is one of the
     following:
     ‘CalinskiHarabasz’
          to create a ‘CalinskiHarabaszEvaluation’ object.

     ‘DaviesBouldin’
          to create a ‘DaviesBouldinEvaluation’ object.

     ‘gap’
          to create a ‘GapEvaluation’ object.

     ‘silhouette’
          to create a ‘SilhouetteEvaluation’ object.

     The clustering algorithm CLUST is one of the following:
     ‘kmeans’
          to cluster the data using ‘kmeans’ with ‘EmptyAction’ set to
          ‘singleton’ and ‘Replicates’ set to 5.

     ‘linkage’
          to cluster the data using ‘clusterdata’ with ‘linkage’ set to
          ‘Ward’.

     ‘gmdistribution’
          to cluster the data using ‘fitgmdist’ with ‘SharedCov’ set to
          ‘true’ and ‘Replicates’ set to 5.

     If the CRITERION is ‘CalinskiHarabasz’, ‘DaviesBouldin’, or
     ‘silhouette’, CLUST can also be a function handle to a function of
     the form ‘c = clust(x, k)’, where X is the input data, K the number
     of clusters to evaluate and C the clustering result.  The
     clustering result can be either an array of size ‘n’ with ‘k’
     different integer values, or a matrix of size ‘n’ by ‘k’ with a
     likelihood value assigned to each one of the ‘n’ observations for
     each one of the K clusters.  In the latter case, each observation
     is assigned to the cluster with the higher value.  If the CRITERION
     is ‘CalinskiHarabasz’, ‘DaviesBouldin’, or ‘silhouette’, CLUST can
     also be a matrix of size ‘n’ by ‘k’, where ‘k’ is the number of
     proposed clustering solutions, so that each column of CLUST is a
     clustering solution.

     In addition to the obligatory X, CLUST and CRITERION inputs there
     is a number of optional arguments, specified as pairs of ‘Name’ and
     ‘Value’ options.  The known ‘Name’ arguments are:
     ‘KList’
          a vector of positive integer numbers, that is the cluster
          sizes to evaluate.  This option is necessary, unless CLUST is
          a matrix of proposed clustering solutions.

     ‘Distance’
          a distance metric as accepted by the chosen CLUST.  It can be
          the name of the distance metric as a string or a function
          handle.  When CRITERION is ‘silhouette’, it can be a vector as
          created by function ‘pdist’.  Valid distance metric strings
          are: ‘sqEuclidean’ (default), ‘Euclidean’, ‘cityblock’,
          ‘cosine’, ‘correlation’, ‘Hamming’, ‘Jaccard’.  Only used by
          ‘silhouette’ and ‘gap’ evaluation.

     ‘ClusterPriors’
          the prior probabilities of each cluster, which can be either
          ‘empirical’ (default), or ‘equal’.  When ‘empirical’ the
          silhouette value is the average of the silhouette values of
          all points; when ‘equal’ the silhouette value is the average
          of the average silhouette value of each cluster.  Only used by
          ‘silhouette’ evaluation.

     ‘B’
          the number of reference datasets generated from the reference
          distribution.  Only used by ‘gap’ evaluation.

     ‘ReferenceDistribution’
          the reference distribution used to create the reference data.
          It can be ‘PCA’ (default) for a distribution based on the
          principal components of X, or ‘uniform’ for a uniform
          distribution based on the range of the observed data.  ‘PCA’
          is currently not implemented.  Only used by ‘gap’ evaluation.

     ‘SearchMethod’
          the method for selecting the optimal value with a ‘gap’
          evaluation.  It can be either ‘globalMaxSE’ (default) for
          selecting the smallest number of clusters which is inside the
          standard error of the maximum gap value, or ‘firstMaxSE’ for
          selecting the first number of clusters which is inside the
          standard error of the following cluster number.  Only used by
          ‘gap’ evaluation.

     Output EVA is a clustering evaluation object.

See also: CalinskiHarabaszEvaluation, DaviesBouldinEvaluation,
GapEvaluation, SilhouetteEvaluation.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 77
Create a clustering evaluation object to find the optimal number of
clusters.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
factoran


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1452
 -- statistics: LOADINGS = factoran (X, NFAC)
 -- statistics: [LOADINGS, SPECVAR] = factoran (X, NFAC)
 -- statistics: [LOADINGS, SPECVAR, FSCORES] = factoran (X, NFAC)

     Perform principal axis factor analysis on data matrix.

     ‘LOADINGS = factoran (X, NFAC)’ performs principal axis factoring
     to extract NFAC factors from the N x P data matrix X, where rows
     correspond to observations and columns to variables.  The output
     LOADINGS is a P x NFAC matrix whose columns contain the loadings on
     each factor, in decreasing order of importance.

     ‘[LOADINGS, SPECVAR] = factoran (...)’ also returns a P x 1 vector
     SPECVAR containing the specific variances (unique variances) for
     each variable.

     ‘[LOADINGS, SPECVAR, FSCORES] = factoran (...)’ also returns the N
     x NFAC matrix FSCORES of estimated factor scores, computed using
     the regression method.

     The analysis is performed on the correlation matrix of the
     standardized X.  Initial communalities are set to 1.  Iterations
     continue until the maximum change in communality is less than 1e-4
     or 50 iterations are reached.  The sign of each loading vector is
     chosen so that the element with largest absolute value is positive.

     References
     ----------

       1. Harman, H. H., Modern Factor Analysis, 3rd Edition, University
          of Chicago Press, 1976.

     See also: barttest, pca, pcacov, pcares.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 54
Perform principal axis factor analysis on data matrix.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 4
ff2n


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 439
 -- statistics: DFF2 = ff2n (N)

     Two-level full factorial design.

     ‘DFF2 = ff2n (N)’ gives factor settings dFF2 for a two-level full
     factorial design with n factors.  DFF2 is m-by-n, where m is the
     number of treatments in the full-factorial design.  Each row of
     DFF2 corresponds to a single treatment.  Each column contains the
     settings for a single factor, with values of 0 and 1 for the two
     levels.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 32
Two-level full factorial design.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 11
fillmissing


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8287
 -- statistics: B = fillmissing (A, "constant", V)
 -- statistics: B = fillmissing (A, METHOD)
 -- statistics: B = fillmissing (A, MOVE_METHOD, WINDOW_SIZE)
 -- statistics: B = fillmissing (A, FILL_FUNCTION, WINDOW_SIZE)
 -- statistics: B = fillmissing (..., DIM)
 -- statistics: B = fillmissing (..., PROPERTYNAME, PROPERTYVALUE)
 -- statistics: [B, IDX] = fillmissing (...)

     Replace missing entries of array A either with values in V or as
     determined by other specified methods.  'missing' values are
     determined by the data type of A as identified by the function
     ismissing, currently defined as:

        • NaN: ‘single’, ‘double’

        • " " (white space): ‘char’

        • {""} (white space in cell): string cells.

     A can be a numeric scalar or array, a character vector or array, or
     a cell array of character vectors (a.k.a.  string cells).

     V can be a scalar or an array containing values for replacing the
     missing values in A with a compatible data type for insertion into
     A.  The shape of V must be a scalar or an array with number of
     elements in V equal to the number of elements orthogonal to the
     operating dimension.  E.g., if ‘size(A)’ = [3 5 4], operating along
     ‘dim’ = 2 requires V to contain either 1 or 3x4=12 elements.

     If requested, the optional output IDX will contain a logical array
     the same shape as A indicating with 1's which locations in A were
     filled.

     Alternate Input Arguments and Values:
        • METHOD - replace missing values with:

          ‘next’
          ‘previous’
          ‘nearest’
               next, previous, or nearest non-missing value (nearest
               defaults to next when equidistant as determined by
               ‘SamplePoints’.)

          ‘linear’
               linear interpolation of neigboring, non-missing values

          ‘spline’
               piecewise cubic spline interpolation of neigboring,
               non-missing values

          ‘pchip’
               'shape preserving' piecewise cubic spline interposaliton
               of neighboring, non-missing values

        • MOVE_METHOD - moving window calculated replacement values:

          ‘movmean’
          ‘movmedian’
               moving average or median using a window determined by
               WINDOW_SIZE.  WINDOW_SIZE must be either a positive
               scalar value or a two element positive vector of sizes
               ‘[NB, NA]’ measured in the same units as ‘SamplePoints’.
               For scalar values, the window is centered on the missing
               element and includes all data points within a distance of
               half of WINDOW_SIZE on either side of the window center
               point.  Note that for compatibility, when using a scalar
               value, the backward window limit is inclusive and the
               forward limit is exclusive.  If a two-element WINDOW_SIZE
               vector is specified, the window includes all points
               within a distance of NB backward and NA forward from the
               current element at the window center (both limits
               inclusive).

        • FILL_FUNCTION - custom method specified as a function handle.
          The supplied fill function must accept three inputs in the
          following order for each missing gap in the data:
          A_VALUES -
               elements of A within the window on either side of the gap
               as determined by WINDOW_SIZE.  (Note these elements can
               include missing values from other nearby gaps.)
          A_LOCS -
               locations of the reference data, A_VALUES, in terms of
               the default or specified ‘SamplePoints’.
          GAP_LOCS -
               location of the gap data points that need to be filled in
               terms of the default or specified ‘SamplePoints’.

          The supplied function must return a scalar or vector with the
          same number of elements in GAP_LOCS.  The required WINDOW_SIZE
          parameter follows similar rules as for the moving average and
          median methods described above, with the two exceptions that
          (1) each gap is processed as a single element, rather than gap
          elements being processed individually, and (2) the window
          extended on either side of the gap has inclusive endpoints
          regardless of how WINDOW_SIZE is specified.

        • DIM - specify a dimension for vector operation (default =
          first non-singeton dimension)

        • PROPERTYNAME-PROPERTYVALUE pairs
          ‘SamplePoints’
               PROPERTYVALUE is a vector of sample point values
               representing the sorted and unique x-axis values of the
               data in A.  If unspecified, the default is assumed to be
               the vector [1 : SIZE (A, DIM)].  The values in
               ‘SamplePoints’ will affect methods and properties that
               rely on the effective distance between data points in A,
               such as interpolants and moving window functions where
               the WINDOW_SIZE specified for moving window functions is
               measured relative to the ‘SamplePoints’.

          ‘EndValues’
               Apply a separate handling method for missing values at
               the front or back of the array.  PROPERTYVALUE can be:
                  • A constant scalar or array with the same shape
                    requirements as V.
                  • ‘none’ - Do not fill end gap values.
                  • ‘extrap’ - Use the same procedure as METHOD to fill
                    the end gap values.
                  • Any valid METHOD listed above except for ‘movmean’,
                    ‘movmedian’, and ‘fill_function’.  Those methods can
                    only be applied to end gap values with ‘extrap’.

          ‘MissingLocations’
               PROPERTYVALUE must be a logical array the same size as A
               indicating locations of known missing data with a value
               of ‘true’.  (cannot be combined with MaxGap)

          ‘MaxGap’
               PROPERTYVALUE is a numeric scalar indicating the maximum
               gap length to fill, and assumes the same distance scale
               as the sample points.  Gap length is calculated by the
               difference in locations of the sample points on either
               side of the gap, and gaps larger than MaxGap are ignored
               by FILLMISSING.  (cannot be combined with
               MissingLocations)

     Compatibility Notes:
        • Numerical and logical inputs for A and V may be specified in
          any combination.  The output will be the same class as A, with
          the V converted to that data type for filling.  Only ‘single’
          and ‘double’ have defined 'missing' values, so except for when
          the ‘missinglocations’ option specifies the missing value
          identification of logical and other numeric data types, the
          output will always be ‘B = A’ with ‘IDX = false(size(A))’.
        • All interpolation methods can be individually applied to
          ‘EndValues’.
        • MATLAB's FILL_FUNCTION method currently has several
          inconsistencies with the other methods (tested against version
          2022a), and Octave's implementation has chosen the following
          consistent behavior over compatibility: (1) a column full of
          missing data is considered part of ‘EndValues’, (2) such
          columns are then excluded from FILL_FUNCTION processing
          because the moving window is always empty.  (3) operation in
          dimensions higher than 2 perform identically to operations in
          dims 1 and 2, most notable on vectors.
        • Method "makima" is not yet implemented in ‘interp1’, which is
          used by ‘fillmissing’.  Attempting to call this method will
          produce an error until the method is implemented in ‘interp1’.

     See also: ismissing, rmmissing, standardizeMissing.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Replace missing entries of array A either with values in V or as
determined b...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 10
fishertest


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2638
 -- statistics: H = fishertest (X)
 -- statistics: H = fishertest (X, PARAM1, VALUE1, ...)
 -- statistics: [H, PVAL] = fishertest (...)
 -- statistics: [H, PVAL, STATS] = fishertest (...)

     Fisher's exact test.

     ‘H = fishertest (X)’ performs Fisher's exact test on a 2x2
     contingency table given in matrix X.  This is a test of the
     hypothesis that there are no non-random associations between the
     two 2-level categorical variables in X.  ‘fishertest’ returns the
     result of the tested hypothesis in H.  H = 0 indicates that the
     null hypothesis (of no association) cannot be rejected at the 5%
     significance level.  H = 1 indicates that the null hypothesis can
     be rejected at the 5% level.  X must contain only non-negative
     integers.  Use the ‘crosstab’ function to generate the contingency
     table from samples of two categorical variables.  Fisher's exact
     test is not suitable when all integers in X are very large.  User
     can use the Chi-square test in this case.

     ‘[H, PVAL] = fishertest (X)’ returns the p-value in PVAL.  That is
     the probability of observing the given result, or one more extreme,
     by chance if the null hypothesis is true.  Small values of PVAL
     cast doubt on the validity of the null hypothesis.

     ‘[P, PVAL, STATS] = fishertest (...)’ returns the structure STATS
     with the following fields:

          OddsRatio              - the odds ratio
          ConfidenceInterval     - the asymptotic confidence interval for the
                                 odds ratio.  If any of the four entries in
                                 the contingency table X is zero, the
                                 confidence interval will not be computed, and
                                 [-Inf Inf] will be displayed.

     ‘[...] = fishertest (..., NAME, VALUE, ...)’ specifies one or more
     of the following name/value pairs:

          Name           Value
     ---------------------------------------------------------------------------
          "alpha"        the significance level.  Default is 0.05.
                         
          "tail"         a string specifying the alternative hypothesis
             "both"             odds ratio not equal to 1, indicating
                                association between two variables (two-tailed
                                test, default)
             "left"             odds ratio greater than 1 (right-tailed test)
             "right"            odds ratio is less than 1 (left-tailed test)

     See also: crosstab, chi2test, mcnemar_test, ztest2.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 20
Fisher's exact test.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 9
fitcdiscr


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 3803
 -- statistics: MDL = fitcdiscr (X, Y)
 -- statistics: MDL = fitcdiscr (..., NAME, VALUE)

     Fit a Linear Discriminant Analysis classification model.

     ‘MDL = fitcdiscr (X, Y)’ returns a Linear Discriminant Analysis
     (LDA) classification model, MDL, with X being the predictor data,
     and Y the class labels of observations in X.

        • ‘X’ must be a NxP numeric matrix of predictor data where rows
          correspond to observations and columns correspond to features
          or variables.
        • ‘Y’ is Nx1 matrix or cell matrix containing the class labels
          of corresponding predictor data in X.  Y can be numerical,
          logical, char array or cell array of character vectors.  Y
          must have same number of rows as X.

     ‘MDL = fitcdiscr (..., NAME, VALUE)’ returns a Linear Discriminant
     Analysis model with additional options specified by Name-Value pair
     arguments listed below.

     Model Parameters
     ----------------

     NAME            VALUE
                     
     ---------------------------------------------------------------------------
     "PredictorNames"A cell array of character vectors specifying the names
                     of the predictors.  The length of this array must match
                     the number of columns in X.
                     
     "ResponseName"  A character vector specifying the name of the response
                     variable.
                     
     "ClassNames"    Names of the classes in the class labels, Y, used for
                     fitting the Discriminant model.  ClassNames are of the
                     same type as the class labels in Y.
                     
     "Prior"         A numeric vector specifying the prior probabilities for
                     each class.  The order of the elements in Prior
                     corresponds to the order of the classes in ClassNames.
                     Alternatively, you can specify "empirical" to use the
                     empirical class probabilities or "uniform" to assume
                     equal class probabilities.
                     
     "Cost"          A NxR numeric matrix containing misclassification cost
                     for the corresponding instances in X where R is the
                     number of unique categories in Y.  If an instance is
                     correctly classified into its category the cost is
                     calculated to be 1, otherwise 0.  cost matrix can be
                     altered use ‘MDL.COST = somecost’.  default value COST =
                     ones(rows(X),numel(unique(Y))).
                     
     "DiscrimType"   A character vector or string scalar specifying the type
                     of discriminant analysis to perform.  The only supported
                     value is "linear".
                     
     "FillCoeffs"    A character vector or string scalar with values "on" or
                     "off" specifying whether to fill the coefficients after
                     fitting.  If set to "on", the coefficients are computed
                     during model fitting, which can be useful for
                     prediction.
                     
     "Gamma"         A numeric scalar specifying the regularization parameter
                     for the covariance matrix.  It adjusts the linear
                     discriminant analysis to make the model more stable in
                     the presence of multicollinearity or small sample sizes.
                     A value of 0 corresponds to no regularization, while a
                     value of 1 corresponds to a completely regularized
                     model.
                     

     See also: ClassificationDiscriminant.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 56
Fit a Linear Discriminant Analysis classification model.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
fitcgam


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6384
 -- statistics: MDL = fitcgam (X, Y)
 -- statistics: MDL = fitcgam (..., NAME, VALUE)

     Fit a Generalized Additive Model (GAM) for binary classification.

     ‘MDL = fitcgam (X, Y)’ returns a a GAM classification model, MDL,
     with X being the predictor data, and Y the binary class labels of
     observations in X.

        • ‘X’ must be a NxP numeric matrix of predictor data where rows
          correspond to observations and columns correspond to features
          or variables.
        • ‘Y’ is Nx1 numeric vector containing binary class labels,
          typically 0 or 1.

     ‘MDL = fitcgam (..., NAME, VALUE)’ returns a GAM classification
     model with additional options specified by Name-Value pair
     arguments listed below.

     Model Parameters
     ----------------

     NAME            VALUE
                     
     ---------------------------------------------------------------------------
     "PredictorNames"A cell array of character vectors specifying the names
                     of the predictors.  The length of this array must match
                     the number of columns in X.
                     
     "ResponseName"  A character vector specifying the name of the response
                     variable.
                     
     "ClassNames"    Names of the classes in the class labels, Y, used for
                     fitting the Discriminant model.  ClassNames are of the
                     same type as the class labels in Y.
                     
     "Cost"          A NxR numeric matrix containing misclassification cost
                     for the corresponding instances in X where R is the
                     number of unique categories in Y.  If an instance is
                     correctly classified into its category the cost is
                     calculated to be 1, otherwise 0.  cost matrix can be
                     altered use ‘MDL.COST = somecost’.  default value COST =
                     ones(rows(X),numel(unique(Y))).
                     
     "Formula"       A model specification given as a string in the form "Y ~
                     terms" where Y represents the reponse variable and terms
                     the predictor variables.  The formula can be used to
                     specify a subset of variables for training model.  For
                     example: "Y ~ x1 + x2 + x3 + x4 + x1:x2 + x2:x3"
                     specifies four linear terms for the first four columns
                     of for predictor data, and x1:x2 and x2:x3 specify the
                     two interaction terms for 1st-2nd and 3rd-4th columns
                     respectively.  Only these terms will be used for
                     training the model, but X must have at least as many
                     columns as referenced in the formula.  If Predictor
                     Variable names have been defined, then the terms in the
                     formula must reference to those.  When "formula" is
                     specified, all terms used for training the model are
                     referenced in the IntMatrix field of the OBJ class
                     object as a matrix containing the column indexes for
                     each term including both the predictors and the
                     interactions used.
                     
     "Interactions"  A logical matrix, a positive integer scalar, or the
                     string "all" for defining the interactions between
                     predictor variables.  When given a logical matrix, it
                     must have the same number of columns as X and each row
                     corresponds to a different interaction term combining
                     the predictors indexed as true.  Each interaction term
                     is appended as a column vector after the available
                     predictor column in X.  When "all" is defined, then all
                     possible combinations of interactions are appended in X
                     before training.  At the moment, parsing a positive
                     integer has the same effect as the "all" option.  When
                     "interactions" is specified, only the interaction terms
                     appended to X are referenced in the IntMatrix field of
                     the OBJ class object.
                     
     "Knots"         A scalar or a row vector with the same columns as X.  It
                     defines the knots for fitting a polynomial when training
                     the GAM. As a scalar, it is expanded to a row vector.
                     The default value is 5, hence expanded to ones (1,
                     columns (X)) * 5.  You can parse a row vector with
                     different number of knots for each predictor variable to
                     be fitted with, although not recommended.
                     
     "Order"         A scalar or a row vector with the same columns as X.  It
                     defines the order of the polynomial when training the
                     GAM. As a scalar, it is expanded to a row vector.  The
                     default values is 3, hence expanded to ones (1, columns
                     (X)) * 3.  You can parse a row vector with different
                     number of polynomial order for each predictor variable
                     to be fitted with, although not recommended.
                     
     "DoF"           A scalar or a row vector with the same columns as X.  It
                     defines the degrees of freedom for fitting a polynomial
                     when training the GAM. As a scalar, it is expanded to a
                     row vector.  The default value is 8, hence expanded to
                     ones (1, columns (X)) * 8.  You can parse a row vector
                     with different degrees of freedom for each predictor
                     variable to be fitted with, although not recommended.
                     
     You can parse either a "Formula" or an "Interactions" optional
     parameter.  Parsing both parameters will result an error.
     Accordingly, you can only pass up to two parameters among "Knots",
     "Order", and "DoF" to define the required polynomial for training
     the GAM model.

     See also: ClassificationGAM.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 65
Fit a Generalized Additive Model (GAM) for binary classification.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
fitcknn


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 15448
 -- statistics: MDL = fitcknn (X, Y)
 -- statistics: MDL = fitcknn (..., NAME, VALUE)

     Fit a k-Nearest Neighbor classification model.

     ‘MDL = fitcknn (X, Y)’ returns a k-Nearest Neighbor classification
     model, MDL, with X being the predictor data, and Y the class labels
     of observations in X.

        • ‘X’ must be a NxP numeric matrix of predictor data where rows
          correspond to observations and columns correspond to features
          or variables.
        • ‘Y’ is Nx1 matrix or cell matrix containing the class labels
          of corresponding predictor data in X.  Y can be numerical,
          logical, char array or cell array of character vectors.  Y
          must have same number of rows as X.

     ‘MDL = fitcknn (..., NAME, VALUE)’ returns a k-Nearest Neighbor
     classification model with additional options specified by
     Name-Value pair arguments listed below.

     Model Parameters
     ----------------

     NAME            VALUE
                     
     ---------------------------------------------------------------------------
     "Standardize"   A boolean flag indicating whether the data in X should
                     be standardized prior to training.
                     
     "PredictorNames"A cell array of character vectors specifying the
                     predictor variable names.  The variable names are
                     assumed to be in the same order as they appear in the
                     training data X.
                     
     "ResponseName"  A character vector specifying the name of the response
                     variable.
                     
     "ClassNames"    Names of the classes in the class labels, Y, used for
                     fitting the kNN model.  ClassNames are of the same type
                     as the class labels in Y.
                     
     "Prior"         A numeric vector specifying the prior probabilities for
                     each class.  The order of the elements in Prior
                     corresponds to the order of the classes in ClassNames.
                     
     "Cost"          A NxR numeric matrix containing misclassification cost
                     for the corresponding instances in X where R is the
                     number of unique categories in Y.  If an instance is
                     correctly classified into its category the cost is
                     calculated to be 1, otherwise 0.  cost matrix can be
                     altered use ‘MDL.COST = somecost’.  default value COST =
                     ones(rows(X),numel(unique(Y))).
                     
     "ScoreTransform"A character vector defining one of the following
                     functions or a user defined function handle, which is
                     used for transforming the prediction scores returned by
                     the ‘predict’ and ‘resubPredict’ methods.  Default value
                     is 'none'.

          VALUE          DESCRIPTION
     ---------------------------------------------------------------------------
          "doublelogit"  1 ./ (1 + exp .^ (-2 * x))
          "invlogit"     log (x ./ (1 - x))
          "ismax"        Sets the score for the class with the largest score
                         to 1, and sets the scores for all other classes to 0
          "logit"        1 ./ (1 + exp .^ (-x))
          "none"         x (no transformation)
          "identity"     x (no transformation)
          "sign"         -1 for x < 0, 0 for x = 0, 1 for x > 0
          "symmetric"    2 * x + 1
          "symmetricismax"Sets the score for the class with the largest score
                         to 1, and sets the scores for all other classes to
                         -1
          "symmetriclogit"2 ./ (1 + exp .^ (-x)) - 1

     NAME            VALUE
                     
     ---------------------------------------------------------------------------
     "BreakTies"     Tie-breaking algorithm used by predict when multiple
                     classes have the same smallest cost.  By default, ties
                     occur when multiple classes have the same number of
                     nearest points among the k nearest neighbors.  The
                     available options are specified by the following
                     character arrays:

          VALUE          DESCRIPTION
                         
     ---------------------------------------------------------------------------
          "smallest"     This is the default and it favors the class with the
                         smallest index among the tied groups, i.e.  the one
                         that appears first in the training labelled data.
          "nearest"      This favors the class with the nearest neighbor
                         among the tied groups, i.e.  the class with the
                         closest member point according to the distance
                         metric used.
          "random"       This randomly picks one class among the tied groups.

     NAME            VALUE
                     
     ---------------------------------------------------------------------------
     "BucketSize"    The maximum number of data points in the leaf node of
                     the Kd-tree and it must be a positive integer.  By
                     default, it is 50.  This argument is meaningful only
                     when the selected search method is "kdtree".
                     
     "NumNeighbors"  A positive integer value specifying the number of
                     nearest neighbors to be found in the kNN search.  By
                     default, it is 1.
                     
     "Exponent"      A positive scalar (usually an integer) specifying the
                     Minkowski distance exponent.  This argument is only
                     valid when the selected distance metric is "minkowski".
                     By default it is 2.
                     
     "Scale"         A nonnegative numeric vector specifying the scale
                     parameters for the standardized Euclidean distance.  The
                     vector length must be equal to the number of columns in
                     X.  This argument is only valid when the selected
                     distance metric is "seuclidean", in which case each
                     coordinate of X is scaled by the corresponding element
                     of "scale", as is each query point in Y.  By default,
                     the scale parameter is the standard deviation of each
                     coordinate in X.  If a variable in X is constant, i.e.
                     zero variance, this value is forced to 1 to avoid
                     division by zero.  This is the equivalent of this
                     variable not being standardized.
                     
     "Cov"           A square matrix with the same number of columns as X
                     specifying the covariance matrix for computing the
                     mahalanobis distance.  This must be a positive definite
                     matrix matching.  This argument is only valid when the
                     selected distance metric is "mahalanobis".
                     
     "Distance"      is the distance metric used by ‘knnsearch’ as specified
                     below:

          VALUE          DESCRIPTION
                         
     ---------------------------------------------------------------------------
          "euclidean"    Euclidean distance.
          "seuclidean"   standardized Euclidean distance.  Each coordinate
                         difference between the rows in X and the query
                         matrix Y is scaled by dividing by the corresponding
                         element of the standard deviation computed from X.
                         To specify a different scaling, use the "Scale"
                         name-value argument.
          "cityblock"    City block distance.
          "chebychev"    Chebychev distance (maximum coordinate difference).
          "minkowski"    Minkowski distance.  The default exponent is 2.  To
                         specify a different exponent, use the "P" name-value
                         argument.
          "mahalanobis"  Mahalanobis distance, computed using a positive
                         definite covariance matrix.  To change the value of
                         the covariance matrix, use the "Cov" name-value
                         argument.
          "cosine"       Cosine distance.
          "correlation"  One minus the sample linear correlation between
                         observations (treated as sequences of values).
          "spearman"     One minus the sample Spearman's rank correlation
                         between observations (treated as sequences of
                         values).
          "hamming"      Hamming distance, which is the percentage of
                         coordinates that differ.
          "jaccard"      One minus the Jaccard coefficient, which is the
                         percentage of nonzero coordinates that differ.
          @DISTFUN       Custom distance function handle.  A distance
                         function of the form ‘function D2 = distfun (XI,
                         YI)’, where XI is a 1xP vector containing a single
                         observation in P-dimensional space, YI is an NxP
                         matrix containing an arbitrary number of
                         observations in the same P-dimensional space, and D2
                         is an NxP vector of distances, where (D2k) is the
                         distance between observations XI and (YIk,:).

     NAME            VALUE
                     
     ---------------------------------------------------------------------------
     "DistanceWeight"A distance weighting function, specified either as a
                     function handle, which accepts a matrix of nonnegative
                     distances and returns a matrix the same size containing
                     nonnegative distance weights, or one of the following
                     values: "equal", which corresponds to no weighting;
                     "inverse", which corresponds to a weight equal to
                     1/distance; "squaredinverse", which corresponds to a
                     weight equal to 1/distance^2.
                     
     "IncludeTies"   A boolean flag to indicate if the returned values should
                     contain the indices that have same distance as the K^th
                     neighbor.  When false, ‘knnsearch’ chooses the
                     observation with the smallest index among the
                     observations that have the same distance from a query
                     point.  When true, ‘knnsearch’ includes all nearest
                     neighbors whose distances are equal to the K^th smallest
                     distance in the output arguments.  To specify K, use the
                     "K" name-value pair argument.
                     
     "NSMethod"      is the nearest neighbor search method used by
                     ‘knnsearch’ as specified below.

          VALUE          DESCRIPTION
                         
     ---------------------------------------------------------------------------
          "kdtree"       Creates and uses a Kd-tree to find nearest
                         neighbors.  "kdtree" is the default value when the
                         number of columns in X is less than or equal to 10,
                         X is not sparse, and the distance metric is
                         "euclidean", "cityblock", "manhattan", "chebychev",
                         or "minkowski".  Otherwise, the default value is
                         "exhaustive".  This argument is only valid when the
                         distance metric is one of the four aforementioned
                         metrics.
          "exhaustive"   Uses the exhaustive search algorithm by computing
                         the distance values from all the points in X to each
                         point in Y.

     Cross Validation Options
     ------------------------

     NAME            VALUE
                     
     ---------------------------------------------------------------------------
     "Crossval"      Cross-validation flag specified as 'on' or 'off'.  If
                     'on' is specified, a 10-fold cross validation is
                     performed and a ‘ClassificationPartitionedModel’ is
                     returned in MDL.  To override this cross-validation
                     setting, use only one of the following Name-Value pair
                     arguments.
                     
     "CVPartition"   A ‘cvpartition’ object that specifies the type of
                     cross-validation and the indexing for the training and
                     validation sets.  A ‘ClassificationPartitionedModel’ is
                     returned in MDL and the trained model is stored in the
                     ‘Trained’ property.
                     
     "Holdout"       Fraction of the data used for holdout validation,
                     specified as a scalar value in the range [0,1].  When
                     specified, a randomly selected percentage is reserved as
                     validation data and the remaining set is used for
                     training.  The trained model is stored in the ‘Trained’
                     property of the ‘ClassificationPartitionedModel’
                     returned in MDL.  "Holdout" partitioning attempts to
                     ensure that each partition represents the classes
                     proportionately.
                     
     "KFold"         Number of folds to use in the cross-validated model,
                     specified as a positive integer value greater than 1.
                     When specified, then the data is randomly partitioned in
                     k sets and for each set, the set is reserved as
                     validation data while the remaining k-1 sets are used
                     for training.  The trained models are stored in the
                     ‘Trained’ property of the
                     ‘ClassificationPartitionedModel’ returned in MDL.
                     "KFold" partitioning attempts to ensure that each
                     partition represents the classes proportionately.
                     
     "Leaveout"      Leave-one-out cross-validation flag specified as 'on' or
                     'off'.  If 'on' is specified, then for each of the n
                     observations (where n is the number of observations,
                     excluding missing observations, specified in the
                     ‘NumObservations’ property of the model), one
                     observation is reserved as validation data while the
                     remaining observations are used for training.  The
                     trained models are stored in the ‘Trained’ property of
                     the ‘ClassificationPartitionedModel’ returned in MDL.

     See also: ClassificationKNN, ClassificationPartitionedModel,
     knnsearch, rangesearch, pdist2.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 46
Fit a k-Nearest Neighbor classification model.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
fitcnet


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 5664
 -- statistics: MDL = fitcnet (X, Y)
 -- statistics: MDL = fitcnet (..., NAME, VALUE)

     Fit a Neural Network classification model.

     ‘MDL = fitcnet (X, Y)’ returns a Neural Network classification
     model, MDL, with X being the predictor data, and Y the class labels
     of observations in X.

        • ‘X’ must be a NxP numeric matrix of predictor data where rows
          correspond to observations and columns correspond to features
          or variables.
        • ‘Y’ is Nx1 matrix or cell matrix containing the class labels
          of corresponding predictor data in X.  Y can contain any type
          of categorical data.  Y must have same numbers of rows as X.

     ‘MDL = fitcnet (..., NAME, VALUE)’ returns a Neural Network
     classification model with additional options specified by
     Name-Value pair arguments listed below.

     Model Parameters
     ----------------

     NAME                      VALUE
                               
     -------------------------------------------------------------------------------------
     "Standardize"             A boolean flag indicating whether the data in X should
                               be standardized prior to training.
                               
     "PredictorNames"          A cell array of character vectors specifying the
                               predictor variable names.  The variable names are
                               assumed to be in the same order as they appear in the
                               training data X.
                               
     "ResponseName"            A character vector specifying the name of the response
                               variable.
                               
     "ClassNames"              Names of the classes in the class labels, Y, used for
                               fitting the Neural Network model.  ClassNames are of the
                               same type as the class labels in Y.
                               
     "Prior"                   A numeric vector specifying the prior probabilities for
                               each class.  The order of the elements in Prior
                               corresponds to the order of the classes in ClassNames.
                               
     "LayerSizes"              A vector of positive integers that defines the sizes of
                               the fully connected layers in the neural network model.
                               Each element in LayerSizes corresponds to the number of
                               outputs for the respective fully connected layer in the
                               neural network model.  The default value is 10.
                               
     "LearningRate"            A positive scalar value that defines the learning rate
                               during the gradient descent.  Default value is 0.01.
                               
     "Activations"             A character vector or a cellstr vector specifying the
                               activation functions for the hidden layers of the neural
                               network (excluding the output layer).  The available
                               activation functions are 'linear', 'sigmoid', 'tanh',
                               'sigmoid', and 'none'.  The default value is 'sigmoid'.
                               
     "OutputLayerActivation"   A character vector specifying the activation function
                               for the output layer of the neural network.  The
                               available activation functions are 'linear', 'sigmoid',
                               'tanh', 'sigmoid', and 'none'.  The default value is
                               'sigmoid'.
                               
     "IterationLimit"          A positive integer scalar that specifies the maximum
                               number of training iterations.  The default value is
                               1000.
                               
     "DisplayInfo"             A boolean flag indicating whether to print information
                               during training.  Default is false.
                               
     "ScoreTransform"          A character vector defining one of the following
                               functions or a user defined function handle, which is
                               used for transforming the prediction scores returned by
                               the ‘predict’ and ‘resubPredict’ methods.  Default value
                               is 'none'.

          VALUE                  DESCRIPTION
     -----------------------------------------------------------------------------------
          "doublelogit"          1 ./ (1 + exp .^ (-2 * x))
          "invlogit"             log (x ./ (1 - x))
          "ismax"                Sets the score for the class with the largest score
                                 to 1, and sets the scores for all other classes to 0
          "logit"                1 ./ (1 + exp .^ (-x))
          "none"                 x (no transformation)
          "identity"             x (no transformation)
          "sign"                 -1 for x < 0, 0 for x = 0, 1 for x > 0
          "symmetric"            2 * x + 1
          "symmetricismax"       Sets the score for the class with the largest score
                                 to 1, and sets the scores for all other classes to
                                 -1
          "symmetriclogit"       2 ./ (1 + exp .^ (-x)) - 1

     See also: ClassificationNeuralNetwork.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 42
Fit a Neural Network classification model.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
fitcsvm


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 11801
 -- statistics: MDL = fitcsvm (X, Y)
 -- statistics: MDL = fitcsvm (..., NAME, VALUE)

     Fit a Support Vector Machine classification model.

     ‘MDL = fitcsvm (X, Y)’ returns a Support Vector Machine
     classification model, MDL, with X being the predictor data, and Y
     the class labels of observations in X.

        • ‘X’ must be a NxP numeric matrix of predictor data where rows
          correspond to observations and columns correspond to features
          or variables.
        • ‘Y’ is Nx1 matrix or cell matrix containing the class labels
          of corresponding predictor data in X.  Y can be numerical,
          logical, char array or cell array of character vectors.  Y
          must have same number of rows as X.

     ‘MDL = fitcsvm (..., NAME, VALUE)’ returns a Support Vector Machine
     model with additional options specified by Name-Value pair
     arguments listed below.

     Model Parameters
     ----------------

     NAME            VALUE
                     
     ---------------------------------------------------------------------------
     "Standardize"   A boolean flag indicating whether the data in X should
                     be standardized prior to training.
                     
     "PredictorNames"A cell array of character vectors specifying the
                     predictor variable names.  The variable names are
                     assumed to be in the same order as they appear in the
                     training data X.
                     
     "ResponseName"  A character vector specifying the name of the response
                     variable.
                     
     "ClassNames"    Names of the classes in the class labels, Y, used for
                     fitting the kNN model.  ClassNames are of the same type
                     as the class labels in Y.
                     
     "Prior"         A numeric vector specifying the prior probabilities for
                     each class.  The order of the elements in Prior
                     corresponds to the order of the classes in ClassNames.
                     
     "Cost"          A NxR numeric matrix containing misclassification cost
                     for the corresponding instances in X where R is the
                     number of unique categories in Y.  If an instance is
                     correctly classified into its category the cost is
                     calculated to be 1, otherwise 0.  cost matrix can be
                     altered use ‘MDL.COST = somecost’.  default value COST =
                     ones(rows(X),numel(unique(Y))).
                     
     "SVMtype"       Specifies the type of SVM used for training the
                     ‘ClassificationSVM’ model.  By default, the type of SVM
                     is defined by setting other parameters and/or by the
                     data itself.  Setting the "SVMtype" parameter overrides
                     the default behavior and it accepts the following
                     options:

          VALUE          DESCRIPTION
     ---------------------------------------------------------------------------
          "C_SVC"        It is the standard SVM formulation for
                         classification tasks.  It aims to find the optimal
                         hyperplane that separates different classes by
                         maximizing the margin between them while allowing
                         some misclassifications.  The parameter "C" controls
                         the trade-off between maximizing the margin and
                         minimizing the classification error.  It is the
                         default type, unless otherwise specified.
          "nu_SVC"       It is a variation of the standard SVM that
                         introduces a parameter ν (nu) as an upper bound on
                         the fraction of margin errors and a lower bound on
                         the fraction of support vectors.  This formulation
                         provides more control over the number of support
                         vectors and the margin errors, making it useful for
                         specific classification scenarios.  It is the
                         default type, when the "OutlierFraction" parameter
                         is set.
          "one_class_SVM"It is used for anomaly detection and novelty
                         detection tasks.  It aims to separate the data
                         points of a single class from the origin in a
                         high-dimensional feature space.  This method is
                         particularly useful for identifying outliers or
                         unusual patterns in the data.  It is the default
                         type, when the "Nu" parameter is set or when there
                         is a single class in Y.  When "one_class_SVM" is set
                         by the "SVMtype" pair argument, Y has no effect and
                         any classes are ignored.

     NAME            VALUE
                     
     ---------------------------------------------------------------------------
     "OutlierFraction"The expected proportion of outliers in the training
                     data, specified as a scalar value in the range [0,1].
                     When specified, the type of SVM model is switched to
                     "nu_SVC" and "OutlierFraction" defines the ν (nu)
                     parameter.
                     
     "KernelFunction"A character vector specifying the method for computing
                     elements of the Gram matrix.  The available kernel
                     functions are 'gaussian' or 'rbf', 'linear',
                     'polynomial', and 'sigmoid'.  For one-class learning,
                     the default Kernel function is 'rbf'.  For two-class
                     learning the default is 'linear'.
                     
     "PolynomialOrder"A positive integer that specifies the order of
                     polynomial in kernel function.  The default value is 3.
                     Unless the "KernelFunction" is set to 'polynomial', this
                     parameter is ignored.
                     
     "KernelScale"   A positive scalar that specifies a scaling factor for
                     the γ (gamma) parameter, which can be seen as the
                     inverse of the radius of influence of samples selected
                     by the model as support vectors.  The γ (gamma)
                     parameter is computed as gamma = KernelScale / (number
                     of features).  The default value for "KernelScale" is 1.
                     
     "KernelOffset"  A nonnegative scalar that specifies the coef0 in kernel
                     function.  For the polynomial kernel, it influences the
                     polynomial's shift, and for the sigmoid kernel, it
                     affects the hyperbolic tangent's shift.  The default
                     value for "KernelOffset" is 0.
                     
     "BoxConstraint" A positive scalar that specifies the upper bound of the
                     Lagrange multipliers, i.e.  the parameter C, which is
                     used for training "C_SVC" and "one_class_SVM" type of
                     models.  It determines the trade-off between maximizing
                     the margin and minimizing the classification error.  The
                     default value for "BoxConstraint" is 1.
                     
     "Nu"            A positive scalar, in the range (0,1] that specifies the
                     parameter ν (nu) for training "nu_SVC" and
                     "one_class_SVM" type of models.  Unless overridden by
                     setting the "SVMtype" parameter, setting the "Nu"
                     parameter always forces the training model type to
                     "one_class_SVM", in which case, the number of classes in
                     Y is ignored.  The default value for "Nu" is 1.
                     
     "CacheSize"     A positive scalar that specifies the memory requirements
                     (in MB) for storing the Gram matrix.  The default is
                     1000.
                     
     "Tolerance"     A nonnegative scalar that specifies the tolerance of
                     termination criterion.  The default value is 1e-6.
                     
     "Shrinking"     Specifies whether to use shrinking heuristics.  It
                     accepts either 0 or 1.  The default value is 1.

     Cross Validation Options
     ------------------------

     NAME            VALUE
                     
     ---------------------------------------------------------------------------
     "Crossval"      Cross-validation flag specified as 'on' or 'off'.  If
                     'on' is specified, a 10-fold cross validation is
                     performed and a ‘ClassificationPartitionedModel’ is
                     returned in MDL.  To override this cross-validation
                     setting, use only one of the following Name-Value pair
                     arguments.
                     
     "CVPartition"   A ‘cvpartition’ object that specifies the type of
                     cross-validation and the indexing for the training and
                     validation sets.  A ‘ClassificationPartitionedModel’ is
                     returned in MDL and the trained model is stored in the
                     ‘Trained’ property.
                     
     "Holdout"       Fraction of the data used for holdout validation,
                     specified as a scalar value in the range [0,1].  When
                     specified, a randomly selected percentage is reserved as
                     validation data and the remaining set is used for
                     training.  The trained model is stored in the ‘Trained’
                     property of the ‘ClassificationPartitionedModel’
                     returned in MDL.  "Holdout" partitioning attempts to
                     ensure that each partition represents the classes
                     proportionately.
                     
     "KFold"         Number of folds to use in the cross-validated model,
                     specified as a positive integer value greater than 1.
                     When specified, then the data is randomly partitioned in
                     k sets and for each set, the set is reserved as
                     validation data while the remaining k-1 sets are used
                     for training.  The trained models are stored in the
                     ‘Trained’ property of the
                     ‘ClassificationPartitionedModel’ returned in MDL.
                     "KFold" partitioning attempts to ensure that each
                     partition represents the classes proportionately.
                     
     "Leaveout"      Leave-one-out cross-validation flag specified as 'on' or
                     'off'.  If 'on' is specified, then for each of the n
                     observations (where n is the number of observations,
                     excluding missing observations, specified in the
                     ‘NumObservations’ property of the model), one
                     observation is reserved as validation data while the
                     remaining observations are used for training.  The
                     trained models are stored in the ‘Trained’ property of
                     the ‘ClassificationPartitionedModel’ returned in MDL.

     See also: ClassificationSVM, ClassificationPartitionedModel,
     svmtrain, svmpredict.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 50
Fit a Support Vector Machine classification model.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 9
fitgmdist


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 3054
 -- statistics: GMDIST = fitgmdist (DATA, K, PARAM1, VALUE1, ...)

     Fit a Gaussian mixture model with K components to DATA.  Each row
     of DATA is a data sample.  Each column is a variable.

     Optional parameters are:
        • "start": Initialization conditions.  Possible values are:
             • "randSample" (default) Takes means uniformly from rows of
               data.
             • "plus" Use k-means++ to initialize means.
             • "cluster" Performs an initial clustering with 10% of the
               data.
             • VECTOR A vector whose length is the number of rows in
               data, and whose values are 1 to k specify the components
               each row is initially allocated to.  The mean, variance,
               and weight of each component is calculated from that.
             • STRUCTURE A structure with fields mu, Sigma and
               ComponentProportion.
          For "randSample", "plus", and "cluster", the initial variance
          of each component is the variance of the entire data sample.

        • "Replicates": Number of random restarts to perform.

        • "RegularizationValue" or "Regularize": A small number added to
          the diagonal entries of the covariance to prevent singular
          covariances.

        • "SharedCovariance" or "SharedCov" (logical).  True if all
          components must share the same variance, to reduce the number
          of free parameters

        • "CovarianceType" or "CovType" (string).  Possible values are:
             • "full" (default) Allow arbitrary covariance matrices.
             • "diagonal" Force covariances to be diagonal, to reduce
               the number of free parameters.

        • "Options": A structure with all of the following fields:
             • MaxIter Maximum number of EM iterations (default 100).
             • TolFun Threshold increase in likelihood to terminate EM
               (default 1e-6).
             • Display Possible values are:
                  • "off" (default): Display nothing.
                  • "final": Display the total number of iterations and
                    likelihood once the execution completes.
                  • "iter": Display the number of iteration and
                    likelihood after each iteration.
        • "Weight": A column vector or Nx2 matrix.  The first column
          consists of non-negative weights given to the samples.  If
          these are all integers, this is equivalent to specifying
          WEIGHT(i) copies of row i of DATA, but potentially faster.  If
          a row of DATA is used to represent samples that are similar
          but not identical, then the second column of WEIGHT indicates
          the variance of those original samples.  Specifically, in the
          EM algorithm, the contribution of row i towards the variance
          is set to at least WEIGHT(i,2), to prevent spurious components
          with zero variance.

     See also: gmdistribution, kmeans.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 55
Fit a Gaussian mixture model with K components to DATA.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 5
fitlm


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 4161
 -- statistics: TAB = fitlm (X, Y)
 -- statistics: TAB = fitlm (X, Y, NAME, VALUE)
 -- statistics: TAB = fitlm (X, Y, MODELSPEC)
 -- statistics: TAB = fitlm (X, Y, MODELSPEC, NAME, VALUE)
 -- statistics: [TAB] = fitlm (...)
 -- statistics: [TAB, STATS] = fitlm (...)
 -- statistics: [TAB, STATS] = fitlm (...)

     Regress the continuous outcome (i.e.  dependent variable) Y on
     continuous or categorical predictors (i.e.  independent variables)
     X by minimizing the sum-of-squared residuals.  Unless requested
     otherwise, fitlm prints the model formula, the regression
     coefficients (i.e.  parameters/contrasts) and an ANOVA table.  Note
     that unlike anovan, fitlm treats all factors as continuous by
     default.  A bootstrap resampling variant of this function,
     ‘bootlm’, is available in the statistics-resampling package and has
     similar usage.

     X must be a column major matrix or cell array consisting of the
     predictors.  A constant term (intercept) should not be included in
     X - it is automatically added to the model.  Y must be a column
     vector corresponding to the outcome variable.  MODELSPEC can
     specified as one of the following:

        • "constant" : model contains only a constant (intercept) term.

        • "linear" (default) : model contains an intercept and linear
          term for each predictor.

        • "interactions" : model contains an intercept, linear term for
          each predictor and all products of pairs of distinct
          predictors.

        • "full" : model contains an intercept, linear term for each
          predictor and all combinations of the predictors.

        • a matrix of term definitions : an t-by-(N+1) matrix specifying
          terms in a model, where t is the number of terms, N is the
          number of predictor variables, and +1 accounts for the outcome
          variable.  The outcome variable is the last column in the
          terms matrix and must be a column of zeros.  An intercept must
          be specified in the first row of the terms matrix and must be
          a row of zeros.

     fitlm can take a number of optional parameters as name-value pairs.

     ‘[...] = fitlm (..., "CategoricalVars", CATEGORICAL)’

        • CATEGORICAL is a vector of indices indicating which of the
          columns (i.e.  variables) in X should be treated as
          categorical predictors rather than as continuous predictors.

     fitlm also accepts optional anovan parameters as name-value pairs
     (except for the "model" parameter).  The accepted parameter names
     from anovan and their default values in fitlm are:

        • CONTRASTS : "treatment"

        • SSTYPE: 2

        • ALPHA: 0.05

        • DISPLAY: "on"

        • WEIGHTS: [] (empty)

        • RANDOM: [] (empty)

        • CONTINUOUS: [1:N]

        • VARNAMES: [] (empty)

     Type 'help anovan' to find out more about what these options do.

     fitlm can return up to two output arguments:

     [TAB] = fitlm (...) returns a cell array containing a table of
     model parameters

     [TAB, STATS] = fitlm (...) returns a structure containing
     additional statistics, including degrees of freedom and effect
     sizes for each term in the linear model, the design matrix, the
     variance-covariance matrix, (weighted) model residuals, and the
     mean squared error.  The columns of STATS.coeffs (from
     left-to-right) report the model coefficients, standard errors,
     lower and upper 100*(1-alpha)% confidence interval bounds,
     t-statistics, and p-values relating to the contrasts.  The number
     appended to each term name in STATS.coeffnames corresponds to the
     column number in the relevant contrast matrix for that factor.  The
     STATS structure can be used as input for multcompare.  Note that if
     the model contains a continuous variable and you wish to use the
     STATS output as input to multcompare, then the model needs to be
     refit with the "contrast" parameter set to a sum-to-zero contrast
     coding scheme, e.g."simple".

     See also: anovan, multcompare.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Regress the continuous outcome (i.e.  dependent variable) Y on
continuous or ...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
fitrgam


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6425
 -- statistics: OBJ = fitrgam (X, Y)
 -- statistics: OBJ = fitrgam (X, Y, NAME, VALUE)

     Fit a Generalized Additive Model (GAM) for regression.

     ‘OBJ = fitrgam (X, Y)’ returns an object of class RegressionGAM,
     with matrix X containing the predictor data and vector Y containing
     the continuous response data.

        • X must be a NxP numeric matrix of input data where rows
          correspond to observations and columns correspond to features
          or variables.  X will be used to train the GAM model.
        • Y must be Nx1 numeric vector containing the response data
          corresponding to the predictor data in X.  Y must have same
          number of rows as X.

     ‘OBJ = fitrgam (..., NAME, VALUE)’ returns an object of class
     RegressionGAM with additional properties specified by Name-Value
     pair arguments listed below.

          NAME           VALUE
                         
     ---------------------------------------------------------------------------
          "predictors"   Predictor Variable names, specified as a row vector
                         cell of strings with the same length as the columns
                         in X.  If omitted, the program will generate default
                         variable names (x1, x2, ..., xn) for each column in
                         X.
                         
          "responsename" Response Variable Name, specified as a string.  If
                         omitted, the default value is "Y".
                         
          "formula"      a model specification given as a string in the form
                         "Y ~ terms" where Y represents the reponse variable
                         and terms the predictor variables.  The formula can
                         be used to specify a subset of variables for
                         training model.  For example: "Y ~ x1 + x2 + x3 + x4
                         + x1:x2 + x2:x3" specifies four linear terms for the
                         first four columns of for predictor data, and x1:x2
                         and x2:x3 specify the two interaction terms for
                         1st-2nd and 3rd-4th columns respectively.  Only
                         these terms will be used for training the model, but
                         X must have at least as many columns as referenced
                         in the formula.  If Predictor Variable names have
                         been defined, then the terms in the formula must
                         reference to those.  When "formula" is specified,
                         all terms used for training the model are referenced
                         in the IntMatrix field of the OBJ class object as a
                         matrix containing the column indexes for each term
                         including both the predictors and the interactions
                         used.
                         
          "interactions" a logical matrix, a positive integer scalar, or the
                         string "all" for defining the interactions between
                         predictor variables.  When given a logical matrix,
                         it must have the same number of columns as X and
                         each row corresponds to a different interaction term
                         combining the predictors indexed as true.  Each
                         interaction term is appended as a column vector
                         after the available predictor column in X.  When
                         "all" is defined, then all possible combinations of
                         interactions are appended in X before training.  At
                         the moment, parsing a positive integer has the same
                         effect as the "all" option.  When "interactions" is
                         specified, only the interaction terms appended to X
                         are referenced in the IntMatrix field of the OBJ
                         class object.
                         
          "knots"        a scalar or a row vector with the same columns as X.
                         It defines the knots for fitting a polynomial when
                         training the GAM. As a scalar, it is expanded to a
                         row vector.  The default value is 5, hence expanded
                         to ones (1, columns (X)) * 5.  You can parse a row
                         vector with different number of knots for each
                         predictor variable to be fitted with, although not
                         recommended.
                         
          "order"        a scalar or a row vector with the same columns as X.
                         It defines the order of the polynomial when training
                         the GAM. As a scalar, it is expanded to a row
                         vector.  The default values is 3, hence expanded to
                         ones (1, columns (X)) * 3.  You can parse a row
                         vector with different number of polynomial order for
                         each predictor variable to be fitted with, although
                         not recommended.
                         
          "dof"          a scalar or a row vector with the same columns as X.
                         It defines the degrees of freedom for fitting a
                         polynomial when training the GAM. As a scalar, it is
                         expanded to a row vector.  The default value is 8,
                         hence expanded to ones (1, columns (X)) * 8.  You
                         can parse a row vector with different degrees of
                         freedom for each predictor variable to be fitted
                         with, although not recommended.
                         
          "tol"          a positive scalar to set the tolerance for
                         convergence during training.  By default, it is set
                         to 1e-3.

     You can parse either a "formula" or an "interactions" optional
     parameter.  Parsing both parameters will result an error.
     Accordingly, you can only pass up to two parameters among "knots",
     "order", and "dof" to define the required polynomial for training
     the GAM model.

     See also: RegressionGAM, regress, regress_gp.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 54
Fit a Generalized Additive Model (GAM) for regression.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
friedman


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1726
 -- statistics: P = friedman (X)
 -- statistics: P = friedman (X, REPS)
 -- statistics: P = friedman (X, REPS, DISPLAYOPT)
 -- statistics: [P, ATAB] = friedman (...)
 -- statistics: [P, ATAB, STATS] = friedman (...)

     Performs the nonparametric Friedman's test to compare column
     effects in a two-way layout.  friedman tests the null hypothesis
     that the column effects are all the same against the alternative
     that they are not all the same.

     friedman requires one up to three input arguments:

        • X contains the data and it must be a matrix of at least two
          columns and two rows.
        • REPS is the number of replicates for each combination of
          factor groups.  If not provided, no replicates are assumed.
        • DISPLAYOPT is an optional parameter for displaying the
          Friedman's ANOVA table, when it is 'on' (default) and
          suppressing the display when it is 'off'.

     friedman returns up to three output arguments:

        • P is the p-value of the null hypothesis that all group means
          are equal.
        • ATAB is a cell array containing the results in a Friedman's
          ANOVA table.
        • STATS is a structure containing statistics useful for
          performing a multiple comparison of medians with the
          MULTCOMPARE function.

     If friedman is called without any output arguments, then it prints
     the results in a one-way ANOVA table to the standard output as if
     DISPLAYOPT is 'on'.

     Examples:

          load popcorn;
          friedman (popcorn, 3);

          [p, anovatab, stats] = friedman (popcorn, 3, "off");
          disp (p);

     See also: anova2, kruskalwallis, multcompare.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Performs the nonparametric Friedman's test to compare column effects in
a two...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
fullfact


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 494
 -- statistics: A = fullfact (LEVELS)

     Full factorial design.

     ‘A =’ fullfact (LEVELS) returns a numeric matrix A with the
     treatments of a full factorial design specified by LEVELS, which
     must be a numeric vector of real positive integer values with each
     value specifying the number of levels of each individual factor.

     Each row of A corresponds to a single treatment and each column to
     a single factor.  For binary full factorial design, use ‘ff2n’.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 22
Full factorial design.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
geomean


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1980
 -- statistics: M = geomean (X)
 -- statistics: M = geomean (X, "all")
 -- statistics: M = geomean (X, DIM)
 -- statistics: M = geomean (X, VECDIM)
 -- statistics: M = geomean (..., NANFLAG)

     Compute the geometric mean of X.

        • If X is a vector, then ‘geomean(X)’ returns the geometric mean
          of the elements in X defined as

               geomean (X) = PROD_i X(i) ^ (1/N)

          where N is the length of the X vector.

        • If X is a matrix, then ‘geomean(X)’ returns a row vector with
          the geometric mean of each columns in X.

        • If X is a multidimensional array, then ‘geomean(X)’ operates
          along the first nonsingleton dimension of X.

        • X must not contain any negative or complex values.

     ‘geomean(X, "all")’ returns the geometric mean of all the elements
     in X.  If X contains any 0, then the returned value is 0.

     ‘geomean(X, DIM)’ returns the geometric mean along the operating
     dimension DIM of X.  Calculating the harmonic mean of any subarray
     containing any 0 will return 0.

     ‘geomean(X, VECDIM)’ returns the geometric mean over the dimensions
     specified in the vector VECDIM.  For example, if X is a 2-by-3-by-4
     array, then ‘geomean(X, [1 2])’ returns a 1-by-1-by-4 array.  Each
     element of the output array is the geometric mean of the elements
     on the corresponding page of X.  If VECDIM indexes all dimensions
     of X, then it is equivalent to ‘geomean (X, "all")’.  Any dimension
     in VECDIM greater than ‘ndims (X)’ is ignored.

     ‘geomean(..., NANFLAG)’ specifies whether to exclude NaN values
     from the calculation, using any of the input argument combinations
     in previous syntaxes.  By default, geomean includes NaN values in
     the calculation (NANFLAG has the value "includenan").  To exclude
     NaN values, set the value of NANFLAG to "omitnan".

     See also: harmmean, mean.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 32
Compute the geometric mean of X.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
glmfit


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 5591
 -- statistics: B = glmfit (X, Y, DISTRIBUTION)
 -- statistics: B = glmfit (X, Y, DISTRIBUTION, NAME, VALUE)
 -- statistics: [B, DEV] = glmfit (...)
 -- statistics: [B, DEV, STATS] = glmfit (...)

     Perform generalized linear model fitting.

     ‘B = glmfit (X, Y, DISTRIBUTION)’ returns a vector B of coefficient
     estimates for a generalized linear regression model of the
     responses in Y on the predictors in X, using the distribution
     defined in DISTRIBUTION.

        • X is an nxp numeric matrix of predictor variables with n
          observations and p predictors.
        • Y is an nx1 numeric vector of responses for all supported
          distributions, except for the 'binomial' distribution in which
          case Y can be either a numeric or logical nx1 vector or an nx2
          matrix, where the first column contains the number of
          successes and the second column contains the number of trials.
        • DISTRIBUTION is a character vector specifying the distribution
          of the response variable.  Supported distributions are
          "normal", "binomial", "poisson", "gamma", and "inverse
          gaussian".

     ‘B = glmfit (..., NAME, VALUE)’ specifies additional options using
     Name-Value pair arguments.

     NAME            VALUE
                     
     ---------------------------------------------------------------------------
     "B0"            A numeric vector specifying initial values for the
                     coefficient estimates.  By default, the initial values
                     are fitted values fitted from the data.
                     
     "Constant"      A character vector specifying whether to include a
                     constant term in the model.  Valid options are "ON"
                     (default) and "OFF".
                     
     "EstDisp"       A character vector specifying whether to compute
                     dispersion parameter.  Valid options are "ON" and "OFF".
                     For "binomial" and "poisson" distributions the default
                     is "OFF", whereas for the "normal", "gamma", and
                     "inverse gaussian" distributions the default is "ON".
                     
     "link"          A character vector specifying the name of a canonical
                     link function or a numeric scalar for specifying a
                     "power" link function.  Supported canonical link
                     functions include "identity" (default for "normal"
                     distribution), "log" (default for "poisson"
                     distribution), "logit" (default for "binomial"
                     distribution), "probit", "loglog", "comploglog", and
                     "reciprocal" (default for the "gamma" distribution).
                     The "power" link function is the default for the
                     "inverse gaussian" distribution with p = -2.  For custom
                     link functions, the user can provide cell array with
                     three function handles: the link function, its
                     derivative, and its inverse, or alternatively a
                     structure S with three fields: S.Link, S.Derivative, and
                     S.Inverse.  Each field can either contain a function
                     handle or a character vector with the name of an
                     existing function.  All custom link functions must
                     accept a vector of inputs and return a vector of the
                     same size.
                     
     "Offset"        A numeric vector of the same length as the response Y
                     specifying an offset variable in the fit.  It is used as
                     an additional predictor with a coefficient value fixed
                     at 1.
                     
     "Options"       A scalar structure containing the fields MaxIter and
                     TolX. MaxIter must be a scalar positive integer
                     specifying the maximum number of iteration allowed for
                     fitting the model, and TolX must be a positive scalar
                     value specifying the termination tolerance.
                     
     "Weights"       An nx1 numeric vector of nonnegative values, where n is
                     the number of observations in X.  By default, it is
                     ‘ones (n, 1)’.

     ‘[B, DEV] = glmfit (...)’ also returns the deviance of the fit as a
     numeric value in DEV.  Deviance is a generalization of the residual
     sum of squares.  It measures the goodness of fit compared to a
     saturated model.

     ‘[B, DEV, STATS] = glmfit (...)’ also returns the structure STATS,
     which contains the model statistics in the following fields:

        • beta - Coefficient estimates B
        • dfe - Degrees of freedom for error
        • sfit - Estimated dispersion parameter
        • s - Theoretical or estimated dispersion parameter
        • estdisp - ‘false’ when "EstDisp" is "off" and ‘true’ when
          "EstDisp" is "on"
        • covb - Estimated covariance matrix for B
        • se - Vector of standard errors of the coefficient estimates B
        • coeffcorr - Correlation matrix for B
        • t - t statistics for B
        • p - p-values for B
        • resid - Vector of residuals
        • residp - Vector of Pearson residuals
        • residd - Vector of deviance residuals
        • resida - Vector of Anscombe residuals

     See also: glmval.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 41
Perform generalized linear model fitting.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
glmval


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2039
 -- statistics: YHAT = glmval (B, X, LINK)
 -- statistics: [YHAT, Y_LO, Y_HI] = glmval (B, X, LINK, STATS)
 -- statistics: [...] = glmval (..., NAME, VALUE)

     Predict values for a generalized linear model.

     ‘YHAT = glmval (B, X, LINK)’ returns the predicted values for the
     generalized linear model with a vector of coefficient estimates B,
     a matrix of predictors X, in which each column corresponds to a
     distinct predictor variable, and a link function LINK, which can be
     any of the character vectors, numeric scalar, or custom-defined
     link functions used as values for the "link" name-value pair
     argument in the ‘glmfit’ function.

     ‘[YHAT, Y_LO, Y_HI] = glmval (B, X, LINK, STATS)’ also returns the
     95% confidence intervals for the predicted values according to the
     model's statistics contained in the STATS structure, which is the
     output of the ‘glmfit’ function.  By default, the confidence
     intervals are nonsimultaneous, and apply to the fitted curve
     instead of new observations.

     ‘[...] = glmval (..., NAME, VALUE)’ specifies additional options
     using Name-Value pair arguments.

     NAME            VALUE
                     
     ---------------------------------------------------------------------------
     "confidence"    A scalar value between 0 and 1 specifying the confidence
                     level for the confidence bounds.
                     
     "Constant"      A character vector specifying whether to include a
                     constant term in the model.  Valid options are "ON"
                     (default) and "OFF".
                     
     "simultaneous"  Specifies whether to include a constant term in the
                     model.  Options are "ON" (default) or "OFF".
                     
     "size"          A numeric scalar or a vector with one value for each row
                     of X specifying the size parameter N for a binomial
                     model.

     See also: glmfit.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 46
Predict values for a generalized linear model.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 14
gmdistribution


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1210
 -- statistics: GMDIST = gmdistribution (MU, SIGMA)
 -- statistics: GMDIST = gmdistribution (MU, SIGMA, P)
 -- statistics: GMDIST = gmdistribution (MU, SIGMA, P, EXTRA)

     Create an object of the gmdistribution class which represents a
     Gaussian mixture model with k components of n-dimensional
     Gaussians.

     Input MU is a k-by-n matrix specifying the n-dimensional mean of
     each of the k components of the distribution.

     Input SIGMA is an array that specifies the variances of the
     distributions, in one of four forms depending on its dimension.
        • n-by-n-by-k: Slice SIGMA(:,:,i) is the variance of the i'th
          component
        • 1-by-n-by-k: Slice diag(SIGMA(1,:,i)) is the variance of the
          i'th component
        • n-by-n: SIGMA is the variance of every component
        • 1-by-n-by-k: Slice diag(SIGMA) is the variance of every
          component

     If P is specified, it is a vector of length k specifying the
     proportion of each component.  If it is omitted or empty, each
     component has an equal proportion.

     Input EXTRA is used by fitgmdist to indicate the parameters of the
     fitting process.

     See also: fitgmdist.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Create an object of the gmdistribution class which represents a Gaussian
mixt...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
grp2idx


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 453
 -- statistics: [G, GN, GL] = grp2idx (S)

     Get index for group variables.

     For variable S, returns the indices G, into the variable groups GN
     and GL.  The first has a string representation of the groups while
     the later has its actual values.  The group indices are allocated
     in order of appearance in S.

     NaNs and empty strings in S appear as NaN in G and are not present
     on either GN and GL.

     See also: grpstats.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 30
Get index for group variables.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
grpstats


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2407
 -- statistics: MEAN = grpstats (X)
 -- statistics: MEAN = grpstats (X, GROUP)
 -- statistics: [A, B, ...] = grpstats (X, GROUP, WHICHSTATS)
 -- statistics: [A, B, ...] = grpstats (X, GROUP, WHICHSTATS, ALPHA)
 -- statistics: [A, B, ...] = grpstats (X, GROUP, WHICHSTATS, "alpha",
          A)

     Summary statistics by group.  ‘grpstats’ computes groupwise summary
     statistics, for data in a matrix X.  ‘grpstats’ treats NaNs as
     missing values, and removes them.

     ‘MEANS = grpstats (X, GROUP)’, when X is a matrix of observations,
     returns the means of each column of X by GROUP.  GROUP is a
     grouping variable defined as a categorical variable, numeric,
     string array, or cell array of strings.  GROUP can be [] or omitted
     to compute the mean of the entire sample without grouping.

     ‘[A, B, ...] = grpstats (X, GROUP, WHICHSTATS)’, for a numeric
     matrix X, returns the statistics specified by WHICHSTATS, as
     separate arrays A, B, ....  WHICHSTATS can be a single function
     name, or a cell array containing multiple function names.  The
     number of output arguments must match the number of function names
     in WHICHSTATS.  Names in WHICHSTATS can be chosen from among the
     following:

          "mean"         mean
          "median"       median
          "sem"          standard error of the mean
          "std"          standard deviation
          "var"          variance
          "min"          minimum value
          "max"          maximum value
          "range"        maximum - minimum
          "numel"        count, or number of elements
          "meanci"       95% confidence interval for the mean
          "predci"       95% prediction interval for a new observation
          "gname"        group name

     ‘[...] = grpstats (X, GROUP, WHICHSTATS, ALPHA)’ specifies the
     confidence level as 100(1-ALPHA)% for the "meanci" and "predci"
     options.  Default value for ALPHA is 0.05.  The significance can
     also be specified using the Name-Value pair argument syntax.

     Examples:

          load carsmall;
          [m,p,g] = grpstats (Weight, Model_Year, {"mean", "predci", "gname"})
          n = length(m);
          errorbar((1:n)',m,p(:,2)-m)
          set (gca, "xtick", 1:n, "xticklabel", g);
          title ("95% prediction intervals for mean weight by year")

     See also: grp2idx.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Summary statistics by group.  ‘grpstats’ computes groupwise summary
stati...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
gscatter


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1827
 -- statistics: gscatter (X, Y, G)
 -- statistics: gscatter (X, Y, G, CLR, SYM, SIZ)
 -- statistics: gscatter (..., DOLEG, XNAM, YNAM)
 -- statistics: H = gscatter (...)

     Draw a scatter plot with grouped data.

     ‘gscatter’ is a utility function to draw a scatter plot of X and Y,
     according to the groups defined by G.  Input X and Y are numeric
     vectors of the same size, while G is either a vector of the same
     size as X or a character matrix with the same number of rows as the
     size of X.  As a vector G can be numeric, logical, a character
     array, a string array (not implemented), a cell string or cell
     array.

     A number of optional inputs change the appearance of the plot:
        • "CLR" defines the color for each group; if not enough colors
          are defined by "CLR", ‘gscatter’ cycles through the specified
          colors.  Colors can be defined as named colors, as rgb
          triplets or as indices for the current ‘colormap’.  The
          default value is a different color for each group, according
          to the current ‘colormap’.

        • "SYM" is a char array of symbols for each group; if not enough
          symbols are defined by "SYM", ‘gscatter’ cycles through the
          specified symbols.

        • "SIZ" is a numeric array of sizes for each group; if not
          enough sizes are defined by "SIZ", ‘gscatter’ cycles through
          the specified sizes.

        • "DOLEG" is a boolean value to show the legend; it can be
          either on (default) or off.

        • "XNAM" is a character array, the name for the x axis.

        • "YNAM" is a character array, the name for the y axis.

     Output H is an array of graphics handles to the ‘line’ object of
     each group.

See also: scatter.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 38
Draw a scatter plot with grouped data.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
harmmean


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1986
 -- statistics: M = harmmean (X)
 -- statistics: M = harmmean (X, "all")
 -- statistics: M = harmmean (X, DIM)
 -- statistics: M = harmmean (X, VECDIM)
 -- statistics: M = harmmean (..., NANFLAG)

     Compute the harmonic mean of X.

        • If X is a vector, then ‘harmmean(X)’ returns the harmonic mean
          of the elements in X defined as

               harmmean (X) = N / SUM_i X(i)^-1

          where N is the length of the X vector.

        • If X is a matrix, then ‘harmmean(X)’ returns a row vector with
          the harmonic mean of each columns in X.

        • If X is a multidimensional array, then ‘harmmean(X)’ operates
          along the first nonsingleton dimension of X.

        • X must not contain any negative or complex values.

     ‘harmmean(X, "all")’ returns the harmonic mean of all the elements
     in X.  If X contains any 0, then the returned value is 0.

     ‘harmmean(X, DIM)’ returns the harmonic mean along the operating
     dimension DIM of X.  Calculating the harmonic mean of any subarray
     containing any 0 will return 0.

     ‘harmmean(X, VECDIM)’ returns the harmonic mean over the dimensions
     specified in the vector VECDIM.  For example, if X is a 2-by-3-by-4
     array, then ‘harmmean(X, [1 2])’ returns a 1-by-1-by-4 array.  Each
     element of the output array is the harmonic mean of the elements on
     the corresponding page of X.  If VECDIM indexes all dimensions of
     X, then it is equivalent to ‘harmmean (X, "all")’.  Any dimension
     in VECDIM greater than ‘ndims (X)’ is ignored.

     ‘harmmean(..., NANFLAG)’ specifies whether to exclude NaN values
     from the calculation, using any of the input argument combinations
     in previous syntaxes.  By default, harmmean includes NaN values in
     the calculation (NANFLAG has the value "includenan").  To exclude
     NaN values, set the value of NANFLAG to "omitnan".

     See also: geomean, mean.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 31
Compute the harmonic mean of X.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 5
hist3


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2141
 -- statistics: hist3 (X)
 -- statistics: hist3 (X, NBINS)
 -- statistics: hist3 (X, "Nbins", NBINS)
 -- statistics: hist3 (X, CENTERS)
 -- statistics: hist3 (X, "Ctrs", CENTERS)
 -- statistics: hist3 (X, "Edges", EDGES)
 -- statistics: [N, C] = hist3 (...)
 -- statistics: hist3 (..., PROP, VAL, ...)
 -- statistics: hist3 (HAX, ...)

     Produce bivariate (2D) histogram counts or plots.

     The elements to produce the histogram are taken from the Nx2 matrix
     X.  Any row with NaN values are ignored.  The actual bins can be
     configured in 3 different: number, centers, or edges of the bins:

     Number of bins (default)
          Produces equally spaced bins between the minimum and maximum
          values of X.  Defined as a 2 element vector, NBINS, one for
          each dimension.  Defaults to ‘[10 10]’.

     Center of bins
          Defined as a cell array of 2 monotonically increasing vectors,
          CENTERS.  The width of each bin is determined from the
          adjacent values in the vector with the initial and final bin,
          extending to Infinity.

     Edge of bins
          Defined as a cell array of 2 monotonically increasing vectors,
          EDGES.  ‘N(i,j)’ contains the number of elements in X for
          which:

               EDGES{1}(i) <= X(:,1) < EDGES{1}(i+1)
               EDGES{2}(j) <= X(:,2) < EDGES{2}(j+1)

          The consequence of this definition is that values outside the
          initial and final edge values are ignored, and that the final
          bin only contains the number of elements exactly equal to the
          final edge.

     The return values, N and C, are the bin counts and centers
     respectively.  These are specially useful to produce intensity
     maps:

          [counts, centers] = hist3 (data);
          imagesc (centers{1}, centers{2}, counts)

     If there is no output argument, or if the axes graphics handle HAX
     is defined, the function will plot a 3 dimensional bar graph.  Any
     extra property/value pairs are passed directly to the underlying
     surface object.

     See also: hist, histc, lookup, mesh.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 49
Produce bivariate (2D) histogram counts or plots.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
histfit


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1452
 -- statistics: histfit (X)
 -- statistics: histfit (X, NBINS)
 -- statistics: histfit (X, NBINS, DISTNAME)
 -- statistics: histfit (AX, ...)
 -- statistics: H = histfit (...)

     Plot histogram with superimposed distribution fit.

     ‘histfit (X)’ plots a histogram of the values in the vector X using
     the number of bins equal to the square root of the number of
     non-missing elements in X and superimposes a fitted normal density
     function.

     ‘histfit (X, NBINS)’ plots a histogram of the values in the vector
     X using NBINS number of bins in the histogram and superimposes a
     fitted normal density function.

     ‘histfit (X, NBINS, DISTNAME)’ plots a histogram of the values in
     the vector X using NBINS number of bins in the histogram and
     superimposes a fitted density function from the distribution
     specified by DISTNAME.

     ‘histfit (AX, ...)’ uses the axes handle AX to plot the histogram
     and the fitted density function onto followed by any of the input
     argument combinations specified in the previous syntaxes.

     ‘H = histfit (...)’ returns a vector of handles H, where H(1) is
     the handle to the histogram and H(2) is the handle to the density
     curve.

     Note: calling ‘histfit’ without any input arguments will return a
     cell array of character vectors listing all supported
     distributions.

     See also: bar, hist, normplot, fitdist.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 50
Plot histogram with superimposed distribution fit.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 11
hmmestimate


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 4577
 -- statistics: [TRANSPROBEST, OUTPROBEST] = hmmestimate (SEQUENCE,
          STATES)
 -- statistics: [...] = hmmestimate (..., "statenames", STATENAMES)
 -- statistics: [...] = hmmestimate (..., "symbols", SYMBOLS)
 -- statistics: [...] = hmmestimate (..., "pseudotransitions",
          PSEUDOTRANSITIONS)
 -- statistics: [...] = hmmestimate (..., "pseudoemissions",
          PSEUDOEMISSIONS)

     Estimation of a hidden Markov model for a given sequence.

     Estimate the matrix of transition probabilities and the matrix of
     output probabilities of a given sequence of outputs and states
     generated by a hidden Markov model.  The model assumes that the
     generation starts in state ‘1’ at step ‘0’ but does not include
     step ‘0’ in the generated states and sequence.

     Arguments
     ---------

        • SEQUENCE is a vector of a sequence of given outputs.  The
          outputs must be integers ranging from ‘1’ to the number of
          outputs of the hidden Markov model.

        • STATES is a vector of the same length as SEQUENCE of given
          states.  The states must be integers ranging from ‘1’ to the
          number of states of the hidden Markov model.

     Return values
     -------------

        • TRANSPROBEST is the matrix of the estimated transition
          probabilities of the states.  ‘transprobest(i, j)’ is the
          estimated probability of a transition to state ‘j’ given state
          ‘i’.

        • OUTPROBEST is the matrix of the estimated output
          probabilities.  ‘outprobest(i, j)’ is the estimated
          probability of generating output ‘j’ given state ‘i’.

     If ‘'symbols'’ is specified, then SEQUENCE is expected to be a
     sequence of the elements of SYMBOLS instead of integers.  SYMBOLS
     can be a cell array.

     If ‘'statenames'’ is specified, then STATES is expected to be a
     sequence of the elements of STATENAMES instead of integers.
     STATENAMES can be a cell array.

     If ‘'pseudotransitions'’ is specified then the integer matrix
     PSEUDOTRANSITIONS is used as an initial number of counted
     transitions.  ‘pseudotransitions(i, j)’ is the initial number of
     counted transitions from state ‘i’ to state ‘j’.  TRANSPROBEST will
     have the same size as PSEUDOTRANSITIONS.  Use this if you have
     transitions that are very unlikely to occur.

     If ‘'pseudoemissions'’ is specified then the integer matrix
     PSEUDOEMISSIONS is used as an initial number of counted outputs.
     ‘pseudoemissions(i, j)’ is the initial number of counted outputs
     ‘j’ given state ‘i’.  If ‘'pseudoemissions'’ is also specified then
     the number of rows of PSEUDOEMISSIONS must be the same as the
     number of rows of PSEUDOTRANSITIONS.  OUTPROBEST will have the same
     size as PSEUDOEMISSIONS.  Use this if you have outputs or states
     that are very unlikely to occur.

     Examples
     --------

          transprob = [0.8, 0.2; 0.4, 0.6];
          outprob = [0.2, 0.4, 0.4; 0.7, 0.2, 0.1];
          [sequence, states] = hmmgenerate (25, transprob, outprob);
          [transprobest, outprobest] = hmmestimate (sequence, states)

          symbols = {"A", "B", "C"};
          statenames = {"One", "Two"};
          [sequence, states] = hmmgenerate (25, transprob, outprob, ...
                                            "symbols", symbols, ...
                                            "statenames", statenames);
          [transprobest, outprobest] = hmmestimate (sequence, states, ...
                                            "symbols', symbols, ...
                                            "statenames', statenames)

          pseudotransitions = [8, 2; 4, 6];
          pseudoemissions = [2, 4, 4; 7, 2, 1];
          [sequence, states] = hmmgenerate (25, transprob, outprob);
          [transprobest, outprobest] = hmmestimate (sequence, states, ...
                                       "pseudotransitions", pseudotransitions, ...
                                       "pseudoemissions", pseudoemissions)

     References
     ----------

       1. Wendy L. Martinez and Angel R. Martinez.  ‘Computational
          Statistics Handbook with MATLAB’. Appendix E, pages 547-557,
          Chapman & Hall/CRC, 2001.

       2. Lawrence R. Rabiner.  A Tutorial on Hidden Markov Models and
          Selected Applications in Speech Recognition.  ‘Proceedings of
          the IEEE’, 77(2), pages 257-286, February 1989.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 57
Estimation of a hidden Markov model for a given sequence.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 11
hmmgenerate


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2702
 -- statistics: [SEQUENCE, STATES] = hmmgenerate (LEN, TRANSPROB,
          OUTPROB)
 -- statistics: [...] = hmmgenerate (..., "symbols", SYMBOLS)
 -- statistics: [...] = hmmgenerate (..., "statenames", STATENAMES)

     Output sequence and hidden states of a hidden Markov model.

     Generate an output sequence and hidden states of a hidden Markov
     model.  The model starts in state ‘1’ at step ‘0’ but will not
     include step ‘0’ in the generated states and sequence.

     Arguments
     ---------

        • LEN is the number of steps to generate.  SEQUENCE and STATES
          will have LEN entries each.

        • TRANSPROB is the matrix of transition probabilities of the
          states.  ‘transprob(i, j)’ is the probability of a transition
          to state ‘j’ given state ‘i’.

        • OUTPROB is the matrix of output probabilities.  ‘outprob(i,
          j)’ is the probability of generating output ‘j’ given state
          ‘i’.

     Return values
     -------------

        • SEQUENCE is a vector of length LEN of the generated outputs.
          The outputs are integers ranging from ‘1’ to ‘columns
          (outprob)’.

        • STATES is a vector of length LEN of the generated hidden
          states.  The states are integers ranging from ‘1’ to ‘columns
          (transprob)’.

     If ‘"symbols"’ is specified, then the elements of SYMBOLS are used
     for the output sequence instead of integers ranging from ‘1’ to
     ‘columns (outprob)’.  SYMBOLS can be a cell array.

     If ‘"statenames"’ is specified, then the elements of STATENAMES are
     used for the states instead of integers ranging from ‘1’ to
     ‘columns (transprob)’.  STATENAMES can be a cell array.

     Examples
     --------

          transprob = [0.8, 0.2; 0.4, 0.6];
          outprob = [0.2, 0.4, 0.4; 0.7, 0.2, 0.1];
          [sequence, states] = hmmgenerate (25, transprob, outprob)

          symbols = {"A", "B", "C"};
          statenames = {"One", "Two"};
          [sequence, states] = hmmgenerate (25, transprob, outprob, ...
                                            "symbols", symbols, ...
                                            "statenames", statenames)

     References
     ----------

       1. Wendy L. Martinez and Angel R. Martinez.  ‘Computational
          Statistics Handbook with MATLAB’. Appendix E, pages 547-557,
          Chapman & Hall/CRC, 2001.

       2. Lawrence R. Rabiner.  A Tutorial on Hidden Markov Models and
          Selected Applications in Speech Recognition.  ‘Proceedings of
          the IEEE’, 77(2), pages 257-286, February 1989.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 59
Output sequence and hidden states of a hidden Markov model.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 10
hmmviterbi


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2779
 -- statistics: VPATH = hmmviterbi (SEQUENCE, TRANSPROB, OUTPROB)
 -- statistics: VPATH = hmmviterbi (..., "symbols", SYMBOLS)
 -- statistics: VPATH = hmmviterbi (..., "statenames", STATENAMES)

     Viterbi path of a hidden Markov model.

     Use the Viterbi algorithm to find the Viterbi path of a hidden
     Markov model given a sequence of outputs.  The model assumes that
     the generation starts in state ‘1’ at step ‘0’ but does not include
     step ‘0’ in the generated states and sequence.

     Arguments
     ---------

        • SEQUENCE is the vector of length LEN of given outputs.  The
          outputs must be integers ranging from ‘1’ to ‘columns
          (outprob)’.

        • TRANSPROB is the matrix of transition probabilities of the
          states.  ‘transprob(i, j)’ is the probability of a transition
          to state ‘j’ given state ‘i’.

        • OUTPROB is the matrix of output probabilities.  ‘outprob(i,
          j)’ is the probability of generating output ‘j’ given state
          ‘i’.

     Return values
     -------------

        • VPATH is the vector of the same length as SEQUENCE of the
          estimated hidden states.  The states are integers ranging from
          ‘1’ to ‘columns (transprob)’.

     If ‘"symbols"’ is specified, then SEQUENCE is expected to be a
     sequence of the elements of SYMBOLS instead of integers ranging
     from ‘1’ to ‘columns (outprob)’.  SYMBOLS can be a cell array.

     If ‘"statenames"’ is specified, then the elements of STATENAMES are
     used for the states in VPATH instead of integers ranging from ‘1’
     to ‘columns (transprob)’.  STATENAMES can be a cell array.

     Examples
     --------

          transprob = [0.8, 0.2; 0.4, 0.6];
          outprob = [0.2, 0.4, 0.4; 0.7, 0.2, 0.1];
          [sequence, states] = hmmgenerate (25, transprob, outprob);
          vpath = hmmviterbi (sequence, transprob, outprob);

          symbols = {"A", "B", "C"};
          statenames = {"One", "Two"};
          [sequence, states] = hmmgenerate (25, transprob, outprob, ...
                               "symbols", symbols, "statenames", statenames);
          vpath = hmmviterbi (sequence, transprob, outprob, ...
                  "symbols", symbols, "statenames", statenames);

     References
     ----------

       1. Wendy L. Martinez and Angel R. Martinez.  ‘Computational
          Statistics Handbook with MATLAB’. Appendix E, pages 547-557,
          Chapman & Hall/CRC, 2001.

       2. Lawrence R. Rabiner.  A Tutorial on Hidden Markov Models and
          Selected Applications in Speech Recognition.  ‘Proceedings of
          the IEEE’, 77(2), pages 257-286, February 1989.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 38
Viterbi path of a hidden Markov model.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 16
hotelling_t2test


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1856
 -- statistics: [H, PVAL, STATS] = hotelling_t2test (X)
 -- statistics: [...] = hotelling_t2test (X, M)
 -- statistics: [...] = hotelling_t2test (X, Y)
 -- statistics: [...] = hotelling_t2test (X, M, NAME, VALUE)
 -- statistics: [...] = hotelling_t2test (X, Y, NAME, VALUE)

     Compute Hotelling's T^2 ("T-squared") test for a single sample or
     two dependent samples (paired-samples).

     For a sample X from a multivariate normal distribution with unknown
     mean and covariance matrix, test the null hypothesis that ‘mean (X)
     == M’.

     For two dependent samples X and Y from a multivariate normal
     distributions with unknown means and covariance matrices, test the
     null hypothesis that ‘mean (X - Y) == 0’.

     hotelling_t2test treats NaNs as missing values, and ignores the
     corresponding rows.

     Name-Value pair arguments can be used to set statistical
     significance.  "alpha" can be used to specify the significance
     level of the test (the default value is 0.05).

     If H is 1 the null hypothesis is rejected, meaning that the tested
     sample does not come from a multivariate distribution with mean M,
     or in case of two dependent samples that they do not come from the
     same multivariate distribution.  If H is 0, then the null
     hypothesis cannot be rejected and it can be assumed that it holds
     true.

     The p-value of the test is returned in PVAL.

     STATS is a structure containing the value of the Hotelling's T^2
     test statistic in the field "Tsq", and the degrees of freedom of
     the F distribution in the fields "df1" and "df2".  Under the null
     hypothesis, (n-p) T^2 / (p(n-1)) has an F distribution with p and
     n-p degrees of freedom, where n and p are the numbers of samples
     and variables, respectively.

     See also: hotelling_t2test2.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Compute Hotelling's T^2 ("T-squared") test for a single sample or two
depende...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 17
hotelling_t2test2


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1552
 -- statistics: [H, PVAL, STATS] = hotelling_t2test2 (X, Y)
 -- statistics: [...] = hotelling_t2test2 (X, Y, NAME, VALUE)

     Compute Hotelling's T^2 ("T-squared") test for two independent
     samples.

     For two samples X from multivariate normal distributions with the
     same number of variables (columns), unknown means and unknown equal
     covariance matrices, test the null hypothesis ‘mean (X) == mean
     (Y)’.

     hotelling_t2test2 treats NaNs as missing values, and ignores the
     corresponding rows for each sample independently.

     Name-Value pair arguments can be used to set statistical
     significance.  "alpha" can be used to specify the significance
     level of the test (the default value is 0.05).

     If H is 1 the null hypothesis is rejected, meaning that the tested
     samples do not come from the same multivariate distribution.  If H
     is 0, then the null hypothesis cannot be rejected and it can be
     assumed that both samples come from the same multivariate
     distribution.

     The p-value of the test is returned in PVAL.

     STATS is a structure containing the value of the Hotelling's T^2
     test statistic in the field "Tsq", and the degrees of freedom of
     the F distribution in the fields "df1" and "df2".  Under the null
     hypothesis,

          (n_x+n_y-p-1) T^2 / (p(n_x+n_y-2))

     has an F distribution with p and n_x+n_y-p-1 degrees of freedom,
     where n_x and n_y are the sample sizes and p is the number of
     variables.

     See also: hotelling_t2test.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 71
Compute Hotelling's T^2 ("T-squared") test for two independent samples.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 12
inconsistent


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1054
 -- statistics: Y = inconsistent (Z)
 -- statistics: Y = inconsistent (Z, D)

     Compute the inconsistency coefficient for each link of a
     hierarchical cluster tree.

     Given a hierarchical cluster tree Z generated by the ‘linkage’
     function, ‘inconsistent’ computes the inconsistency coefficient for
     each link of the tree, using all the links down to the D-th level
     below that link.

     The default depth D is 2, which means that only two levels are
     considered: the level of the computed link and the level below
     that.

     Each row of Y corresponds to the row of same index of Z.  The
     columns of Y are respectively: the mean of the heights of the links
     used for the calculation, the standard deviation of the heights of
     those links, the number of links used, the inconsistency
     coefficient.

     *Reference* Jain, A., and R. Dubes.  Algorithms for Clustering
     Data.  Upper Saddle River, NJ: Prentice-Hall, 1988.

See also: cluster, clusterdata, dendrogram, linkage, pdist, squareform.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Compute the inconsistency coefficient for each link of a hierarchical
cluster...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 9
ismissing


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1283
 -- statistics: TF = ismissing (A)
 -- statistics: TF = ismissing (A, INDICATOR)

     Find missing data in a numeric or string array.

     Given an input numeric data array, char array, or array of cell
     strings A, ‘ismissing’ returns a logical array TF with the same
     dimensions as A, where ‘true’ values match missing values in the
     input data.

     The optional input INDICATOR is an array of values that represent
     missing values in the input data.  The values which represent
     missing data by default depend on the data type of A:

        • NaN: ‘single’, ‘double’.

        • ' ' (white space): ‘char’.

        • {"}: string cells.

     Note: logical and numeric data types may be used in any combination
     for A and INDICATOR.  A and the indicator values will be compared
     as type double, and the output will have the same class as A.  Data
     types other than those specified above have no defined 'missing'
     value.  As such, the TF output for those inputs will always be
     ‘false(size(A))’.  The exception to this is that INDICATOR can be
     specified for logical and numeric inputs to designate values that
     will register as 'missing'.

     See also: fillmissing, rmmissing, standardizeMissing.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 47
Find missing data in a numeric or string array.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 9
isoutlier


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7871
 -- statistics: TF = isoutlier (X)
 -- statistics: TF = isoutlier (X, METHOD)
 -- statistics: TF = isoutlier (X, "percentiles", THRESHOLD)
 -- statistics: TF = isoutlier (X, MOVMETHOD, WINDOW)
 -- statistics: TF = isoutlier (..., DIM)
 -- statistics: TF = isoutlier (..., NAME, VALUE)
 -- statistics: [TF, L, U, C] = isoutlier (...)

     Find outliers in data

     ‘isoutlier (X)’ returns a logical array whose elements are true
     when an outlier is detected in the corresponding element of X.
     ‘isoutlier’ treats NaNs as missing values and removes them.

        • If X is a matrix, then ‘isoutlier’ operates on each column of
          X separately.
        • If X is a multidimensional array, then ‘isoutlier’ operates
          along the first dimension of X whose size does not equal 1.

     By default, an outlier is a value that is more than three scaled
     median absolute deviations (MAD) from the median.  The scaled
     median is defined as ‘c*median(abs(A-median(A)))’, where
     ‘c=-1/(sqrt(2)*erfcinv(3/2))’.

     ‘isoutlier (X, METHOD)’ specifies a method for detecting outliers.
     The following methods are available:

     Method      Description
     -----------------------------------------------------------------------
     "median"    Outliers are defined as elements more than three scaled
                 MAD from the median.
     "mean"      Outliers are defined as elements more than three
                 standard deviations from the mean.
     "quartiles" Outliers are defined as elements more than 1.5
                 interquartile ranges above the upper quartile (75
                 percent) or below the lower quartile (25 percent).  This
                 method is useful when the data in X is not normally
                 distributed.
     "grubbs"    Outliers are detected using Grubbs’ test for outliers,
                 which removes one outlier per iteration based on
                 hypothesis testing.  This method assumes that the data
                 in X is normally distributed.
     "gesd"      Outliers are detected using the generalized extreme
                 Studentized deviate test for outliers.  This iterative
                 method is similar to "grubbs", but can perform better
                 when there are multiple outliers masking each other.

     ‘isoutlier (X, "percentiles", THRESHOLD)’ detects outliers based on
     a percentile thresholds, specified as a two-element row vector
     whose elements are in the interval [0, 100].  The first element
     indicates the lower percentile threshold, and the second element
     indicates the upper percentile threshold.  The first element of
     threshold must be less than the second element.

     ‘isoutlier (X, MOVMETHOD, WINDOW)’ specifies a moving method for
     detecting outliers.  The following methods are available:

     Method      Description
     -----------------------------------------------------------------------
     "movmedian" Outliers are defined as elements more than three local
                 scaled MAD from the local median over a window length
                 specified by WINDOW.
     "movmean"   Outliers are defined as elements more than three local
                 standard deviations from the from the local mean over a
                 window length specified by WINDOW.

     WINDOW must be a positive integer scalar or a two-element vector of
     positive integers.  When WINDOW is a scalar, if it is an odd
     number, the window is centered about the current element and
     contains WINDOW - 1 neighboring elements.  If even, then the window
     is centered about the current and previous elements.  When WINDOW
     is a two-element vector of positive integers [nb, na], the window
     contains the current element, nb elements before the current
     element, and na elements after the current element.  When
     "SamplePoints" are also specified, WINDOW can take any real
     positive values (either as a scalar or a two-element vector) and in
     this case, the windows are computed relative to the sample points.

     DIM specifies the operating dimension and it must be a positive
     integer scalar.  If not specified, then, by default, ‘isoutlier’
     operates along the first non-singleton dimension of X.

     The following optional parameters can be specified as NAME/VALUE
     paired arguments.

        • "SamplePoints" can be specified as a vector of sample points
          with equal length as the operating dimension.  The sample
          points represent the x-axis location of the data and must be
          sorted and contain unique elements.  Sample points do not need
          to be uniformly sampled.  By default, the vector is [1, 2, 3,
          ..., N], where N = size (X, DIM).  You can use unequally
          spaced "SamplePoints" to define a variable-length window for
          one of the moving methods available.

        • "ThresholdFactor" can be specified as a nonnegative scalar.
          For methods "median" and "movmedian", the detection threshold
          factor replaces the number of scaled MAD, which is 3 by
          default.  For methods "mean" and "movmean", the detection
          threshold factor replaces the number of standard deviations,
          which is 3 by default.  For methods "grubbs" and "gesd", the
          detection threshold factor ranges from 0 to 1, specifying the
          critical alpha-value of the respective test, and it is 0.05 by
          default.  For the "quartiles" method, the detection threshold
          factor replaces the number of interquartile ranges, which is
          1.5 by default.  "ThresholdFactor" is not supported for the
          "quartiles" method.

        • "MaxNumOutliers" is only relevant to the "gesd" method and it
          must be a positive integer scalar specifying the maximum
          number of outliers returned by the "gesd" method.  By default,
          it is the integer nearest to the 10% of the number of elements
          along the operating dimension in X.  The "gesd" method assumes
          the nonoutlier input data is sampled from an approximate
          normal distribution.  When the data is not sampled in this
          way, the number of returned outliers might exceed the
          MaxNumOutliers value.

     ‘[TF, L, U, C] = isoutlier (...)’ returns up to 4 output arguments
     as described below.

        • TF is the outlier indicator with the same size a X.

        • L is the lower threshold used by the outlier detection method.
          If METHOD is used for outlier detection, then L has the same
          size as X in all dimensions except for the operating dimension
          where the length is 1.  If MOVMETHOD is used, then L has the
          same size as X.

        • U is the upper threshold used by the outlier detection method.
          If METHOD is used for outlier detection, then U has the same
          size as X in all dimensions except for the operating dimension
          where the length is 1.  If MOVMETHOD is used, then U has the
          same size as X.

        • C is the center value used by the outlier detection method.
          If METHOD is used for outlier detection, then C has the same
          size as X in all dimensions except for the operating dimension
          where the length is 1.  If MOVMETHOD is used, then C has the
          same size as X.  For "median", "movmedian", "mean", and
          "movmean" methods, C is computed by taking into acount the
          outlier values.  For "grubbs" and "gesd" methods, C is
          computed by excluding the outliers.  For the "percentiles"
          method, C is the average between U and L thresholds.

     See also: filloutliers, rmoutliers, ismissing.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 21
Find outliers in data



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 9
jackknife


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2029
 -- statistics: JACKSTAT = jackknife (E, X)
 -- statistics: JACKSTAT = jackknife (E, X, ...)

     Compute jackknife estimates of a parameter taking one or more given
     samples as parameters.

     In particular, E is the estimator to be jackknifed as a function
     name, handle, or inline function, and X is the sample for which the
     estimate is to be taken.  The I-th entry of JACKSTAT will contain
     the value of the estimator on the sample X with its I-th row
     omitted.

          jackstat (I) = E(X(1 : I - 1, I + 1 : length(X)))

     Depending on the number of samples to be used, the estimator must
     have the appropriate form:
        • If only one sample is used, then the estimator need not be
          concerned with cell arrays, for example jackknifing the
          standard deviation of a sample can be performed with ‘JACKSTAT
          = jackknife (@std, rand (100, 1))’.
        • If, however, more than one sample is to be used, the samples
          must all be of equal size, and the estimator must address them
          as elements of a cell-array, in which they are aggregated in
          their order of appearance:

          JACKSTAT = jackknife (@(x) std(x{1})/var(x{2}),
          rand (100, 1), randn (100, 1))

     If all goes well, a theoretical value P for the parameter is
     already known, N is the sample size,

     ‘T = N * E(X) - (N - 1) * mean(JACKSTAT)’

     and

     ‘V = sumsq(N * E(X) - (N - 1) * JACKSTAT - T) / (N * (N - 1))’

     then

     ‘(T-P)/sqrt(V)’ should follow a t-distribution with N-1 degrees of
     freedom.

     Jackknifing is a well known method to reduce bias.  Further details
     can be found in:

     References
     ----------

       1. Rupert G. Miller.  The jackknife - a review.  Biometrika
          (1974), 61(1):1-15.  doi:10.1093/biomet/61.1.1
       2. Rupert G. Miller.  Jackknifing Variances.  Ann.  Math.
          Statist.  (1968), Volume 39, Number 2, 567-582.
          doi:10.1214/aoms/1177698418


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Compute jackknife estimates of a parameter taking one or more given
samples a...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
kmeans


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7089
 -- statistics: IDX = kmeans (DATA, K)
 -- statistics: [IDX, CENTERS] = kmeans (DATA, K)
 -- statistics: [IDX, CENTERS, SUMD] = kmeans (DATA, K)
 -- statistics: [IDX, CENTERS, SUMD, DIST] = kmeans (DATA, K)
 -- statistics: [...] = kmeans (DATA, K, PARAM1, VALUE1, ...)
 -- statistics: [...] = kmeans (DATA, [], "start", START, ...)

     Perform a K-means clustering of the NxD matrix DATA.

     If parameter "start" is specified, then K may be empty in which
     case K is set to the number of rows of START.

     The outputs are:

     IDX              An Nx1 vector whose i-th element is the class to which
                      row i of DATA is assigned.
                      
     CENTERS          A KxD array whose i-th row is the centroid of cluster i.
                      
     SUMD             A kx1 vector whose i-th entry is the sum of the
                      distances from samples in cluster i to centroid i.
                      
     DIST             An Nxk matrix whose ij-th element is the distance from
                      sample i to centroid j.

     The following parameters may be placed in any order.  Each
     parameter must be followed by its value, as in Name-Value pairs.

     Name          Description
     ---------------------------------------------------------------------------
     "Start"       The initialization method for the centroids.

         Value            Description
     ----------------------------------------------------------------------------
         "plus"           The k-means++ algorithm.  (Default)
         "sample"         A subset of k rows from DATA, sampled uniformly
                          without replacement.
         "cluster"        Perform a pilot clustering on 10% of the rows of
                          DATA.
         "uniform"        Each component of each centroid is drawn uniformly
                          from the interval between the maximum and minimum
                          values of that component within DATA.  This performs
                          poorly and is implemented only for Matlab
                          compatibility.
         NUMERIC          A kxD matrix of centroid starting locations.  The
         MATRIX           rows correspond to seeds.
         NUMERIC          A kxDxr array of centroid starting locations.  The
         ARRAY            third dimension invokes replication of the
                          clustering routine.  Page r contains the set of
                          seeds for replicate r.  kmeans infers the number of
                          replicates (specified by the "Replicates" Name-Value
                          pair argument) from the size of the third dimension.

     Name          Description
     ---------------------------------------------------------------------------
     "Distance"    The distance measure used for partitioning and calculating
                   centroids.

         Value            Description
     ----------------------------------------------------------------------------
         "sqeuclidean"    The squared Euclidean distance.  i.e.  the sum of
                          the squares of the differences between corresponding
                          components.  In this case, the centroid is the
                          arithmetic mean of all samples in its cluster.  This
                          is the only distance for which this algorithm is
                          truly "k-means".
         "cityblock"      The sum metric, or L1 distance, i.e.  the sum of the
                          absolute differences between corresponding
                          components.  In this case, the centroid is the
                          median of all samples in its cluster.  This gives
                          the k-medians algorithm.
         "cosine"         One minus the cosine of the included angle between
                          points (treated as vectors).  Each centroid is the
                          mean of the points in that cluster, after
                          normalizing those points to unit Euclidean length.
         "correlation"    One minus the sample correlation between points
                          (treated as sequences of values).  Each centroid is
                          the component-wise mean of the points in that
                          cluster, after centering and normalizing those
                          points to zero mean and unit standard deviation.
         "hamming"        The number of components in which the sample and the
                          centroid differ.  In this case, the centroid is the
                          median of all samples in its cluster.  Unlike
                          Matlab, Octave allows non-logical DATA.

     Name          Description
     ---------------------------------------------------------------------------
     "EmptyAction" What to do when a centroid is not the closest to any data
                   sample.

         Value            Description
     ----------------------------------------------------------------------------
         "error"          Throw an error.
         "singleton"      (Default) Select the row of DATA that has the
                          highest error and use that as the new centroid.
         "drop"           Remove the centroid, and continue computation with
                          one fewer centroid.  The dimensions of the outputs
                          CENTROIDS and D are unchanged, with values for
                          omitted centroids replaced by NaN.

     Name          Description
     ---------------------------------------------------------------------------
     "Display"     Display a text summary.

         Value            Description
     ----------------------------------------------------------------------------
         "off"            (Default) Display no summary.
         "final"          Display a summary for each clustering operation.
         "iter"           Display a summary for each iteration of a clustering
                          operation.

     Name          Value
     ---------------------------------------------------------------------------
     "Replicates"  A positive integer specifying the number of independent
                   clusterings to perform.  The output values are the values
                   for the best clustering, i.e., the one with the smallest
                   value of SUMD.  If START is numeric, then REPLICATES
                   defaults to (and must equal) the size of the third
                   dimension of START.  Otherwise it defaults to 1.
     "MaxIter"     The maximum number of iterations to perform for each
                   replicate.  If the maximum change of any centroid is less
                   than 0.001, then the replicate terminates even if MAXITER
                   iterations have no occurred.  The default is 100.

     Example:

     [~,c] = kmeans (rand(10, 3), 2, "emptyaction", "singleton");

     See also: linkage.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 52
Perform a K-means clustering of the NxD matrix DATA.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 9
knnsearch


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7093
 -- statistics: IDX = knnsearch (X, Y)
 -- statistics: [IDX, D] = knnsearch (X, Y)
 -- statistics: [...] = knnsearch (..., NAME, VALUE)

     Find k-nearest neighbors from input data.

     ‘IDX = knnsearch (X, Y)’ finds K nearest neighbors in X for Y.  It
     returns IDX which contains indices of K nearest neighbors of each
     row of Y, If not specified, K = 1.  X must be an NxP numeric matrix
     of input data, where rows correspond to observations and columns
     correspond to features or variables.  Y is an MxP numeric matrix
     with query points, which must have the same numbers of column as X.

     ‘[IDX, D] = knnsearch (X, Y)’ also returns the the distances, D,
     which correspond to the K nearest neighbour in X for each Y

     Additional parameters can be specified by Name-Value pair
     arguments.

     NAME            VALUE
                     
     ---------------------------------------------------------------------------
     "K"             is the number of nearest neighbors to be found in the
                     kNN search.  It must be a positive integer value and by
                     default it is 1.
                     
     "P"             is the Minkowski distance exponent and it must be a
                     positive scalar.  This argument is only valid when the
                     selected distance metric is "minkowski".  By default it
                     is 2.
                     
     "Scale"         is the scale parameter for the standardized Euclidean
                     distance and it must be a nonnegative numeric vector of
                     equal length to the number of columns in X.  This
                     argument is only valid when the selected distance metric
                     is "seuclidean", in which case each coordinate of X is
                     scaled by the corresponding element of "scale", as is
                     each query point in Y.  By default, the scale parameter
                     is the standard deviation of each coordinate in X.
                     
     "Cov"           is the covariance matrix for computing the mahalanobis
                     distance and it must be a positive definite matrix
                     matching the the number of columns in X.  This argument
                     is only valid when the selected distance metric is
                     "mahalanobis".
                     
     "BucketSize"    is the maximum number of data points in the leaf node of
                     the Kd-tree and it must be a positive integer.  This
                     argument is only valid when the selected search method
                     is "kdtree".
                     
     "SortIndices"   is a boolean flag to sort the returned indices in
                     ascending order by distance and it is true by default.
                     When the selected search method is "exhaustive" or the
                     "IncludeTies" flag is true, ‘knnsearch’ always sorts the
                     returned indices.
                     
     "Distance"      is the distance metric used by ‘knnsearch’ as specified
                     below:

          "euclidean"    Euclidean distance.
          "seuclidean"   standardized Euclidean distance.  Each coordinate
                         difference between the rows in X and the query
                         matrix Y is scaled by dividing by the corresponding
                         element of the standard deviation computed from X.
                         To specify a different scaling, use the "Scale"
                         name-value argument.
          "cityblock"    City block distance.
          "chebychev"    Chebychev distance (maximum coordinate difference).
          "minkowski"    Minkowski distance.  The default exponent is 2.  To
                         specify a different exponent, use the "P" name-value
                         argument.
          "mahalanobis"  Mahalanobis distance, computed using a positive
                         definite covariance matrix.  To change the value of
                         the covariance matrix, use the "Cov" name-value
                         argument.
          "cosine"       Cosine distance.
          "correlation"  One minus the sample linear correlation between
                         observations (treated as sequences of values).
          "spearman"     One minus the sample Spearman's rank correlation
                         between observations (treated as sequences of
                         values).
          "hamming"      Hamming distance, which is the percentage of
                         coordinates that differ.
          "jaccard"      One minus the Jaccard coefficient, which is the
                         percentage of nonzero coordinates that differ.
          @DISTFUN       Custom distance function handle.  A distance
                         function of the form ‘function D2 = distfun (XI,
                         YI)’, where XI is a 1xP vector containing a single
                         observation in P-dimensional space, YI is an NxP
                         matrix containing an arbitrary number of
                         observations in the same P-dimensional space, and D2
                         is an NxP vector of distances, where (D2k) is the
                         distance between observations XI and (YIk,:).

     "NSMethod"      is the nearest neighbor search method used by
                     ‘knnsearch’ as specified below.

          "kdtree"       Creates and uses a Kd-tree to find nearest
                         neighbors.  "kdtree" is the default value when the
                         number of columns in X is less than or equal to 10,
                         X is not sparse, and the distance metric is
                         "euclidean", "cityblock", "manhattan", "chebychev",
                         or "minkowski".  Otherwise, the default value is
                         "exhaustive".  This argument is only valid when the
                         distance metric is one of the four aforementioned
                         metrics.
          "exhaustive"   Uses the exhaustive search algorithm by computing
                         the distance values from all the points in X to each
                         point in Y.

     "IncludeTies"   is a boolean flag to indicate if the returned values
                     should contain the indices that have same distance as
                     the K^th neighbor.  When false, ‘knnsearch’ chooses the
                     observation with the smallest index among the
                     observations that have the same distance from a query
                     point.  When true, ‘knnsearch’ includes all nearest
                     neighbors whose distances are equal to the K^th smallest
                     distance in the output arguments.  To specify K, use the
                     "K" name-value pair argument.

     See also: rangesearch, pdist2, fitcknn.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 41
Find k-nearest neighbors from input data.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 13
kruskalwallis


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2572
 -- statistics: P = kruskalwallis (X)
 -- statistics: P = kruskalwallis (X, GROUP)
 -- statistics: P = kruskalwallis (X, GROUP, DISPLAYOPT)
 -- statistics: [P, TBL] = kruskalwallis (X, ...)
 -- statistics: [P, TBL, STATS] = kruskalwallis (X, ...)

     Perform a Kruskal-Wallis test, the non-parametric alternative of a
     one-way analysis of variance (ANOVA), for comparing the means of
     two or more groups of data under the null hypothesis that the
     groups are drawn from the same population, i.e.  the group means
     are equal.

     kruskalwallis can take up to three input arguments:

        • X contains the data and it can either be a vector or matrix.
          If X is a matrix, then each column is treated as a separate
          group.  If X is a vector, then the GROUP argument is
          mandatory.
        • GROUP contains the names for each group.  If X is a matrix,
          then GROUP can either be a cell array of strings of a
          character array, with one row per column of X.  If you want to
          omit this argument, enter an empty array ([]).  If X is a
          vector, then GROUP must be a vector of the same length, or a
          string array or cell array of strings with one row for each
          element of X.  X values corresponding to the same value of
          GROUP are placed in the same group.
        • DISPLAYOPT is an optional parameter for displaying the groups
          contained in the data in a boxplot.  If omitted, it is 'on' by
          default.  If group names are defined in GROUP, these are used
          to identify the groups in the boxplot.  Use 'off' to omit
          displaying this figure.

     kruskalwallis can return up to three output arguments:

        • P is the p-value of the null hypothesis that all group means
          are equal.
        • TBL is a cell array containing the results in a standard ANOVA
          table.
        • STATS is a structure containing statistics useful for
          performing a multiple comparison of means with the MULTCOMPARE
          function.

     If kruskalwallis is called without any output arguments, then it
     prints the results in a one-way ANOVA table to the standard output.
     It is also printed when DISPLAYOPT is 'on'.

     Examples:

          x = meshgrid (1:6);
          x = x + normrnd (0, 1, 6, 6);
          [p, atab] = kruskalwallis(x);

          x = ones (50, 4) .* [-2, 0, 1, 5];
          x = x + normrnd (0, 2, 50, 4);
          group = {"A", "B", "C", "D"};
          kruskalwallis (x, group);


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Perform a Kruskal-Wallis test, the non-parametric alternative of a
one-way an...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
kstest


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 4387
 -- statistics: H = kstest (X)
 -- statistics: H = kstest (X, NAME, VALUE)
 -- statistics: [H, P] = kstest (...)
 -- statistics: [H, P, KSSTAT, CV] = kstest (...)

     Single sample Kolmogorov-Smirnov (K-S) goodness-of-fit hypothesis
     test.

     ‘H = kstest (X)’ performs a Kolmogorov-Smirnov (K-S) test to
     determine if a random sample X could have come from a standard
     normal distribution.  H indicates the results of the null
     hypothesis test.

        • H = 0 => Do not reject the null hypothesis at the 5%
          significance
        • H = 1 => Reject the null hypothesis at the 5% significance

     X is a vector representing a random sample from some unknown
     distribution with a cumulative distribution function F(X). Missing
     values declared as NaNs in X are ignored.

     ‘H = kstest (X, NAME, VALUE)’ returns a test decision for a
     single-sample K-S test with additional options specified by one or
     more NAME-VALUE pair arguments as shown below.

     Name             Value
     ----------------------------------------------------------------------------
     "alpha"          A numeric scalar between 0 and 1 specifying th the
                      significance level.  Default is 0.05 for 5%
                      significance.
                      
     "CDF"            The hypothesized CDF under the null hypothesis.  It can
                      be specified as a function handle of an existing cdf
                      function, a character vector defining a probability
                      distribution with default parameters, a probability
                      distribution object, or a two-column matrix.  If not
                      provided, the default is the standard normal, N(0,1).
                      The one-sample Kolmogorov-Smirnov test is only valid for
                      continuous cumulative distribution functions, and
                      requires the CDF to be predetermined.  The result is not
                      accurate if CDF is estimated from the data.
                      
     "tail"           A string indicating the type of test:
                    "unequal"        "F(X) not equal to CDF(X)"
                                     (two-sided) (Default)
                                     
                    "larger"         "F(X) > CDF(X)" (one-sided)
                                     
                    "smaller"        "F(X) < CDF(X)" (one-sided)

     Let S(X) be the empirical c.d.f.  estimated from the sample vector
     X, F(X) be the corresponding true (but unknown) population c.d.f.,
     and CDF be the known input c.d.f.  specified under the null
     hypothesis.  For ‘tail’ = "unequal", "larger", and "smaller", the
     test statistics are max|S(X) - CDF(X)|, max[S(X) - CDF(X)], and
     max[CDF(X) - S(X)], respectively.

     ‘[H, P] = kstest (...)’ also returns the asymptotic p-value P.

     ‘[H, P, KSSTAT] = kstest (...)’ returns the K-S test statistic
     KSSTAT defined above for the test type indicated by the "tail"
     option

     In the matrix version of CDF, column 1 contains the x-axis data and
     column 2 the corresponding y-axis c.d.f data.  Since the K-S test
     statistic will occur at one of the observations in X, the
     calculation is most efficient when CDF is only specified at the
     observations in X.  When column 1 of CDF represents x-axis points
     independent of X, CDF is linearly interpolated at the observations
     found in the vector X.  In this case, the interval along the x-axis
     (the column 1 spread of CDF) must span the observations in X for
     successful interpolation.

     The decision to reject the null hypothesis is based on comparing
     the p-value P with the "alpha" value, not by comparing the
     statistic KSSTAT with the critical value CV.  CV is computed
     separately using an approximate formula or by interpolation using
     Miller's approximation table.  The formula and table cover the
     range 0.01 <= "alpha" <= 0.2 for two-sided tests and 0.005 <=
     "alpha" <= 0.1 for one-sided tests.  CV is returned as NaN if
     "alpha" is outside this range.  Since CV is approximate, a
     comparison of KSSTAT with CV may occasionally lead to a different
     conclusion than a comparison of P with "alpha".

     See also: kstest2, cdfplot.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 71
Single sample Kolmogorov-Smirnov (K-S) goodness-of-fit hypothesis test.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
kstest2


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2277
 -- statistics: H = kstest2 (X1, X2)
 -- statistics: H = kstest2 (X1, X2, NAME, VALUE)
 -- statistics: [H, P] = kstest2 (...)
 -- statistics: [H, P, KS2STAT] = kstest2 (...)

     Two-sample Kolmogorov-Smirnov goodness-of-fit hypothesis test.

     ‘H = kstest2 (X1, X2)’ returns a test decision for the null
     hypothesis that the data in vectors X1 and X2 are from the same
     continuous distribution, using the two-sample Kolmogorov-Smirnov
     test.  The alternative hypothesis is that X1 and X2 are from
     different continuous distributions.  The result H is 1 if the test
     rejects the null hypothesis at the 5% significance level, and 0
     otherwise.

     ‘H = kstest2 (X1, X2, NAME, VALUE)’ returns a test decision for a
     two-sample Kolmogorov-Smirnov test with additional options
     specified by one or more name-value pair arguments as shown below.

     "alpha"        A value ALPHA between 0 and 1 specifying the
                    significance level.  Default is 0.05 for 5%
                    significance.
                    
     "tail"         A string indicating the type of test:

        "unequal"      "F(X1) not equal to F(X2)" (two-sided) [Default]
                       
        "larger"       "F(X1) > F(X2)" (one-sided)
                       
        "smaller"      "F(X1) < F(X2)" (one-sided)

     The two-sided test uses the maximum absolute difference between the
     cdfs of the distributions of the two data vectors.  The test
     statistic is ‘D* = max(|F1(x) - F2(x)|)’, where F1(x) is the
     proportion of X1 values less or equal to x and F2(x) is the
     proportion of X2 values less than or equal to x.  The one-sided
     test uses the actual value of the difference between the cdfs of
     the distributions of the two data vectors rather than the absolute
     value.  The test statistic is ‘D* = max(F1(x) - F2(x))’ or ‘D* =
     max(F2(x) - F1(x))’ for ‘tail’ = "larger" or "smaller",
     respectively.

     ‘[H, P] = kstest2 (...)’ also returns the asymptotic p-value P.

     ‘[H, P, KS2STAT] = kstest2 (...)’ also returns the
     Kolmogorov-Smirnov test statistic KS2STAT defined above for the
     test type indicated by ‘tail’.

     See also: kstest, cdfplot.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 62
Two-sample Kolmogorov-Smirnov goodness-of-fit hypothesis test.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 11
levene_test


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2690
 -- statistics: H = levene_test (X)
 -- statistics: H = levene_test (X, GROUP)
 -- statistics: H = levene_test (X, ALPHA)
 -- statistics: H = levene_test (X, TESTTYPE)
 -- statistics: H = levene_test (X, GROUP, ALPHA)
 -- statistics: H = levene_test (X, GROUP, TESTTYPE)
 -- statistics: H = levene_test (X, GROUP, ALPHA, TESTTYPE)
 -- statistics: [H, PVAL] = levene_test (...)
 -- statistics: [H, PVAL, W] = levene_test (...)
 -- statistics: [H, PVAL, W, DF] = levene_test (...)

     Perform a Levene's test for the homogeneity of variances.

     Under the null hypothesis of equal variances, the test statistic W
     approximately follows an F distribution with DF degrees of freedom
     being a vector ([k-1, N-k]).

     The p-value (1 minus the CDF of this distribution at W) is returned
     in PVAL.  H = 1 if the null hypothesis is rejected at the
     significance level of ALPHA.  Otherwise H = 0.

     Input Arguments:

        • X contains the data and it can either be a vector or matrix.
          If X is a matrix, then each column is treated as a separate
          group.  If X is a vector, then the GROUP argument is
          mandatory.  NaN values are omitted.

        • GROUP contains the names for each group.  If X is a vector,
          then GROUP must be a vector of the same length, or a string
          array or cell array of strings with one row for each element
          of X.  X values corresponding to the same value of GROUP are
          placed in the same group.  If X is a matrix, then GROUP can
          either be a cell array of strings of a character array, with
          one row per column of X in the same way it is used in ‘anova1’
          function.  If X is a matrix, then GROUP can be omitted either
          by entering an empty array ([]) or by parsing only ALPHA as a
          second argument (if required to change its default value).

        • ALPHA is the statistical significance value at which the null
          hypothesis is rejected.  Its default value is 0.05 and it can
          be parsed either as a second argument (when GROUP is omitted)
          or as a third argument.

        • TESTTYPE is a string determining the type of Levene's test.
          By default it is set to "absolute", but the user can also
          parse "quadratic" in order to perform Levene's Quadratic test
          for equal variances or "median" in order to to perform the
          Brown-Forsythe's test.  These options determine how the Z_ij
          values are computed.  If an invalid name is parsed for
          TESTTYPE, then the Levene's Absolute test is performed.

     See also: bartlett_test, vartest2, vartestn.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 57
Perform a Levene's test for the homogeneity of variances.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
linkage


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 3044
 -- statistics: Y = linkage (D)
 -- statistics: Y = linkage (D, METHOD)
 -- statistics: Y = linkage (X)
 -- statistics: Y = linkage (X, METHOD)
 -- statistics: Y = linkage (X, METHOD, METRIC)
 -- statistics: Y = linkage (X, METHOD, ARGLIST)

     Produce a hierarchical clustering dendrogram.

     D is the dissimilarity matrix relative to n observations, formatted
     as a (n-1)*n/2x1 vector as produced by ‘pdist’.  Alternatively, X
     contains data formatted for input to ‘pdist’, METRIC is a metric
     for ‘pdist’ and ARGLIST is a cell array containing arguments that
     are passed to ‘pdist’.

     ‘linkage’ starts by putting each observation into a singleton
     cluster and numbering those from 1 to n.  Then it merges two
     clusters, chosen according to METHOD, to create a new cluster
     numbered n+1, and so on until all observations are grouped into a
     single cluster numbered 2(n-1).  Row k of the (m-1)x3 output matrix
     relates to cluster n+k: the first two columns are the numbers of
     the two component clusters and column 3 contains their distance.

     METHOD defines the way the distance between two clusters is
     computed and how they are recomputed when two clusters are merged:

     ‘"single" (default)’
          Distance between two clusters is the minimum distance between
          two elements belonging each to one cluster.  Produces a
          cluster tree known as minimum spanning tree.

     ‘"complete"’
          Furthest distance between two elements belonging each to one
          cluster.

     ‘"average"’
          Unweighted pair group method with averaging (UPGMA). The mean
          distance between all pair of elements each belonging to one
          cluster.

     ‘"weighted"’
          Weighted pair group method with averaging (WPGMA). When two
          clusters A and B are joined together, the new distance to a
          cluster C is the mean between distances A-C and B-C.

     ‘"centroid"’
          Unweighted Pair-Group Method using Centroids (UPGMC). Assumes
          Euclidean metric.  The distance between cluster centroids,
          each centroid being the center of mass of a cluster.

     ‘"median"’
          Weighted pair-group method using centroids (WPGMC). Assumes
          Euclidean metric.  Distance between cluster centroids.  When
          two clusters are joined together, the new centroid is the
          midpoint between the joined centroids.

     ‘"ward"’
          Ward's sum of squared deviations about the group mean (ESS).
          Also known as minimum variance or inner squared distance.
          Assumes Euclidean metric.  How much the moment of inertia of
          the merged cluster exceeds the sum of those of the individual
          clusters.

     *Reference* Ward, J. H. Hierarchical Grouping to Optimize an
     Objective Function J. Am.  Statist.  Assoc.  1963, 58, 236-244,
     <http://iv.slis.indiana.edu/sw/data/ward.pdf>.

     See also: pdist,squareform.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 45
Produce a hierarchical clustering dendrogram.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 9
loadmodel


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 435
 -- ClassificationSVM: OBJ = loadmodel (FILENAME)

     Load a Classification or Regression model from a file.

     ‘OBJ = loadmodel (FILENAME)’ loads a Classification or Regression
     object, OBJ, from a file defined in FILENAME.

     See also: savemodel, ClassificationDiscriminant, ClassificationGAM,
     ClassificationKNN, ClassificationNeuralNetwork,
     ClassificationPartitionedModel, ClassificationSVM, RegressionGAM.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 54
Load a Classification or Regression model from a file.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 19
logistic_regression


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2217
 -- statistics: [INTERCEPT, SLOPE, DEV, DL, D2L, P, STATS] =
          logistic_regression (Y, X, PRINT, INTERCEPT, SLOPE)

     Perform ordinal logistic regression.

     Suppose Y takes values in k ordered categories, and let ‘P_i (X)’
     be the cumulative probability that Y falls in one of the first i
     categories given the covariate X.  Then

          [INTERCEPT, SLOPE] = logistic_regression (Y, X)

     fits the model

          logit (P_i (X)) = X * SLOPE + INTERCEPT_i,   i = 1 ... k-1

     The number of ordinal categories, k, is taken to be the number of
     distinct values of ‘round (Y)’.  If k equals 2, Y is binary and the
     model is ordinary logistic regression.  The matrix X is assumed to
     have full column rank.

     Given Y only, ‘INTERCEPT = logistic_regression (Y)’ fits the model
     with baseline logit odds only.

     The full form is

          [INTERCEPT, SLOPE, DEV, DL, D2L, P, STATS]
             = logistic_regression (Y, X, PRINT, INTERCEPT, SLOPE)

     in which all output arguments and all input arguments except Y are
     optional.

     Setting PRINT to 1 requests summary information about the fitted
     model to be displayed.  Setting PRINT to 2 requests information
     about convergence at each iteration.  Other values request no
     information to be displayed.  The input arguments INTERCEPT and
     SLOPE give initial estimates for INTERCEPT and SLOPE.

     The returned value DEV holds minus twice the log-likelihood.

     The returned values DL and D2L are the vector of first and the
     matrix of second derivatives of the log-likelihood with respect to
     INTERCEPT and SLOPE.

     P holds estimates for the conditional distribution of Y given X.

     STATS returns a structure that contains the following fields:
        • "intercept": intercept coefficients
        • "slope": slope coefficients
        • "coeff": regression coefficients (intercepts and slops)
        • "covb": estimated covariance matrix for coefficients (coeff)
        • "coeffcorr": correlation matrix for coeff
        • "se": standard errors of the coeff
        • "z": z statistics for coeff
        • "pval": p-values for coeff


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 36
Perform ordinal logistic regression.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 5
logit


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 175
 -- statistics: X = logit (P)

     Compute the logit for each value of P

     The logit is defined as

          logit (P) = log (P / (1-P))

     See also: probit, logicdf.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 37
Compute the logit for each value of P



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 5
mahal


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 493
 -- statistics: D = mahal (Y, X)

     Mahalanobis' D-square distance.

     Return the Mahalanobis' D-square distance of the points in Y from
     the distribution implied by points X.

     Specifically, it uses a Cholesky decomposition to set

           answer(i) = (Y(i,:) - mean (X)) * inv (A) * (Y(i,:)-mean (X))'

     where A is the covariance of X.

     The data X and Y must have the same number of components (columns),
     but may have a different number of observations (rows).


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 31
Mahalanobis' D-square distance.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
manova1


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 3123
 -- statistics: D = manova1 (X, GROUP)
 -- statistics: D = manova1 (X, GROUP, ALPHA)
 -- statistics: [D, P] = manova1 (...)
 -- statistics: [D, P, STATS] = manova1 (...)

     One-way multivariate analysis of variance (MANOVA).

     ‘D = manova1 (X, GROUP, ALPHA)’ performs a one-way MANOVA for
     comparing the mean vectors of two or more groups of multivariate
     data.

     X is a matrix with each row representing a multivariate
     observation, and each column representing a variable.

     GROUP is a numeric vector, string array, or cell array of strings
     with the same number of rows as X.  X values are in the same group
     if they correspond to the same value of GROUP.

     ALPHA is the scalar significance level and is 0.05 by default.

     D is an estimate of the dimension of the group means.  It is the
     smallest dimension such that a test of the hypothesis that the
     means lie on a space of that dimension is not rejected.  If D = 0
     for example, we cannot reject the hypothesis that the means are the
     same.  If D = 1, we reject the hypothesis that the means are the
     same but we cannot reject the hypothesis that they lie on a line.

     ‘[D, P] = manova1 (...)’ returns P, a vector of p-values for
     testing the null hypothesis that the mean vectors of the groups lie
     on various dimensions.  P(1) is the p-value for a test of dimension
     0, P(2) for dimension 1, etc.

     ‘[D, P, STATS] = manova1 (...)’ returns a STATS structure with the
     following fields:

          "W"            within-group sum of squares and products matrix
          "B"            between-group sum of squares and products matrix
          "T"            total sum of squares and products matrix
          "dfW"          degrees of freedom for WSSP matrix
          "dfB"          degrees of freedom for BSSP matrix
          "dfT"          degrees of freedom for TSSP matrix
          "lambda"       value of Wilk's lambda (the test statistic)
          "chisq"        transformation of lambda to a chi-square
                         distribution
          "chisqdf"      degrees of freedom for chisq
          "eigenval"     eigenvalues of (WSSP^-1) * BSSP
          "eigenvec"     eigenvectors of (WSSP^-1) * BSSP; these are the
                         coefficients for canonical variables, and they are
                         scaled so the within-group variance of C is 1
          "canon"        canonical variables, equal to XC*eigenvec, where XC
                         is X with columns centered by subtracting their
                         means
          "mdist"        Mahalanobis distance from each point to its group
                         mean
          "gmdist"       Mahalanobis distances between each pair of group
                         means
          "gnames"       Group names

     The canonical variables C have the property that C(:,1) is the
     linear combination of the X columns that has the maximum separation
     between groups, C(:,2) has the maximum separation subject to it
     being orthogonal to C(:,1), and so on.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 51
One-way multivariate analysis of variance (MANOVA).



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 13
manovacluster


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 974
 -- statistics: manovacluster (STATS)
 -- statistics: manovacluster (STATS, METHOD)
 -- statistics: H = manovacluster (STATS)
 -- statistics: H = manovacluster (STATS, METHOD)

     Cluster group means using manova1 output.

     ‘manovacluster (STATS)’ draws a dendrogram showing the clustering
     of group means, calculated using the output STATS structure from
     ‘manova1’ and applying the single linkage algorithm.  See the
     ‘dendrogram’ function for more information about the figure.

     ‘manovacluster (STATS, METHOD)’ uses the METHOD algorithm in place
     of single linkage.  The available methods are:

          "single"       -- nearest distance
          "complete"     -- furthest distance
          "average"      -- average distance
          "centroid"     -- center of mass distance
          "ward"         -- inner squared distance

     ‘H = manovacluster (...)’ returns a vector of line handles.

     See also: manova1.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 41
Cluster group means using manova1 output.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 12
mcnemar_test


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1808
 -- statistics: [H, PVAL, CHISQ] = mcnemar_test (X)
 -- statistics: [H, PVAL, CHISQ] = mcnemar_test (X, ALPHA)
 -- statistics: [H, PVAL, CHISQ] = mcnemar_test (X, TESTTYPE)
 -- statistics: [H, PVAL, CHISQ] = mcnemar_test (X, ALPHA, TESTTYPE)

     Perform a McNemar's test on paired nominal data.

     McNemar's test is applied to a 2x2 contingency table X with a
     dichotomous trait, with matched pairs of subjects, of data
     cross-classified on the row and column variables to testing the
     null hypothesis of symmetry of the classification probabilities.
     More formally, the null hypothesis of marginal homogeneity states
     that the two marginal probabilities for each outcome are the same.

     Under the null, with a sufficiently large number of discordants
     (X(1,2) + X(2,1) >= 25), the test statistic, CHISQ, follows a
     chi-squared distribution with 1 degree of freedom.  When the number
     of discordants is less than 25, then the mid-P exact McNemar test
     is used.

     TESTTYPE will force ‘mcnemar_test’ to apply a particular method for
     testing the null hypothesis independently of the number of
     discordants.  Valid options for TESTTYPE:
        • "asymptotic" Original McNemar test statistic
        • "corrected" Edwards' version with continuity correction
        • "exact" An exact binomial test
        • "mid-p" The mid-P McNemar test (mid-p binomial test)

     The test decision is returned in H, which is 1 when the null
     hypothesis is rejected (PVAL < ALPHA) or 0 otherwise.  ALPHA
     defines the critical value of statistical significance for the
     test.

     Further information about the McNemar's test can be found at
     <https://en.wikipedia.org/wiki/McNemar%27s_test>

     See also: crosstab, chi2test, fishertest.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 48
Perform a McNemar's test on paired nominal data.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
mhsample


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 3415
 -- statistics: [SMPL, ACCEPT] = mhsample (START, NSAMPLES, PROPERTY,
          VALUE, ...)

     Draws NSAMPLES samples from a target stationary distribution PDF
     using Metropolis-Hastings algorithm.

     Inputs:

        • START is a NCHAIN by DIM matrix of starting points for each
          Markov chain.  Each row is the starting point of a different
          chain and each column corresponds to a different dimension.

        • NSAMPLES is the number of samples, the length of each Markov
          chain.

     Some property-value pairs can or must be specified, they are:

     (Required) One of:

        • "pdf" PDF: a function handle of the target stationary
          distribution to be sampled.  The function should accept
          different locations in each row and each column corresponds to
          a different dimension.

          or

        • "logpdf" LOGPDF: a function handle of the log of the target
          stationary distribution to be sampled.  The function should
          accept different locations in each row and each column
          corresponds to a different dimension.

     In case optional argument SYMMETRIC is set to false (the default),
     one of:

        • "proppdf" PROPPDF: a function handle of the proposal
          distribution that is sampled from with PROPRND to give the
          next point in the chain.  The function should accept two
          inputs, the random variable and the current location each
          input should accept different locations in each row and each
          column corresponds to a different dimension.

          or

        • "logproppdf" LOGPROPPDF: the log of "proppdf".

     The following input property/pair values may be needed depending on
     the desired output:

        • "proprnd" PROPRND: (Required) a function handle which
          generates random numbers from PROPPDF.  The function should
          accept different locations in each row and each column
          corresponds to a different dimension corresponding with the
          current location.

        • "symmetric" SYMMETRIC: true or false based on whether PROPPDF
          is a symmetric distribution.  If true, PROPPDF (or LOGPROPPDF)
          need not be specified.  The default is false.

        • "burnin" BURNIN the number of points to discard at the
          beginning, the default is 0.

        • "thin" THIN: omits THIN-1 of every THIN points in the
          generated Markov chain.  The default is 1.

        • "nchain" NCHAIN: the number of Markov chains to generate.  The
          default is 1.

     Outputs:

        • SMPL: a NSAMPLES x DIM x NCHAIN tensor of random values drawn
          from PDF, where the rows are different random values, the
          columns correspond to the dimensions of PDF, and the third
          dimension corresponds to different Markov chains.

        • ACCEPT is a vector of the acceptance rate for each chain.

     Example : Sampling from a normal distribution

          start = 1;
          nsamples = 1e3;
          pdf = @(x) exp (-.5 * x .^ 2) / (pi ^ .5 * 2 ^ .5);
          proppdf = @(x,y) 1 / 6;
          proprnd = @(x) 6 * (rand (size (x)) - .5) + x;
          [smpl, accept] = mhsample (start, nsamples, "pdf", pdf, "proppdf", ...
          proppdf, "proprnd", proprnd, "thin", 4);
          histfit (smpl);

     See also: rand, slicesample.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Draws NSAMPLES samples from a target stationary distribution PDF using
Metrop...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
mnrfit


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2472
 -- statistics: B = mnrfit (X, Y)
 -- statistics: B = mnrfit (X, Y, NAME, VALUE)
 -- statistics: [B, DEV] = mnrfit (...)
 -- statistics: [B, DEV, STATS] = mnrfit (...)

     Perform logistic regression for binomial responses or multiple
     ordinal responses.

     Note: This function is currently a wrapper for the
     ‘logistic_regression’ function.  It can only be used for fitting an
     ordinal logistic model and a nominal model with 2 categories (which
     is an ordinal case).  Hierarchical models as well as nominal model
     with more than two classes are not currently supported.  This
     function is a work in progress.

     ‘B = mnrfit (X, Y)’ returns a matrix, B, of coefficient estimates
     for a multinomial logistic regression of the nominal responses in Y
     on the predictors in X.  X is an NxP numeric matrix the
     observations on predictor variables, where N corresponds to the
     number of observations and P corresponds to predictor variables.  Y
     contains the response category labels and it either be an NxP
     categorical or numerical matrix (containing only 1s and 0s) or an
     Nx1 numeric vector with positive integer values, a cell array of
     character vectors and a logical vector.  Y can also be defined as a
     character matrix with each row corresponding to an observation of
     X.

     ‘B = mnrfit (X, Y, NAME, VALUE)’ returns a matrix, B, of
     coefficient estimates for a multinomial model fit with additional
     parameters specified Name-Value pair arguments.

     NAME            VALUE
                     
     ---------------------------------------------------------------------------
     "model"         Specifies the type of model to fit.  Currently, only
                     "ordinal" is fully supported.  "nominal" is only
                     supported for 2 classes in Y.
                     
     "display"       A flag to enable/disable displaying information about
                     the fitted model.  Default is "off".

     ‘[B, DEV, STATS] = mnrfit (...’ also returns the deviance of the
     fit, DEV, and the structure STATS for any of the previous input
     arguments.  STATS currently only returns values for the fields
     "beta", same as B, "coeffcorr", the estimated correlation matrix
     for B, "covd", the estimated covariance matrix for B, and "se", the
     standard errors of the coefficient estimates B.

     See also: logistic_regression.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Perform logistic regression for binomial responses or multiple ordinal
respon...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 15
monotone_smooth


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1810
 -- statistics: YY = monotone_smooth (X, Y, H)

     Produce a smooth monotone increasing approximation to a sampled
     functional dependence.

     A kernel method is used (an Epanechnikov smoothing kernel is
     applied to y(x); this is integrated to yield the monotone
     increasing form.  See Reference 1 for details.)

     Arguments
     ---------

        • X is a vector of values of the independent variable.

        • Y is a vector of values of the dependent variable, of the same
          size as X.  For best performance, it is recommended that the Y
          already be fairly smooth, e.g.  by applying a kernel smoothing
          to the original values if they are noisy.

        • H is the kernel bandwidth to use.  If H is not given, a
          "reasonable" value is computed.

     Return values
     -------------

        • YY is the vector of smooth monotone increasing function values
          at X.

     Examples
     --------

          x = 0:0.1:10;
          y = (x .^ 2) + 3 * randn(size(x)); # typically non-monotonic from the added
          noise
          ys = ([y(1) y(1:(end-1))] + y + [y(2:end) y(end)])/3; # crudely smoothed via
          moving average, but still typically non-monotonic
          yy = monotone_smooth(x, ys); # yy is monotone increasing in x
          plot(x, y, '+', x, ys, x, yy)

     References
     ----------

       1. Holger Dette, Natalie Neumeyer and Kay F. Pilz (2006), A
          simple nonparametric estimator of a strictly monotone
          regression function, ‘Bernoulli’, 12:469-490
       2. Regine Scheder (2007), R Package 'monoProc', Version 1.0-6,
          <http://cran.r-project.org/web/packages/monoProc/monoProc.pdf>
          (The implementation here is based on the monoProc function
          mono.1d)


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Produce a smooth monotone increasing approximation to a sampled
functional de...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 11
multcompare


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7770
 -- statistics: C = multcompare (STATS)
 -- statistics: C = multcompare (STATS, "name", VALUE)
 -- statistics: [C, M] = multcompare (...)
 -- statistics: [C, M, H] = multcompare (...)
 -- statistics: [C, M, H, GNAMES] = multcompare (...)
 -- statistics: PADJ = multcompare (P)
 -- statistics: PADJ = multcompare (P, "ctype", CTYPE)

     Perform posthoc multiple comparison tests or p-value adjustments to
     control the family-wise error rate (FWER) or false discovery rate
     (FDR).

     ‘C = multcompare (STATS)’ performs a multiple comparison using a
     STATS structure that is obtained as output from any of the
     following functions: anova1, anova2, anovan, kruskalwallis, and
     friedman.  The return value C is a matrix with one row per
     comparison and six columns.  Columns 1-2 are the indices of the two
     samples being compared.  Columns 3-5 are a lower bound, estimate,
     and upper bound for their difference, where the bounds are for 95%
     confidence intervals.  Column 6-8 are the multiplicity adjusted
     p-values for each individual comparison, the test statistic and the
     degrees of freedom.  All tests by multcompare are two-tailed.

     multcompare can take a number of optional parameters as name-value
     pairs.

     ‘[...] = multcompare (STATS, "alpha", ALPHA)’

        • ALPHA sets the significance level of null hypothesis
          significance tests to ALPHA, and the central coverage of
          two-sided confidence intervals to 100*(1-ALPHA)%.  (Default
          ALPHA is 0.05).

     ‘[...] = multcompare (STATS, "ControlGroup", REF)’

        • REF is the index of the control group to limit comparisons to.
          The index must be a positive integer scalar value.  For each
          dimension (d) listed in DIM, multcompare uses
          STATS.grpnames{d}(idx) as the control group.  (Default is
          empty, i.e.  [], for full pairwise comparisons)

     ‘[...] = multcompare (STATS, "ctype", CTYPE)’

        • CTYPE is the type of comparison test to use.  In order of
          increasing power, the choices are: "bonferroni", "scheffe",
          "mvt", "holm" (default), "hochberg", "fdr", or "lsd".  The
          first five methods control the family-wise error rate.  The
          "fdr" method controls false discovery rate (by the original
          Benjamini-Hochberg step-up procedure).  The final method,
          "lsd" (or "none"), makes no attempt to control the Type 1
          error rate of multiple comparisons.  The coverage of
          confidence intervals are only corrected for multiple
          comparisons in the cases where CTYPE is "bonferroni",
          "scheffe" or "mvt", which control the Type 1 error rate for
          simultaneous inference.

          The "mvt" method uses the multivariate t distribution to
          assess the probability or critical value of the maximum
          statistic across the tests, thereby accounting for
          correlations among comparisons in the control of the
          family-wise error rate with simultaneous inference.  In the
          case of pairwise comparisons, it simulates Tukey's (or the
          Games-Howell) test, in the case of comparisons with a single
          control group, it simulates Dunnett's test.  CTYPE values
          "tukey-kramer" and "hsd" are recognised but set the value of
          CTYPE and REF to "mvt" and empty respectively.  A CTYPE value
          "dunnett" is recognised but sets the value of CTYPE to "mvt",
          and if REF is empty, sets REF to 1.  Since the algorithm uses
          a Monte Carlo method (of 1e+06 random samples), you can expect
          the results to fluctuate slightly with each call to
          multcompare and the calculations may be slow to complete for a
          large number of comparisons.  If the parallel package is
          installed and loaded, multcompare will automatically
          accelerate computations by parallel processing.  Note that
          p-values calculated by the "mvt" are truncated at 1e-06.

     ‘[...] = multcompare (STATS, "df", DF)’

        • DF is an optional scalar value to set the number of degrees of
          freedom in the calculation of p-values for the multiple
          comparison tests.  By default, this value is extracted from
          the STATS structure of the ANOVA test, but setting DF maybe
          necessary to approximate Satterthwaite correction if anovan
          was performed using weights.

     ‘[...] = multcompare (STATS, "dim", DIM)’

        • DIM is a vector specifying the dimension or dimensions over
          which the estimated marginal means are to be calculated.  Used
          only if STATS comes from anovan.  The value [1 3], for
          example, computes the estimated marginal mean for each
          combination of the first and third predictor values.  The
          default is to compute over the first dimension (i.e.  1).  If
          the specified dimension is, or includes, a continuous factor
          then multcompare will return an error.

     ‘[...] = multcompare (STATS, "estimate", ESTIMATE)’

        • ESTIMATE is a string specifying the estimates to be compared
          when computing multiple comparisons after anova2; this
          argument is ignored by anovan and anova1.  Accepted values for
          ESTIMATE are either "column" (default) to compare column
          means, or "row" to compare row means.  If the model type in
          anova2 was "linear" or "nested" then only "column" is accepted
          for ESTIMATE since the row factor is assumed to be a random
          effect.

     ‘[...] = multcompare (STATS, "display", DISPLAY)’

        • DISPLAY is either "on" (the default): to display a table and
          graph of the comparisons (e.g.  difference between means),
          their 100*(1-ALPHA)% intervals and multiplicity adjusted
          p-values in APA style; or "off": to omit the table and graph.
          On the graph, markers and error bars colored red have
          multiplicity adjusted p-values < ALPHA, otherwise the markers
          and error bars are blue.

     ‘[...] = multcompare (STATS, "seed", SEED)’

        • SEED is a scalar value used to initialize the random number
          generator so that CTYPE "mvt" produces reproducible results.

     ‘[C, M, H, GNAMES] = multcompare (...)’ returns additional outputs.
     M is a matrix where columns 1-2 are the estimated marginal means
     and their standard errors, and columns 3-4 are lower and upper
     bounds of the confidence intervals for the means; the critical
     value of the test statistic is scaled by a factor of 2^(-0.5)
     before multiplying by the standard errors of the group means so
     that the intervals overlap when the difference in means becomes
     significant at approximately the level ALPHA.  When ALPHA is 0.05,
     this corresponds to confidence intervals with 83.4% central
     coverage.  H is a handle to the figure containing the graph.
     GNAMES is a cell array with one row for each group, containing the
     names of the groups.

     ‘PADJ = multcompare (P)’ calculates and returns adjusted p-values
     (PADJ) using the Holm-step down Bonferroni procedure to control the
     family-wise error rate.

     ‘PADJ = multcompare (P, "ctype", CTYPE)’ calculates and returns
     adjusted p-values (PADJ) computed using the method CTYPE.  In order
     of increasing power, CTYPE for p-value adjustment can be either
     "bonferroni", "holm" (default), "hochberg", or "fdr".  See above
     for further information about the CTYPE methods.

     See also: anova1, anova2, anovan, kruskalwallis, friedman, fitlm.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Perform posthoc multiple comparison tests or p-value adjustments to
control t...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
multiway


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2124
 -- statistics: GROUPINDEX = multiway (NUMBERS, NUM_PARTS)
 -- statistics: GROUPINDEX = multiway (NUMBERS, NUM_PARTS, METHOD)
 -- statistics: [GROUPINDEX, PARTITION] = multiway (...)
 -- statistics: [GROUPINDEX, PARTITION, GROUPSIZES] = multiway (...)

     Solve the multiway number partitioning problem.

     ‘GROUPINDEX = multiway (NUMBERS, NUM_PARTS)’ splits a set of
     numbers in NUMBERS into a number of subsets specified in NUM_PARTS
     such that the sums of the subsets are nearly as equal as possible
     and returns a vector of group indices in GROUPINDEX with each index
     corresponding to the set of numbers provided as input.

        • NUMBERS is a vector of positive real numbers to be
          partitioned.
        • NUM_PARTS is a positive integer scalar specifying the number
          of partitions (subsets) to split the numbers into.

     ‘GROUPINDEX = multiway (NUMBERS, NUM_PARTS, METHOD)’ also specifies
     the algorithm used for partitioning the set of numbers.  By
     default, ‘multiway’ uses the complete Karmarkar-Karp algorithm,
     when the set of numbers contains up to 10 elements and the
     requested number of subsets does not exceed 5, otherwise it
     defaults to the greedy algorithm, which is optimized for speed, but
     may not return the optimal partitioning.  The following methods are
     supported:

        • 'greedy' (Greedy algorithm)
        • 'completeKK' (Complete Karmarkar-Karp algorithm)

     The ‘multiway’ function may return up to three output arguments
     described below:

        • GROUPINDEX: A vector of the same length as NUMBERS containing
          the group index (from 1 to NUM_PARTS) for each number.
        • PARTITION: A cell array of length NUM_PARTS with each cell
          containing the numbers assigned to that partition.
        • GROUPSIZES: A vector of the sums of the numbers in each
          partition.

     Example:
          numbers = [4, 5, 6, 7, 8];
          num_parts = 2;
          [groupindex, partition, groupsizes] = multiway (numbers, num_parts);

     See also: cvpartition.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 47
Solve the multiway number partitioning problem.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
nanmax


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1946
 -- statistics: V = nanmax (X)
 -- statistics: V = nanmax (X, [], DIM)
 -- statistics: [V, IDX] = nanmax (...)
 -- statistics: V = nanmax (X, [], 'all')
 -- statistics: V = nanmax (X, [], VECDIM)
 -- statistics: V = nanmax (X, Y)

     Find the maximum while ignoring NaN values.

     ‘V = nanmax (X)’ returns the maximum of X, after removing NaN
     values.  If X is a vector, a scalar maximum value is returned.  If
     X is a matrix, a row vector of column maxima is returned.  If X is
     a multidimensional array, the ‘nanmax’ operates along the first
     nonsingleton dimension.  If all values in a column are NaN, the
     maximum is returned as NaN rather than [].

     ‘V = nanmax (X, [], DIM)’ operates along the dimension DIM of X.

     ‘[V, IDX] = nanmax (...)’ also returns the row indices of the
     maximum values for each column in the vector IDX.  When X is a
     vector, then IDX is a scalar value as V.

     ‘V = nanmax (X, [], 'all')’ returns the maximum of all elements of
     X, after removing NaN values.  It is the equivalent of ‘nanmax
     (X(:))’.  The optional flag 'all' cannot be used together with DIM
     or VECDIM input arguments.

     ‘V = nanmax (X, [], VECDIM)’ returns the maximum over the
     dimensions specified in the vector VECDIM.  Each element of VECDIM
     represents a dimension of the input array X and the output V has
     length 1 in the specified operating dimensions.  The lengths of the
     other dimensions are the same for X and Y.  For example, if X is a
     2-by-3-by-4 array, then ‘nanmax (X, [1 2])’ returns a 1-by-1-by-4
     array.  Each element of the output array is the maximum of the
     elements on the corresponding page of X.  If VECDIM indexes all
     dimensions of X, then it is equivalent to ‘nanmax (X, 'all')’.  Any
     dimension in VECDIM greater than ‘ndims (X)’ is ignored.

     See also: max, nanmin, nansum.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 43
Find the maximum while ignoring NaN values.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
nanmean


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1942
 -- statistics: S = nanmean (X)
 -- statistics: S = nanmean (X, 'all')
 -- statistics: S = nanmean (X, DIM)
 -- statistics: S = nanmean (X, VECDIM)

     Compute the mean while ignoring NaN values.

     ‘S = nanmean (X)’ returns the mean of X after removing NaN values.
     If X is a vector, a scalar value is returned.  If X is a matrix, a
     row vector of column means is returned.  If X is a multidimensional
     array, ‘nanmean’ operates along the first nonsingleton dimension.
     If all values along a dimesion are NaN, the mean is returned
     returned as NaN.

     ‘S = nanmean (X, 'all')’ returns the mean of all elements of X,
     after removing NaN values.  It is the equivalent of ‘nanmean
     (X(:))’.

     ‘S = nanmean (X, DIM)’ operates along the dimension DIM of X.

     ‘S = nanmean (X, VECDIM)’ returns the mean over the dimensions
     specified in the vector VECDIM.  Each element of VECDIM represents
     a dimension of the input array X and the output S has length 1 in
     the specified operating dimensions.  The lengths of the other
     dimensions are the same for X and Y.  For example, if X is a
     2-by-3-by-4 array, then ‘nanmean (X, [1 2])’ returns a 1-by-1-by-4
     array.  Each element of the output array is the mean of the
     elements on the corresponding page of X.  If VECDIM indexes all
     dimensions of X, then it is equivalent to ‘nanmean (X, 'all')’.
     Any dimension in VECDIM greater than ‘ndims (X)’ is ignored.

     ‘nanmean’ primarily operates on single and double numeric types,
     since they support NaN values, while preserving the data type.
     Nevertheless, it can also operate on integer types by treating them
     as double types.  To avoid overflow on very large int64 and uint64
     values, use the ‘mean’ function, which applies special handling for
     such cases.

     See also: mean, nansum, nanmin, nanmax.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 43
Compute the mean while ignoring NaN values.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
nanmin


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1946
 -- statistics: V = nanmin (X)
 -- statistics: V = nanmin (X, [], DIM)
 -- statistics: [V, IDX] = nanmin (...)
 -- statistics: V = nanmin (X, [], 'all')
 -- statistics: V = nanmin (X, [], VECDIM)
 -- statistics: V = nanmin (X, Y)

     Find the minimum while ignoring NaN values.

     ‘V = nanmin (X)’ returns the minimum of X, after removing NaN
     values.  If X is a vector, a scalar minimum value is returned.  If
     X is a matrix, a row vector of column minima is returned.  If X is
     a multidimensional array, the ‘nanmin’ operates along the first
     nonsingleton dimension.  If all values in a column are NaN, the
     minimum is returned as NaN rather than [].

     ‘V = nanmin (X, [], DIM)’ operates along the dimension DIM of X.

     ‘[V, IDX] = nanmin (...)’ also returns the row indices of the
     minimum values for each column in the vector IDX.  When X is a
     vector, then IDX is a scalar value as V.

     ‘V = nanmin (X, [], 'all')’ returns the minimum of all elements of
     X, after removing NaN values.  It is the equivalent of ‘nanmin
     (X(:))’.  The optional flag 'all' cannot be used together with DIM
     or VECDIM input arguments.

     ‘V = nanmin (X, [], VECDIM)’ returns the minimum over the
     dimensions specified in the vector VECDIM.  Each element of VECDIM
     represents a dimension of the input array X and the output V has
     length 1 in the specified operating dimensions.  The lengths of the
     other dimensions are the same for X and Y.  For example, if X is a
     2-by-3-by-4 array, then ‘nanmin (X, [1 2])’ returns a 1-by-1-by-4
     array.  Each element of the output array is the minimum of the
     elements on the corresponding page of X.  If VECDIM indexes all
     dimensions of X, then it is equivalent to ‘nanmin (X, 'all')’.  Any
     dimension in VECDIM greater than ‘ndims (X)’ is ignored.

     See also: min, nanmax, nansum.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 43
Find the minimum while ignoring NaN values.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
nansum


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1536
 -- statistics: S = nansum (X)
 -- statistics: S = nanmax (X, 'all')
 -- statistics: S = nanmax (X, DIM)
 -- statistics: S = nanmax (X, VECDIM)

     Compute the sum while ignoring NaN values.

     ‘S = nansum (X)’ returns the sum of X, after removing NaN values.
     If X is a vector, a scalar value is returned.  If X is a matrix, a
     row vector of column sums is returned.  If X is a multidimensional
     array, the ‘nansum’ operates along the first nonsingleton
     dimension.  If all values along a dimesion are NaN, the sum is
     returned returned as 0.

     ‘S = nansum (X, 'all')’ returns the sum of all elements of X, after
     removing NaN values.  It is the equivalent of ‘nansum (X(:))’.

     ‘S = nansum (X, DIM)’ operates along the dimension DIM of X.

     ‘S = nansum (X, VECDIM)’ returns the sum over the dimensions
     specified in the vector VECDIM.  Each element of VECDIM represents
     a dimension of the input array X and the output S has length 1 in
     the specified operating dimensions.  The lengths of the other
     dimensions are the same for X and Y.  For example, if X is a
     2-by-3-by-4 array, then ‘nanmax (X, [1 2])’ returns a 1-by-1-by-4
     array.  Each element of the output array is the maximum of the
     elements on the corresponding page of X.  If VECDIM indexes all
     dimensions of X, then it is equivalent to ‘nanmax (X, 'all')’.  Any
     dimension in VECDIM greater than ‘ndims (X)’ is ignored.

     See also: sum, nanmin, nanmax.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 42
Compute the sum while ignoring NaN values.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 22
normalise_distribution


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2075
 -- statistics: NORMALISED = normalise_distribution (DATA)
 -- statistics: NORMALISED = normalise_distribution (DATA, DISTRIBUTION)
 -- statistics: NORMALISED = normalise_distribution (DATA, DISTRIBUTION,
          DIMENSION)

     Transform a set of data so as to be N(0,1) distributed according to
     an idea by van Albada and Robinson.

     This is achieved by first passing it through its own cumulative
     distribution function (CDF) in order to get a uniform distribution,
     and then mapping the uniform to a normal distribution.

     The data must be passed as a vector or matrix in DATA.  If the CDF
     is unknown, then [] can be passed in DISTRIBUTION, and in this case
     the empirical CDF will be used.  Otherwise, if the CDFs for all
     data are known, they can be passed in DISTRIBUTION, either in the
     form of a single function name as a string, or a single function
     handle, or a cell array consisting of either all function names as
     strings, or all function handles.  In the latter case, the number
     of CDFs passed must match the number of rows, or columns
     respectively, to normalise.  If the data are passed as a matrix,
     then the transformation will operate either along the first
     non-singleton dimension, or along DIMENSION if present.

     Notes: The empirical CDF will map any two sets of data having the
     same size and their ties in the same places after sorting to some
     permutation of the same normalised data:
          normalise_distribution([1 2 2 3 4])
          ⇒ -1.28  0.00  0.00  0.52  1.28

          normalise_distribution([1 10 100 10 1000])
          ⇒ -1.28  0.00  0.52  0.00  1.28

     Original source: S.J. van Albada, P.A. Robinson "Transformation of
     arbitrary distributions to the normal distribution with application
     to EEG test-retest reliability" Journal of Neuroscience Methods,
     Volume 161, Issue 2, 15 April 2007, Pages 205-211 ISSN 0165-0270,
     10.1016/j.jneumeth.2006.11.004.
     (http://www.sciencedirect.com/science/article/pii/S0165027006005668)


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Transform a set of data so as to be N(0,1) distributed according to an
idea b...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
normplot


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 780
 -- Function File: normplot (X)
 -- Function File: normplot (AX, X)
 -- Function File: H = normplot (...)

     Produce normal probability plot of the data in X.  If X is a
     matrix, ‘normplot’ plots the data for each column.  NaN values are
     ignored.

     ‘H = normplot (AX, X)’ takes a handle AX in addition to the data in
     X and it uses that axes for plotting.  You may get this handle of
     an existing plot with ‘gca’.

     The line joining the 1st and 3rd quantile is drawn solid whereas
     its extensions to both ends are dotted.  If the underlying
     distribution is normal, the points will cluster around the solid
     part of the line.  Other distribution types will introduce
     curvature in the plot.

     See also: cdfplot, wblplot.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 49
Produce normal probability plot of the data in X.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 16
optimalleaforder


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1313
 -- statistics: LEAFORDER = optimalleaforder (TREE, D)
 -- statistics: LEAFORDER = optimalleaforder (..., NAME, VALUE)

     Compute the optimal leaf ordering of a hierarchical binary cluster
     tree.

     The optimal leaf ordering of a tree is the ordering which minimizes
     the sum of the distances between each leaf and its adjacent leaves,
     without altering the structure of the tree, that is without
     redefining the clusters of the tree.

     Required inputs:
        • TREE: a hierarchical cluster tree TREE generated by the
          ‘linkage’ function.

        • D: a matrix of distances as computed by ‘pdist’.

     Optional inputs can be the following property/value pairs:
        • property 'Criteria' at the moment can only have the value
          'adjacent', for minimizing the distances between leaves.

        • property 'Transformation' can have one of the values 'linear',
          'inverse' or a handle to a custom function which computes S
          the similarity matrix.

     optimalleaforder's output LEAFORDER is the optimal leaf ordering.

     *Reference* Bar-Joseph, Z., Gifford, D.K., and Jaakkola, T.S. Fast
     optimal leaf ordering for hierarchical clustering.  Bioinformatics
     vol.  17 suppl.  1, 2001.

See also: dendrogram,linkage,pdist.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 72
Compute the optimal leaf ordering of a hierarchical binary cluster tree.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 3
pca


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 3468
 -- statistics: COEFF = pca (X)
 -- statistics: COEFF = pca (X, NAME, VALUE)
 -- statistics: [COEFF, SCORE, LATENT] = pca (...)
 -- statistics: [COEFF, SCORE, LATENT, TSQUARED] = pca (...)
 -- statistics: [COEFF, SCORE, LATENT, TSQUARED, EXPLAINED, MU] = pca
          (...)

     Performs a principal component analysis on a data matrix.

     A principal component analysis of a data matrix of N observations
     in a D dimensional space returns a DxD transformation matrix, to
     perform a change of basis on the data.  The first component of the
     new basis is the direction that maximizes the variance of the
     projected data.

     Input argument:
        • X : a NxD data matrix

     The following NAME, VALUE pair arguments can be used:
        • "Algorithm" defines the algorithm to use:
             • "svd" (default), for singular value decomposition
             • "eig" for eigenvalue decomposition

        • "Centered" is a boolean indicator for centering the
          observation data.  It is ‘true’ by default.
        • 
          "Economy" is a boolean indicator for the economy size output.
          It is ‘true’ by default.  Hence, ‘pca’ returns only the
          elements of LATENT that are not necessarily zero, and the
          corresponding columns of COEFF and SCORE, that is, when N <=
          D, only the first N - 1.

        • "NumComponents" defines the number of components k to return.
          If k < p, then only the first k columns of COEFF and SCORE are
          returned.

        • "Rows" defines how to handle missing values:
             • "complete" (default), missing values are removed before
               computation.
             • "pairwise" (only valid when "Algorithm" is "eig"), the
               covariance of rows with missing data is computed using
               the available data, but the covariance matrix could be
               not positive definite, which triggers the termination of
               ‘pca’.
             • "complete", missing values are not allowed, ‘pca’
               terminates with an error if there are any.

        • "Weights" defines observation weights as a vector of positive
          values of length N.

        • "VariableWeights" defines variable weights:
             • a VECTOR of positive values of length D.
             • the string "variance" to use the sample variance as
               weights.

     Return values:
        • COEFF : the principal component coefficients, a DxD
          transformation matrix
        • SCORE : the principal component scores, the representation of
          X in the principal component space
        • LATENT : the principal component variances, i.e., the
          eigenvalues of the covariance matrix of X
        • TSQUARED : Hotelling's T-squared Statistic for each
          observation in X
        • EXPLAINED : the percentage of the variance explained by each
          principal component
        • MU : the estimated mean of each variable of X, it is zero if
          the data are not centered

     Matlab compatibility note: the alternating least square method
     'als' and associated options 'Coeff0', 'Score0', and 'Options' are
     not yet implemented

     References
     ----------

       1. Jolliffe, I. T., Principal Component Analysis, 2nd Edition,
          Springer, 2002

     See also: barttest, factoran, pcacov, pcares.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 57
Performs a principal component analysis on a data matrix.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
pcacov


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1483
 -- statistics: COEFF = pcacov (K)
 -- statistics: [COEFF, LATENT] = pcacov (K)
 -- statistics: [COEFF, LATENT, EXPLAINED] = pcacov (K)

     Perform principal component analysis on covariance matrix

     ‘COEFF = pcacov (K)’ performs principal component analysis on the
     square covariance matrix K and returns the principal component
     coefficients, also known as loadings.  The columns are in order of
     decreasing component variance.

     ‘[COEFF, LATENT] = pcacov (K)’ also returns a vector with the
     principal component variances, i.e.  the eigenvalues of K.  LATENT
     has a length of size (COEFF, 1).

     ‘[COEFF, LATENT, EXPLAINED] = pcacov (K)’ also returns a vector
     with the percentage of the total variance explained by each
     principal component.  EXPLAINED has the same size as LATENT.  The
     entries in EXPLAINED range from 0 (none of the variance is
     explained) to 100 (all of the variance is explained).

     ‘pcacov’ does not standardize K to have unit variances.  In order
     to perform principal component analysis on standardized variables,
     use the correlation matrix R = K ./ (SD * SD'), where SD = sqrt
     (diag (K)), in place of K.  To perform principal component analysis
     directly on the data matrix, use ‘pca’.

     References
     ----------

       1. Jolliffe, I. T., Principal Component Analysis, 2nd Edition,
          Springer, 2002

     See also: barttest, factoran, pcares, pca.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 57
Perform principal component analysis on covariance matrix



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
pcares


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1308
 -- statistics: RESIDUALS = pcares (X, NDIM)
 -- statistics: [RESIDUALS, RECONSTRUCTED] = pcares (X, NDIM)

     Calculate residuals from principal component analysis.

     ‘RESIDUALS = pcares (X, NDIM)’ returns the residuals obtained by
     retaining NDIM principal components of the NxD matrix X.  Rows of X
     correspond to observations, columns of X correspond to variables.
     NDIM is a scalar and must be less than or equal to D. RESIDUALS is
     a matrix of the same size as X.  Use the data matrix, not the
     covariance matrix, with this function.

     ‘[RESIDUALS, RECONSTRUCTED] = pcares (X, NDIM)’ returns the
     reconstructed observations, i.e.  the approximation to X obtained
     by retaining its first NDIM principal components.

     ‘pcares’ does not normalize the columns of X.  Use pcares (zscore
     (X), NDIM) in order to perform the principal components analysis
     based on standardized variables, i.e.  based on correlations.  Use
     ‘pcacov’ in order to perform principal components analysis directly
     on a covariance or correlation matrix without constructing
     residuals.

     References
     ----------

       1. Jolliffe, I. T., Principal Component Analysis, 2nd Edition,
          Springer, 2002

     See also: factoran, pcacov, pca.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 54
Calculate residuals from principal component analysis.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 5
pdist


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 3791
 -- statistics: D = pdist (X)
 -- statistics: D = pdist (X, DISTANCE)
 -- statistics: D = pdist (X, DISTANCE, DISTPARAMETER)

     Return the distance between any two rows in X.

     ‘D = pdist (X’ calculates the euclidean distance between pairs of
     observations in X.  X must be an MxP numeric matrix representing M
     points in P-dimensional space.  This function computes the pairwise
     distances returned in D as an Mx(M-1)/P row vector.  Use ‘Z =
     squareform (D)’ to convert the row vector D into a an MxM symmetric
     matrix Z, where Z(i,j) corresponds to the pairwise distance between
     points i and j.

     ‘D = pdist (X, Y, DISTANCE)’ returns the distance between pairs of
     observations in X using the metric specified by DISTANCE, which can
     be any of the following options.

     "euclidean"         Euclidean distance.
     "squaredeuclidean"  Squared Euclidean distance.
     "seuclidean"        standardized Euclidean distance.  Each
                         coordinate difference between the rows in X
                         and the query matrix Y is scaled by dividing
                         by the corresponding element of the standard
                         deviation computed from X.  A different
                         scaling vector can be specified with the
                         subsequent DISTPARAMETER input argument.
     "mahalanobis"       Mahalanobis distance, computed using a
                         positive definite covariance matrix.  A
                         different covariance matrix can be specified
                         with the subsequent DISTPARAMETER input
                         argument.
     "cityblock"         City block distance.
     "minkowski"         Minkowski distance.  The default exponent is
                         2.  A different exponent can be specified
                         with the subsequent DISTPARAMETER input
                         argument.
     "chebychev"         Chebychev distance (maximum coordinate
                         difference).
     "cosine"            One minus the cosine of the included angle
                         between points (treated as vectors).
     "correlation"       One minus the sample linear correlation
                         between observations (treated as sequences of
                         values).
     "hamming"           Hamming distance, which is the percentage of
                         coordinates that differ.
     "jaccard"           One minus the Jaccard coefficient, which is
                         the percentage of nonzero coordinates that
                         differ.
     "spearman"          One minus the sample Spearman's rank
                         correlation between observations (treated as
                         sequences of values).
     @DISTFUN            Custom distance function handle.  A distance
                         function of the form ‘function D2 = distfun
                         (XI, YI)’, where XI is a 1xP vector
                         containing a single observation in
                         P-dimensional space, YI is an NxP matrix
                         containing an arbitrary number of
                         observations in the same P-dimensional space,
                         and D2 is an NxP vector of distances, where
                         (D2k) is the distance between observations XI
                         and (YIk,:).

     ‘D = pdist (X, Y, DISTANCE, DISTPARAMETER)’ returns the distance
     using the metric specified by DISTANCE and DISTPARAMETER.  The
     latter one can only be specified when the selected DISTANCE is
     "seuclidean", "minkowski", and "mahalanobis".

     See also: pdist2, squareform, linkage.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 46
Return the distance between any two rows in X.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
pdist2


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 4811
 -- statistics: D = pdist2 (X, Y)
 -- statistics: D = pdist2 (X, Y, DISTANCE)
 -- statistics: D = pdist2 (X, Y, DISTANCE, DISTPARAMETER)
 -- statistics: D = pdist2 (..., NAME, VALUE)
 -- statistics: [D, I] = pdist2 (..., NAME, VALUE)

     Compute pairwise distance between two sets of vectors.

     ‘D = pdist2 (X, Y)’ calculates the euclidean distance between each
     pair of observations in X and Y.  Let X be an MxP matrix
     representing M points in P-dimensional space and Y be an NxP matrix
     representing another set of points in the same space.  This
     function computes the MxN distance matrix D, where D(i,j) is the
     distance between X(i,:) and Y(j,:).

     ‘D = pdist2 (X, Y, DISTANCE)’ returns the distance between each
     pair of observations in X and Y using the metric specified by
     DISTANCE, which can be any of the following options.

     "euclidean"         Euclidean distance.
     "squaredeuclidean"  Squared Euclidean distance.
     "seuclidean"        standardized Euclidean distance.  Each
                         coordinate difference between the rows in X
                         and the query matrix Y is scaled by dividing
                         by the corresponding element of the standard
                         deviation computed from X.  A different
                         scaling vector can be specified with the
                         subsequent DISTPARAMETER input argument.
     "mahalanobis"       Mahalanobis distance, computed using a
                         positive definite covariance matrix.  A
                         different covariance matrix can be specified
                         with the subsequent DISTPARAMETER input
                         argument.
     "cityblock"         City block distance.
     "minkowski"         Minkowski distance.  The default exponent is
                         2.  A different exponent can be specified
                         with the subsequent DISTPARAMETER input
                         argument.
     "chebychev"         Chebychev distance (maximum coordinate
                         difference).
     "cosine"            One minus the cosine of the included angle
                         between points (treated as vectors).
     "correlation"       One minus the sample linear correlation
                         between observations (treated as sequences of
                         values).
     "hamming"           Hamming distance, which is the percentage of
                         coordinates that differ.
     "jaccard"           One minus the Jaccard coefficient, which is
                         the percentage of nonzero coordinates that
                         differ.
     "spearman"          One minus the sample Spearman's rank
                         correlation between observations (treated as
                         sequences of values).
     @DISTFUN            Custom distance function handle.  A distance
                         function of the form ‘function D2 = distfun
                         (XI, YI)’, where XI is a 1xP vector
                         containing a single observation in
                         P-dimensional space, YI is an NxP matrix
                         containing an arbitrary number of
                         observations in the same P-dimensional space,
                         and D2 is an NxP vector of distances, where
                         (D2k) is the distance between observations XI
                         and (YIk,:).

     ‘D = pdist2 (X, Y, DISTANCE, DISTPARAMETER)’ returns the distance
     using the metric specified by DISTANCE and DISTPARAMETER.  The
     latter one can only be specified when the selected DISTANCE is
     "seuclidean", "minkowski", and "mahalanobis".

     ‘D = pdist2 (..., NAME, VALUE)’ for any previous arguments,
     modifies the computation using NAME-VALUE parameters.
        • ‘D = pdist2 (X, Y, DISTANCE, "Smallest", K)’ computes the
          distance using the metric specified by DISTANCE and returns
          the K smallest pairwise distances to observations in X for
          each observation in Y in ascending order.
        • ‘D = pdist2 (X, Y, DISTANCE, DISTPARAMETER, "Largest", K)’
          computes the distance using the metric specified by DISTANCE
          and DISTPARAMETER and returns the K largest pairwise distances
          in descending order.

     ‘[D, I] = pdist2 (..., NAME, VALUE)’ also returns the matrix I,
     which contains the indices of the observations in X corresponding
     to the distances in D.  You must specify either "Smallest" or
     "Largest" as an optional NAME-VALUE pair pair argument to compute
     the second output argument.

     See also: pdist, knnsearch, rangesearch.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 54
Compute pairwise distance between two sets of vectors.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 10
plsregress


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 5477
 -- statistics: [XLOAD, YLOAD] = plsregress (X, Y)
 -- statistics: [XLOAD, YLOAD] = plsregress (X, Y, NCOMP)
 -- statistics: [XLOAD, YLOAD, XSCORE, YSCORE, COEF, PCTVAR, MSE, STATS]
          = plsregress (X, Y, NCOMP)
 -- statistics: [XLOAD, YLOAD, XSCORE, YSCORE, COEF, PCTVAR, MSE, STATS]
          = plsregress (..., NAME, VALUE)

     Calculate partial least squares regression using SIMPLS algorithm.

     ‘plsregress’ uses the SIMPLS algorithm, and first centers X and Y
     by subtracting off column means to get centered variables.
     However, it does not rescale the columns.  To perform partial least
     squares regression with standardized variables, use ‘zscore’ to
     normalize X and Y.

     ‘[XLOAD, YLOAD] = plsregress (X, Y)’ computes a partial least
     squares regression of Y on X, using NCOMP PLS components, which by
     default are calculated as min (size (X, 1) - 1, size(X, 2)), and
     returns the the predictor and response loadings in XLOAD and YLOAD,
     respectively.
        • X is an NxP matrix of predictor variables, with rows
          corresponding to observations, and columns corresponding to
          variables.
        • Y is an NxM response matrix.
        • XLOAD is a PxNCOMP matrix of predictor loadings, where each
          row of XLOAD contains coefficients that define a linear
          combination of PLS components that approximate the original
          predictor variables.
        • YLOAD is an MxNCOMP matrix of response loadings, where each
          row of YLOAD contains coefficients that define a linear
          combination of PLS components that approximate the original
          response variables.

     ‘[XLOAD, YLOAD] = plsregress (X, Y, NCOMP)’ defines the desired
     number of PLS components to use in the regression.  NCOMP, a scalar
     positive integer, must not exceed the default calculated value.

     ‘[XLOAD, YLOAD, XSCORE, YSCORE, COEF, PCTVAR, MSE, STATS] =
     plsregress (X, Y, NCOMP)’ also returns the following arguments:
        • XSCORE is an NxNCOMP orthonormal matrix with the predictor
          scores, i.e., the PLS components that are linear combinations
          of the variables in X, with rows corresponding to observations
          and columns corresponding to components.
        • YSCORE is an NxNCOMP orthonormal matrix with the response
          scores, i.e., the linear combinations of the responses with
          which the PLS components XSCORE have maximum covariance, with
          rows corresponding to observations and columns corresponding
          to components.
        • COEF is a (P+1)xM matrix with the PLS regression coefficients,
          containing the intercepts in the first row.
        • PCTVAR is a 2xNCOMP matrix containing the percentage of the
          variance explained by the model with the first row containing
          the percentage of explained varianced in X by each PLS
          component and the second row containing the percentage of
          explained variance in Y.
        • MSE is a 2x(NCOMP+1) matrix containing the estimated mean
          squared errors for PLS models with 0:NCOMP components with the
          first row containing the squared errors for the predictor
          variables in X and the second row containing the mean squared
          errors for the response variable(s) in Y.
        • STATS is a structure with the following fields:
             • STATS.W is a PxNCOMP matrix of PLS weights.
             • STATS.T2 is the T^2 statistics for each point in XSCORE.
             • STATS.Xresiduals is an NxP matrix with the predictor
               residuals.
             • STATS.Yresiduals is an NxM matrix with the response
               residuals.

     ‘[...] = plsregress (..., NAME, VALUE, ...)’ specifies one or more
     of the following NAME/VALUE pairs:

          NAME           VALUE
     ---------------------------------------------------------------------------
          "CV"           The method used to compute MSE.  When VALUE is a
                         positive integer K, ‘plsregress’ uses K-fold
                         cross-validation.  Set VALUE to a cross-validation
                         partition, created using ‘cvpartition’, to use other
                         forms of cross-validation.  Set VALUE to
                         "resubstitution" to use both X and Y to fit the
                         model and to estimate the mean squared errors,
                         without cross-validation.  By default, VALUE =
                         "resubstitution".
          "MCReps"       A positive integer indicating the number of
                         Monte-Carlo repetitions for cross-validation.  By
                         default, VALUE = 1.  A different "MCReps" value is
                         only meaningful when using the "HoldOut" method for
                         cross-validation, previously set by a ‘cvpartition’
                         object.  If no cross-validation method is used, then
                         "MCReps" must be 1.

     Further information about the PLS regression can be found at
     <https://en.wikipedia.org/wiki/Partial_least_squares_regression>

     References
     ----------

       1. SIMPLS: An alternative approach to partial least squares
          regression.  Chemometrics and Intelligent Laboratory Systems
          (1993)


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 66
Calculate partial least squares regression using SIMPLS algorithm.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
ppplot


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 955
 -- statistics: ppplot (X, DIST)
 -- statistics: ppplot (X, DIST, PARAMS)
 -- statistics: [P, Y] = ppplot (X, DIST, PARAMS)

     Perform a PP-plot (probability plot).

     If F is the CDF of the distribution DIST with parameters PARAMS and
     X a sample vector of length N, the PP-plot graphs ordinate Y(I) = F
     (I-th largest element of X) versus abscissa P(I) = (I - 0.5)/N.  If
     the sample comes from F, the pairs will approximately follow a
     straight line.

     The default for DIST is the standard normal distribution.

     The optional argument PARAMS contains a list of parameters of DIST.

     For example, for a probability plot of the uniform distribution on
     [2,4] and X, use

          ppplot (x, "unif", 2, 4)

     DIST can be any string for which a function DISTCDF that calculates
     the CDF of distribution DIST exists.

     If no output is requested then the data are plotted immediately.

     See also: qqplot.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 37
Perform a PP-plot (probability plot).



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
princomp


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1247
 -- statistics: COEFF = princomp (X)
 -- statistics: [COEFF, SCORE] = princomp (X)
 -- statistics: [COEFF, SCORE, LATENT] = princomp (X)
 -- statistics: [COEFF, SCORE, LATENT, TSQUARE] = princomp (X)
 -- statistics: [...] = princomp (X, "econ")

     Performs a principal component analysis on a NxP data matrix X.

        • COEFF : returns the principal component coefficients
        • SCORE : returns the principal component scores, the
          representation of X in the principal component space
        • LATENT : returns the principal component variances, i.e., the
          eigenvalues of the covariance matrix X.
        • TSQUARE : returns Hotelling's T-squared Statistic for each
          observation in X
        • [...]  = princomp(X,'econ') returns only the elements of
          latent that are not necessarily zero, and the corresponding
          columns of COEFF and SCORE, that is, when n <= p, only the
          first n-1.  This can be significantly faster when p is much
          larger than n.  In this case the svd will be applied on the
          transpose of the data matrix X

     References
     ----------

       1. Jolliffe, I. T., Principal Component Analysis, 2nd Edition,
          Springer, 2002


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 63
Performs a principal component analysis on a NxP data matrix X.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
probit


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 184
 -- statistics: X = probit (P)

     Probit transformation

     Return the probit (the quantile of the standard normal
     distribution) for each element of P.

     See also: logit.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 21
Probit transformation



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 10
procrustes


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2477
 -- statistics: D = procrustes (X, Y)
 -- statistics: D = procrustes (X, Y, PARAM1, VALUE1, ...)
 -- statistics: [D, Z] = procrustes (...)
 -- statistics: [D, Z, TRANSFORM] = procrustes (...)

     Procrustes Analysis.

     ‘D = procrustes (X, Y)’ computes a linear transformation of the
     points in the matrix Y to best conform them to the points in the
     matrix X by minimizing the sum of squared errors, as the goodness
     of fit criterion, which is returned in D as a dissimilarity
     measure.  D is standardized by a measure of the scale of X, given
     by
        • sum (sum ((X - repmat (mean (X, 1), size (X, 1), 1)) .^ 2, 1))
     i.e., the sum of squared elements of a centered version of X.
     However, if X comprises repetitions of the same point, the sum of
     squared errors is not standardized.

     X and Y must have the same number of points (rows) and procrustes
     matches the i-th point in Y to the i-th point in X.  Points in Y
     can have smaller dimensions (columns) than those in X, but not the
     opposite.  Missing dimensions in Y are added with padding columns
     of zeros as necessary to match the the dimensions in X.

     ‘[D, Z] = procrustes (X, Y)’ also returns the transformed values in
     Y.

     ‘[D, Z, TRANSFORM] = procrustes (X, Y)’ also returns the
     transformation that maps Y to Z.

     TRANSFORM is a structure with fields:

          c            the translation component
          T            the orthogonal rotation and reflection component
          b            the scale component

     So that ‘Z = TRANSFORM.b * Y * TRANSFORM.T + TRANSFORM.c’

     procrustes can take two optional parameters as Name-Value pairs.

     ‘[...] = procrustes (..., "Scaling", false)’ computes a
     transformation that does not include scaling, that is TRANSFORM.b =
     1.  Setting "Scaling" to true includes a scaling component, which
     is the default.

     ‘[...] = procrustes (..., "Reflection", false)’ computes a
     transformation that does not include a reflection component, that
     is TRANSFORM.T = 1.  Setting "Reflection" to true forces the
     solution to include a reflection component in the computed
     transformation, that is TRANSFORM.T = -1.

     ‘[...] = procrustes (..., "Reflection", "best")’ computes the best
     fit procrustes solution, which may or may not include a reflection
     component, which is the default.

     See also: cmdscale.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 20
Procrustes Analysis.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
qqplot


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1200
 -- statistics: [Q, S] = qqplot (X)
 -- statistics: [Q, S] = qqplot (X, Y)
 -- statistics: [Q, S] = qqplot (X, DIST)
 -- statistics: [Q, S] = qqplot (X, Y, PARAMS)
 -- statistics: qqplot (...)

     Perform a QQ-plot (quantile plot).

     If F is the CDF of the distribution DIST with parameters PARAMS and
     G its inverse, and X a sample vector of length N, the QQ-plot
     graphs ordinate S(I) = I-th largest element of x versus abscissa
     Q(If) = G((I - 0.5)/N).

     If the sample comes from F, except for a transformation of location
     and scale, the pairs will approximately follow a straight line.

     If the second argument is a vector Y the empirical CDF of Y is used
     as DIST.

     The default for DIST is the standard normal distribution.  The
     optional argument PARAMS contains a list of parameters of DIST.
     For example, for a quantile plot of the uniform distribution on
     [2,4] and X, use

          qqplot (x, "unif", 2, 4)

     DIST can be any string for which a function DISTINV or DIST_INV
     exists that calculates the inverse CDF of distribution DIST.

     If no output arguments are given, the data are plotted directly.

     See also: ppplot.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 34
Perform a QQ-plot (quantile plot).



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
qrandn


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 503
 -- statistics: Z = qrandn (Q, R, C)
 -- statistics: Z = qrandn (Q, [R, C])

     Returns random deviates drawn from a q-Gaussian distribution.

     Parameter Q characterizes the q-Gaussian distribution.  The result
     has the size indicated by S.

     Reference: W. Thistleton, J. A. Marsh, K. Nelson, C. Tsallis (2006)
     "Generalized Box-Muller method for generating q-Gaussian random
     deviates" arXiv:cond-mat/0605570
     http://arxiv.org/abs/cond-mat/0605570

     See also: rand, randn.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 61
Returns random deviates drawn from a q-Gaussian distribution.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 10
randsample


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 678
 -- statistics: Y = randsample (V, K)
 -- statistics: Y = randsample (V, K, REPLACEMENT=false)
 -- statistics: Y = randsample (V, K, REPLACEMENT=false, [W=[]])

     Sample elements from a vector.

     Returns K random elements from a vector V with N elements, sampled
     without or with REPLACEMENT.

     If V is a scalar, samples from 1:V.

     If a weight vector W of the same size as V is specified, the
     probability of each element being sampled is proportional to W.
     Unlike Matlab's function of the same name, this can be done for
     sampling with or without replacement.

     Randomization is performed using rand().

     See also: datasample, randperm.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 30
Sample elements from a vector.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 11
rangesearch


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6666
 -- statistics: IDX = rangesearch (X, Y, R)
 -- statistics: [IDX, D] = rangesearch (X, Y, R)
 -- statistics: [...] = rangesearch (..., NAME, VALUE)

     Find all neighbors within specified distance from input data.

     ‘IDX = rangesearch (X, Y, R)’ returns all the points in X that are
     within distance R from the points in Y.  X must be an NxP numeric
     matrix of input data, where rows correspond to observations and
     columns correspond to features or variables.  Y is an MxP numeric
     matrix with query points, which must have the same numbers of
     column as X.  R must be a nonnegative scalar value.  IDX is an Mx1
     cell array, where M is the number of observations in Y.  The vector
     IDX{j} contains the indices of observations (rows) in X whose
     distances to Y(j,:) are not greater than R.

     ‘[IDX, D] = rangesearch (X, Y, R)’ also returns the distances, D,
     which correspond to the points in X that are within distance R from
     the points in Y.  D is an Mx1 cell array, where M is the number of
     observations in Y.  The vector D{j} contains the distances of
     observations (rows) in X whose distances to Y(j,:) are not greater
     than R.

     Additional parameters can be specified by Name-Value pair
     arguments.

     NAME            VALUE
                     
     ---------------------------------------------------------------------------
     "P"             is the Minkowski distance exponent and it must be a
                     positive scalar.  This argument is only valid when the
                     selected distance metric is "minkowski".  By default it
                     is 2.
                     
     "Scale"         is the scale parameter for the standardized Euclidean
                     distance and it must be a nonnegative numeric vector of
                     equal length to the number of columns in X.  This
                     argument is only valid when the selected distance metric
                     is "seuclidean", in which case each coordinate of X is
                     scaled by the corresponding element of "scale", as is
                     each query point in Y.  By default, the scale parameter
                     is the standard deviation of each coordinate in X.
                     
     "Cov"           is the covariance matrix for computing the mahalanobis
                     distance and it must be a positive definite matrix
                     matching the the number of columns in X.  This argument
                     is only valid when the selected distance metric is
                     "mahalanobis".
                     
     "BucketSize"    is the maximum number of data points in the leaf node of
                     the Kd-tree and it must be a positive integer.  This
                     argument is only valid when the selected search method
                     is "kdtree".
                     
     "SortIndices"   is a boolean flag to sort the returned indices in
                     ascending order by distance and it is true by default.
                     When the selected search method is "exhaustive" or the
                     "IncludeTies" flag is true, ‘rangesearch’ always sorts
                     the returned indices.
                     
     "Distance"      is the distance metric used by ‘rangesearch’ as
                     specified below:

          "euclidean"    Euclidean distance.
          "seuclidean"   standardized Euclidean distance.  Each coordinate
                         difference between the rows in X and the query
                         matrix Y is scaled by dividing by the corresponding
                         element of the standard deviation computed from X.
                         To specify a different scaling, use the "Scale"
                         name-value argument.
          "cityblock"    City block distance.
          "chebychev"    Chebychev distance (maximum coordinate difference).
          "minkowski"    Minkowski distance.  The default exponent is 2.  To
                         specify a different exponent, use the "P" name-value
                         argument.
          "mahalanobis"  Mahalanobis distance, computed using a positive
                         definite covariance matrix.  To change the value of
                         the covariance matrix, use the "Cov" name-value
                         argument.
          "cosine"       Cosine distance.
          "correlation"  One minus the sample linear correlation between
                         observations (treated as sequences of values).
          "spearman"     One minus the sample Spearman's rank correlation
                         between observations (treated as sequences of
                         values).
          "hamming"      Hamming distance, which is the percentage of
                         coordinates that differ.
          "jaccard"      One minus the Jaccard coefficient, which is the
                         percentage of nonzero coordinates that differ.
          @DISTFUN       Custom distance function handle.  A distance
                         function of the form ‘function D2 = distfun (XI,
                         YI)’, where XI is a 1xP vector containing a single
                         observation in P-dimensional space, YI is an NxP
                         matrix containing an arbitrary number of
                         observations in the same P-dimensional space, and D2
                         is an NxP vector of distances, where (D2k) is the
                         distance between observations XI and (YIk,:).

     "NSMethod"      is the nearest neighbor search method used by
                     ‘rangesearch’ as specified below.

          "kdtree"       Creates and uses a Kd-tree to find nearest
                         neighbors.  "kdtree" is the default value when the
                         number of columns in X is less than or equal to 10,
                         X is not sparse, and the distance metric is
                         "euclidean", "cityblock", "manhattan", "chebychev",
                         or "minkowski".  Otherwise, the default value is
                         "exhaustive".  This argument is only valid when the
                         distance metric is one of the four aforementioned
                         metrics.
          "exhaustive"   Uses the exhaustive search algorithm by computing
                         the distance values from all the points in X to each
                         point in Y.

     See also: knnsearch, pdist2.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 61
Find all neighbors within specified distance from input data.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
ranksum


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 3121
 -- statistics: P = ranksum (X, Y)
 -- statistics: P = ranksum (X, Y, ALPHA)
 -- statistics: P = ranksum (X, Y, ALPHA, NAME, VALUE)
 -- statistics: P = ranksum (X, Y, NAME, VALUE)
 -- statistics: [P, H] = ranksum (X, Y, ...)
 -- statistics: [P, H, STATS] = ranksum (X, Y, ...)

     Wilcoxon rank sum test for equal medians.  This test is equivalent
     to a Mann-Whitney U-test.

     ‘P = ranksum (X, Y)’ returns the p-value of a two-sided Wilcoxon
     rank sum test.  It tests the null hypothesis that two independent
     samples, in the vectors X and Y, come from continuous distributions
     with equal medians, against the alternative hypothesis that they
     are not.  X and Y can have different lengths and the test assumes
     that they are independent.

     ‘ranksum’ treats NaN in X, Y as missing values.  The two-sided
     p-value is computed by doubling the most significant one-sided
     value.

     ‘[P, H] = ranksum (X, Y)’ also returns the result of the hypothesis
     test with ‘H = 1’ indicating a rejection of the null hypothesis at
     the default alpha = 0.05 significance level, and ‘H = 0’ indicating
     a failure to reject the null hypothesis at the same significance
     level.

     ‘[P, H, STATS] = ranksum (X, Y)’ also returns the structure STATS
     with information about the test statistic.  It contains the field
     ‘ranksum’ with the value of the rank sum test statistic and if
     computed with the "approximate" method it also contains the value
     of the z-statistic in the field ‘zval’.

     ‘[...] = ranksum (X, Y, ALPHA)’ or alternatively ‘[...] = ranksum
     (X, Y, "alpha", ALPHA)’ returns the result of the hypothesis test
     performed at the significance level ALPHA.

     ‘[...] = ranksum (X, Y, "method", M)’ defines the computation
     method of the p-value specified in M, which can be "exact",
     "approximate", or "oldexact".  M must be a single string.  When
     "method" is unspecified, the default is: "exact" when ‘min (length
     (X), length (Y)) < 10’ and ‘length (X) + length (Y) < 10’,
     otherwise the "approximate" method is used.

        • "exact" method uses full enumeration for small total sample
          size (< 10), otherwise the network algorithm is used for
          larger samples.
        • "approximate" uses normal approximation method for computing
          the p-value.
        • "oldexact" uses full enumeration for any sample size.  Note,
          that this option can lead to out of memory error for large
          samples.  Use with caution!

     ‘[...] = ranksum (X, Y, "tail", TAIL)’ defines the type of test,
     which can be "both", "right", or "left".  TAIL must be a single
     string.

        • "both" - "medians are not equal" (two-tailed test, default)
        • "right" - "median of X is greater than median of Y"
          (right-tailed test)
        • "left" - "median of X is less than median of Y" (left-tailed
          test)

     Note: the rank sum statistic is based on the smaller sample of
     vectors X and Y.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 41
Wilcoxon rank sum test for equal medians.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
regress


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1449
 -- statistics: [B, BINT, R, RINT, STATS] = regress (Y, X, [ALPHA])

     Multiple Linear Regression using Least Squares Fit of Y on X with
     the model ‘y = X * beta + e’.

     Here,

        • ‘y’ is a column vector of observed values
        • ‘X’ is a matrix of regressors, with the first column filled
          with the constant value 1
        • ‘beta’ is a column vector of regression parameters
        • ‘e’ is a column vector of random errors

     Arguments are

        • Y is the ‘y’ in the model
        • X is the ‘X’ in the model
        • ALPHA is the significance level used to calculate the
          confidence intervals BINT and RINT (see 'Return values'
          below).  If not specified, ALPHA defaults to 0.05

     Return values are

        • B is the ‘beta’ in the model
        • BINT is the confidence interval for B
        • R is a column vector of residuals
        • RINT is the confidence interval for R
        • STATS is a row vector containing:

             • The R^2 statistic
             • The F statistic
             • The p value for the full model
             • The estimated error variance

     R and RINT can be passed to ‘rcoplot’ to visualize the residual
     intervals and identify outliers.

     NaN values in Y and X are removed before calculation begins.

     See also: regress_gp, regression_ftest, regression_ttest.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Multiple Linear Regression using Least Squares Fit of Y on X with the
model ...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 10
regress_gp


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2409
 -- statistics: [YFIT, YINT, M, K] = regress_gp (X, Y, XFIT)
 -- statistics: [YFIT, YINT, M, K] = regress_gp (X, Y, XFIT, "linear")
 -- statistics: [YFIT, YINT, YSD] = regress_gp (X, Y, XFIT, "rbf")
 -- statistics: [...] = regress_gp (X, Y, XFIT, "linear", SP)
 -- statistics: [...] = regress_gp (X, Y, XFIT, SP)
 -- statistics: [...] = regress_gp (X, Y, XFIT, "rbf", THETA)
 -- statistics: [...] = regress_gp (X, Y, XFIT, "rbf", THETA, G)
 -- statistics: [...] = regress_gp (X, Y, XFIT, "rbf", THETA, G, ALPHA)
 -- statistics: [...] = regress_gp (X, Y, XFIT, THETA)
 -- statistics: [...] = regress_gp (X, Y, XFIT, THETA, G)
 -- statistics: [...] = regress_gp (X, Y, XFIT, THETA, G, ALPHA)

     Regression using Gaussian Processes.

     ‘[YFIT, YINT, M, K] = regress_gp (X, Y, XFIT)’ will estimate a
     linear Gaussian Process model M in the form Y = X' * M, where X is
     an NxP matrix with N observations in P dimensional space and Y is
     an Nx1 column vector as the dependent variable.  The information
     about errors of the predictions (interpolation/extrapolation) is
     given by the covariance matrix K.  By default, the linear model
     defines the prior covariance of M as ‘SP = 100 * eye (size (X, 2) +
     1)’.  A custom prior covariance matrix can be passed as SP, which
     must be a P+1xP+1 positive definite matrix.  The model is evaluated
     for input XFIT, which must have the same columns as X, and the
     estimates are returned in YFIT along with the estimated variation
     in YINT.  YINT(:,1) contains the upper boundary estimate and
     YINT(:,1) contains the upper boundary estimate with respect to
     YFIT.

     ‘[YFIT, YINT, YSD, K] = regress_gp (X, Y, XFIT, "rbf")’ will
     estimate a Gaussian Process model with a Radial Basis Function
     (RBF) kernel with default parameters THETA = 5, which corresponds
     to the characteristic lengthscale, and G = 0.01, which corresponds
     to the nugget effect, and ALPHA = 0.05 which defines the confidence
     level for the estimated intervals returned in YINT.  The function
     also returns the predictive covariance matrix in YSD.  For
     multidimensional predictors X the function will automatically
     normalize each column to a zero mean and a standard deviation to
     one.

     Run ‘demo regress_gp’ to see examples.

     See also: regress, regression_ftest, regression_ttest.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 36
Regression using Gaussian Processes.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 16
regression_ftest


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2407
 -- statistics: [H, PVAL, STATS] = regression_ftest (Y, X, FM)
 -- statistics: [...] = regression_ftest (Y, X, FM, RM)
 -- statistics: [...] = regression_ftest (Y, X, FM, RM, NAME, VALUE)
 -- statistics: [...] = regression_ftest (Y, X, FM, [], NAME, VALUE)

     F-test for General Linear Regression Analysis

     Perform a general linear regression F test for the null hypothesis
     that the full model of the form y = b_0 + b_1 * x_1 + b_2 * x_2 +
     ... + b_n * x_n + e, where n is the number of variables in X, does
     not perform better than a reduced model, such as y = b'_0 + b'_1 *
     x_1 + b'_2 * x_2 + ... + b'_k * x_k + e, where k < n and it
     corresponds to the first k variables in X.  Explanatory (dependent)
     variable Y and response (independent) variables X must not contain
     any missing values (NaNs).

     The full model, FM, must be a vector of length equal to the columns
     of X, in which case the constant term b_0 is assumed 0, or equal to
     the columns of X plus one, in which case the first element is the
     constant b_0.

     The reduced model, RM, must include the constant term and a subset
     of the variables (columns) in X.  If RM is not given, then a
     constant term b'_0 is assumed equal to the constant term, b_0, of
     the full model or 0, if the full model, FM, does not have a
     constant term.  RM must be a vector or a scalar if only a constant
     term is passed into the function.

     Name-Value pair arguments can be used to set statistical
     significance.  "alpha" can be used to specify the significance
     level of the test (the default value is 0.05).  If you want to pass
     optional Name-Value pair without a reduced model, make sure that
     the latter is passed as an empty variable.

     If H is 1 the null hypothesis is rejected, meaning that the full
     model explains the variance better than the restricted model.  If H
     is 0, it can be assumed that the full model does NOT explain the
     variance any better than the restricted model.

     The p-value (1 minus the CDF of this distribution at F) is returned
     in PVAL.

     Under the null, the test statistic F follows an F distribution with
     'df1' and 'df2' degrees of freedom, which are returned as fields in
     the STATS structure along with the test's F-statistic, 'fstat'

     See also: regression_ttest, regress, regress_gp.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 45
F-test for General Linear Regression Analysis



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 16
regression_ttest


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1762
 -- statistics: H = regression_ttest (Y, X)
 -- statistics: [H, PVAL] = regression_ttest (Y, X)
 -- statistics: [H, PVAL, CI] = regression_ttest (Y, X)
 -- statistics: [H, PVAL, CI, STATS] = regression_ttest (Y, X)
 -- statistics: [...] = regression_ttest (Y, X, NAME, VALUE)

     Perform a linear regression t-test.

     ‘H = regression_ttest (Y, X)’ tests the null hypothesis that the
     slope beta1 of a simple linear regression equals 0.  The result is
     H = 0 if the null hypothesis cannot be rejected at the 5%
     significance level, or H = 1 if the null hypothesis can be rejected
     at the 5% level.  Y and X must be vectors of equal length with
     finite real numbers.

     The p-value of the test is returned in PVAL.  A 100(1-alpha)%
     confidence interval for beta1 is returned in CI.  STATS is a
     structure containing the value of the test statistic (tstat), the
     degrees of freedom (df), the slope coefficient (beta1), and the
     intercept (beta0).  Under the null, the test statistic STATS.tstat
     follows a T-distribution with STATS.df degrees of freedom.

     ‘[...] = regression_ttest (..., NAME, VALUE)’ specifies one or more
     of the following name/value pairs:

          Name           Value
     ---------------------------------------------------------------------------
          "alpha"        the significance level.  Default is 0.05.
                         
          "tail"         a string specifying the alternative hypothesis
             "both"             beta1 is not 0 (two-tailed, default)
             "left"             beta1 is less than 0 (left-tailed)
             "right"            beta1 is greater than 0 (right-tailed)

     See also: regression_ftest, regress, regress_gp.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 35
Perform a linear regression t-test.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 5
ridge


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1426
 -- statistics: B = ridge (Y, X, K)
 -- statistics: B = ridge (Y, X, K, SCALED)

     Ridge regression.

     ‘B = ridge (Y, X, K)’ returns the vector of coefficient estimates
     by applying ridge regression from the predictor matrix X to the
     response vector Y.  Each value of B is the coefficient for the
     respective ridge parameter given K.  By default, B is calculated
     after centering and scaling the predictors to have a zero mean and
     standard deviation 1.

     ‘B = ridge (Y, X, K, SCALED)’ performs the regression with the
     specified scaling of the coefficient estimates B.  When SCALED = 0,
     the function restores the coefficients to the scale of the original
     data thus is more useful for making predictions.  When SCALED = 1,
     the coefficient estimates correspond to the scaled centered data.

        • ‘y’ must be an Nx1 numeric vector with the response data.
        • ‘X’ must be an Nxp numeric matrix with the predictor data.
        • ‘k’ must be a numeric vector with the ridge parameters.
        • ‘scaled’ must be a numeric scalar indicating whether the
          coefficient estimates in B are restored to the scale of the
          original data.  By default, SCALED = 1.

     Further information about Ridge regression can be found at
     <https://en.wikipedia.org/wiki/Ridge_regression>

     See also: lasso, stepwisefit, regress.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 17
Ridge regression.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 9
rmmissing


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1519
 -- statistics: R = rmmissing (A)
 -- statistics: R = rmmissing (A, DIM)
 -- statistics: R = rmmissing (..., NAME, VALUE)
 -- statistics: [R TF] = rmmissing (...)

     Remove missing or incomplete data from an array.

     Given an input vector or matrix (2-D array) A, remove missing data
     from a vector or missing rows or columns from a matrix.  A can be a
     numeric array, char array, or an array of cell strings.  R returns
     the array after removal of missing data.

     The values which represent missing data depend on the data type of
     A:

        • NaN: ‘single’, ‘double’.

        • ' ' (white space): ‘char’.

        • {"}: string cells.

     Choose to remove rows (default) or columns by setting optional
     input DIM:

        • 1: rows.

        • 2: columns.

     Note: data types with no default 'missing' value will always result
     in ‘R == A’ and a TF output of ‘false(size(A))’.

     Additional optional parameters are set by NAME-VALUE pairs.  These
     are:

        • MinNumMissing: minimum number of missing values to remove an
          entry, row or column, defined as a positive integer number.
          E.g.: if MinNumMissing is set to ‘2’, remove the row of a
          numeric matrix only if it includes 2 or more NaN.

     Optional return value TF is a logical array where ‘true’ values
     represent removed entries, rows or columns from the original data
     A.

See also: fillmissing, ismissing, standardizeMissing.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 48
Remove missing or incomplete data from an array.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
runstest


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2342
 -- statistics: H = runstest (X)
 -- statistics: H = runstest (X, V)
 -- statistics: H = runstest (X, "ud")
 -- statistics: H = runstest (..., NAME, VALUE)
 -- statistics: [H, PVAL, STATS] = runstest (...)

     Run test for randomness in the vector X.

     ‘H = runstest (X)’ calculates the number of runs of consecutive
     values above or below the mean of X and tests the null hypothesis
     that the values in the data vector X come in random order.  H is 1
     if the test rejects the null hypothesis at the 5% significance
     level, or 0 otherwise.

     ‘H = runstest (X, V)’ tests the null hypothesis based on the number
     of runs of consecutive values above or below the specified
     reference value V.  Values exactly equal to V are omitted.

     ‘H = runstest (X, "ud")’ calculates the number of runs up or down
     and tests the null hypothesis that the values in the data vector X
     follow a trend.  Too few runs indicate a trend, while too many runs
     indicate an oscillation.  Values exactly equal to the preceding
     value are omitted.

     ‘H = runstest (..., NAME, VALUE)’ specifies additional options to
     the above tests by one or more NAME-VALUE pair arguments.

     Name             Value
     ----------------------------------------------------------------------------
     "alpha"          the significance level.  Default is 0.05.
                      
     "method"         a string specifying the method used to compute the
                      p-value of the test.  It can be either "exact" to use an
                      exact algorithm, or "approximate" to use a normal
                      approximation.  The default is "exact" for runs
                      above/below, and for runs up/down when the length of x
                      is less than or equal to 50.  When testing for runs
                      up/down and the length of X is greater than 50, then the
                      default is "approximate", and the "exact" method is not
                      available.
                      
     "tail"           a string specifying the alternative hypothesis
                    "both"           two-tailed (default)
                    "left"           left-tailed
                    "right"          right-tailed

     See also: signrank, signtest.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 40
Run test for randomness in the vector X.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 11
sampsizepwr


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6301
 -- statistics: N = sampsizepwr (TESTTYPE, PARAMS, P1)
 -- statistics: N = sampsizepwr (TESTTYPE, PARAMS, P1, POWER)
 -- statistics: POWER = sampsizepwr (TESTTYPE, PARAMS, P1, [], N)
 -- statistics: P1 = sampsizepwr (TESTTYPE, PARAMS, [], POWER, N)
 -- statistics: [N1, N2] = sampsizepwr ("t2", PARAMS, P1, POWER)
 -- statistics: [...] = sampsizepwr (TESTTYPE, PARAMS, P1, POWER, N,
          NAME, VALUE)

     Sample size and power calculation for hypothesis test.

     ‘sampsizepwr’ computes the sample size, power, or alternative
     parameter value for a hypothesis test, given the other two values.
     For example, you can compute the sample size required to obtain a
     particular power for a hypothesis test, given the parameter value
     of the alternative hypothesis.

     ‘N = sampsizepwr (TESTTYPE, PARAMS, P1)’ returns the sample size N
     required for a two-sided test of the specified type to have a power
     (probability of rejecting the null hypothesis when the alternative
     is true) of 0.90 when the significance level (probability of
     rejecting the null hypothesis when the null hypothesis is true) is
     0.05.  PARAMS specifies the parameter values under the null
     hypothesis.  P1 specifies the value of the single parameter being
     tested under the alternative hypothesis.  For the two-sample
     t-test, N is the value of the equal sample size for both samples,
     PARAMS specifies the parameter values of the first sample under the
     null and alternative hypotheses, and P1 specifies the value of the
     single parameter from the other sample under the alternative
     hypothesis.

     The following TESTTYPE values are available:

          "z"     one-sample z-test for normally distributed data with known
                  standard deviation.  PARAMS is a two-element vector [MU0
                  SIGMA0] of the mean and standard deviation, respectively,
                  under the null hypothesis.  P1 is the value of the mean
                  under the alternative hypothesis.
          "t"     one-sample t-test or paired t-test for normally distributed
                  data with unknown standard deviation.  PARAMS is a
                  two-element vector [MU0 SIGMA0] of the mean and standard
                  deviation, respectively, under the null hypothesis.  P1 is
                  the value of the mean under the alternative hypothesis.
          "t2"    two-sample pooled t-test (test for equal means) for
                  normally distributed data with equal unknown standard
                  deviations.  PARAMS is a two-element vector [MU0 SIGMA0] of
                  the mean and standard deviation of the first sample under
                  the null and alternative hypotheses.  P1 is the mean of the
                  second sample under the alternative hypothesis.
          "var"   chi-square test of variance for normally distributed data.
                  PARAMS is the variance under the null hypothesis.  P1 is
                  the variance under the alternative hypothesis.
          "p"     test of the P parameter (success probability) for a
                  binomial distribution.  PARAMS is the value of P under the
                  null hypothesis.  P1 is the value of P under the
                  alternative hypothesis.
          "r"     test of the correlation coefficient parameter for
                  significance.  PARAMS is the value of r under the null
                  hypothesis.  P1 is the value of r under the alternative
                  hypothesis.

     The "p" test for the binomial distribution is a discrete test for
     which increasing the sample size does not always increase the
     power.  For N values larger than 200, there may be values smaller
     than the returned N value that also produce the desired power.

     ‘N = sampsizepwr (TESTTYPE, PARAMS, P1, POWER)’ returns the sample
     size N such that the power is POWER for the parameter value P1.
     For the two-sample t-test, N is the equal sample size of both
     samples.

     ‘[N1, N2] = sampsizepwr ("t2", PARAMS, P1, POWER)’ returns the
     sample sizes N1 and N2 for the two samples.  These values are the
     same unless the "ratio" parameter, ‘RATIO = N2 / N1’, is set to a
     value other than the default (See the name/value pair definition of
     ratio below).

     ‘POWER = sampsizepwr (TESTTYPE, PARAMS, P1, [], N)’ returns the
     power achieved for a sample size of N when the true parameter value
     is P1.  For the two-sample t-test, N is the smaller one of the two
     sample sizes.

     ‘P1 = sampsizepwr (TESTTYPE, PARAMS, [], POWER, N)’ returns the
     parameter value detectable with the specified sample size N and
     power POWER.  For the two-sample t-test, N is the smaller one of
     the two sample sizes.  When computing P1 for the "p" test, if no
     alternative can be rejected for a given PARAMS, N and POWER value,
     the function displays a warning message and returns NaN.

     ‘[...] = sampsizepwr (..., N, NAME, VALUE)’ specifies one or more
     of the following NAME / VALUE pairs:

          "alpha"     significance level of the test (default is 0.05)
          "tail"      the type of test which can be:

             "both"         two-sided test for an alternative P1 not equal
                            to PARAMS
                            
             "right"        one-sided test for an alternative P1 larger than
                            PARAMS
                            
             "left"         one-sided test for an alternative P1 smaller
                            than PARAMS

          "ratio"     desired ratio N2 / N1 of the larger sample size N2 to
                      the smaller sample size N1.  Used only for the
                      two-sample t-test.  The value of ‘RATIO’ is greater than
                      or equal to 1 (default is 1).

     ‘sampsizepwr’ computes the sample size, power, or alternative
     hypothesis value given values for the other two.  Specify one of
     these as [] to compute it.  The remaining parameters (and ALPHA,
     RATIO) can be scalars or arrays of the same size.

     See also: vartest, ttest, ttest2, ztest, binocdf.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 54
Sample size and power calculation for hypothesis test.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 9
sigma_pts


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1110
 -- statistics: PTS = sigma_pts (N)
 -- statistics: PTS = sigma_pts (N, M)
 -- statistics: PTS = sigma_pts (N, M, K)
 -- statistics: PTS = sigma_pts (N, M, K, L)

     Calculates 2*N+1 sigma points in N dimensions.

     Sigma points are used in the unscented transform to estimate the
     result of applying a given nonlinear transformation to a
     probability distribution that is characterized only in terms of a
     finite set of statistics.

     If only the dimension N is given the resulting points have zero
     mean and identity covariance matrix.  If the mean M or the
     covariance matrix K are given, then the resulting points will have
     those statistics.  The factor L scales the points away from the
     mean.  It is useful to tune the accuracy of the unscented
     transform.

     There is no unique way of computing sigma points, this function
     implements the algorithm described in section 2.6 "The New Filter"
     pages 40-41 of

     Uhlmann, Jeffrey (1995).  "Dynamic Map Building and Localization:
     New Theoretical Foundations".  Ph.D. thesis.  University of Oxford.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 46
Calculates 2*N+1 sigma points in N dimensions.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
signrank


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 4334
 -- statistics: PVAL = signrank (X)
 -- statistics: PVAL = signrank (X, MY)
 -- statistics: PVAL = signrank (X, MY, NAME, VALUE)
 -- statistics: [PVAL, H] = signrank (...)
 -- statistics: [PVAL, H, STATS] = signrank (...)

     Wilcoxon signed rank test for median.

     ‘PVAL = signrank (X)’ returns the p-value of a two-sided Wilcoxon
     signed rank test.  It tests the null hypothesis that data in X come
     from a distribution with zero median at the 5% significance level
     under the assumption that the distribution is symmetric about its
     median.  X must be a vector.

     If the second argument MY is a scalar, the null hypothesis is that
     X has median MY, whereas if MY is a vector, the null hypothesis is
     that the distribution of ‘X - MY’ has zero median.

     ‘PVAL = signrank (..., NAME, VALUE)’ performs the Wilcoxon signed
     rank test with additional options specified by one or more of the
     following NAME, VALUE pair arguments:

     NAME            VALUE
                     
     ---------------------------------------------------------------------------
     "alpha"         A scalar value for the significance level of the test.
                     Default is 0.05.
                     
     "tail"          A character vector specifying the alternative
                     hypothesis.  It can take one of the following values:

          VALUE          DESCRIPTION
                         
     ---------------------------------------------------------------------------
          "both"         For one-sample test (MY is empty or a scalar), the
                         data in X come from a continuous distribution with
                         median different than zero or MY.  For two-sample
                         test (MY is a vector), the data in X - MY come from
                         a continuous distribution with median different than
                         zero.
                         
          "left"         For one-sample test (MY is empty or a scalar), the
                         data in X come from a continuous distribution with
                         median less than zero or MY.  For two-sample test
                         (MY is a vector), the data in X - MY come from a
                         continuous distribution with median less than zero.
                         
          "right"        For one-sample test (MY is empty or a scalar), the
                         data in X come from a continuous distribution with
                         median greater than zero or MY.  For two-sample test
                         (MY is a vector), the data in X - MY come from a
                         continuous distribution with median greater than
                         zero.

     NAME            VALUE
                     
     ---------------------------------------------------------------------------
     "method"        A character vector specifying the method for computing
                     the p-value.  It can take one of the following values:

          VALUE          DESCRIPTION
                         
     ---------------------------------------------------------------------------
          "exact"        Exact computation of the p-value.  It is the default
                         value for 15 of fewer observations when "method" is
                         not specified.
                         
          "approximate"  Using normal approximation for computing the
                         p-value.  It is the default value for more than 15
                         observations when "method" is not specified.

     ‘[PVAL, H] = signrank (...)’ also returns a logical value
     indicating the test decision.  If H is 0, the null hypothesis is
     accepted, whereas if H is 1, the null hypothesis is rejected.

     ‘[PVAL, H, STATS] = signrank (...)’ also returns the structure
     STATS containing the following fields:

     FIELD           VALUE
     ---------------------------------------------------------------------------
     signedrank      Value of the sign rank test statistic.
                     
     zval            Value of the z-statistic (only computed when the
                     "method" is "approximate").

     See also: tiedrank, signtest, runstest.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 37
Wilcoxon signed rank test for median.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
signtest


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 4222
 -- statistics: PVAL = signtest (X)
 -- statistics: PVAL = signtest (X, MY)
 -- statistics: PVAL = signtest (X, MY, NAME, VALUE)
 -- statistics: [PVAL, H] = signtest (...)
 -- statistics: [PVAL, H, STATS] = signtest (...)

     Signed test for median.

     ‘PVAL = signtest (X)’ returns the p-value of a two-sided sign test.
     It tests the null hypothesis that data in X come from a
     distribution with zero median at the 5% significance level.  X must
     be a vector.

     If the second argument MY is a scalar, the null hypothesis is that
     X has median MY, whereas if MY is a vector, the null hypothesis is
     that the distribution of ‘X - MY’ has zero median.

     ‘PVAL = signtest (..., NAME, VALUE)’ performs the Wilcoxon signed
     rank test with additional options specified by one or more of the
     following NAME, VALUE pair arguments:

     NAME            VALUE
                     
     ---------------------------------------------------------------------------
     "alpha"         A scalar value for the significance level of the test.
                     Default is 0.05.
                     
     "tail"          A character vector specifying the alternative
                     hypothesis.  It can take one of the following values:

          VALUE          DESCRIPTION
                         
     ---------------------------------------------------------------------------
          "both"         For one-sample test (MY is empty or a scalar), the
                         data in X come from a continuous distribution with
                         median different than zero or MY.  For two-sample
                         test (MY is a vector), the data in X - MY come from
                         a continuous distribution with median different than
                         zero.
                         
          "left"         For one-sample test (MY is empty or a scalar), the
                         data in X come from a continuous distribution with
                         median less than zero or MY.  For two-sample test
                         (MY is a vector), the data in X - MY come from a
                         continuous distribution with median less than zero.
                         
          "right"        For one-sample test (MY is empty or a scalar), the
                         data in X come from a continuous distribution with
                         median greater than zero or MY.  For two-sample test
                         (MY is a vector), the data in X - MY come from a
                         continuous distribution with median greater than
                         zero.

     NAME            VALUE
                     
     ---------------------------------------------------------------------------
     "method"        A character vector specifying the method for computing
                     the p-value.  It can take one of the following values:

          VALUE          DESCRIPTION
                         
     ---------------------------------------------------------------------------
          "exact"        Exact computation of the p-value.  It is the default
                         value for fewer than 100 observations when "method"
                         is not specified.
                         
          "approximate"  Using normal approximation for computing the
                         p-value.  It is the default value for 100 or more
                         observations when "method" is not specified.

     ‘[PVAL, H] = signtest (...)’ also returns a logical value
     indicating the test decision.  If H is 0, the null hypothesis is
     accepted, whereas if H is 1, the null hypothesis is rejected.

     ‘[PVAL, H, STATS] = signtest (...)’ also returns the structure
     STATS containing the following fields:

     FIELD           VALUE
     ---------------------------------------------------------------------------
     sign            Value of the sign test statistic.
                     
     zval            Value of the z-statistic (only computed when the
                     "method" is "approximate").

     See also: signrank, tiedrank, runstest.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 23
Signed test for median.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 10
silhouette


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2016
 -- statistics: silhouette (X, CLUST)
 -- statistics: [SI, H] = silhouette (X, CLUST)
 -- statistics: [SI, H] = silhouette (..., METRIC, METRICARG)

     Compute the silhouette values of clustered data and show them on a
     plot.

     X is a n-by-p matrix of n data points in a p-dimensional space.
     Each datapoint is assigned to a cluster using CLUST, a vector of n
     elements, one cluster assignment for each data point.

     Each silhouette value of SI, a vector of size n, is a measure of
     the likelihood that a data point is accurately classified to the
     right cluster.  Defining "a" as the mean distance between a point
     and the other points from its cluster, and "b" as the mean distance
     between that point and the points from other clusters, the
     silhouette value of the i-th point is:

              bi - ai
     Si =  ------------
            max(ai,bi)

     Each element of SI ranges from -1, minimum likelihood of a correct
     classification, to 1, maximum likelihood.

     Optional input value METRIC is the metric used to compute the
     distances between data points.  Since ‘silhouette’ uses ‘pdist’ to
     compute these distances, METRIC is similar to the DISTANCE input
     argument of ‘pdist’ and it can be:
        • A known distance metric defined as a string: euclidean,
          squaredeuclidean (default), seuclidean, mahalanobis,
          cityblock, minkowski, chebychev, cosine, correlation, hamming,
          jaccard, or spearman.

        • A vector as those created by ‘pdist’.  In this case X does
          nothing.

        • A function handle that is passed to ‘pdist’ with METRICARG as
          optional inputs.

     Optional return value H is a handle to the silhouette plot.

     *Reference* Peter J. Rousseeuw, Silhouettes: a Graphical Aid to the
     Interpretation and Validation of Cluster Analysis.  1987.
     doi:10.1016/0377-0427(87)90125-7

See also: dendrogram, evalclusters, kmeans, linkage, pdist.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 72
Compute the silhouette values of clustered data and show them on a plot.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 11
slicesample


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2111
 -- statistics: [SMPL, NEVAL] = slicesample (START, NSAMPLES, PROPERTY,
          VALUE, ...)

     Draws NSAMPLES samples from a target stationary distribution PDF
     using slice sampling of Radford M. Neal.

     Input:
        • START is a 1 by DIM vector of the starting point of the Markov
          chain.  Each column corresponds to a different dimension.

        • NSAMPLES is the number of samples, the length of the Markov
          chain.

     Next, several property-value pairs can or must be specified, they
     are:

     (Required properties) One of:

        • "PDF": the value is a function handle of the target stationary
          distribution to be sampled.  The function should accept
          different locations in each row and each column corresponds to
          a different dimension.

          or

        • LOGPDF: the value is a function handle of the log of the
          target stationary distribution to be sampled.  The function
          should accept different locations in each row and each column
          corresponds to a different dimension.

     The following input property/pair values may be needed depending on
     the desired output:

        • "burnin" BURNIN the number of points to discard at the
          beginning, the default is 0.

        • "thin" THIN omits M-1 of every M points in the generated
          Markov chain.  The default is 1.

        • "width" WIDTH the maximum Manhattan distance between two
          samples.  The default is 10.

     Outputs:

        • SMPL is a NSAMPLES by DIM matrix of random values drawn from
          PDF where the rows are different random values, the columns
          correspond to the dimensions of PDF.

        • NEVAL is the number of function evaluations per sample.
     Example : Sampling from a normal distribution

          start = 1;
          nsamples = 1e3;
          pdf = @(x) exp (-.5 * x .^ 2) / (pi ^ .5 * 2 ^ .5);
          [smpl, accept] = slicesample (start, nsamples, "pdf", pdf, "thin", 4);
          histfit (smpl);

     See also: rand, mhsample, randsample.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Draws NSAMPLES samples from a target stationary distribution PDF using
slice ...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 10
squareform


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1139
 -- statistics: Z = squareform (Y)
 -- statistics: Y = squareform (Z)
 -- statistics: Z = squareform (Y, "tovector")
 -- statistics: Y = squareform (Z, "tomatrix")

     Interchange between distance matrix and distance vector formats.

     Converts between a hollow (diagonal filled with zeros), square, and
     symmetric matrix and a vector of the lower triangular part.

     Its target application is the conversion of the vector returned by
     ‘pdist’ into a distance matrix.  It performs the opposite operation
     if input is a matrix.

     If X is a vector, its number of elements must fit into the
     triangular part of a matrix (main diagonal excluded).  In other
     words, ‘numel (X) = N * (N - 1) / 2’ for some integer N.  The
     resulting matrix will be N by N.

     If X is a distance matrix, it must be square and the diagonal
     entries of X must all be zeros.  ‘squareform’ will generate a
     warning if X is not symmetric.

     The second argument is used to specify the output type in case
     there is a single element.  It will default to "tomatrix"
     otherwise.

     See also: pdist.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 64
Interchange between distance matrix and distance vector formats.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 18
standardizeMissing


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1229
 -- statistics: B = standardizeMissing (A, INDICATOR)

     Replace data values specified by INDICATOR in A by the standard
     'missing' data value for that data type.

     A can be a numeric scalar or array, a character vector or array, or
     a cell array of character vectors (a.k.a.  string cells).

     INDICATOR can be a scalar or an array containing values to be
     replaced by the 'missing' value for the class of A, and should have
     a data type matching A.

     'missing' values are defined as :

        • NaN: ‘single’, ‘double’

        • " " (white space): ‘char’

        • {""} (empty string in cell): string cells.

     Compatibility Notes:
        • Octave's implementation of ‘standardizeMissing’ does not
          restrict INDICATOR of type char to row vectors.

        • All numerical and logical inputs for A and INDICATOR may be
          specified in any combination.  The output will be the same
          class as A, with the INDICATOR converted to that data type for
          comparison.  Only ‘single’ and ‘double’ have defined 'missing'
          values, so A of other data types will always output B = A.

See also: fillmissing, ismissing, rmmissing.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Replace data values specified by INDICATOR in A by the standard
'missing' dat...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 11
stepwisefit


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1550
 -- statistics: [X_USE, B, BINT, R, RINT, STATS] = stepwisefit (Y, X,
          PENTER = 0.05, PREMOVE = 0.1, METHOD = "corr")

     Linear regression with stepwise variable selection.

     Arguments
     ---------

        • Y is an N by 1 vector of data to fit.
        • X is an N by K matrix containing the values of K potential
          predictors.  No constant term should be included (one will
          always be added to the regression automatically).
        • PENTER is the maximum p-value to enter a new variable into the
          regression (default: 0.05).
        • PREMOVE is the minimum p-value to remove a variable from the
          regression (default: 0.1).
        • METHOD sets how predictors are selected at each step, either
          based on their correlation with the residuals ("corr",
          default) or on the p values of their regression coefficients
          when they are successively added ("p").

     Return values
     -------------

        • X_USE contains the indices of the predictors included in the
          final regression model.  The predictors are listed in the
          order they were added, so typically the first ones listed are
          the most significant.
        • B, BINT, R, RINT, STATS are the results of ‘[b, bint, r, rint,
          stats] = regress(y, [ones(size(y)) X(:, X_use)], penter);’

     References
     ----------

       1. N. R. Draper and H. Smith (1966).  ‘Applied Regression
          Analysis’.  Wiley.  Chapter 6.

     See also: regress.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 51
Linear regression with stepwise variable selection.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
tabulate


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 833
 -- statistics: tabulate (X)
 -- statistics: TABLE = tabulate (X)

     Calculate a frequency table.

     ‘tabulate (x)’ displays a frequency table of the data in the vector
     X.  For each unique value in X, the tabulate function shows the
     number of instances and percentage of that value in X.

     ‘TABLE = tabulate (X)’ returns the frequency table, TABLE, as a
     numeric matrix when X is numeric and as a cell array otherwise.
     When an output argument is requested, ‘tabulate’ does not print the
     frequency table in the command window.

     If X is numeric, any missing values (NaNs) are ignored.

     If all the elements of X are positive integers, then the frequency
     table includes 0 counts for the integers between 1 and max (X) that
     do not appear in X.

     See also: bar, pareto.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 28
Calculate a frequency table.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
tiedrank


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1051
 -- statistics: [R, TIEADJ] = tiedrank (X)
 -- statistics: [R, TIEADJ] = tiedrank (X, TIEFLAG)
 -- statistics: [R, TIEADJ] = tiedrank (X, TIEFLAG, BIDIR)

     Compute rank adjusted for ties.

     ‘[R, TIEADJ] = tiedrank (X)’ computes the ranks of the values in
     vector X.  If any values in X are tied, ‘tiedrank’ computes their
     average rank.  The return value TIEADJ is an adjustment for ties
     required by the nonparametric tests ‘signrank’ and ‘ranksum’, and
     for the computation of Spearman's rank correlation.

     ‘[R, TIEADJ] = tiedrank (X, 1)’ computes the ranks of the values in
     the vector X.  TIEADJ is a vector of three adjustments for ties
     required in the computation of Kendall's tau.  ‘tiedrank (X, 0)’ is
     the same as ‘tiedrank (X)’.

     ‘[R, TIEADJ] = tiedrank (X, 0, 1)’ computes the ranks from each
     end, so that the smallest and largest values get rank 1, the next
     smallest and largest get rank 2, etc.  These ranks are used in the
     Ansari-Bradley test.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 31
Compute rank adjusted for ties.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
trimmean


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2977
 -- statistics: M = trimmean (X, P)
 -- statistics: M = trimmean (X, P, FLAG)
 -- statistics: M = trimmean (..., "all")
 -- statistics: M = trimmean (..., DIM)
 -- statistics: M = trimmean (..., VECDIM)

     Compute the trimmed mean.

     The trimmed mean of X is defined as the mean of X excluding the
     highest and lowest k data values of X, calculated as K = n * (P /
     100) / 2), where N is the sample size.

     ‘M = trimmean (X, P)’ returns the mean of X after removing the
     outliers in X defined by P percent.
        • If X is a vector, then ‘trimmean (X, P)’ is the mean of all
          the values of X, computed after removing the outliers.
        • If X is a matrix, then ‘trimmean (X, P)’ is a row vector of
          column means, computed after removing the outliers.
        • If X is a multidimensional array, then ‘trimmean’ operates
          along the first nonsingleton dimension of X.

     To specify the operating dimension(s) when X is a matrix or a
     multidimensional array, use the DIM or VECDIM input argument.

     ‘trimmean’ treats NaN values in X as missing values and removes
     them.

     ‘M = trimmean (X, P, FLAG)’ specifies how to trim when k, i.e.
     half the number of outliers, is not an integer.  FLAG can be
     specified as one of the following values:
     Value               Description
     ---------------------------------------------------------------------------
     "round"             Round k to the nearest integer.  This is the
                         default.
     "floor"             Round k down to the next smaller integer.
     "weighted"          If k = i + f, where i is an integer and f is a
                         fraction, compute a weighted mean with weight (1 -
                         f) for the (i + 1)-th and (n - i)-th values, and
                         full weight for the values between them.

     ‘M = trimmean (..., "all")’ returns the trimmed mean of all the
     values in X using any of the input argument combinations in the
     previous syntaxes.

     ‘M = trimmean (..., DIM)’ returns the trimmed mean along the
     operating dimension DIM specified as a positive integer scalar.  If
     not specified, then the default value is the first nonsingleton
     dimension of X, i.e.  whose size does not equal 1.  If DIM is
     greater than ndims (X) or if size (X, DIM) is 1, then ‘trimmean’
     returns X.

     ‘M = trimmean (..., VECDIM)’ returns the trimmed mean over the
     dimensions specified in the vector VECDIM.  For example, if X is a
     2-by-3-by-4 array, then ‘mean (X, [1 2])’ returns a 1-by-1-by-4
     array.  Each element of the output array is the mean of the
     elements on the corresponding page of X.  If VECDIM indexes all
     dimensions of X, then it is equivalent to ‘mean (X, "all")’.  Any
     dimension in VECDIM greater than ‘ndims (X)’ is ignored.

     See also: mean.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 25
Compute the trimmed mean.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 5
ttest


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2504
 -- statistics: [H, PVAL, CI, STATS] = ttest (X)
 -- statistics: [H, PVAL, CI, STATS] = ttest (X, M)
 -- statistics: [H, PVAL, CI, STATS] = ttest (X, Y)
 -- statistics: [H, PVAL, CI, STATS] = ttest (X, M, NAME, VALUE)
 -- statistics: [H, PVAL, CI, STATS] = ttest (X, Y, NAME, VALUE)

     Test for mean of a normal sample with unknown variance.

     Perform a t-test of the null hypothesis ‘mean (X) == M’ for a
     sample X from a normal distribution with unknown mean and unknown
     standard deviation.  Under the null, the test statistic T has a
     Student's t distribution.  The default value of M is 0.

     If the second argument Y is a vector, a paired-t test of the
     hypothesis ‘mean (X) = mean (Y)’ is performed.  If X and Y are
     vectors, they must have the same size and dimensions.

     X (and Y) can also be matrices.  For matrices, ttest performs
     separate t-tests along each column, and returns a vector of
     results.  X and Y must have the same number of columns.  The Type I
     error rate of the resulting vector of PVAL can be controlled by
     entering PVAL as input to the function multcompare.

     ttest treats NaNs as missing values, and ignores them.

     Name-Value pair arguments can be used to set various options.
     "alpha" can be used to specify the significance level of the test
     (the default value is 0.05).  "tail", can be used to select the
     desired alternative hypotheses.  If the value is "both" (default)
     the null is tested against the two-sided alternative ‘mean (X) !=
     M’.  If it is "right" the one-sided alternative ‘mean (X) > M’ is
     considered.  Similarly for "left", the one-sided alternative ‘mean
     (X) < M’ is considered.  When argument X is a matrix, "dim" can be
     used to select the dimension over which to perform the test.  (The
     default is the first non-singleton dimension).

     If H is 1 the null hypothesis is rejected, meaning that the tested
     sample does not come from a Student's t distribution.  If H is 0,
     then the null hypothesis cannot be rejected and it can be assumed
     that X follows a Student's t distribution.  The p-value of the test
     is returned in PVAL.  A 100(1-alpha)% confidence interval is
     returned in CI.

     STATS is a structure containing the value of the test statistic
     (TSTAT), the degrees of freedom (DF) and the sample's standard
     deviation (SD).

     See also: hotelling_t2test, ttest2, hotelling_t2test2.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 55
Test for mean of a normal sample with unknown variance.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
ttest2


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2012
 -- statistics: [H, PVAL, CI, STATS] = ttest2 (X, Y)
 -- statistics: [H, PVAL, CI, STATS] = ttest2 (X, Y, NAME, VALUE)

     Perform a t-test to compare the means of two groups of data under
     the null hypothesis that the groups are drawn from distributions
     with the same mean.

     X and Y can be vectors or matrices.  For matrices, ttest2 performs
     separate t-tests along each column, and returns a vector of
     results.  X and Y must have the same number of columns.  The Type I
     error rate of the resulting vector of PVAL can be controlled by
     entering PVAL as input to the function multcompare.

     ttest2 treats NaNs as missing values, and ignores them.

     For a nested t-test, use anova2.

     The argument "alpha" can be used to specify the significance level
     of the test (the default value is 0.05).  The string argument
     "tail", can be used to select the desired alternative hypotheses.
     If "tail" is "both" (default) the null is tested against the
     two-sided alternative ‘mean (X) != M’.  If "tail" is "right" the
     one-sided alternative ‘mean (X) > M’ is considered.  Similarly for
     "left", the one-sided alternative ‘mean (X) < M’ is considered.

     When "vartype" is "equal" the variances are assumed to be equal
     (this is the default).  When "vartype" is "unequal" the variances
     are not assumed equal.

     When argument X and Y are matrices the "dim" argument can be used
     to select the dimension over which to perform the test.  (The
     default is the first non-singleton dimension.)

     If H is 0 the null hypothesis is accepted, if it is 1 the null
     hypothesis is rejected.  The p-value of the test is returned in
     PVAL.  A 100(1-alpha)% confidence interval is returned in CI.
     STATS is a structure containing the value of the test statistic
     (TSTAT), the degrees of freedom (DF) and the sample standard
     deviation (SD).

     See also: hotelling_t2test, anova1, hotelling_t2test2, ttest.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 80
Perform a t-test to compare the means of two groups of data under the
null hy...



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
vartest


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2303
 -- statistics: H = vartest (X, V)
 -- statistics: H = vartest (X, V, NAME, VALUE)
 -- statistics: [H, PVAL] = vartest (...)
 -- statistics: [H, PVAL, CI] = vartest (...)
 -- statistics: [H, PVAL, CI, STATS] = vartest (...)

     One-sample test of variance.

     ‘H = vartest (X, V)’ performs a chi-square test of the hypothesis
     that the data in the vector X come from a normal distribution with
     variance V, against the alternative that X comes from a normal
     distribution with a different variance.  The result is H = 0 if the
     null hypothesis ("variance is V") cannot be rejected at the 5%
     significance level, or H = 1 if the null hypothesis can be rejected
     at the 5% level.

     X may also be a matrix or an N-D array.  For matrices, ‘vartest’
     performs separate tests along each column of X, and returns a
     vector of results.  For N-D arrays, ‘vartest’ works along the first
     non-singleton dimension of X.  V must be a scalar.

     ‘vartest’ treats NaNs as missing values, and ignores them.

     ‘[H, PVAL] = vartest (...)’ returns the p-value.  That is the
     probability of observing the given result, or one more extreme, by
     chance if the null hypothesis true.

     ‘[H, PVAL, CI] = vartest (...)’ returns a 100 * (1 - ALPHA)%
     confidence interval for the true variance.

     ‘[H, PVAL, CI, STATS] = vartest (...)’ returns a structure with the
     following fields:

          chisqstat      the value of the test statistic
          df             the degrees of freedom of the test

     ‘[...] = vartest (..., NAME, VALUE), ...’ specifies one or more of
     the following name/value pairs:

          Name           Value
     ---------------------------------------------------------------------------
          "alpha"        the significance level.  Default is 0.05.
                         
          "dim"          dimension to work along a matrix or an N-D array.
                         
          "tail"         a string specifying the alternative hypothesis
             "both"      variance is not V (two-tailed, default)
             "left"      variance is less than V (left-tailed)
             "right"     variance is greater than V (right-tailed)

     See also: ttest, ztest, kstest.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 28
One-sample test of variance.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
vartest2


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2505
 -- statistics: H = vartest2 (X, Y)
 -- statistics: H = vartest2 (X, Y, NAME, VALUE)
 -- statistics: [H, PVAL] = vartest2 (...)
 -- statistics: [H, PVAL, CI] = vartest2 (...)
 -- statistics: [H, PVAL, CI, STATS] = vartest2 (...)

     Two-sample F test for equal variances.

     ‘H = vartest2 (X, Y)’ performs an F test of the hypothesis that the
     independent data in vectors X and Y come from normal distributions
     with equal variance, against the alternative that they come from
     normal distributions with different variances.  The result is H = 0
     if the null hypothesis ("variance are equal") cannot be rejected at
     the 5% significance level, or H = 1 if the null hypothesis can be
     rejected at the 5% level.

     X and Y may also be matrices or N-D arrays.  For matrices,
     ‘vartest2’ performs separate tests along each column and returns a
     vector of results.  For N-D arrays, ‘vartest2’ works along the
     first non-singleton dimension and X and Y must have the same size
     along all the remaining dimensions.

     ‘vartest2’ treats NaNs as missing values, and ignores them.

     ‘[H, PVAL] = vartest2 (...)’ returns the p-value.  That is the
     probability of observing the given result, or one more extreme, by
     chance if the null hypothesis true.

     ‘[H, PVAL, CI] = vartest2 (...)’ returns a 100 * (1 - ALPHA)%
     confidence interval for the true ratio var(X)/var(Y).

     ‘[H, PVAL, CI, STATS] = vartest2 (...)’ returns a structure with
     the following fields:

          fstat          the value of the test statistic
          df1            the numerator degrees of freedom of the test
          df2            the denominator degrees of freedom of the test

     ‘[...] = vartest2 (..., NAME, VALUE), ...’ specifies one or more of
     the following name/value pairs:

          Name           Value
     ---------------------------------------------------------------------------
          "alpha"        the significance level.  Default is 0.05.
                         
          "dim"          dimension to work along a matrix or an N-D array.
                         
          "tail"         a string specifying the alternative hypothesis
             "both"      variance is not V (two-tailed, default)
             "left"      variance is less than V (left-tailed)
             "right"     variance is greater than V (right-tailed)

     See also: ttest2, kstest2, bartlett_test, levene_test.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 38
Two-sample F test for equal variances.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 8
vartestn


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 3295
 -- statistics: vartestn (X)
 -- statistics: vartestn (X, GROUP)
 -- statistics: vartestn (..., NAME, VALUE)
 -- statistics: P = vartestn (...)
 -- statistics: [P, STATS] = vartestn (...)
 -- statistics: [P, STATS] = vartestn (..., NAME, VALUE)

     Test for equal variances across multiple groups.

     ‘H = vartestn (X)’ performs Bartlett's test for equal variances for
     the columns of the matrix X.  This is a test of the null hypothesis
     that the columns of X come from normal distributions with the same
     variance, against the alternative that they come from normal
     distributions with different variances.  The result is displayed in
     a summary table of statistics as well as a box plot of the groups.

     ‘vartestn (X, GROUP)’ requires a vector X, and a GROUP argument
     that is a categorical variable, vector, string array, or cell array
     of strings with one row for each element of X.  Values of X
     corresponding to the same value of GROUP are placed in the same
     group.

     ‘vartestn’ treats NaNs as missing values, and ignores them.

     ‘P = vartestn (...)’ returns the probability of observing the given
     result, or one more extreme, by chance under the null hypothesis
     that all groups have equal variances.  Small values of P cast doubt
     on the validity of the null hypothesis.

     ‘[P, STATS] = vartestn (...)’ returns a structure with the
     following fields:

          chistat        - the value of the test statistic
          df             - the degrees of freedom of the test

     ‘[P, STATS] = vartestn (..., NAME, VALUE)’ specifies one or more of
     the following NAME/VALUE pairs:

     "display"      "on" to display a boxplot and table, or "off" to omit
                    these displays.  Default "on".
                    
     "testtype"     One of the following strings to control the type of test
                    to perform

        "Bartlett"         Bartlett's test (default).
                           
        "LeveneQuadratic"  Levene's test computed by performing anova on the
                           squared deviations of the data values from their
                           group means.
                           
        "LeveneAbsolute"   Levene's test computed by performing anova on the
                           absolute deviations of the data values from their
                           group means.
                           
        "BrownForsythe"    Brown-Forsythe test computed by performing anova
                           on the absolute deviations of the data values from
                           the group medians.
                           
        "OBrien"           O'Brien's modification of Levene's test with
                           W=0.5.

     The classical Bartlett's test is sensitive to the assumption that
     the distribution in each group is normal.  The other test types are
     more robust to non-normal distributions, especially ones prone to
     outliers.  For these tests, the STATS output structure has a field
     named fstat containing the test statistic, and df1 and df2
     containing its numerator and denominator degrees of freedom.

     See also: vartest, vartest2, anova1, bartlett_test, levene_test.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 48
Test for equal variances across multiple groups.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
violin


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2680
 -- statistics: violin (X)
 -- statistics: H = violin (X)
 -- statistics: H = violin (..., PROPERTY, VALUE, ...)
 -- statistics: H = violin (HAX, ...)
 -- statistics: H = violin (..., "horizontal")

     Produce a Violin plot of the data X.

     The input data X can be a N-by-m array containing N observations of
     m variables.  It can also be a cell with m elements, for the case
     in which the variables are not uniformly sampled.

     The following PROPERTY can be set using PROPERTY/VALUE pairs
     (default values in parenthesis).  The value of the property can be
     a scalar indicating that it applies to all the variables in the
     data.  It can also be a cell/array, indicating the property for
     each variable.  In this case it should have m columns (as many as
     variables).

     Color
          ("y") Indicates the filling color of the violins.

     Nbins
          (50) Internally, the function calls ‘hist’ to compute the
          histogram of the data.  This property indicates how many bins
          to use.  See ‘help hist’ for more details.

     SmoothFactor
          (4) The function performs simple kernel density estimation and
          automatically finds the bandwidth of the kernel function that
          best approximates the histogram using optimization (‘sqp’).
          The result is in general very noisy.  To smooth the result the
          bandwidth is multiplied by the value of this property.  The
          higher the value the smoother the violins, but values too high
          might remove features from the data distribution.

     Bandwidth
          (NA) If this property is given a value other than NA, it sets
          the bandwidth of the kernel function.  No optimization is
          performed and the property SmoothFactor is ignored.

     Width
          (0.5) Sets the maximum width of the violins.  Violins are
          centered at integer axis values.  The distance between two
          violin middle axis is 1.  Setting a value higher than 1 in
          this property will cause the violins to overlap.

     If the string "Horizontal" is among the input arguments, the violin
     plot is rendered along the x axis with the variables in the y axis.

     The returned structure H has handles to the plot elements, allowing
     customization of the visualization using set/get functions.

     Example:

          title ("Grade 3 heights");
          axis ([0,3]);
          set (gca, "xtick", 1:2, "xticklabel", {"girls"; "boys"});
          h = violin ({randn(100,1)*5+140, randn(130,1)*8+135}, "Nbins", 10);
          set (h.violin, "linewidth", 2)

     See also: boxplot, hist.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 36
Produce a Violin plot of the data X.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 7
wblplot


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1750
 -- statistics: wblplot (DATA, ...)
 -- statistics: HANDLE = wblplot (DATA, ...)
 -- statistics: [HANDLE, PARAM] = wblplot (DATA)
 -- statistics: [HANDLE, PARAM] = wblplot (DATA, CENSOR)
 -- statistics: [HANDLE, PARAM] = wblplot (DATA, CENSOR, FREQ)
 -- statistics: [HANDLE, PARAM] = wblplot (DATA, CENSOR, FREQ, CONFINT)
 -- statistics: [HANDLE, PARAM] = wblplot (DATA, CENSOR, FREQ, CONFINT,
          FANCYGRID)
 -- statistics: [HANDLE, PARAM] = wblplot (DATA, CENSOR, FREQ, CONFINT,
          FANCYGRID, SHOWLEGEND)

     Plot a column vector DATA on a Weibull probability plot using rank
     regression.

     CENSOR: optional parameter is a column vector of same size as DATA
     with 1 for right censored data and 0 for exact observation.  Pass
     [] when no censor data are available.

     FREQ: optional vector same size as DATA with the number of
     occurrences for corresponding data.  Pass [] when no frequency data
     are available.

     CONFINT: optional confidence limits for plotting upper and lower
     confidence bands using beta binomial confidence bounds.  If a
     single value is given this will be used such as LOW = a and HIGH =
     1 - a.  Pass [] if confidence bounds is not requested.

     FANCYGRID: optional parameter which if set to anything but 1 will
     turn off the fancy gridlines.

     SHOWLEGEND: optional parameter that when set to zero(0) turns off
     the legend.

     If one output argument is given, a HANDLE for the data marker and
     plotlines is returned, which can be used for further modification
     of line and marker style.

     If a second output argument is specified, a PARAM vector with
     scale, shape and correlation factor is returned.

     See also: normplot, wblpdf.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 78
Plot a column vector DATA on a Weibull probability plot using rank
regression.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 4
x2fx


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2722
 -- statistics: [D, MODEL, TERMSTART, TERMEND] = x2fx (X)
 -- statistics: [D, MODEL, TERMSTART, TERMEND] = x2fx (X, MODEL)
 -- statistics: [D, MODEL, TERMSTART, TERMEND] = x2fx (X, MODEL, CATEG)
 -- statistics: [D, MODEL, TERMSTART, TERMEND] = x2fx (X, MODEL, CATEG,
          CATLEVELS)

     Convert predictors to design matrix.

     ‘D = x2fx (X, MODEL)’ converts a matrix of predictors X to a design
     matrix D for regression analysis.  Distinct predictor variables
     should appear in different columns of X.

     The optional input MODEL controls the regression model.  By
     default, ‘x2fx’ returns the design matrix for a linear additive
     model with a constant term.  MODEL can be any one of the following
     strings:

          "linear"       Constant and linear terms (the default)
          "interaction"  Constant, linear, and interaction terms
          "quadratic"    Constant, linear, interaction, and squared terms
          "purequadratic"Constant, linear, and squared terms

     If X has n columns, the order of the columns of D for a full
     quadratic model is:

        • The constant term.
        • The linear terms (the columns of X, in order 1,2,...,n).
        • The interaction terms (pairwise products of columns of X, in
          order (1,2), (1,3), ..., (1,n), (2,3), ..., (n-1,n).
        • The squared terms (in the order 1,2,...,n).

     Other models use a subset of these terms, in the same order.

     Alternatively, MODEL can be a matrix specifying polynomial terms of
     arbitrary order.  In this case, MODEL should have one column for
     each column in X and one r for each term in the model.  The entries
     in any r of MODEL are powers for the corresponding columns of X.
     For example, if X has columns X1, X2, and X3, then a row [0 1 2] in
     MODEL would specify the term (X1.^0).*(X2.^1).*(X3.^2).  A row of
     all zeros in MODEL specifies a constant term, which you can omit.

     ‘D = x2fx (X, MODEL, CATEG)’ treats columns with numbers listed in
     the vector CATEG as categorical variables.  Terms involving
     categorical variables produce dummy variable columns in D.  Dummy
     variables are computed under the assumption that possible
     categorical levels are completely enumerated by the unique values
     that appear in the corresponding column of X.

     ‘D = x2fx (X, MODEL, CATEG, CATLEVELS)’ accepts a vector CATLEVELS
     the same length as CATEG, specifying the number of levels in each
     categorical variable.  In this case, values in the corresponding
     column of X must be integers in the range from 1 to the specified
     number of levels.  Not all of the levels need to appear in X.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 36
Convert predictors to design matrix.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 5
ztest


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 2152
 -- statistics: H = ztest (X, M, SIGMA)
 -- statistics: H = ztest (X, M, SIGMA, NAME, VALUE)
 -- statistics: [H, PVAL] = ztest (...)
 -- statistics: [H, PVAL, CI] = ztest (...)
 -- statistics: [H, PVAL, CI, ZVALUE] = ztest (...)

     One-sample Z-test.

     ‘H = ztest (X, V)’ performs a Z-test of the hypothesis that the
     data in the vector X come from a normal distribution with mean M,
     against the alternative that X comes from a normal distribution
     with a different mean M.  The result is H = 0 if the null
     hypothesis ("mean is M") cannot be rejected at the 5% significance
     level, or H = 1 if the null hypothesis can be rejected at the 5%
     level.

     X may also be a matrix or an N-D array.  For matrices, ‘ztest’
     performs separate tests along each column of X, and returns a
     vector of results.  For N-D arrays, ‘ztest’ works along the first
     non-singleton dimension of X.  M and SIGMA must be scalars.

     ‘ztest’ treats NaNs as missing values, and ignores them.

     ‘[H, PVAL] = ztest (...)’ returns the p-value.  That is the
     probability of observing the given result, or one more extreme, by
     chance if the null hypothesis true.

     ‘[H, PVAL, CI] = ztest (...)’ returns a 100 * (1 - ALPHA)%
     confidence interval for the true mean.

     ‘[H, PVAL, CI, ZVALUE] = ztest (...)’ returns the value of the test
     statistic.

     ‘[...] = ztest (..., NAME, VALUE, ...)’ specifies one or more of
     the following NAME/VALUE pairs:

          NAME           VALUE
     ---------------------------------------------------------------------------
          "alpha"        the significance level.  Default is 0.05.
                         
          "dim"          dimension to work along a matrix or an N-D array.
                         
          "tail"         a string specifying the alternative hypothesis:
             "both"      "mean is not M" (two-tailed, default)
             "left"      "mean is less than M" (left-tailed)
             "right"     "mean is greater than M" (right-tailed)

     See also: ttest, vartest, signtest, kstest.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 18
One-sample Z-test.



# name: <cell-element>
# type: sq_string
# elements: 1
# length: 6
ztest2


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 1826
 -- statistics: H = ztest2 (X1, N1, X2, N2)
 -- statistics: H = ztest2 (X1, N1, X2, N2, NAME, VALUE)
 -- statistics: [H, PVAL] = ztest2 (...)
 -- statistics: [H, PVAL, ZVALUE] = ztest2 (...)

     Two proportions Z-test.

     If X1 and N1 are the counts of successes and trials in one sample,
     and X2 and N2 those in a second one, test the null hypothesis that
     the success probabilities p1 and p2 are the same.  The result is H
     = 0 if the null hypothesis cannot be rejected at the 5%
     significance level, or H = 1 if the null hypothesis can be rejected
     at the 5% level.

     Under the null, the test statistic ZVALUE approximately follows a
     standard normal distribution.

     The size of H, PVAL, and ZVALUE is the common size of X1, N1, X2,
     and N2, which must be scalars or of common size.  A scalar input
     functions as a constant matrix of the same size as the other
     inputs.

     ‘[H, PVAL] = ztest2 (...)’ returns the p-value.  That is the
     probability of observing the given result, or one more extreme, by
     chance if the null hypothesis true.

     ‘[H, PVAL, ZVALUE] = ztest2 (...)’ returns the value of the test
     statistic.

     ‘[...] = ztest2 (..., NAME, VALUE, ...)’ specifies one or more of
     the following NAME/VALUE pairs:

          NAME           VALUE
     ---------------------------------------------------------------------------
          "alpha"        the significance level.  Default is 0.05.
                         
          "tail"         a string specifying the alternative hypothesis
             "both"             p1 is not p2 (two-tailed, default)
             "left"             p1 is less than p2 (left-tailed)
             "right"            p1 is greater than p2 (right-tailed)

     See also: chi2test, fishertest.


# name: <cell-element>
# type: sq_string
# elements: 1
# length: 23
Two proportions Z-test.





