scikit_posthocs.outliers_gesd
- scikit_posthocs.outliers_gesd(x: List | ndarray, outliers: int = 5, hypo: bool = False, report: bool = False, alpha: float = 0.05) ndarray
The generalized (Extreme Studentized Deviate) ESD test is used to detect one or more outliers in a univariate data set that follows an approximately normal distribution [1].
- Parameters:
x (Union[List, np.ndarray]) – An array, any object exposing the array interface, containing data to test for outliers.
outliers (int = 5) – Number of potential outliers to test for. Test is two-tailed, i.e. maximum and minimum values are checked for potential outliers.
hypo (bool = False) – Specifies whether to return a bool value of a hypothesis test result. Returns True when we can reject the null hypothesis. Otherwise, False. Available options are: 1) True - return a hypothesis test result. 2) False - return a filtered array without an outlier (default).
report (bool = False) – Specifies whether to print a summary table of the test.
alpha (float = 0.05) – Significance level for a hypothesis test.
- Returns:
Returns the filtered array if alternative hypo is True, otherwise an unfiltered (input) array.
- Return type:
np.ndarray
Notes
Examples
>>> data = np.array([-0.25, 0.68, 0.94, 1.15, 1.2, 1.26, 1.26, 1.34, 1.38, 1.43, 1.49, 1.49, 1.55, 1.56, 1.58, 1.65, 1.69, 1.7, 1.76, 1.77, 1.81, 1.91, 1.94, 1.96, 1.99, 2.06, 2.09, 2.1, 2.14, 2.15, 2.23, 2.24, 2.26, 2.35, 2.37, 2.4, 2.47, 2.54, 2.62, 2.64, 2.9, 2.92, 2.92, 2.93, 3.21, 3.26, 3.3, 3.59, 3.68, 4.3, 4.64, 5.34, 5.42, 6.01]) >>> outliers_gesd(data, 5) array([-0.25, 0.68, 0.94, 1.15, 1.2 , 1.26, 1.26, 1.34, 1.38, 1.43, 1.49, 1.49, 1.55, 1.56, 1.58, 1.65, 1.69, 1.7 , 1.76, 1.77, 1.81, 1.91, 1.94, 1.96, 1.99, 2.06, 2.09, 2.1 , 2.14, 2.15, 2.23, 2.24, 2.26, 2.35, 2.37, 2.4 , 2.47, 2.54, 2.62, 2.64, 2.9 , 2.92, 2.92, 2.93, 3.21, 3.26, 3.3 , 3.59, 3.68, 4.3 , 4.64]) >>> outliers_gesd(data, outliers = 5, report = True) H0: no outliers in the data Ha: up to 5 outliers in the data Significance level: α = 0.05 Reject H0 if Ri > Critical Value (λi) Summary Table for Two-Tailed Test --------------------------------------- Exact Test Critical Number of Statistic Value, λi Outliers, i Value, Ri 5 % --------------------------------------- 1 3.119 3.159 2 2.943 3.151 3 3.179 3.144 * 4 2.81 3.136 5 2.816 3.128