Runs & Patterns Charts Analysis
- Runs and patterns charts are nonparametric visualization tools that encode sequence structure to detect serial dependence and shifts in binary or numeric data.
- They leverage methodologies such as gap analysis, finite Markov chain embedding, and scan statistics to achieve exact distribution-free error control.
- Practical applications span quality control, genomics, and industrial processes, enabling reliable detection of nonrandom patterns even in small sample settings.
A runs and patterns chart is a statistical process monitoring tool and inferential visualization that encodes the sequence structure of binary or numeric data to detect deviations from randomness, serial dependence, or distributional shifts. Runs-based methods, both classical and modern, are used in nonparametric inference, quality control, and the mathematical analysis of permutations and stochastic processes. The following article surveys the main theoretical, methodological, and applied frameworks for runs and patterns charts, with technical detail drawn from recent advances in nonparametric statistics, distribution-free control charting, and permutation pattern theory.
1. Definitions and Mathematical Foundations
A run in a binary sequence is a maximal consecutive block of identical symbols. For a sequence , , a run of 1’s is a maximal subblock of contiguous 1’s bounded by either 0’s or sequence ends; the number of runs is the total count of such blocks (for both 1’s and 0’s). In the context of permutations , a run is a maximal consecutive increasing subsequence, and the number of runs is related to descents via
Pattern statistics generalize this concept: a pattern chart tracks specified arrangements, such as the length of the longest run, the number of occurrences of an alternating or monotonic subsequence, or the maximum number of 1’s in a sliding window (scan statistic).
Modern runs and patterns charts exploit the mathematical properties of these objects—often tracked through indicator functions, gap vectors, or combinatorial recursions—both for exact enumeration and for the construction of distribution-free statistical tests (De, 2023, Wu, 2018, Wu, 17 Nov 2025).
2. Nonparametric and Distribution-Free Approaches
Several recent methodologies harness runs- and pattern-based statistics for distribution-free inference (that is, inference whose null-distribution is independent of any underlying process distribution ). The main pillars are:
- Binary transformation of raw data: Observations are mapped to , for a threshold chosen to set a desired baseline success probability (Wu, 17 Nov 2025, Wu, 2018).
- Conditioning on the number of successes: Given , the binary sequence is treated as a uniform random permutation of 1’s and 0’s. This erases the dependency on in the null distribution of any run or pattern statistic (Wu, 2018).
Finite Markov chain imbedding (FMCI) techniques are used to derive exact null distributions of statistics including the number of runs, the scan statistic, and the longest run, enabling the determination of control limits or p-values with nominal false-alarm rates exactly maintained (Wu, 17 Nov 2025, Wu, 2018).
3. Key Families of Runs and Patterns Statistics
3.1 Gap-Based Runs Charts
In gap-based analysis, the positions of all the 1’s in a binary sequence are enumerated, and the gaps () form the primary data vector (De, 2023). This construction enables detection of patterns undetectable by conventional runs counts:
- Exact Binomial test: Tests if the number of "small" gaps () significantly deviates from the $1/2$ probability predicted under randomness.
- Kendall's Tau trend test: Assesses monotonic trends in the gap sequence, indicating increasing or decreasing clustering of 1’s.
- Siegel–Tukey test: Tests for changes in gap-sequence dispersion (scale), using the Vegelius ties correction for accurate ranking under ties.
3.2 Traditional Runs and Scan Statistics
Classical statistics include:
- Number of runs of successes: with (Wu, 17 Nov 2025).
- Longest run: .
- Scan statistic: , tracking maximal local concentration.
For both the success runs count and scan statistic, exact distributional formulas under the null can be derived via FMCI; transition matrices are constructed based on state spaces encoding number and pattern of runs (Wu, 17 Nov 2025, Wu, 2018).
3.3 Runs and Patterns in Permutations
The study of consecutive patterns in permutations includes enumeration of runs, descents, and more general statistics:
- The exponential generating function for the number of permutations with runs is
- The distribution of runs is asymptotically normal with mean and variance for .
4. Control Chart Implementations and Practical Algorithms
Runs and patterns charts have central roles in modern process control, especially when normality or parametric modeling is infeasible.
4.1 Distribution-Free Phase I Charts
For Phase I monitoring, control limits for the runs or scan statistics are set so that, conditioning on the observed number of 1’s, the type I error probability is exactly (Wu, 17 Nov 2025):
| Statistic | Control Limit Criterion |
|---|---|
| Number of runs | |
| Scan |
Transition matrices and are constructed for the runs and scan statistics respectively, based on the FMCI embedding, and matrix products yield the joint distribution.
4.2 Gap and Pattern Charts for Binary Sequences
A workflow for analyzing a binary sequence using runs and patterns charts includes:
- Extract indices of 1’s, construct gap vector .
- Apply the binomial test, then—if indicated—Kendall’s Tau test, then Siegel–Tukey scale test.
- Plot gap vs. index (run chart), color-coding small/large gaps; annotate with test p-values.
- Add regression or trend overlays (pattern chart) as warranted (De, 2023).
The standard runs test (counting alternations in the sequence) is included for side-by-side comparison; gap-based tests can detect local nonrandomness missed by traditional methods.
4.3 Sensitizing Rules for Shewhart Charts
In Shewhart-type control charts with subgroup means , modified rules fire if (for example) of the last points fall beyond a control limit and the remainder do not cross the center line, with thresholds set to calibrate in-control average run length (ARL) (Antzoulakos et al., 2010).
5. Statistical Properties, Performance, and Use Cases
Runs and patterns charts possess several key features:
- Distribution-free performance: Control limits are fully nonparametric under the conditioning argument (Wu, 2018, Wu, 17 Nov 2025).
- Exact error control: In-control ARL is exactly for user-specified .
- Small sample suitability: Methods remain valid for moderate (), with exact or small-sample-corrected tests, and no need for normal approximations (De, 2023).
- Robustness to distributional shape: Performance is maintained under normal, skewed, and heavy-tailed innovations (Wu, 17 Nov 2025).
Runs and patterns charts are used for detecting mean shifts in unknown distributions, identification of clustering or dispersion, and in genomics, reliability, and industrial process monitoring. The flexibility in statistic selection allows practitioners to target specific pattern types of concern.
6. Comparison and Integration with Classical Methods
Traditional runs tests measure the alternation frequency of 0’s and 1’s under the null hypothesis of independent, identically distributed elements with arbitrary mixing. However, within-sequence patterns such as clustering or trend in the occurrences of a specific symbol may evade this test but are detected by gap-based sequences or local scan statistics (De, 2023).
Modern designs recommend plotting both classical and advanced runs-based charts side-by-side; in numerous cases, the new methods (gap sequences, scan statistics) provide lower ARL or statistically significant evidence of nonrandomness even when conventional p-values are large.
7. Theoretical Enumeration and Pattern Analysis
Permutation-based theory links runs to broader pattern enumeration problems. The number of permutations in with exactly runs is given by Eulerian numbers, and associated generating functions permit both exact and asymptotic analysis (Elizalde, 2015). In this combinatorial setting, runs, descents, peaks, and general consecutive patterns are interrelated, with Wilf equivalence classes and cluster methods governing their enumeration and pattern avoidance properties.
The monotone consecutive pattern —corresponding to runs of length —forms a unique c-Wilf equivalence class, with associated linear differential equations characterizing its generating functions.
In summary, runs and patterns charts provide a highly structured, distribution-free, and theoretically rigorous framework for the detection of serial dependence, nonrandom patterns, and process shifts. They integrate combinatorial enumeration, Markov chain methods, nonparametric statistics, and applied control chart design, forming the foundation of modern statistical process surveillance and permutation pattern analysis (De, 2023, Wu, 2018, Wu, 17 Nov 2025, Elizalde, 2015, Antzoulakos et al., 2010).