Draft:Forecastability

Predictability of time series data From Wikipedia, the free encyclopedia

Forecastability is a property of a time series describing the degree to which its future values can be predicted from past observations. The term has been defined in the forecasting literature in both structural and operational ways: structurally, as a property of the underlying data-generating process reflecting the degree of order or regularity in the series;[1] and operationally, as the range of forecast errors achievable in the long run, bounded below by the best accuracy attainable in principle and above by a naïve benchmark.[2] Kolassa (2009) argued that these perspectives may be viewed as complementary rather than mutually exclusive, with entropy-based measures serving as useful diagnostics of stability and lowest achievable forecast error as the operational criterion.[3]

Forecastability is treated in the literature as distinct from forecast accuracy. A highly forecastable series may still be forecast poorly by an inadequate model; conversely, no model can substantially reduce error for a series whose past and future are statistically independent at the relevant horizon.[4][2] Measurement approaches discussed in the literature include the coefficient of variation, entropy-based regularity measures such as approximate entropy and spectral entropy, Lyapunov exponents, model benchmark comparisons, and auto-mutual information.[5]

Background and motivation

Results from the M1, M3, and M4 forecasting competitions have been widely discussed in the forecasting literature in relation to differences in time-series characteristics and model performance across heterogeneous datasets.[6][7][8] Gilliland (2010) argued that where forecastability differs substantially across series in a portfolio, blanket application of sophisticated methods is an inefficient use of resources.[9] Petropoulos et al. (2022) noted that series-level structural characteristics are systematically related to method performance across large benchmark datasets.[10]

Catt (2009) proposed that forecastability should be understood as a structural property of the data-generating process, situated on a continuum from deterministic to random, and argued it can be assessed using information-theoretic measures of regularity such as approximate entropy.[1] Boylan (2009) proposed a different formulation: forecastability should be defined in terms of the range of forecast errors achievable in the long run, with a lower bound representing the best accuracy attainable in principle and an upper bound defined by a naïve benchmark method.[2] Kolassa (2009) accepted that entropy-based measures can diagnose the stability and structure of a series usefully, but argued that practitioners ultimately require forecastability to be judged in terms of lowest achievable forecast error, since that is what connects directly to forecasting decisions.[3]

Relationship to forecast accuracy

In the forecasting literature, forecastability is often treated as placing bounds on the accuracy achievable by any given method. Boylan (2009) formalised this as a range: the lower bound represents the best forecast error attainable in principle, and the upper bound is the error produced by a naïve benchmark.[2] Kolassa (2009) noted that entropy-based diagnostics measure the stability of a series, whereas the lowest achievable forecast error measures its forecastability in operational terms.[3]

Green, Armstrong and Soon (2009) defined forecastability operationally as the ability to improve upon a naïve benchmark model, connecting the concept directly to forecast evaluation practice.[11] Makridakis and Hibon (2000) and Petropoulos et al. (2022) have discussed competition results in terms of whether series-level structural differences account for variation in the relative performance of simple versus complex methods.[7][10]

Theoretical foundations

The deterministic–random continuum

Forecastability has been discussed in relation to a broader classification of data-generating processes (DGPs) along a continuum from fully deterministic to purely random.[12] Chaos theory, following Edward Lorenz's work on atmospheric convection in 1963, demonstrated that some deterministic nonlinear systems exhibit sensitive dependence on initial conditions, making long-horizon prediction practically impossible even when the DGP is known. Eckmann and Ruelle (1985) argued that this implies a fundamental bound on the predictability horizon in chaotic systems, independent of any forecasting method applied.[12]

Data-generating processes are commonly classified along the following continuum:[13][12]

  • Deterministic: The DGP and its parameters are fully known; the series can in principle be forecast without error at any horizon.
  • Chaotic: The DGP is known but sensitive to initial conditions. Theoretically deterministic but practically bounded in forecastability, as characterised by positive Lyapunov exponents.[12]
  • Complex: The DGP cannot be fully characterised, as in many economic and social systems where emergent behaviour arises from interactions among many agents.
  • Random: The DGP has no exploitable pattern; future values are statistically independent of past values.

Catt (2009) situated typical business and economic time series within this continuum, arguing that their position between the deterministic and random extremes determines an upper bound on forecast accuracy achievable by any method.[14]

Information-theoretic foundations

Information theory, originating with Claude Shannon's foundational 1948 paper, provides a mathematical framework for quantifying the reduction in uncertainty achievable by observing the past.[15] The relevant quantity for time series forecastability is mutual information between past observations and future values:

where denotes Shannon entropy. High mutual information indicates that the past substantially reduces uncertainty about the future at horizon ; a value near zero indicates weak or absent past–future dependence.

Jaynes (1957) extended this framework through the maximum entropy principle, establishing information-theoretic reasoning as a foundation for statistical inference under uncertainty.[16] Bialek, Nemenman and Tishby (2001) formalised the concept of predictive information as the mutual information between the past and future of a stochastic process, arguing that this quantity characterises how much of the future is knowable from the past, independently of any particular method.[17] Related constructs in computational mechanics include excess entropy and statistical complexity, developed by Grassberger (1986) and Crutchfield and Feldman (2003).[18][19]

Measurement approaches

Coefficient of variation

The coefficient of variation (CV), defined as the ratio of the standard deviation to the mean of a detrended and deseasonalised series, has been proposed as a practical numerical indicator of forecastability.[9] Lower residual variability after removing trend and seasonality is taken to indicate greater forecastability.

Gilliland (2010) noted that the CV assumes no further exploitable structure remains after detrending and deseasonalisation, an assumption that may not hold for chaotic or complex series, and that the measure becomes unreliable when the series mean is near zero.[9] Tiao and Tsay (1994) demonstrated that nonlinear structure can persist in series that appear low-variance under classical decomposition, motivating approaches capable of detecting non-linear dependence.[20]

Entropy-based measures

Information-theoretic entropy measures assess series regularity without assuming that only trend and seasonality constitute exploitable structure.

Approximate entropy (ApEn), introduced by Pincus (1991), quantifies the likelihood that similar short patterns in a series remain similar at the next comparison step; small values indicate high regularity.[21] Sample entropy (SampEn), developed by Richman and Moorman (2000), reduces the self-matching bias of ApEn in shorter series.[22] Both produce a single scalar measure of overall regularity that does not vary by forecast horizon.

Goerg (2013) introduced forecastable component analysis (ForeCA), which identifies the most forecastable linear combination of a multivariate time series by minimising spectral entropy (the entropy of the normalised spectral density), and extends principal component analysis by optimising for forecastability rather than variance explained.[23]

Lyapunov exponents

In dynamical systems, Lyapunov exponents quantify the rate of divergence of nearby trajectories. Eckmann and Ruelle (1985) established that the largest Lyapunov exponent determines the timescale over which prediction remains feasible in chaotic systems.[12] Wang, Klee and Roos (2025) noted that reliable estimation requires long, high-quality series and tends to perform poorly on the short or sparse series common in business and economic forecasting.[5]

Model-based and benchmark approaches

Gilliland (2010) described Forecast Value Added (FVA) analysis as a method for evaluating forecasting performance relative to a simple naïve baseline; a method that cannot consistently outperform a naïve seasonal repeat is taken to indicate low forecastability or poor model specification.[9] Stock and Watson (2003) discussed variance decomposition and predictive R2 as approaches to assessing predictability in macroeconomic forecasting contexts.[24]

Auto-mutual information

Recent research has explored horizon-specific measures of forecastability based on the information-theoretic dependence between past and future observations. Auto-mutual information (AMI) at lag is the mutual information between a time series and its own values steps later:

Fraser and Swinney (1986) introduced the measurement of mutual information between a time series and its lagged self in the context of chaotic systems analysis.[25] Kantz and Schreiber (2004) noted that unlike the autocorrelation function, AMI captures both linear and nonlinear temporal dependencies without distributional assumptions, and is explicitly indexed to the lag , allowing horizon-specific assessment.[13]

Catt (2026) reported associations between AMI and realised forecast accuracy across benchmark series from the M4 competition.[26] Wang, Klee and Roos (2025) provided a comparative review of time series forecastability measures, including AMI alongside entropy-based and model-based alternatives.[5]

Empirical evidence

The major international forecasting competitions (M1, M3, and M4) have provided large-scale benchmark data used in discussions of forecastability, though not designed to test it directly. Makridakis et al. (1982), Makridakis and Hibon (2000), and Makridakis, Spiliotis and Assimakopoulos (2020) reported that simple methods remained competitive with sophisticated alternatives on average across heterogeneous portfolios in each competition.[6][7][8] Petropoulos et al. (2022), in a comprehensive review of forecasting theory and practice, noted that series-level structural characteristics are systematically related to method performance.[27]

Spiliotis et al. (2020) examined the representativeness of the M4 dataset using instance space analysis, finding broad coverage of time series feature space across trend, seasonality, spectral entropy, and distributional characteristics.[28]

Practical implications

Gilliland (2010) and Kolassa (2009) proposed that forecastability diagnostics can be used to allocate modelling effort across large forecasting portfolios, directing resources toward series where structural dependence justifies investment.[9][4] Kolassa (2009) further noted that for series that cannot be accurately forecast, investigating how to mitigate the consequences of unavoidable error is likely to be a more productive organisational response than further modelling investment.[4] Catt (2009) argued that horizon-specific diagnostics may additionally guide decisions about how far ahead to forecast and at which lead times to invest in model capacity.[29]

Limitations

Wang, Klee and Roos (2025) noted that forecastability diagnostics depend on properties of the observed time series and that their reliability can degrade when series are short or highly sparse; their experiments show that metrics such as spectral predictability and Lyapunov exponents require sufficient data density and length to produce stable estimates.[5] Series that are short, intermittent, or structurally unstable are poorly suited to many forecastability diagnostic approaches.

Catt (2026) distinguished between forecastability as dependence (a property of the data-generating process) and forecastability as exploitability (the degree to which a given model can convert available dependence into lower forecast error), noting that diagnostics function as screening tools rather than guarantees of model performance.[30]

See also

References

Sources

Related Articles

Wikiwand AI