4.0 Our Process: A Disciplined Framework for Alpha Generation
A robust investment signal is the product of a repeatable and disciplined research process. This three-phase process is the operational embodiment of our “extracting signal from noise” philosophy—a disciplined machinery built to filter out spurious correlations and cognitive biases. Our framework is explicitly designed to prevent common analytical pitfalls such as data snooping and survivorship bias, ensuring only the most robust signals drive investment decisions.
Phase 1: Hypothesis and Ex Ante Justification
Every investment signal we develop begins not with data, but with a strong foundation in financial economic theory. We firmly believe that sustainable alpha cannot be discovered through undisciplined data mining. Therefore, every potential factor must have a sound economic rationale—a reason why it should be expected to predict future returns—before it is considered for empirical testing. This could be a risk-based explanation (e.g., investors demand compensation for bearing a certain type of risk) or a behavioral one (e.g., investors systematically over- or under-react to certain information). This initial step acts as a powerful filter, focusing our research efforts on plausible and persistent sources of return.
Phase 2: Data Integrity and Sample Selection
The quality of any quantitative model is entirely dependent on the quality of the underlying data. We therefore place extreme emphasis on data integrity and the avoidance of common biases that can lead to unrealistic historical simulations.
- Survivorship Bias: Many commercial datasets only include companies that “survived” over the period of study, excluding those that went bankrupt or were acquired. Building a model on such a dataset leads to overly optimistic results. Our strict policy is to use only survivorship-free datasets, which include the histories of all firms that existed at the start of the sample period, ensuring our backtests are realistic.
- Data Snooping: This is the pitfall of testing a model on the same data used to calibrate it, leading to models that are “overfit” to past noise and fail in live trading. Our methodology rigorously prevents this by strictly separating our historical data into a “training set” for model calibration and a completely separate “test set” for out-of-sample validation. A strategy is only considered viable if it demonstrates efficacy on data it has never seen before.
Phase 3: Model Estimation, Validation, and Diagnosis
Once a theoretically sound hypothesis has been established and tested on clean, unbiased data, we proceed to the iterative process of model building. This phase involves rigorous statistical testing to ensure our models are correctly specified and robust. Our process culminates in a rigorous battery of diagnostic checks to evaluate the model’s goodness-of-fit (using metrics like R2), the overall significance of the model (using the F-test), and the statistical significance of each individual factor included (using the t-test). Only models that pass this comprehensive validation process are advanced for implementation.
This three-phase framework provides the discipline and rigor necessary to translate economic theory into actionable, alpha-generating investment signals.