In recent years, machine learning has become widely used in scientific contexts, producing strong predictive results. However, an increasingly clear limitation has emerged: many models perform well numerically but remain difficult to interpret.
At ActarusLab, our work focuses on this gap. We explore whether it is possible, starting from complex datasets, to recover simple and interpretable mathematical structures that describe the underlying system.
Many modern machine learning approaches, particularly deep neural networks, are highly effective but difficult to analyze.
In practice, this means:
This becomes a critical issue in domains such as quantitative finance, physics, and computational chemistry, where interpretation is an essential part of validation.
Unlike traditional machine learning methods, symbolic regression does not assume a fixed functional form.
Instead, it searches directly for a mathematical relationship of the form:
The goal is not only to minimize prediction error, but to identify expressions that are:
In many cases, the process begins with the construction of high-fidelity simulations that approximate the behavior of the system under study.
These environments allow controlled data generation, which is essential for testing hypotheses in a reproducible way.
The methodology can be summarized in three stages:
In some cases, the resulting model can take a form such as:
What makes this interesting is not its complexity, but the fact that it is:
To identify non-stationary relationships in time series and detect regime changes.
To reduce overfitting in QSAR models and improve robustness in predictive pipelines.
To uncover emergent structure in high-dimensional datasets.
A model that cannot be explained is difficult to validate rigorously.
For this reason, in our approach interpretability is not an optional feature but a constraint.
We prioritize models that are simpler and verifiable, even at the cost of some predictive performance.
The objective is not to replace scientific reasoning with more complex algorithms, but to use algorithms to recover simpler and more understandable representations of complex systems.
In this sense, symbolic regression provides a bridge between data complexity and mathematical structure.
ActarusLab — Independent Scientific Machine Learning Laboratory