Producing models with better generalization using continued fractions
In data analysis, our goal is often to create models that predict outcomes based on a set of variables. A bank, for instance, may take the current salary, age and location to determine a maximum dollar amount for lending.
The ability to extrapolate is often crucial, that is, the ability to predict well for individuals whose data is outside of the ranges used to build the model. This could mean determining a lending amount in uncommon market circumstances, or in unique combinations of salary and age for an individual. Another real-world problem is noisy data—random fluctuations and errors that can obscure these underlying patterns. A robust model must differentiate signal from noise, allowing for strong predictions even in data inconsistencies.
We investigated using continued fractions to model such problems and demonstrated better generalization performance in these circumstances. Our developed algorithm achieved the most 1st ranks compared to its state-of-the-art competitors on 21 noise-affected datasets. It never ranked worse than third for any dataset while taking at most 4% of the training time of its best-performing competitor.
This work was published in the Proceedings of the Genetic and Evolutionary Computation Conference, the leading conference in genetic and evolutionary computation, and presented in Lisbon, Portugal 2023 to the academic community https://dl.acm.org/doi/abs/10.1145/3583131.3590461ty https://dl.acm.org/doi/abs/10.1145/3583131.3590461