The ex-ante identification of outperforming mutual funds, measured by a positive risk-adjusted return (alpha) after fees, is a very challenging and difficult task. However, several empirical studies (Kosowski et al., 2006; Barras et al., 2010; Fama and French, 2010; Kacperczyk et al., 2014) provide evidence that a subset of fund managers posses skill to outperform a passive benchmark after fees. In order for investors to benefit from this skill, an ex-ante identification of these fund managers is required. In this context, academic research has documented the ability of various characteristics at the fund-, the fund-firm-, as well as the fund manager-level to predict future fund performance (alpha). The aim of this project is to create a novel and unique database consisting of proprietary and public data (provided by commercial data vendors) for various markets (US, UK, Germany, Switzerland etc.) that can be used by investors (e.g. pension funds, wealth manager, private banks) to exploit any predictability about fund performance found in the data. For this purpose, we apply techniques from machine learning (ML) which allow for non-linearities and interaction effects and reducing thereby the risk of model misspecification that potentially arise in a linear model.