Organic solar cells are an inexpensive, flexible alternative to traditional silicon‐based solar cells but disadvantaged by low power conversion efficiency due to empirical design and complex manufacturing processes. This process can be accelerated by generating a comprehensive set of potential candidates. However, this would require a laborious trial and error method of modeling all possible polymer configurations. A machine learning model has the potential to accelerate the process of screening potential donor candidates by associating structural features of the compound using molecular fingerprints with their highest occupied molecular orbital energies. In this paper, extremely randomized tree learning models are employed for the prediction of HOMO values for donor compounds, and a web application is developed. The proposed models outperform neural networks trained on molecular fingerprints as well as SMILES, as well as other state‐of‐the‐art architectures such as Chemception and Molecular Graph Convolution on two datasets of varying sizes.
1- what was the intuition for using multiple representation come from? the fact that each would capture a different aspect of the molecule? 2-are these specific regression and classification tasks? 3- interpretability of your model? 4- why you used fingerprint representation and not SMILE? 5- which one can be used for inorganic materials 6- Did you use Random forest or Neural networks for transfer learning?
1- The importance of ML in accelerating the discovery in materials 2- The power of extremely randomized trees in reducing the features 3- The potential of integration of feature manipulation combined with extensive grid search on a small experiment-theory calibrated dataset of organic photovoltaic donors