Machine Learning to Predict Entropy and Heat Capacity of Hydrocarbons

Student thesis: Master's Thesis


Chemical substances are essential to all aspects of human life, and understanding their properties is essential for effective application. The properties of chemical species are usually measured by experimentation or computational calculation using theoretical methods. In this work, machine learning models (ML) for predicting entropy, S, and heat capacity, cp, were developed for alkanes, alkenes, and alkynes at 298.15 K. The data for entropy and heat capacity were collected from various sources. Commercial software (alvaDesc) then generated the molecular descriptors of all the hydrocarbons in the dataset used as input for the ML models. Support vector regression (SVR), v-support vector regression (v-SVR), and random forest regression (RFR) algorithms were trained with K-fold cross-validation on two levels. The first level assessed the models’ performance and the second level generated the final models. After a performance comparison of the three models, the SVR was chosen. To illustrate the advantage of using the ML approach, the SVR model was compared against Benson’s group additivity. Finally, a sensitivity analysis was performed.
Date of AwardJun 2020
Original languageEnglish (US)
Awarding Institution
  • Physical Sciences and Engineering
SupervisorMani Sarathy (Supervisor)


  • machine learning
  • SVR
  • RFR
  • entropy
  • Heat capacity
  • hydrocarbons

Cite this