We consider the problem of traffic density reconstruction using measurements from probe vehicles (PVs) with a low penetration rate. In other words, the number of sensors is small compared to the number of vehicles on the road. The model used assumes noisy measurements and a partially unknown first-order model. All these considerations make the use of machine learning to reconstruct the state the only applicable solution. We first investigate how the identification and reconstruction processes can be merged and how a sparse dataset can still enable a good identification. Secondly, we propose a pre-training procedure that aids the hyperparameter tuning, preventing the gradient descent algorithm from getting stuck at saddle points. Examples using numerical simulations and the SUMO traffic simulator show that the reconstructions are close to the real density in all cases.