Electronic health records (EHR) are a connected data structure that can be modelled as a graphical structure. Research has shown that using graphical EHR is superior on predictive tasks than simply assuming no data connectivity. However, EHR data doesn’t always contain structural information making it difficult to actually create graphical EHR. The authors propose the Graph Convolutional Transformer (GCT), a novel approach to jointly learn the hidden structure while performing various prediction tasks when the structure information is unavailable. The proposed model consistently outperformed previous approaches empirically, on both synthetic data and publicly available EHR data, for various prediction tasks such as graph reconstruction and readmission prediction, indicating that it can serve as an effective general-purpose representation learning algorithm for EHR data.