Autism Spectrum Disorder (ASD) and Intellectual Disability (ID) are comorbid neurodevelopmental disorders with complex genetic architectures. Despite large-scale sequencing studies only a fraction of the risk genes were identified for both. Here, we present a novel network-based gene risk prioritization algorithm named DeepND that performs cross-disorder analysis to improve prediction power by exploiting the comorbidity of ASD and ID via multitask learning. Our model leverages information from gene coexpression networks that model human brain development using graph convolutional neural networks and learns which spatio-temporal neurovelopmental windows are important for disorder etiologies. We show that our approach substantially improves the state-of-the-art prediction power in both single-disorder and cross-disorder settings. DeepND identifies mediodorsal thalamus and cerebral cortex brain region and infancy to childhood period as the highest neurodevelopmental risk window for both ASD and ID. We observe that both disorders are enriched in transcription regulators. Despite tight regulatory links in between ASD risk genes, such is lacking across ASD and ID risk genes or within ID risk genes. Finally, we investigate frequent ASD and ID associated copy number variation regions and confident false findings to suggest several novel susceptibility gene candidates. DeepND can be generalized to analyze any combinations of comorbid disorders and is released at http://github.com/ciceklab/deepnd. # Equal contribution.*Correspondance: cicek@cs.bilkent.edu.tr been used to assess gene risk using excess genetic burden from case-control and family studies [35] which are recently extended to work with multiple traits [68]. Yet, these tools work with genes with observed disruptive mutations (mainly de novo). It is often of interest to use these as prior risk and obtain a posterior gene interaction network-adjusted risk which can also assess risk for genes with no prior signal. Network-based computational gene risk prediction methods come handy for (i) imputing the insufficient statistical signal and providing a genome-wide risk ranking, and (ii) finding out the affected cellular circuitries such as pathways and networks of genes [35,29,24,37,69,27,26,67,10]. While these methods have helped unraveling the underlying mechanisms, they have several limitations. First, by design, they are limited to work with a single disorder. In order to compare and contrast comorbid disorders such as ASD and ID using these tools, one approach is to bag the mutational burden observed for each disorder assuming two are the same. However, disorder specific features are lost as a consequence [29]. The more common approach is to perform independent analyses per disorder and intersect the results. Unfortunately, this approach ignores valuable source of information coming from the shared genetic architecture and lose prediction power as per-disorder analyses have less input (i.e., samples, mutation counts) and less statistical power [81,12,37]. Second, c...