To improve HLA-I neoepitope prediction, Bulik-Sullivan et al. developed a computational model called EDGE and trained it on a large set of HLA-I peptide mass spectrometry, HLA type, and RNAseq data from human tumor and normal tissue samples. EDGE identified predictive models for 53 alleles, learned whether genes were prone to presentation, and predicted the stability of HLA-peptide complexes. EDGE achieved a positive predictive value that was folds higher than that of current state-of-the-art models and was able to reliably identify neoantigens and neoantigen-specific T cells using routine clinical specimens.
Neoantigens, which are expressed on tumor cells, are one of the main targets of an effective antitumor T-cell response. Cancer immunotherapies to target neoantigens are of growing interest and are in early human trials, but methods to identify neoantigens either require invasive or difficult-to-obtain clinical specimens, require the screening of hundreds to thousands of synthetic peptides or tandem minigenes, or are only relevant to specific human leukocyte antigen (HLA) alleles. We apply deep learning to a large (N = 74 patients) HLA peptide and genomic dataset from various human tumors to create a computational model of antigen presentation for neoantigen prediction. We show that our model, named EDGE, increases the positive predictive value of HLA antigen prediction by up to ninefold. We apply EDGE to enable identification of neoantigens and neoantigen-reactive T cells using routine clinical specimens and small numbers of synthetic peptides for most common HLA alleles. EDGE could enable an improved ability to develop neoantigen-targeted immunotherapies for cancer patients.