Traversing TCGA: Making Sense of the Data Files

This post is part of an ongoing series, Traversing TCGA, in which I analyze data from The Cancer Genome Atlas using Python. Once the download of the data is complete, we end up with a folder full of .xml files containing the clinical data. How do we go from relatively free-from data in hundreds of


Traversing TCGA: Downloading the Data

The Cancer Genome Atlas (TCGA) is an amazing source of data about cancer. It contains clinical observations, gene panel data and genome sequence data for thousands of cancer patients. I have initiated a series called Traversing TCGA, which I will update periodically with my insights about the data contained in TCGA. First, I chose to investigate