.
├── README.md
├── _quarto.yml
├── aacr-genie-surveillance.Rproj
├── analysis_scripts
│ ├── 01_descriptives_aacr_nsclc_cohort.qmd
│ ├── 01_descriptives_aacr_nsclc_cohort_files
│ ├── maps.jpeg
│ └── references.bib
├── epidemiology.csl
├── functions
│ ├── boxplot_stats.R
│ └── get_visit_stats.R
├── index.Rmd
├── index.rmarkdown
├── public
├── references.bib
├── references.qmd
├── renv
│ ├── activate.R
│ ├── library
│ ├── settings.json
│ └── staging
├── renv.lock
└── update_README.R
Surveillance in NSCLC
Surveillance and descriptives in AACR GENIE BPC
Preface
Background
This repository hosts materials to explore surveillance patterns in the NSCLC AACR GENIE BPC database.
Data
This repository leverages publicly accessible data from the American Association for Cancer Research Project Genomics Evidence Neoplasia Information Exchange Biopharma Collaborative (GENIE BPC). This initiative is an effort to aggregate comprehensive clinical data linked to genomic sequencing data to create a pan-cancer, publicly available data repository.
To get access to the data follow these instructions.
To obtain access for public data releases:
Register for a ‘Synapse’ account. Accept the Synapse account terms of use.
Navigate to the data release and request accept terms of use (e.g., for the NSCLC 2.0-public data release, navigate to the ‘Synapse’ page for the data release). Towards the top of the page, there is information including the ‘Synapse’ ID, DOI, Item count, and Access. Next to Access is a link that reads Request Access.
Select Request Access, review the terms of data use and select Accept
Note that permissions for Synapse and permissions for each data release are distinct. Both permissions must be accepted to successfully access the data.
After receiving access, the data can be querieed using the {genieBPC}
package. The {genieBPC}
package is a user-friendly data processing pipeline to streamline the process for developing analytic cohorts that are ready for clinico-genomic analyses. Check out this website to learn more about the data.
Dependencies
R package dependencies are managed through the renv
package. All packages and their versions can be viewed in the lockfile renv.lock
. All required packages and the appropriate versions can be installed by running the following command:
renv::restore()
Important: Make sure you have quarto installed.
Reproducibility
To reproduce the analyses in this repository, make sure you have your Synapse’ account user credentials (SYNAPSE_USERNAME, SYNAPSE_PASSWORD) stored in an environment file, e.g., .Renviron
and all dependencies installed (see above). The simpy execute quarto render
in the Terminal or hit the Render
button within RStudio.
Structure
- .Rprofile - defines paths and activates
renv
, options for Posit R package manager - protocol - motivating example and statistical analysis protocol
- scripts - main R/RMarkdown analytic scripts
- functions - helper functions called in scripts
- tables - main and supplementary tables (R objects and .docx format)
- figures - main and supplementary figures (R objects and .docx format)
- renv/renv.lock -
renv
directories to manage R dependencies and versions used in this simulation - public - output of Quarto scripts