4. CpG_distrb_chrom.py

4.1. Description

This program calculates the distribution of CpG over chromosomes

4.2. Options

--version

show program’s version number and exit

-h, --help

show this help message and exit

-i INPUT_FILES, --input_files=INPUT_FILES

Input CpG file(s) in BED3+ format. Multiple BED files should be separated by “,” (eg: “-i file_1.bed,file_2.bed,file_3.bed”). BED file can be a regular text file or compressed file (.gz, .bz2). The barplot figures will NOT be generated if you provide more than 12 samples (bed files). [required]

-n FILE_NAMES, --names=FILE_NAMES

Shorter and meaningful names to label samples. Should be separated by “,” and match CpG BED files in number. If not provided, basenames of CpG BED files will be used to label samples. [optional]

-s CHROM_SIZE, --chrom-size=CHROM_SIZE

Chromosome size file. Tab or space separated text file with two columns: the first column is chromosome name/ID, the second column is chromosome size. This file will determine: (1) which chromosomes are included in the final bar plots, so do NOT include ‘unplaced’, ‘alternative’ contigs in this file. (2) The order of chromosomes in the final bar plots. [required]

-o OUT_FILE, --output=OUT_FILE

The prefix of the output file. [required]

4.4. Command

$ chrom_distribution.py -i 450K_probe.hg19.bed3.gz,850K_probe.hg19.bed3.gz -n 450K,850K \
  -s hg19.chrom.sizes -o chromDist

Output files

  • chromDist.txt

  • chromDist.r

  • chromDist.CpG_total.pdf

  • chromDist.CpG_percent.pdf

  • chromDist.CpG_perMb.pdf

Total CpG count per chromosome

../_images/chromDist.CpG_total.png

CpG percent on each chromosome (normalized to total CpGs)

../_images/chromDist.CpG_percent.png

CpG per Mb (normalized to chromosome size)

../_images/chromDist.CpG_perMb.png