aroma.affymetrix 1.7.0
aroma.cn 0.5.0
What's new?
Author: Mark Robinson (pruned by Henrik Bengtsson)
Created on: 2009-01-14
Last updated: 2010-03-16
For the script bpmapCluster2Cdf.R, below is a set of commands used to created the CDF file and all the other associated files for the Human Promoter tiling array. The starting point is a BPMAP (binary probe mapping) file. You can get a BPMAP file from Affymetrix. However, if you wish to run the MAT normalization, you'll need a "matchscore" for each probe. The matchscore is the number of times that probe maps exactly to the human genome. The BPMAP files that you can download from the MAT Download website have the matchscore within them.
You will also need to know the number of rows (of probes physically on the chip) in order to get the indexing right. If you don't have this handy, look at the output of readCelHeader() (from the affxparser package) for a CEL data file that you have.
library("aroma.affymetrix");
bpmapFile <- "Hs_PromPR_v01-3_NCBIv36.NR.bpmap";
chipType <- "Hs_PromPR_v02";
bpmapCluster2Cdf(bpmapFile, chipType, rows=2166, cols=2166, verbose=-20);
# Should have Hs_PromPR_v02.cdf and Hs_PromPR_v02.pps
# Move the CDF file into the 'annotationData/chipTypes/Hs_PromPR_v02' directory
pathname <- sprintf("%s.cdf", chipType);
pathname <- Arguments$getReadablePathname(pathname, mustExist=TRUE);
destPath <- file.path("annotationData", "chipTypes", chipType);
destPathname <- Arguments$getWritablePathname(pathname, path=destPath);
file.rename(pathname, destPathname);
# peel out the probe sequences from the BPMAP file
cdf <- AffymetrixCdfFile$byChipType(chipType);
acs <- AromaCellSequenceFile$allocateFromCdf(cdf, tags="", verbose=-80);
importFromBpMap(acs, bpmapFile, rows=2166, verbose=-80);
# peel out the match scores from the BPMAP file
acm <- AromaCellMatchScoreFile$allocateFromCdf(cdf, tags="", verbose=-80);
importFromBpMap(acm, bpmapFile, rows=2166, verbose=-80);
# create a "unique" cell, just a rearrangement of the data so that multimap probes are copied into separate "cells"
cdfU <- getUniqueCdf(cdf);
# peel out the chromosome and position of every probe
# (Note these exact commands may need to be modified for non-human chips)
acp <- AromaCellPositionFile$allocateFromCdf(cdfU, verbose=verbose);
ind <- getCellIndices(cdfU, stratifyBy="pm", unlist=TRUE, useNames=FALSE);
ch <- gsub("chr", "", gsub("F", "", substr(names(ind),1,5)));
pathname <- sprintf("%s.pps", chipType);
sp <- unlist(loadObject(pathname), use.names=FALSE);
chInt <- ch;
chInt[ch=="X"] <- 23;
chInt[ch=="Y"] <- 24;
chInt[ch=="M"] <- 25;
chInt <- as.numeric(chInt);
acp[ind,1] <- chInt;
acp[ind,2] <- sp;
# calculate local CpG density around each probe (useful for MeDIP-chip data)
acc <- AromaCellCpgFile$allocateFromCdf(cdfU, verbose=verbose);
library("MEDME");
library("BSgenome.Hsapiens.UCSC.hg18");
w <- which(ch != "chrM");
n <- length(ch);
dummy <- matrix(rnorm(n*2),nrow=n);
rownames(dummy) <- paste(ch, sp, sep=".");
mms <- new("MEDMEset", chr=ch[w], pos=sp[w], logR=dummy[w,], organism="hsa");
# choose a reasonable window size here (based on the size of hybridized fragments?)
cg600 <- CGcount(data=mms,wsize=600)@CGcounts;
acc[ind[w],1] <- cg600;
Alternatively, you can just download these files from 'Hs_PromPR_v02' and place them in the annotationData/chipTypes/Hs_PromPR_v02/ directory.