When following your workflow in https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#standard-workflow, why do you use rowMeans to display the Heatmap of count matrix? Can I use data from PCA to select the top 20 genes? In my case, I have 6,770 genes. If I choose only the top 20 genes, I assume it does not accurately represent the genes, so I want to use PCA loadings. Does it make sense?
library("pheatmap")
select <- order(rowMeans(counts(dds,normalized=TRUE)),
decreasing=TRUE)[1:20]
df <- as.data.frame(colData(dds)[,c("condition","type")])
pheatmap(assay(ntd)[select,], cluster_rows=FALSE, show_rownames=FALSE,
cluster_cols=FALSE, annotation_col=df)
sessionInfo( )
Thanks for clarifying that this is just a starting point papa's games for data exploration. I'll definitely dive in and use the best methods to properly explore the data.