2021-03-29
Symbol | Meaning |
---|---|
\(g = 6000\) | number of genes |
\(m = 40\) | genes involved in methionine metabolism |
\(n = 5960\) | genes not involved in methionine metabolism |
\(k = 10\) | number of genes in the cluster |
\(x = 6\) | number of methionine genes in the cluster |
Symbol | Meaning | Formula |
---|---|---|
\(C_1\) | choose 10 distinct genes among 6000 | \(C_1 = C_{m+n}^{k} = \frac{6000!}{10!5990!} = 1.65e^{31}\) |
\(C_2\) | choose 6 distinct genes among the 40 involved in methionine | \(C_2 = C_{m}^{x} = \frac{40!}{6!34!} = 3.8e^{6}\) |
\(C_3\) | choose 4 genes among the 5960 which are not involved in methionine | \(C_3 = C_{n}^{k-x} = \frac{5960!}{4!5956!} = 5.2e^{13}\) |
\(C_4\) | choose 6 methionine and 4 non-methionine genes | \(C_4 = C2 \cdot C3 = C_{m}^{x}C_{n}^{k-x} = 2.0e^{20}\) |
Probability to have exactly 6 methionine genes within a selection of 10
\[P(X=6) = \frac{C4}{C1} = \frac{C_{m}^{x}C_{n}^{k-x}}{C_{m+n}^{k}} = \frac{C_{40}^{6}C_{5960}^{4}}{C_{6000}^{10}} = 1.219e^{-11}\]
Probability to have at least 6 methionine genes within a selection of 10
\[P(X \ge 6) = \sum_{i=x}^{k}\frac{C_{m}^{i}C_{n}^{k-i}}{C_{m+n}^{k}} = 1.222e^{-11}\]
tool: g:GOSt from gProfiler https://biit.cs.ut.ee/gprofiler/gost
documentation: https://biit.cs.ut.ee/gprofiler/page/docs
html : https://du-bii.github.io/module-3-Stat-R/stat-R_2021/tutorials/Rsession6_tuto_gProfiler.html
Goal:
What about a negative control ?
Rmd : https://du-bii.github.io/module-3-Stat-R/stat-R_2021/practicals/Rsession6_functional_enrichment.Rmd
html : https://du-bii.github.io/module-3-Stat-R/stat-R_2021/practicals/Rsession6_functional_enrichment.html
All genes are sorted according to some criterion (e.g. differential expression p-value, correlation of expression with other variables, …).
Each graph compares the ranked gene list with one reference class (e.g. one biological process).
Black bars denote genes belonging to the reference class.
The green curve estimates, at each level i, the degree of over-representation of the reference genes in the i top-ranking genes.