| calcBaselineLOddsFromSample {mgrp} | R Documentation |
Calculates the baseline log-odds for a genotype in order to achieve the
desired disease prevalence (expected risk). The function takes a disease
prevalence(p) and vectors for each of odds ratio("or"), frequency of risk allele in
the population("f") and number of genes ("nog"). The genotype will then
consist of nog[i] genes each with f[i] risk allele frequency and or[i]
disease odds ratio for each instance of the risk allele. (the overall
genotype will be made up of of sum(nog) genes.
To calculate a baseline log odds, three assumptions are made:
(1) Each version of the allele is assumed to give the same increased risk
for disease, such that OR for homozygote with the risk allele (compared to
the homozygote non-risk allele) is $or^2$ (where "or" is the odds ratio for
a heterozygote relative to the homozygote non-risk allele).
(2) The alleles are assumed to be in Hardy-Weinberg equilibrium such that
the overall allele frequency fully specifies the frequencies of each of
EE,Ee,ee.
(3) The log odds ratios are additive, and there is no interaction between
genes in the genotype.
Once we fix or, f and p we can solve for a unique baseline log odds which
will yield the desired prevalence of disease in the population. Overall
risk for a given genotype is then expit(sum #riskalleles[i] * or[i] + log
baseline odds).
calcBaselineLOddsFromSample(or,f,nog,p,sampGenotypes)
or |
(A vector of) disease odds ratio(s) for the heterozygote relative to the homozygote non-risk allele. |
f |
A vector of) frequency of the risk allele in the population |
nog |
A vector of) number of genes for each f/or combination |
p |
The prevalence of the disease in the population |
sampGenotypes |
A sample of genotypes for a population. |
calcLR returns a list:
or |
The input disease odds ratio for the heterozygote |
f |
The input frequency of the risk allele |
p |
The input prevalence of the disease in the population |
nog |
The input numbers of genes in the population |
bLO |
The baseline log odds to achieve the desired prevalence. |
Daryl Morris
sampleGenes <- function(ff,nn) {
sapply(ff,function(x) sample(1:3,nn,replace=TRUE,
prob=c(x^2,2*x*(1-x),(1-x)^2)))
}
f <- seq(.4,.1,length.out=50)
or <- seq(1.5,1,length.out=50)
nog <- rep(1,50)
sample <- 10000
sampGenotypes <- sampleGenes(f,sample)
calcBaselineLOddsFromSample(or=or,f=f,nog=nog,p=.05,sampGenotypes)