| simulateGenotypes {mgrp} | R Documentation |
These functions (1) simulate a set of genotypes according to the specified OR, frequency of allele, and disease prevalence, then (2) simulate disease states according to the calculated risk and (3) report performance.
simulateGenotypes(or, f, p, n = 1e+05, nog = 400, varyEffects=FALSE, seed = NULL, silent=FALSE) summarizeRiskSet(geneSimulation,highRisk=.2,roundDigits=3)
or |
The disease odds ratio for the heterozygote relative to the homozygote non-risk allele (may be a vector if varyEffects==TRUE) |
f |
The frequency of the risk allele in the population (may be a vector if varyEffects==TRUE) |
p |
The prevalence of the disease in the population |
n |
Total sample size |
nog |
Number of genes in the model (may be a vector) |
varyEffects |
TRUE or FALSE. specifies whether to allow or & f to vary over the different genes making up a samples full genotype. If TRUE, then the function checks to see whether f & or are vectors, and expands (or subsets) what it finds to specify or & f for each gene. For example, "or" might be a vector and f a single number, and the function will use the same f for each value of "or" extending the last value of "or" as necessary to reach the maximum model size as specified in nog. |
seed |
(default NULL) A random seed which is set prior to the random number generation. Setting the seed consistently will produce consistent results. |
silent |
(default FALSE) The function will attempt to do as many samples as it can at a time, but may run into memory allocation issues. The function will detect memory allocation issues and break the job into smaller chunks if necessary. "silent" controls whether or not the memory allocation issues are reported to the user. |
geneSimulation |
The structure returned by simulateGenotypes |
highRisk |
The definition for high risk to use as a cutpoint to define percentage at high risk, TPR, FPR |
roundDigits |
The number of digits to which to round the results table. |
Given an odds ratio, a frequency of the risk allele (in the population)
and an overall disease prevalence, a disease likelihood ratio can be
calculated (assuming Hardy-Weinberg equilibrium, and that each instance of
the risk allele confers the same multiplicative risk).
Genetic profiles can be simulated given the overall frequency of the risk
allele for a panel of genes. Assuming a known correct risk model, disease
states can be simulated.
simulateGenotypes returns a list(parms,risk,disease):
parms |
A list of parameters to the function and calculated likelihood ratios and risks for each genotype. calcLR which includes the input parameters as well as the likelihood ratios for each genotype. |
risk |
A matrix of calculated risks one for each subject for each size of model |
disease |
A matrix of simulated disease states, one for each subject for each size of model |
Daryl Morris
geneSimulation =simulateGenotypes(or=1.05,f=0.05,p=0.1, nog=c(50,150,250,350), seed=834234) summarizeRiskSet(geneSimulation,highRisk=.2) ###### # this sample shows use with a genotype profile with varying effects (f/or). f = c(.05+.005*(1:50),.3+.0005*(1:350)) maxOR = 1.5 or = c(seq(maxOR,1.15,length=20),1.15-.1/380*(1:380)) p = .1 nog = c(20,50,150) sg <- simulateGenotypes(or=or,f=f,p=p,n=100000,nog=nog, varyEffects=TRUE,seed=834234) summarizeRiskSet(sg,highRisk=.2,roundDigits=3)