Sample size for matched case-control and cohort studies

The nature of matching

In case-control studies the ‘cases’ are those who have an outcome of interest. The cases are matched to controls without the same outcome of interest. Matching can be done with 1 case to 1 control as well as 1 case to >1 control or >1 case to > 1 control. After matching, the investigator walks back in time to uncover prior exposure to a risk factor for the cases and the matched and selected controls, and relates this exposure to outcome (case or control).

In matched cohort studies, members of a group of interest, typically those with a therapeutic intervention or an exposure, are matched, usually using propensity scores, to members from a large pool of candidates without the exposure. Match sets can vary in the composition of those exposed and selected controls, depending on the availability of an adequately large selection pool. Larger selection pools allow for controls that match more closely with members of the exposed/intervention group and can allow for matching with a larger control to exposed ratio. Subsequent to matching the investigator merges in subject outcome data to the subject data with the flag indicating an exposed subject or a matched control.

Both kinds of studies can be prospective or retrospective and all the records to be matched and the selection pool to provide controls will be available to the analyst. When analyzing matched case-control studies, care must be exercised to separate and fire-wall the exposure data from the analyst, while controls matching cases are being selected. Similarly the analyst must not have access to outcome data when controls matching exposure are being selected in matched cohort studies (see Levenson and Yue, 2013;23(1):110-21). Only ancillary information necessary to select similar controls need to be available initially. This separation of patient selection from the analysis of the relationship between exposure and outcome deflects criticism that inferences about that relationship were gamed through the choice of controls.

Sample size and power for matched case-control studies

The sample size and power for a score test in the context of a conditional logistic regression model, testing for non-zero log odds ratio against a null hypothesis of a zero log odds ratio, are provided in Lachin ( 2008 Jun 30;27(14):2509-23), This is the same as a null of a ratio of 1.0 for the ratio of the odds of a case (outcome of interest) given exposure, to the odds of a case/outcome given no exposure. Lachin derives the expression for the conditional logistic regression model likelihood as a function of observed exposure, notes the identical form of this to that for the Cox discrete proportional hazards model, uses this familiar structure to derive the null and non-null distributions of the score statistic and then derives expressions for the sample size and power. Some key expressions from Lachin are in the attached document. The sample size and power is provided for continuous and binary exposure variables.

Example of Conditional Logistic regression for matched data

The following example is provided as part of documentation for SAS v 9.2 at the following link for a case-control matched study. Similar analysis, among other methods, are recommended in Sjolander (Statistical Science, 2012, Vol. 27, No. 3, 395–411) for cohort studies. The data in the SAS example consists of cases of the occurrence of endometrial cancer (Outcome = 1 below) and a prognostic factor of gall bladder disease (Gall below). A 1 to 1 matching was done and each case and control pair was assigned an ID. SAS code for conducting the conditional logistic regression is reproduced below:

proc logistic data=Data1;
 strata ID;
 model outcome(event='1')=Gall;
run;

The output includes the Score statistic and p-value, as well as the Wald statistic and associated p-values, hazard ratios and confidence intervals. A confidence interval consistent with the score statistic and p-value can be computed using inversion methods described in Agresti (Statistics in Biopharmaceutical Research, 2011, Vol. 3, No. 2).

We will make the point shortly that the sample size and power assessments for the matched case-control context map easily to the matched cohort context on re-framing the hypothesis to be tested to a form similar to that in the case-control context. We will note the equivalence of predicting outcome using exposure to assessing the degree to which exposure group membership associates with future outcome. You can note this exact equivalence by running the following code switching outcome and exposure and checking computed inferential statistics:

proc logistic data=Data1;
 strata ID;
 model Gall(event='1')=outcome;
run;

Sample size and power for matched cohort studies

Note that the predictive ratio of the odds of response/outcome given therapy to the odds of response for a control in a matched cohort setting is identical to the associative ratio of the odds of prior membership in the therapy group given future response to therapy to the odds of membership in the therapy group given the lack of future response. This follows from relationships between conditional, marginal and joint distributions as illustrated below where we start with a ratio of the odds of X given Y to that of X given the complement of Y, and end with the ratio of the odds of Y given X to that of Y given the complement of X.

{P(X|Y)/P(Xc|Y)}/{P(X|Yc)/P(Xc|Yc)} 

= [{P(X,Y)/P(Y)}/{P(Xc,Y)/P(Y)}]/[{P(X,Yc)/P(Yc)}/{P(Xc, Yc)/P(Yc)}]
= [P(X,Y)/P(Xc,Y)]/[P(X,Yc)/P(Xc, Yc)]
= [{P(Y/X)*P(X)}/{P(Y/Xc)P(Xc)}]/[{P(Yc/X)*P(X)}/{P(Yc/Xc)*P(Xc)]
= [P(Y/X)/P(Y/Xc)]/[P(Yc/X)*P(Yc/Xc)]

= {P(Y/X)/P(Yc/X)}/[P(Y/Xc)/P(Yc/Xc)]

Hence the conditional logistic regression formulation predicting outcome using exposure or the one finding if exposure group membership associates with future outcome, yield identical odds ratios and inferences. The latter parallels the structure in the case-control developments in Lachin (2008). In the LHS of the conditional logistic model for this formulation we have the exposure group flag, consisting of matchees and matches, very much like the case/outcome flag with cases and matches in the LHS of the model for the matched case-control scenario. The RHS constitutes outcome and exposure respectively. Thus all expressions derived for case-control studies apply when testing against the null hypothesis of a unit ratio of the odds of response/outcome given therapy, to the odds of response in a matched control.

Extension to continuous variables

The results extend to the context of continuous exposure variable in the case-control context and to continuous outcome data in cohort studies. Sample size and power are provided using the ratio of odds due to a change in  variable values by a standard deviation (SD). Odds ratios for other increments can be obtained equivalently from this odds ratio. Alternately, the power and sample size can be evaluated using the difference in means of the distribution of the variable in the groups being compared. The calculator we provide at this page, described starting with the discrete version in the next section, computes power and sample size using the odds ratio for the 1 SD change.

The discrete calculator

The calculator has two tabs – one for discrete factors and one for the continuous factors. In the discrete tab we consider a cohort study where we have 70 subjects receiving an investigational therapy through a single arm oncology trial. We have a large number of candidates with similar characteristics that we can use to match these 70 subjects on the new therapy. We anticipate a 40% response rate in the control receiving other therapy and 60% when given the agent in the single arm study. We have some limited ability to increase the power of this study by choosing a larger number of matched controls. The default scenario 1 in the tab shows data for matches involving 1 treated to one control, scenario 2 has 2 treated to 3 controls in each matched set and scenario 3 has 1 treated to 2 controls. All three scenarios use a two sided significance level of 0.05 and the same total of 70 treated. In these three scenarios we compute the  power of rejecting the hypothesis of no difference between therapy and control as 66.96%, 89.54% and 78.92% respectively for total sample sizes of 140, 175 and 210 – an unusual finding with an elevated power for scenario 2 despite a lower total number of patients than scenario 3.

Relative power of the conditional logistic regression versus unconditional alternatives

The extra sensitivity of the conditional logistic regression arises in this second scenario as there are multiple treated subjects and multiple controls in each matched set. The score statistic, in this scenario, would evaluate effect in each matched set by comparing the number of responses to the two patients actually exposed to the agent of interest in the matched set to what we compute as an average over all arbitrary selections of the 2 ‘treated’ from the 5 patient (2 treated+3 controls) matched set. Such effects would then be aggregated over all matched sets to get the final test statistic. Note that there are 10 ways of obtaining 2 ‘treated’ from each 5 patient matched set, while for scenario 1 we would have just 2 ways of getting 1 patient from a 2 patient matched set and for scenario 3 we have 3 ways of picking 1 from a set of 3. This and the size of the ‘treated’ set in the latter two scenarios is 1 instead of the 2 in scenario 2, lead to a more informative assessment of the effect and it’s expectation in each matched set in scenario 2.  Hence we have a more powerful statistical comparison when we have multiple treated with one or more controls within the same matched set, despite a smaller total number of patients.

We also present the power if we did an unconditional logistic regression (without using match ID as strata in the SAS code provided above) in the box of the calculator below that for the conditional logistic regression. Note that the power of the unconditional test is identical to that for the conditional test for 1 to m matching for all m >= 1. So going for a many to a one or many control matching appears to be settings where the conditional logistic regression dominates unconditional testing. For this condition, however, it is necessary to note that the score statistic has a strong exchange-ability assumption for patients within a matched set, which drives the appropriateness of the averaging mentioned above over all possible arbitrary selections of ‘treated’ within matched sets. It will usually be difficult to claim exchange-ability between elements of a small group such as the 70 patients treated in our example. Usually the pool of patients to select controls from is much larger. In the example here we might use 500 or more candidates to find the 140 that we need for the 1 + 2 matched sets. The matching is usually done in cohort studies using propensity scores and the matching heuristic may end up matching with no more than, say, a 4% difference on propensity probabilities between elements of a matched set. Small matching calipers are more likely between treated and controls and across controls and less likely across members of the treated group unless there is a selection from a larger candidate of treated patients as well.

Lachin (2008) provides examples of a number of studies in the case-control context where the composition of each matched set varies allowing for a number of sets with multiple cases and multiple controls along with sets which are 1 to many or many to 1. Expressions for power for such scenarios are provided in Lachin (2008) as well. The calculator we provide also computes power in this more general context and we will describe this functionality in the next section about the continuous case calculator. Allowing for a variable composition, as in such matched sets, increases the power of the conditional logistic regression based score test. For our cohort study example, this will likely allow some justifiable many to many matches among mostly 1 to 1 or 1 to 2 matches, leading to increased power compared to fixed 1 to m matching. It is likely that a matching which is not constrained to a fixed matched set composition is more natural, and will lead to lower variation in matching distances between members within matched sets providing a stronger justification for the use of the conditional test. This will however not allow for a simple demonstration of balance achieved on baseline factors through the matching – one would need to provide baseline summaries of patients weighted by the size of the group in the matched set. Further fixed 1 to many matching allows for simpler unconditional methods such as a test for unit odds ratios or the chi-squared test (see bottom-most box of calculator) where it is not necessary to argue strong exchange-ability within matched sets. Either variable or fixed matching may be used, provided it is specified and completed prior to inferential analysis relating outcome to exposure.

The continuous calculator

This continuous case is considered in the second tab of the calculator. The default example used in the first two columns of the calculator is provided in the Lachin article. This calculator provides the sample size and power associated with a ratio of the odds of being in the outcome group at a higher level of a continuous exposure variable to that at a standard deviation (SD) lower (case-control studies) or the ratio of the odds of being in the exposure group with a higher level of a continuous outcome variable to that at a level lower by one standard deviation (cohort studies). In scenario 1 in this tab of the calculator, the odds of having cardiovascular disease (CVD) is considered likely to be 1.39 fold higher when the bio-marker measuring soluble form of the inter-cellular adhesion molecule-1 (sICAM-1) increases by one standard deviation. To arrive at this ratio of odds one could start with the likely CVD rate at the mean value of the bio-marker less half a SD and the likely CVD rate at the mean + 0.5*SD. If these are Pl and Ph then the odds ratio to use in the calculator is obtained as {Ph/(1-Ph)}/{Pl/(1-Pl)}. The odds ratio for any other increment c is given by (OR)^(c/SD) where OR is the odds ratio per SD change. For an odds ratio OR1 per unit increase, the odds OR for an increase by a SD is given by OR = OR1^SD. In scenario 1 we see that 125 cases matched in 1 to 2 sets, to a total of 250 controls, will provide 85% power in a 0.05 level two sided test. Note the 95.68% power, using the conditional logistic test, with a lower total of cases and controls of 250 in matched 5+5 sets (scenario 2). In contrast there is 74% power for an unconditional test.

Scenario 3 looks at data with matched sets with varied case and control make-ups. To compute power in such a context one needs to enter data in the third tab. The default data in this calculator consists of 13 unique matched configurations in 17 matched sets. This data was entered in 13 lines in the third tab of the spreadsheet with each line reporting the number of cases, the number of controls and the number of such combinations. In this case-cohort example in Lachin (2008) the cases were incidences of low birth weight infants. The factor to be related to this outcome was maternal body weight, having a standard deviation of 32. The odds ratio per unit change in maternal weight was expected to be 0.986. Using the expression in the previous paragraph the odds ratio per SD change to be entered in our calculator is 0.986^32 = 0.637. For this scenario, we have 89.8% power to detect this ratio of odds, of having low birth weight infants, to that of having normal weight infants, with increases in maternal birth weight.  This is based on the use of a conditional logistic regression derived score test at a two-sided 0.05 level. The unconditional test has 78.1% power.

The expressions for power for the continuous and discrete conditional logistic regressions with fixed and variable matched sets, as well as that for the unconditional test and for the Chi-squared test are provided in this attached document.

Edit the blue cells in the spreadsheet and enter your data and the calculations in the spreadsheet will refresh.

South Indian Meter Long Coffee. Image to Left is from https://www.evolveback.com/lifescapes/coffee-by-the-meter/ and the Image to the Right is from https://www.saffrontrail.com/recipe-to-make-filter-kaapi-how-to/