T-Test for Medical Record Review Validation

The NCQA HEDIS® Compliance Audit™ includes a standardized protocol to validate the integrity of the medical record review processes of audited health plans. The protocol consists of a process review of the credentials, training, and oversight of medical record reviewers, as well as, the training materials, abstraction and data entry tools, and the application of inter-rater reliability and/or rater to standard tests. An additional component of the process review is the sampling and validation of medical records and abstraction forms that were counted as numerator positives. Prior to the 2000 HEDIS audit cycle, the standardized protocol was missing a statistical method for comparing the results of the numerator positive validation to some consistent standard. Now, NCQA provides a T-Test, which was developed to compare the validation results to a consistent standard, to all Licensed Audit Organizations. 

T-Test Description – Hypothesis Testing

A hypothesis is simply an assumption that one wants to be able to verify. The verification is usually stated in terms of acceptance. An individual normally constructs the hypothesis in such a way that there is an alternative that s/he will end up adopting if the hypothesis is unsupportable. The first hypothesis is called the null hypothesis, which one constructs in a way that states nothing is different from what it is supposed to be, is claimed to be, or has been in the past. An individual normally assumes the null hypothesis to be true, unless there is strong evidence to reject it. It is necessary to have an alternative position, which is called the alternate hypothesis, to automatically accept if the null hypothesis is rejected.

If one wants to know whether or not s/he can accept the hypothesis, a test must be constructed.

T-Test Description – Test Development and Formula

The logic of this T-test is to statistically test the difference between the plan’s estimate of the positive rate and the audited estimate of the positive rate. The null hypothesis is that there is not more than a 5% difference between the rates, and the alternate hypothesis is that there is more than a 5% difference. If the test reveals that the difference is greater that 5%, then the null hypothesis is rejected and, consequently, so is the plan’s estimate of the positive rate.

In order to understand the fundamental aspects of the test, it is necessary first to define some terms and explain why they are used. P, the final reported rate, is really an estimate of the true overall positive rate of a hybrid measure. The administrative data positive rate is P1. The remaining proportion (1- P1) is the administrative negatives that constitute the medical record review component. P2 is the false negative rate among administrative negatives. Now, from these definitions one can construct the formula for the overall final rate: P = P1 (1 - P1) * P2 (i.e., the administrative positive rate plus the proportion of administrative negatives that are false negatives). In this formula, although P2 is unknown, the plan has developed an estimate of P2 through the medical record review process. Based on the medical record review, one can define the plan’s estimate of P2 to be P3. One more term is needed to complete the formula, P4, which is the agreement rate among the sample of records chosen for validation (positive validations/sample size for validation). Based on the medical record review and the validation of 30 records, the audited estimate of P2 is P3 * P4.

As an additional clarification, it should be noted that the medical record review positive rate also could be considered to be an estimate of the false negative rate among administrative negatives. This leads to the following list of variables that will be considered:

Let P = true overall positive rate = P1 (1 – P1) * P2


P1 = Administrative Positive Rate = (# of Administrative Positives / n)

P2 = False Negative Rate = (# of True Positives among Administrative Negatives) / (# of Administrative Negatives)

n = sample size

Plans will have an estimate of P2, which was defined above as P3 (determined by medical record review) and auditors will have an estimate of P2 = P3 * P4 (which will be used in the validation)


P3 = Plan’s estimate of the False Negative Rate (or)

= (# of MRR Positives among Administrative Negatives in sample) / (# of Administrative Negatives in sample)

P4 = Validated Agreement Rate (or)

= (# of Validated Positives in a sample of MRR Positives) / n4

where n4 is the audit sample size (n4 = 30 in most cases).

Next, it is necessary to test whether the plan’s estimate of the overall positive rate is higher than the validated estimate of the overall positive rate by more than .05:

Is {P1 (1 – P1) * P3} – { P1 (1 – P1) * P3 * P4} > .05 ?

In terms of hypothesis testing, there is the null hypothesis, H0 and the alternate hypothesis, HA:

H0: {P1 (1 – P1) * P3} – { P1 (1 – P1) * P3 * P4} < .05


HA: {P1 (1 – P1) * P3} – { P1 (1 – P1) * P3 * P4} > .05

That is, it is assumed that the difference is less than or equal to .05, unless the null hypothesis is rejected.

P1 and P3 are treated as fixed quantities (i.e., the estimate of P4 is conditional on the values of P1 and P3). Therefore, it is only necessary to consider the sampling error associated with P4. The sampling error is present because the plan develops only one hybrid sample (usually 411) for the measure, and the validation sample of 30 records is drawn from this hybrid sample. The sampling error is accounted for in the T-test by the inclusion of the standard error calculation in the denominator of the formula. Consequently, there may be instances in which the difference between the plan’s estimated rate and the auditor’s estimated rate are greater than 5% and the null hypothesis is accepted.

Please note that this does not compromise the integrity of the T-test, which was developed so one can be 95% confident that the test is giving the right result.

The T-test formula is calculated as:

T = {(1 – P1) * P3 * (1 – P4) – .05} / {(1-P1) * P3 * [P4 * (1 – P4) / n4]1/2}

Note: This statistic is undefined under the following circumstances:

If P1 = 1 then there are no administrative negatives (nothing to test).

If P3 = 0 then the plan’s estimate of the false negative rate is zero (nothing to test).

If P4 = 1 then the audited rate is identical to the plan’s rate (don’t reject).

If P4 = 0 then an exact distribution is needed.

To view the MRR validation T-test, click here.

The final formula to derive the test statistic, as presented in a spreadsheet format for use by auditors, was modified by reducing and renaming some of the terms above, and by including the finite population correction factor. However, the basic elements and the relationships remain unchanged.