NCQA Comments on eMeasure Certification RFI

NCQA feedback on the Request for Information on Certification Frequency and Requirements for the Reporting of Quality Measures under CMS Programs.

January 27, 2016

Centers for Medicare & Medicaid Services
Department of Health and Human Services

Attention: CMS-3323-NC

The National Committee for Quality Assurance (NCQA) thanks the Center for Medicare & Medicaid Services (CMS) for the opportunity to provide comments to the Request for Information on Certification Frequency and Requirements for the Reporting of Quality Measures under CMS Programs. NCQA applauds the effort to streamline electronic reporting and reduce burden on providers and health information technology (IT) developers.

Refining certification criteria for health IT, as well as electronic clinical quality measures (eCQMs), will enhance provider capabilities to report accurate, timely, and high-quality data used in CMS’ value-based payment programs. This effort will become more important as the Administration begins implementing changes to Medicare payments according to the statutory requirements of the Medicare Access & CHIP Reauthorization Act of 2015 (MACRA). NCQA offers more specific RFI comments below:

Frequency of Certification

It is NCQA’s experience that, due to frequent updates to medical codes and measure specifications, virtually all measures change every year. Although a small number of measures may remain the same, additional testing and recertification of all measures is necessary to ensure accuracy. There may be burden for some vendors to certify annually for new or updated measures; however, many systems (like NCQA’s) are completely automated.

Burden is greatly reduced when the testing process uses an automated approach, and when testing covers all measures every year. Manual data entry does have value for testing usability, but NCQA’s experience with measure validation within software logic is primarily based on automation. Automating electronic measure validation has led us to a standard format that is flexible, consistent and comprehensive, and has greatly reduced the burden involved in the testing and measure validation process.

NCQA also believes that, for the purposes of payment, all stakeholders would benefit from
alignment across measure development and testing cycles. There should also be alignment of
measures with the same clinical focus. The current landscape of asynchronous cycles and disparate versions of testing tools not only creates unnecessary administrative burden, it produces confusing results that are not comparable. NCQA believes that aligning these cycles and standardizing testing tools will improve the quality of results for measures used in value-based payment and reduce the administrative burden of quality reporting.

CQM Testing and Certification

What changes to testing are recommended (or are not recommended) to increase testing robustness?

Concerning calculating the CQMs, NCQA believes that a thorough testing program tests each
measure individually – one test deck for every measure. Each test deck should contain sufficient records to test every possible situation specified by the CQM. For example, there should be several records to account for the various permutations of sex, age, date range, diagnosis, procedure, etc. It’s also important to include records for incorrect cases to ensure that the target system can select the correct records and cases accurately from the complete set of data.

How could CMS and ONC determine how many test cases are needed for adequate test coverage?

As stated above, at a minimum, there should be hundreds or even thousands of records or cases for each measure to account for the various permutations of correct and incorrect results (e.g., wrong diagnoses, ages, dates of service). The size of the test deck should encompass these minimums.

Are there recommendations for the format of test cases that could be entered both manually and electronically?

While manual data entry has value in terms of testing the usability of a system, NCQA’s experience with measure validation within software logic is primarily based on leveraging automation. We believe that automating electronic measure validation has led us to a standard format that is flexible, consistent and comprehensive. Test deck file formats are important only in that they contain all the required data for the tester to load and use. Consistency is important to ensure that all participants can share information, particularly CQM testing output. We believe that the CCDA standard meets the necessary requirements as a format to support automated testing of encoded software logic within any HIT system.

What kind of errors should constitute warnings rather than test failures?

In a good testing program, there are only test case failures and successes. If good data are loaded, calculated, and reported, there should be no warnings. Input errors should be fixed to ensure all test data are used, calculations are exact, and output data are correct and formatted appropriately. NCQA believes that only the “mapping” of real data from the source into the files that are used for calculation is open to individual variations. In other words, the work flow in one EHR may be different from that in another, and requires specific data mapping to ensure the files used for calculation both include the right diagnosis codes. Only this mapping process (outside of testing) is open to variation and not “testable.” Testing the software does not ensure accurate mapping of data at the implementation site and would require an additional level of validation. In addition to validating the software logic, NCQA uses an audit process to review data mappings. This process verifies the integrity of data to ensure accurate measure results.

Are there recommendations for or against single measure testing?

NCQA strongly recommends single measure testing. Most measures are far too complex and specific to be tested with global cases. We also believe that test decks for each measure should be infinitely generated – i.e., if one deck fails, then another different and unique deck is supplied for testing. NCQA typically creates 5-10 test decks for each measure to account for the number of times a potential target system may need to be tested to pass each measure specification without error.

How could the test procedures and certification companion guides published by ONC be improved to help you be more successful in preparing for and passing certification testing?

Our experience trying to follow the CMS implementation guides indicates that there is a good deal of work that CMS could employ to improve the process. It would benefit CMS to have a new vendor walk through the eCQM implementation process and observe the number of times they find misinformation (or conflicting or confusing information) about the same measures. This process would help CMS and vendors understand which steps of the implementation process need the most improvement.

How can the CQM certification process be made more efficient and how can the certification tools and resources be augmented or made more usable?

In our 15+ years of validating electronic quality measures, we have found that although there is an initial “lift” (burden) for the vendor’s target system to consume and program a new measure, each successive measure becomes substantially less burdensome. Because our system is completely automated, the process of consuming a data file takes minutes, compared to the current manual measure data input that could take hours. For example, a vendor could (after a data parser and consumer is built) test a measure within minutes of loading a test deck in our measure validation system. Feedback to the vendor is real-time when the response file is loaded. We strongly encourage CMS to consider such a system if more robust testing is pursued.

What, if any, adverse implications could the increased certification standards have on providers?

NCQA believes that rigorous testing can only improve consistent data collection and reporting, leading to accurate and comparable information that improves health care rendered. All efforts to ensure correct data and results are worthwhile if it means valid, comparable results and better care.

What levels of testing will ensure that providers and other product purchasers will have enough information on the usability and effectiveness of the tool without unduly burdening health IT developers?

NCQA believes that only thorough, consistent measure-level testing results in a good product that is effective and usable for quality improvement efforts, gaps-in-care analysis, and external reporting of results.

Would flexibility on the vocabulary codes allowed for test files reduce burden on health IT

Test cases should consistently include, and test for, the vocabulary codes established in the CQM. As stated above, NCQA believes that only the “mapping” of real data from the source into the files that are used for calculation is open to individual variations; e.g., mapping a LOINC code in the EHR system to a SNOWMED code in the value set. This mapping process (outside of testing) is open to variation and not “testable,” but should be assessed.

What are other ways in which the Cypress testing tool could be improved?

As we stated in our comments to the RFI from CMS regarding the Testing Frequency of measures, we identified that Project Cypress is not an adequate system to test measures employed within target systems. Our experience with Project Cypress confirmed that it is a sufficient tool to help measure and software developers during the development phases of eCQMs; however, the system is not robust enough, or scalable, to meet the level of rigor necessary to test adequately the logic needed for complex eCQM specifications.

When 45 CFR 170.315(c)(1) requires users to export quality measure data on demand, how would you want that to be accessed by users and what characteristics are minimally required to make this feature useful to end users?

NCQA believes that certifying a measure means that the vendor can calculate the measure as written, but that the vendor can also provide the ability to modify the “filters” – e.g., if the measure requirement is to report all patients seen in a calendar year that are >65, and had HTN, the vendor should be able to change filters for QI projects that allow the user to change the periods, the ages, and the criteria for HTN.

Thank you again for the opportunity to provide feedback on this important issue. If you have any questions, please reach out to Joe Castiglione, Federal Affairs, at or (202) 955-1725.



Rick Moore, PhD
Chief Information Officer

  • Save

    Save your favorite pages and receive notifications whenever they’re updated.

    You will be prompted to log in to your NCQA account.

  • Email

    Share this page with a friend or colleague by Email.

    We do not share your information with third parties.

  • Print

    Print this page.