Suboptimal reliability of liver biopsy evaluation has implications for randomized clinical trials

Beth A Davison; Stephen A Harrison; Gad Cotter; Naim Alkhouri; Arun Sanyal; Christopher Edwards; Jerry R Colca; Julie Iwashita; Gary G Koch; Howard C Dittrich

doi:10.1016/j.jhep.2020.06.025

Suboptimal reliability of liver biopsy evaluation has implications for randomized clinical trials

J Hepatol. 2020 Dec;73(6):1322-1332. doi: 10.1016/j.jhep.2020.06.025. Epub 2020 Jun 28.

Authors

Affiliations

¹ Momentum Research, Inc., Durham, NC, USA. Electronic address: bethdavison@momentum-research.com.
² Hepatology, Radcliffe Department of Medicine, University of Oxford, UK.
³ Momentum Research, Inc., Durham, NC, USA.
⁴ Texas Liver Institute, San Antonio, TX, USA.
⁵ Virginia Commonwealth University School of Medicine, Richmond, Virginia, USA.
⁶ Cirius Therapeutics, Inc., San Diego, CA, USA.
⁷ University of North Carolina, Chapel Hill, NC, USA.

PMID: 32610115
DOI: 10.1016/j.jhep.2020.06.025

Abstract

Background & aims: Liver biopsies are a critical component of pivotal studies in non-alcoholic steatohepatitis (NASH), constituting inclusion criteria, risk stratification factors and endpoints. We evaluated the reliability of NASH Clinical Research Network scoring of liver biopsies in a NASH clinical trial.

Methods: Digitized slides of 678 biopsies from 339 patients with paired biopsies randomized into the EMMINENCE study - examining a novel insulin sensitizer (MSDC-0602K) in NASH - were read independently by 3 hepatopathologists blinded to treatment code and scored using the NASH CRN histological scoring system. Various endpoints were computed from these scores.

Results: Inter-reader linearly weighted kappas were 0.609, 0.484, 0.328, and 0.517 for steatosis, fibrosis, lobular inflammation, and ballooning, respectively. Inter-reader unweighted kappas were 0.400 for the diagnosis of NASH, 0.396 for NASH resolution without worsening fibrosis, and 0.366 for fibrosis improvement without worsening NASH. In the current study, 46.3% of the patients included in the study based on 1 hepatopathologist's qualifying reading were deemed not to meet the study's histologic inclusion criteria by at least 1 of the 3 hepatopathologists. The MSDC-0602K treatment effect was lowest for those histologic features with lower inter-reader reliability. Simulations show that the lack of reliability of endpoints and inclusion criteria can drastically reduce study power - from >90% in a well-powered study to as low as 40%.

Conclusions: The reliability of hepatopathologists' liver biopsy evaluation using currently accepted criteria is suboptimal. This lack of reliability may affect NASH pivotal studies by introducing patients who do not meet NASH study entry criteria, misclassifying fibrosis subgroups, and attenuating apparent treatment effects.

Lay summary: Since liver biopsy analysis plays such an important role in clinical studies of non-alcoholic steatohepatitis, it is important to understand the reliability of hepato-pathologist readings. We examined both inter- and intra-reader variability in a large data set of paired liver biopsies from a clinical trial. We found very poor inter-reader and modest intra-reader variability. This result has important implications for entry criteria, fibrosis stratification, and the ability to measure a treatment effect in clinical trials.

Keywords: Diabetes Mellitus; Histology; Insulin Resistance; Non-alcoholic fatty liver disease; Type 2; Validation studies.

Publication types

Clinical Trial, Phase II
Randomized Controlled Trial
Research Support, Non-U.S. Gov't

MeSH terms

Acetophenones / pharmacology
Biopsy* / methods
Biopsy* / standards
Diabetes Mellitus, Type 2* / diagnosis
Diabetes Mellitus, Type 2* / drug therapy
Disease Progression
Female
Humans
Hypoglycemic Agents / pharmacology
Insulin Resistance
Liver / pathology*
Liver Cirrhosis / etiology
Liver Cirrhosis / pathology*
Male
Middle Aged
Non-alcoholic Fatty Liver Disease / complications
Non-alcoholic Fatty Liver Disease / pathology*
Non-alcoholic Fatty Liver Disease / therapy
Prognosis
Reproducibility of Results
Research Design
Risk Assessment / methods
Thiazolidinediones / pharmacology

Substances

Acetophenones
Hypoglycemic Agents
MSDC-0602
Thiazolidinediones