PERFORMANCE DIAGNOSTIQUE ET PRONOSTIQUE DES TESTS SANGUINS
PATIENTS AND METHODS
The study protocol of the present study conformed to the ethical guidelines of the current Declaration of Helsinki. All patients included in both cross-sectional and longitudinal cohorts gave informed written consent to participate.
Cross-sectional cohort
The purpose of this cohort was to evaluate and compare the diagnostic accuracy of the non-invasive fibrosis tests, and to develop the fibrosis classifications.
Patients
Patients with biopsy-proven NAFLD were consecutively included from January 2004 to June 2014 at Angers University Hospital and from October 2003 to April 2014 at Bordeaux University Hospital. NAFLD was defined as liver steatosis on liver biopsy after exclusion of concomitant steatosis-inducing drugs, excessive alcohol consumption (>210 g/week in men or >140 g/week in women), chronic hepatitis B or C infection, and histological evidence of other concomitant chronic liver disease. Patients were excluded if they had cirrhosis complications (ascites, variceal bleeding, systemic infection, or hepatocellular carcinoma).
Liver biopsy
In each centre, pathological examination was performed by a senior expert specialized in hepatology and blinded for patient data. Liver fibrosis was evaluated according to the NASH CRN scoring system [3]: F0 = no fibrosis; F1 = perisinusoidal or portal/periportal fibrosis, F2 = perisinusoidal and portal/periportal fibrosis, F3 = bridging fibrosis, and F4 = cirrhosis. Significant fibrosis was defined as F≥2, advanced fibrosis as F≥3, and cirrhosis as F4. Because previous longitudinal studies have demonstrated that liver-related prognosis is impaired when advanced fibrosis occurs [6, 7, 8, 13], and as recommended by the latest EASL guidelines [20], we chose advanced F≥3 fibrosis as our primary diagnostic target.
Blood fibrosis tests
Fasting blood samples were taken the day of or within the week preceding liver biopsy. Eight blood fibrosis tests were calculated according to published or patented formulas: NFS [12], BARD [21], FibroMeterS [22], APRI [23], FIB4 [24], Fibrotest [25], Hepascore [26], and FibroMeterV2G [27]. BARD, NFS and FibroMeterS were specifically developed for liver fibrosis assessment in NAFLD, whereas the 5 other tests were developed in patients with chronic viral hepatitis. FibroMeterV2G and Hepascore include both direct and indirect markers of liver fibrosis, whereas the 6 other blood tests include only indirect markers. NFS was interpreted according to published cut-offs [12]: patients with NFS results >0.676 are considered as having advanced fibrosis, those with NFS <-1.455 as having F0-2 stages, and those between the 2 cut-offs as having an indeterminate diagnosis (grey zone). All blood assays were performed in the laboratories of the Angers or Bordeaux centres. We have previously demonstrated the excellent inter-laboratory reproducibility of blood fibrosis tests [28].
Liver stiffness measurement
In each centre, LSM with Fibroscan was performed using the standard M probe by an experienced observer blinded for patient data. LSM was performed the day of liver biopsy or no more than 3 months around. Examination conditions were those recommended by the manufacturer [29]. LSM was stopped when 10 valid measurements were recorded and the result (kilo Pascals: kPa) was expressed as the median of these valid measurements.
Longitudinal cohort
The purpose of this prognostic cohort was to validate the clinical significance of the fibrosis classifications previously developed in the cross-sectional cohort. All NAFLD patients seen between January 2005 and December 2009 in the Hepatology Department of the Angers University Hospital for a non-invasive evaluation of liver fibrosis were retrospectively included. The follow-up started the day of the non-invasive testing and ended November 15th, 2014. The date and cause of death were obtained from the computerized National Registry of Individuals (CepiDC-Inserm, France). For some patients with unsuccessful individual matching with the national registry, mortality data were obtained from the hospital database, or from the concerned general practitioner.
Statistical analysis
In the cross-sectional cohort, diagnostic accuracy of fibrosis tests was evaluated using the classical indexes for binary diagnostic targets: AUROC, rate of well-classified patients according to the highest Youden index that maximizes sensitivity and specificity, and the rate of patients included in the intervals of ≥90% negative or positive predictive values (for the latter, see Supplementary Material for precise definitions). The diagnostic accuracy of fibrosis tests was also evaluated using the Obuchowski index [30]. The Obuchowski index is a multinomial version of the area under the receiver operating characteristics (AUROC) adapted to ordinal references such as pathological fibrosis staging. With N (=5: F0 to F4) categories of the gold standard outcome and AUROCst, it estimates the AUROC of diagnostic tests differentiating between categories s and t. The Obuchowski measure is a weighted average of the N(N-1)/2 (=10) different AUROCst corresponding to all the pair-wise comparisons between two of the N categories. In addition, the Obuchowski measure was assessed using a penalty function proportional to the difference in fibrosis stages, i.e., a penalty of 1 when the difference between stages was 1, 2 when the difference was 2, 3 when the difference was 3, and 4 when the difference was 4. Finally, the result can be interpreted as the probability that the non-invasive test will correctly rank two randomly chosen patients with different fibrosis stages.
In the longitudinal cohort, prognostic accuracy of fibrosis test was evaluated using the C-index of Harrell, as previously described [31]. Briefly, the Harrell C-index is an extension of the AUROC for time-to-event (survival) data and evaluates the concordance between the predicted risk of event and the observed survival time. Its results varies from 0 to 1: 1 shows a perfect concordance (discriminative power of the risk score), 0.5 shows random prediction, and a value less than 0.5 indicates discrimination in the opposite direction to that expected. Survival curves were determined using the Kaplan-Meier method and compared with the log rank test.
Statistical analyses were performed using SPSS version 18.0 software (IBM, Armonk, NY, USA) and SAS 9.1 (SAS Institute Inc., Cary, NC, USA). This study was reported in accordance with the recently published LiverFibroSTARD statements [32].
RESULTS
Cross-sectional cohort: diagnostic accuracy of blood fibrosis tests and LSM Patients
The flow chart of the cross-sectional study is depicted in the Figure 1. A total of 588 patients were included, 243 in the Angers centre and 345 in the Bordeaux centre. Patient characteristics at inclusion are detailed in Table 1. Mean biopsy length was 26±12 mm. Failure of LSM with no valid measurement occurred in 83 patients (14.1%). Median LSM result in the 505 remaining patients was 9.1 kPa (1st quartile: 6.4 kPa; 3rd quartile: 13.9 kPa). Finally, LSM and all 8 blood tests were available in 452 patients (core group). The prevalence of histological fibrosis stages was not significantly different between the core group and the 136 other patients.
Comparison of the diagnostic accuracy of the non-invasive fibrosis tests
AUROC – Blood tests and LSM were directly compared in the core group where all tests were available. For the diagnosis of advanced F≥3 fibrosis, the primary diagnostic target of the study, FibroMeterV2G had a significantly higher AUROC (0.817±0.020) than the 7 other blood tests (p≤0.025; Table 2, see Supplementary Table s1 for detailed pairwise comparisons). LSM has a significantly higher AUROC for advanced fibrosis (0.831±0.019, p≤0.041) than blood tests, except when compared to FibroMeterV2G (p=0.559).
Binary diagnosis of advanced fibrosis – The best diagnostic cut-off for advanced fibrosis was calculated for each fibrosis test according to the highest Youden index that maximizes sensitivity and specificity. LSM, FibroMeterV2G and Hepascore provided the highest rate of well-classified patients using this cut-off (around 73%, Table 3). The negative predictive values of the fibrosis tests were quite good, ranging from 76% to 90%, but the positive predictive values were insufficient with no more than 63% for the best test.
Intervals of reliable diagnosis – Due to the insufficient diagnostic accuracy obtained with a single diagnostic cut-off, we evaluated whether the fibrosis tests are able to give an accurate diagnosis in the largest rate of patients. In this setting, 2 diagnostic cut-offs have been published for NFS, the one for the exclusion (-1.455) and the other for the affirmation (0.676) of advanced fibrosis. 32.8% of patients had NFS <-1.455 and 18.2% had NFS >0.676, thus leaving the remaining 49.0% in the grey zone between these 2 thresholds. 83.8% of patients with NFS <-1.455 had F0-2 stages at liver biopsy (negative predictive value) and 72.6% of patients with NFS >0.676 had advanced fibrosis (positive predictive value). Finally, 41.4% of patients in the grey zone had advanced fibrosis.
To optimize both negative and positive predictive values, we calculated the thresholds of ≥90% positive or negative predictive values for each fibrosis test. In the core group, the rates of patients included in the grey zone between the 2 calculated thresholds, i.e., those for whom both negative or positive predictive values were <90%, were: BARD: 87.6%, APRI: 87.2%, Fibrotest: 81.9%, NFS: 78.5%, FibroMeterS: 70.1%, Hepascore: 64.8%, FIB4: 64.2%, FibroMeterV2G: 53.3%, LSM: 43.6% (Figure 2). Thus, LSM provided the lowest rate of patients in the grey zone (p≤0.001 vs blood tests). Among blood tests, the rate of patients included in the grey zone was the lowest using FibroMeterV2G (p<0.001 vs the 7 other blood tests). Detailed results and 90% predictive value thresholds are presented in Table s2.Obuchowski index – Beyond the binary diagnosis of advanced fibrosis, we used the Obuchowski index to evaluate the ability of fibrosis tests to discriminate individual fibrosis stages. Among blood tests, FibroMeterV2G had the highest Obuchowski index (0.798±0.016) with a significant difference compared to the 7 other tests (p≤0.036, Table 2; see Table s1 for detailed pairwise comparisons). LSM had a significantly higher Obuchowski index than blood tests (0.834±0.014, p≤0.001), except when compared to FibroMeterV2G that showed borderline significance (p=0.063).
Fibrosis classifications
Fibrosis classifications that give an estimation of the histological fibrosis stage from the non-invasive fibrosis test results have already been developed in chronic hepatitis C [15]. Such classifications are very useful for the correct interpretation of fibrosis tests results in clinical practice. However, all published fibrosis classifications are based on the Metavir fibrosis staging and none has been specifically developed for NAFLD using the NASH-CRN scoring system. We thus developed fibrosis classifications for Fibroscan and FibroMeterV2G in NAFLD. We chose these 2 tests because the previous results showed they were the most accurate, especially the Obuchowski index analysis that suggested they were the best to discriminate individual fibrosis stages.
Details of the methodology used to develop the fibrosis classifications are presented in Supplementary Material. The LSM fibrosis classification included 7 classes (F0/1, F1±1, F1/2, F2/3, F3±1, F3/4, F4) and the FibroMeterV2G one included 6 classes (F1±1, F1/2, F2/3, F3±1, F3/4, F4; Figure 3). The rate of well-classified patients by the LSM and the FibroMeterV2G fibrosis classifications was, respectively, 80.8% vs 77.4% (p=0.190). Discrepancy between the histological fibrosis stage and the fibrosis classification was ≥2 stages in only 2.8% of patients with LSM and 4.0% with FibroMeterV2G (p=0.362).
|
Table des matières
Introduction
Patients and methods
Results
Discussion
Conclusion
References.
Tables
Figures
Supplementary material
Télécharger le rapport complet