by Andrea Peirce, Peter Lipsky, MD, and Benjamin D. Schwartz, MD, PhD

In October 2009, the Lupus Research Institute (LRI) convened a meeting of experts in New York to discuss critical issues in the design of clinical trials for new agents to treat systemic lupus erythematosus (SLE) and maximize the likelihood of successful outcomes. Attendees included representatives from academia and clinical practice as well as pharmaceutical and biotech companies developing products for immune-mediated diseases.

Through presentations and discussions, the meeting addressed a number of fundamental issues affecting trial design, including a drive to establish standard methodology for lupus clinical trials, the need for flexibility in trial design for such a heterogeneous disease, regulatory requirements for approval of new agents, and an interest in addressing the effects of any new product in the various clinical settings that may be encountered in practice. The need to adopt a standard set of outcome measures was a related concern, in the view of the different characteristics of the current instruments used to assess responses. While these measures have been assessed in clinical practice and research studies, their performance in clinical trials remains largely untested.

While falling short of reaching consensus on these matters, participants generated valuable discussion and insights on current practices and suggested potential directions in research into the design of clinical trials in lupus, which should prove useful for future clinical studies. Here are some highlights from the discussion.

Key Challenges

The heterogeneity of lupus poses a significant challenge for clinical trial design. Success in achieving statistical significance for new therapies in lupus trials needs to account for the marked heterogeneity of disease manifestations and severity of patient populations that, in some cases, may involve stratification with respect to:

Ethnicity. In seeking trial designs likely to yield statistically significant results, researchers may need to consider ethnicity more carefully as a factor that can affect outcomes.

Disease activity levels. Lupus can manifest with different levels and patterns of disease activity levels, such as relapsing and remitting versus persistently active disease, flare, or more active versus less active disease. Attendees addressed the question of whether there is any scientific, medical, or business rationale to focus trial design on flare prevention versus decreasing persistent disease activity of a patient on standard therapy.

Trial size. The larger the trial, the more likely that a therapeutic effect in a particular ethnic group will be lost in analyses of the whole population. The recent successful BLISS-52 was a large trial, with 90% power to detect differences between different treatments. Could a smaller trial have achieved similar success? With the current outcome measures and disease heterogeneity, are large trials necessary to achieve statistically significant results? Will the need for large trials hamper clinical development?

Other issues that pose challenges for achieving statistical significance in clinical trial design include duration of disease prior to treatment, specific organ systems involved, and current and past treatments for a disease with non-approved standard of care treatments and comorbidities.

Design Flexibility

Clinical trials need an acceptable outcome measure—a reasonable surrogate for an important clinical outcome—for a drug to obtain U.S. Food and Drug Administration (FDA) approval. Design flexibility, including the use of adaptive design, is considered a crucial factor in achieving successful outcomes.

It was noted that, for business reasons, pharmaceutical companies often seek the broadest possible indication(s) for a new product. Companies may tend to avoid smaller, focused trials and favor trials with larger numbers of relatively heterogeneous patients. Of note, although some recent trials entered subjects with relatively uniform patient profiles, these trials had little success.

The most successful trials were those of belimumab, which included large numbers of patients, permitted flexibility in background medication for a substantial portion of the trial, and employed a novel composite outcome that involved both measures of therapeutic success and the absence of deterioration. Whether this trial design or similar designs should become the paradigm for future lupus trials was discussed, but no consensus was reached.

Lessons Learned from Recent Trials

Rituximab: Genentech’s Study to Evaluate the Efficacy and Safety of Rituximab in Subjects With ISN/RPS Class III or IV Lupus Nephritis (LUNAR) and Rituximab in Patients with Severe SLE (EXPLORER) trials tested the anti-CD20 monoclonal antibody in two different patient populations: individuals with lupus nephritis and those with active non-renal lupus. Both trials were unsuccessful in achieving their primary endpoint. Their failure may have stemmed at least partly from clinical trial design issues (including designated co-therapy for each of the treatment arms), inclusion criteria (i.e., the patient population studied), or ineffectiveness of the drug.

Because it has been claimed—either anecdotally or from data for early trials that were small or not randomized—that rituximab is effective in some lupus patients, some conference attendees speculated that rituximab may be effective for a subset of patients, possibly those with a CD20-positive B-cell driven pathogenesis, though this subset has not been identified. It was also noted that SLE patients with periodic flares may differ from those with chronically active lupus, SLE patients from different racial groups may exhibit different pathologies, and conditions within a single patient can vary over time.

Attendees discussed the validity or practicality of sub-classifying SLE patients for highly targeted therapies, such as monoclonal antibodies, in future trials, and agreed that more targeted trials for these subgroups might be indicated if researchers can draw clear distinctions among these groups and if companies would be willing to narrow their potential drug markets, at least initially.

Belimumab: In Human Genome Sciences’ (HGS) Study of Belimumab in Subjects with SLE (BLISS-52) phase 3 trial, Benlysta (belimumab), a human monoclonal antibody that inhibits tumor necrosis factor SF13B or B-lymphocyte stimulator (BLyS), proved more effective than placebo in treating people with serologically active SLE.

The success of BLISS-52 may have resulted from a new design based on a post-hoc analysis of data from a disappointing phase 2 trial. In this analysis, HGS researchers identified a subpopulation of patients who had improved with treatment. The patients in this subpopulation had detectable anti-DNA antibodies and shared other characteristics as well: they were, on average, younger, were more likely to be African American, and had higher disease activity, more detectable serum BLyS levels, and higher serum immunoglobin G. The post-hoc analysis and identification of this subpopulation became the basis for a promising strategy that allowed the drug to demonstrate its efficacy.

In BLISS-52, HGS focused specifically on this seropositive subpopulation, and the conference group agreed that this focus likely contributed to the success of the trial. It was also noted that, in order to retain patients in the trial, HGS allowed participants wide latitude in using other medications and did not mandate a particular steroid dosing schedule. Several attendees noted that this approach may have handicapped belimumab, because some of belimumab’s efficacy might have been masked by the presence of these other medications in both the belimumab and “placebo” arms of the study. Despite this handicap, the trial met its primary outcome measure.

Importantly, the flexible criteria helped in the recruitment and retention of 865 patients in the study, providing a power of more than 90% for the study to achieve a statistically significant difference in its primary outcome measure between the belimumab and control groups. Because BLISS-52 recruited so many patients, the trial was able to show statistical significance despite relatively modest findings: 57.6% of the belimumab-treated patients met the primary composite outcome compared with 43% of the controls.

The successful HGS trial demonstrated that application of these particular trial design strategies can result in a statistically significant difference between a new agent and placebo on background standard-of-care treatment. The adoption of similar strategies as a general model for other lupus clinical trials was discussed, but no consensus was reached.

Definition of Improvement

Among outcome instruments used in clinical trials, the British Isles Lupus Assessment Group index (BILAG) is based on an intention-to-treat approach according to an extensive series of criteria to classify a
patient’s SLE manifestations arising from different organ systems (see Table 1, p. 36). The BILAG score was used as the primary outcome measure in the unsuccessful trials of Genentech’s rituximab and Bristol-
Myers Squibb’s abatacept in lupus, and is currently being used as the primary outcome measure in an ongoing trial of EMD Serono’s atacicept (TACI-Ig).

During the meeting, representatives of all three companies described their efforts to compensate for the BILAG’s limitations, including rigorous physician training programs, the use of adjudication panels to review scores to ensure the authenticity of any changes, and the use of simple questionnaires to allow more subjective measures. In the abatacept trial, the physicians’ subjective reports suggested that the drug was working, even though the BILAG scores and related analyses found no evidence of benefit. Other drug trials presented at the meeting also showed significant differences between physician opinions and BILAG outcomes, which reinforced the prevailing view that the BILAG index is not sufficiently sensitive to detect benefit from the new treatments. Particularly problematic may be the use of BILAG B events as outcome measures of lupus flares.

In contrast, HGS’s phase 3 trials of belimumab incorporated a combination of assays, characterized as a composite or anchored index, called the SLE Responder Index, or SRI. SRI included the Safety of Estrogens in Lupus Erythematosus: National Assessment version of the SLE Disease Activity Index (SELENASLEDAI) as a measure of drug efficacy and the BILAG Index and a physician’s global assessment as measures of patient deterioration.

Although the SLI approach worked in BLISS-52, some critics believe that the amalgamation of several outcome measures into one composite may create an uncertain foundation on which to base results. Some attendees believed that using any one of the component measures would have worked just as well.

Design Considerations

Clinical trial design considerations were extensively debated, with no clear conclusions; however, there was agreement on the key considerations for future trial design:

Heterogeneity versus homogeneity: Heterogeneity is inherent in lupus. One question discussed was whether it is advantageous in lupus clinical trials to study a more homogeneous population. In trials with heterogeneous populations, researchers must be careful not to overextend application of results. On the other hand, while study of a homogeneous population may produce “cleaner” results, these results are restricted to the particular population studied, and extrapolation to other populations may not be warranted. It was noted that large heterogeneous populations can be subdivided into smaller more homogeneous populations that could be studied in separate trials to detect clinical differences between the effects of study drug and placebo.

One participant noted that rheumatoid arthritis (RA) was considered a heterogeneous systemic disease 30 years ago. Today, trial entry is based on the number of active joints, suggesting that heterogeneity may not be important.

Another consideration was activity of disease among patients studied (e.g., more active versus less active patients). A targeted biologic, for example, which treats some features of lupus, might not show benefit in an active patient who is flaring but might show benefit in patients with chronic disease of lower activity, or vice versa. In the BLISS-52 phase 2 trial, the higher the baseline disease activity, the better the response was over time.

Drug mechanism: The success of the BLISS-52 trial suggests that drug mechanism may be critical to trial design. However, there is also the risk that some drugs might make patients worse. Hypothetically, an agent that is effective for active disease might make a patient with quiescent disease worse. Because lupus may have different pathogenic mechanisms in different people, it may be advantageous to select the trial population based on the drug’s mechanism. Participants at the meeting acknowledged that such a selection might be difficult to accomplish but that the approach should be considered.

The value of “withdrawal” trial designs: This is when all enrolled patients initially receive the drug for a period of time before one group switches to placebo. Though common in pediatric trials, withdrawal designs are rare elsewhere, partially because of the difficulty of structuring the trials to meet ethical, business, and FDA requirements in a chronic, slow-developing disease like lupus. Withdrawal trials are also difficult to conduct with a new drug because it is best to have some initial evidence that the drug is of some benefit before initiation of a withdrawal trial. In the absence of an approved “gold standard” therapy, it might be difficult to establish such benefit. Moreover, in the absence of gold standard therapy, superiority and not equivalency or non-inferiority trials are required.

New and different clinical design elements: Participants also discussed several other elements that could be used to inform the design of trials, including biomarkers, randomized delayed treatment design (placebo, then drug versus drug, then placebo in different groups), durability studies, and human observational studies.

Return on investment: Attendees agreed that companies sponsoring clinical trials should consider testing SLE drugs for narrower indications, noting that studies of rigorously defined patient subpopulations may offer lower initial financial returns but also present lower risks of outright failure.

Despite these variables and the remaining hurdles in refining outcome measures and design issues, researchers remain optimistic about the future of clinical trials for lupus. It was noted that the first biologic drugs for RA entered clinical trials in the late 1980s, but the first approval did not come until 1999, after a series of false starts and failures. Moreover, even after an effective clinical trial strategy was identified in RA, many products failed to meet the designated outcome measures. And lupus biologic drugs only entered clinical trials in the mid-1990s.

Andrea Peirce is editorial and communications director for the S.L.E. Lupus Foundation and Lupus Research Institute in New York. Dr. Lipsky is editor-in-chief of Nature Reviews Rheumatology. Dr. Schwartz is professor of clinical medicine at Washington University School of Medicine in St. Louis.