Evaluating the Outcomes of HIV Disease: Focus on Health Status Measurement

Richard A. Berzon, DrPH, and William R. Lenderking, PhD, are with Abt Associates Clinical Trials, Cambridge, MA. Dr. Berzon is Adjunct Lecturer in the Department of Epidemiology and Public Health, Yale University School of Medicine. Dr. Lenderking is Assistant Professor of Psychiatry, Harvard Medical School.




Human immunodeficiency virus (HIV) is a retrovirus that gradually impairs the immune system by destroying CD4 lymphocyte cells which are essential to the body's ability to eliminate infective organisms. As a result of impaired immune function, patients develop the opportunistic infections and malignancies which characterize acquired immune deficiency syndrome (AIDS)--the final stage of HIV infection (see Table 1). It is from these opportunistic infections and malignancies that people with AIDS eventually die. The increasing progression of disease is often accompanied by fatigue, pain, cognitive impairment, and psychological and other difficulties, and thus has a profound effect on quality of life.

Recent advances in the understanding of how HIV infects and replicates in cells have led to changes in clinical care. It is currently thought that initiating treatment with a potent combination of antiretroviral drugs (see Table 2) as soon as possible after infection with the virus may be critical to preserving key elements of the immune system (Stephenson, 1998). Because treatments for early infection that have both beneficial and adverse effects are now offered to subjects who exhibit no signs of illness, healthcare providers need to be able to differentiate these effects from one another and from the natural history of the disease to assess treatment strategy effectiveness.

A full appreciation of the impact of therapeutic regimens on patients with HIV disease requires the assessment of a comprehensive set of outcomes. These include clinical disease, laboratory markers, health-related quality of life (HRQL), and economic outcomes, as well as the interrelationships of these variables. Despite the increased use of highly active antiretroviral therapy, or HAART, and associated dramatic health benefits, health status measurement should remain a staple of outcomes research in HIV/AIDS. This is so because health status and HRQL are of vital importance to patients, because health status may be one of the only measures able to detect subtle differences associated with therapies in the same class, and because health status and HRQL predict adherence to therapy.

The purpose of this state-of-the field overview is to review and highlight existing and newly developed HIV-specific and generic HRQL measures; and to identify psychometrically sound instruments that can be used to assess HRQL in HIV disease for both clinical monitoring and clinical trial research. HRQL instruments (or questionnaires) characterize and measure what persons experience as a result of receiving medical care, and we have examined these instruments with the understanding that there is general agreement on how the term is conceptualized. HRQL refers to subjects' appraisals of their current level of, at minimum, physical, psychological, and social functioning and satisfaction with that functioning compared to their ideal (Levine, 1987). The concept, therefore, is synonymous with subjective health status assessment and points to those aspects of a person's experience which are affected only by health care interventions and disease-related processes.

Selecting an Instrument for Use in HRQL Studies

There have been few direct empirical comparisons of QOL measures to guide one's decision regarding which instrument or scale to select for a given purpose (Burgess et al., 1993). However, there are a number of practical and theoretical considerations which can inform this decision.

First, what is the purpose of collecting these data? For example, is the information needed to evaluate therapies in a clinical trials context; to provide ongoing information about patient status in a clinical, perhaps office-based, environment; or to guide clinical management and decision-making?

Predominant considerations in the clinical trials context might include the level of reliability, the length of the scale, and the coverage of domains that are likely to be affected by changes in health status resulting from disease manifestations and side effects of treatments under study. In terms of tracking patient status over time in a clinical environment, if data were to be included in a chart, for example, coverage of domains that might not be relevant in the clinical trials context--such as the patient's relationship with the physician--ought to be considered for inclusion in this context. In addition, it might be desirable to be able to compare results against norms; therefore, choosing an instrument which has been used previously in a similar population might be important.

Within the context of clinical management, there are several competing considerations. One is the balance between ease of interpretation and richness of information. In such a context, the physician or other health care provider must be able to extract information rapidly from the instrument. To move beyond simple examination of the content of critical items (an approach whose value should not be overlooked), scoring and perhaps even automated interpretation should be available. In addition, other areas not as relevant to clinical trials--such as the "usual" QOL domains, spirituality or changes in financial status, for example--might be particularly important in the clinical management context because these might lead to interventions of considerable importance to the patient.

Another set of considerations in choosing a QOL instrument is more practical. Which instruments have been culturally adapted and translated into other languages? Are scoring algorithms readily available? How much experience do site or study personnel have with particular instruments? Do the instruments exist in formats compatible with the approaches to data collection that have been proposed for the study under consideration? To our knowledge, none of the instruments reviewed have been programmed into A-CASI, a computer-based approach to assessment, which is increasingly being recognized as a valuable data-gathering tool.

In sum, choosing a QOL instrument should be finalized only after all the relevant information has been gathered and assessed. A modular approach is recommended, as this encourages a flexible and thoughtful evaluation of the domains and concepts required for the specific situation and purpose under consideration. No existing instrument is suitable for all purposes; and consultation with a quality-of-life expert during the planning stages of study design and database development is advised.

HIV-Specific HRQL Instruments

The HRQL measurement tools we reviewed are presented in Table 3. Nearly all of the instruments are described in depth in the special issue of the journal Quality of Life Research edited by Berzon and Leplege (QOLR 1997;6(6)). All citations refer to that issue unless otherwise noted.

The following brief narratives highlight key information pertaining to the questionnaires identified by the authors of this state-of-the-field review.


This instrument is the only one reviewed that is explicitly intended to be part of the MOS family of scales (QOLR 1997;6(6):481-493). Wu's approach to development was to add 10 HIV-relevant items from the MOS item pool to the SF-20 based on conversations with patients, clinical trial participants and providers. These include measures of cognitive impairment, energy/fatigue, health distress and quality of life. Additional items--including four additional health perception items and one additional pain item--were subsequently added to increase reliability. Psychometric work with the MOS-HIV has demonstrated that it has adequate internal consistency, and it has been translated into 14 languages. The instrument has limitations however: it has had limited use in populations of women and injection drug users (IDU's); ceiling effects in certain scales (such as physical functioning) may make it impossible to detect improvements; and although it was originally developed for HIV-infected persons, it remains a generic instrument and needs to be supplemented with additional scales. While these criticisms are specific to this questionnaire, some or all of these comments can also be applied to other instruments reviewed in this article.

General Health Self-Assessment

The General Health Self-Assessment was developed by Testa and Lenderking for use in the AIDS Clinical Trials Group (ACTG), and has been used in five large clinical trials (175, 193A, 229, 241, and 286) in over 4,000 patients who represent the spectrum of HIV disease (QOLR 1997;6(6):515-530). The modular approach to developing this scale was also used as the template for the quality-of-life measure used in the Pediatric Late Outcomes Study (ACTG 219). The GHSA was developed in 1991 to address perceived shortcomings in other available measures at the time, and therefore incorporated a symptom distress scale, a brief health care utilization scale, and a different approach to measuring perceived health. The scale was conceived to measure physical, psychological, and role/social functioning, perceived health, health care utilization, and symptom distress. Psychometric analysis in patients with early disease indicated that although the a priori domains were internally consistent and valid, the psychological functioning module could be further sub-divided into factors for the purposes of some analyses. This scale is psychometrically sound and valid, and has been translated into Spanish, French, and Haitian Creole. However, the ACTG Outcomes Committee has recently developed a new symptom checklist and has been working with another QOL scale with fewer items based on the HIV-PARSE; it is unlikely that the GHSA will be proposed for any new ACTG studies, although its psychometric properties suggest that it is potentially useful in a variety of settings.


The ACTG Outcomes Committee has developed a short, 21-item questionnaire based on the HIV-PARSE (QOLR 1997;6(6):536-537). The data reduction approach was designed to preserve the multi-dimensional structure of the larger scale from which it was derived. The internal consistency of the derived scales were lower than those observed in the larger scale, but still acceptable. The scale has been incorporated into ACTG studies of antiretroviral regimens (276, 290, 298, 320, 328, and 368), in trials for complications of HIV including wasting (313 and 329), candidiasis (323), cryptosporidiosis (336), AIDS dementia complex (301) and for MAC prophylaxis (362). The brevity of the scale results in a tradeoff between measurement precision and convenience. Many of the studies listed are ongoing; it remains to be seen whether the ACTG SF-21 will prove to be responsive to treatment effects of the size usually observed in HIV clinical trials.

Health-Related Quality of Life Questionnaire

This measure was developed as an interview by the investigators of the Boston Health Study (Cleary et al., 1993) and has been used successfully as a self-administered questionnaire (Lenderking et al., 1994). This instrument and the MOS-HIV are the only two measures that have been directly compared empirically to our knowledge (Burgess et al., 1993). The questionnaire included physical functioning and disability, perceived health, life satisfaction, preferences for resuscitation and aggressive treatment, well-being and fatigue, depression and depressive symptoms, social support, sexual functioning and satisfaction, disease-related symptoms, and desire to live. More than any other measure herein reviewed, this questionnaire represents a battery of scales. To the extent that this is a meaningful distinction, the "battery" approach reflects the combination of existing scales, whereas the modular approach focuses on using measures of particular concepts. This instrument demonstrates very good reliability, and has been shown to be related to a number of meaningful disease-related concepts. The measure of physical functioning explicitly asks about days not in the hospital, a distinction not always made clear in the other measures reviewed. The broad coverage of domains of relevance to patients would suggest that this instrument would be very useful in the clinic setting. It is not clear that the symptom list represents the ideal symptom checklist; particularly since a number of new medications have come on the market since the list was originally developed. Although many of the components of the questionnaire have been translated, other parts of it have not.

Quality of Well-Being (QWB) Scale

The QWB Scale is an interviewer-administered, preference-weighted, decision-theory based measure that summarizes outcomes as a single aggregated quality-adjusted life year (QALY) score (QOLR 1997;6(6):507-514). Three scales of function are combined with a symptom measure to produce a point in time expression of well-being that is valued between 0 (for death) and 1.0 (for complete and asymptomatic function). The measurement unit, or utility weight, is multiplied with the time under observation or projected time under observation to obtain a quality-adjusted life year (QALY). The QALY is defined as the amount of time free from symptoms equivalent to one year at the current health state.

The QWB has been used across numerous illnesses for some 20 years; its use in HIV-infected persons was recently evaluated at UC San Diego within a cohort of 400 HIV-positive and 114 HIV-negative males. Validation data from that study reveal associations between the scale and CD4+ lymphocytes and mortality; and between ratings of neurological and neuropsychological impairment. However, further evaluative testing within the HIV-infected population seems warranted prior to this measure's widespread use within this population.

The HIV QOL (HIV-QL31) Questionnaire

The HIV-QL31 measures the impact of illness experienced by HIV-infected subjects from their own perspective (QOLR 1997;6(6):585-594). The instrument measures a range of health states (including pain, sexual function, self-esteem, life fulfillment and others) that are consistent with mild to moderate illness.

Through a combination of patient interviews and psychometric analysis (including item response methodology and Rasch modeling), a 31-item questionnaire was developed. A single score is calculated by summing the dichotomous (yes/no) responses. The score was found to be highly reliable (Cronbach's alpha of 0.93) and was found to be discriminant with respect to illness severity (presence/absence of CMV infection). Responsiveness to change has not been assessed. The HIV-QL31 is available in English and French.


The EORTC QOL Questionnaire is a 30-item, cancer specific questionnaire developed by the European Organization for Research and Treatment of Cancer; it was specifically designed for use in cancer clinical trials (deBoer et al., 1994). This self-report questionnaire includes nine subscales (physical, role, emotional, cognitive and social functioning, pain, fatigue, nausea and vomiting, and overall quality of life) and six individual items (shortness of breath, sleeping, appetite, diarrhea, constipation, and financial difficulties). A 20-item AIDS module can be added to the core instrument: these items refer to HIV-specific symptoms and symptoms related to treatment with antiretroviral or interferon therapy.

The QLQ-C30 has been employed in two longitudinal AIDS studies. The first study involved 156 patients with asymptomatic HIV infection and AIDS; and 111 asymptomatic, early symptomatic and AIDS subjects participated in the second study. In both studies, the emotional functioning, pain, fatigue and overall QOL subscales exceeded the .80 reliability standard for group comparisons. With respect to known groups validity, the physical functioning scale distinguished early symptomatic and AIDS subjects. Symptomatic subjects reported poorer emotional functioning compared to asymptomatic and AIDS patients; and this was found to be the case with the overall QOL subscale as well. The physical, role, social functioning, fatigue and overall QOL subscales were able to distinguish between patients with a high vs. low Karnofsky score.

In the study of asymptomatic and AIDS patients, the QLQ-C30 was able to detect changes over time as follows: over a one year time period, patients experienced a decline in physical functioning and an increase in symptoms such as fatigue, pain, shortness of breath, lack of appetite and constipation. In addition, patients' overall QOL declined.

The EORTC QLQ-C30 is a psychometrically sound and reasonably reliable instrument for assessing the HRQL in subjects with HIV infection. Additional research is needed to examine more thoroughly the questionnaire's responsiveness to clinically important changes in patient health status. The QLQ-C30 is available in multiple languages.

Functional Assessment of HIV (FAHI) QOL Instrument (revised)

The FAHI Instrument (version 3) is part of the health profile based Functional Assessment of Chronic Illness Therapy (FACIT) measurement system which has been developed by Cella and colleagues over the past 10 years (QOLR 1997;6(6):572-584). The questionnaires assess generic and disease-specific HRQL for patients with chronic illnesses, including cancer, multiple sclerosis, Parkinson's disease and HIV infection.

Through a combination of factor analysis and Rasch modeling, five subscales reflecting HRQL dimensions within HIV/AIDS were created. Psychometric validation of the instrument is discussed in a recent paper by Peterman et al. (1997). The content of the subscales includes physical well-being (10 items, alpha=0.91); function and global well-being (13 items, alpha=0.86); emotional well-being/living with HIV (10 items, alpha=0.82); social well-being (8 items, alpha=0.73); and cognitive functioning (3 items; alpha=0.75). From both general illness- and HIV/AIDS specific-concerns, a total score can be calculated for the FAHI (44 items, alpha=0.91). Construct validity, known groups validity and sensitivity to change were demonstrated by significant associations between the FAHI and indicators of functional status, psychological symptoms, stress and illness severity. The FAHI appears to be psychometrically sound and can be used within the HIV-infected population now.


The AIDS Time-Oriented Health Outcome Study (ATHOS) questionnaire was developed for the study of the same name: a longitudinal, observational study of HIV-positive and at-risk patients in the practices of community-based physicians (QOLR 1997;6(6):494-506). The questionnaire, which the researchers labeled the AIDS-HAQ (there is another questionnaire called the HAQ to measure QOL in patients with arthritis), consisted of eight dimensions largely drawn from the MOS: disability, energy, general health, pain, cognitive functioning, mental health (well-being), social functioning, and health distress. In addition, a list of 68 symptoms was included. The internal consistency of the scales ranged from .79 to .89. The disability scale consists of 23 items, although several of these items might be otherwise classified as physical functioning (walking, activities) or cognitive functioning (memory). Various subscales were sensitive to changes when participants transitioned from asymptomatic to symptomatic status or progressed to AIDS, as expected. Among those participants whose CD-4 count dropped 20% over a 6-month period, there were differences in health status. In summary, the AIDS-HAQ is reliable and responsive to changes in clinical status over time.

Multidimensional QOL Questionnaire for HIV/AIDS (MQOL-HIV)

The MQOL-HIV Questionnaire was recently developed and is designed to provide a comprehensive assessment of HRQL in people who are HIV-positive (QOLR 1997;6(6):555-560). The 40-item instrument measures 10 domains relevant to HIV infection; these include mental health, physical health, physical functioning, social functioning, social support, cognitive functioning, financial status, partner intimacy, sexual functioning and medical care. An overall HRQL score, the MQoL Index, is a weighted composite of two domain scores.

With respect to known groups validity, the instrument distinguished symptomatic AIDS and asymptomatic HIV-positive cases in overall HRQL and in seven separate HRQL domains in a sample of 216 HIV-infected men and women. In addition, the index was responsive to perceived HRQL changes over 5.5 months in asymptomatic persons. The MQoL was found to be less susceptible to ceiling effects in asymptomatic cases when compared with the MOS SF-20. This instrument is psychometrically sound and can be used in both asymptomatic and symptomatic HIV-infected persons now.

HIV/AIDS-targeted quality of life (HAT-QOL)

This questionnaire was developed in response to the concern that the available measures of QOL for HIV-infected persons had not sufficiently incorporated the input of HIV-infected persons during the scale development process (QOLR 1997;6(6):561-571). By obtaining qualitative information from HIV-positive individuals regarding item content, the developers hoped to avoid some of the psychometric problems seen in other scales, such as ceiling effects in asymptomatic patients. The developers of the scale had patients review an initial 83-item questionnaire, which was reduced to 76 items. This questionnaire was administered to a sample of 201 HIV-positive individuals; further psychometric analysis eliminated 34 more items, leaving a 42-item questionnaire assessing the following domains: overall function, sexual function, disclosure worries, health worries, financial worries, HIV mastery, life satisfaction, medication concerns, and provider trust. The questionnaire has the advantage of all items being scored using the same rating scale. Six of the nine dimensions showed adequate internal consistency (HIV mastery, sexual function, and medication concerns were all a < .70), and one dimension (provider trust) showed a substantial ceiling effect. Among asymptomatic HIV-positive subjects (n = 106), there were more differences detected by the subscales for demographic variables than for disease-related variables, making it difficult to evaluate the questionnaire's ability to differentiate across levels of severity of HIV disease. Further work is needed with this scale prior to recommending its use in clinical trials; however, the scale does assess areas of novel content that are relevant to persons living with HIV/AIDS.

Summary and Future Directions

The impact of therapeutic regimens on patients with HIV disease can be fully evaluated only through an assessment of a comprehensive set of health outcomes. These include clinical disease, laboratory, HRQL, and economic outcomes. We have focused on health status measures in this state-of-the-field overview because HIV disease affects many aspects of patient's lives and we believe these types of outcome measures best characterize--from the patients' perspective--what persons experience as a result of receiving health care. Although our review has focused on paper-and-pencil approaches to QOL assessment, other approaches such as the Q-TWiST method for HRQL assessment in HIV clinical trials should be considered in some circumstances (Gelber et al, 1992; Lenderking et al., 1994).

The most current HIV-specific measures available have been reviewed, and practical and theoretical considerations that inform instrument and scale selection have been discussed. Our advice is to take a modular approach to instrument selection, which focuses on the desired concepts to be measured, rather than taking a questionnaire off the shelf and assuming that it will be able to measure all the dimensions that need to be measured for a given study. Taking this approach encourages the use of existing instruments or scales where possible, and supplementing these with additional indices, such as measures of symptoms or health care utilization where the study requires them. In some populations (for example, adolescents) and for some unique circumstances (for example, reasons surrounding patient non-adherence to treatment regimens), new scales may need to be developed.

Future work in the field should emphasize improving psychometric responsiveness of existing measures, rather than development of new measures. In particular, a measure's ability to detect small treatment effects should be an area of focus. Other new topics which are likely to affect the field include the development of item banks and the increasing use of psychometric techniques, such as Rasch modeling.

Further research topics should include clinical relevance of health status scores. For example, what does a decrease of 10 points in physical functioning mean? What interventions should occur as a result of this change? How can these scores be tied to a clinically meaningful benchmark so that physicians can more readily interpret, understand, and accept their meaning? Finally, a brief instrument does not necessarily imply a quality instrument; and while brevity is appropriate in certain circumstances, all situations in which an HRQL measure is being considered for use must be thoughtfully evaluated prior to instrument selection.



