Clustered or longitudinal data are commonly encountered in clinical trials and observational studies. This type of data could be collected through a real-time monitoring scheme associated with some specific event, such as disease recurrence, hospitalization, or emergency room visit. In these contexts, the cluster size could be informative because of its potential correlation with disease status, since more frequency of observations may indicate a worsening health condition. However, for some clusters/subjects, there are no measures or relevant medical records. Under such circumstances, these clusters/subjects may have a considerably lower risk of an event occurrence or may not be susceptible to such events at all, indicating a nonignorable zero cluster size. There is a substantial body of literature using observations from those clusters with a nonzero informative cluster size only, but few works discuss informative nonignorable zero-sized clusters. To utilize the information from both event-free and event-occurring participants, we propose a weighted within-cluster-resampling (WWCR) method and its asymptotically equivalent method, dual-weighted generalized estimating equations (WWGEE) by adopting the inverse probability weighting technique. The asymptotic properties are rigorously presented theoretically. Extensive simulations and an illustrative example of the Assessment, Serial Evaluation, and Subsequent Sequelae of Acute Kidney Injury (ASSESS-AKI) study are performed to analyze the finite-sample behavior of our methods and to show their advantageous performance compared to the existing approaches.
Keywords: clustered data; generalized estimating equation; informative cluster size; inverse probability weighting; within-cluster resampling.