hazardous.data.make_synthetic_competing_weibull#

hazardous.data.make_synthetic_competing_weibull(n_events=3, n_samples=3000, return_X_y=False, base_scale=1000, feature_rounding=2, target_rounding=1, shape_ranges=((0.4, 0.9), (1.0, 1.0), (1.2, 3)), scale_ranges=((1, 20), (1, 10), (1.5, 5)), censoring_relative_scale=1.5, random_state=None)#

Generate a synthetic dataset with competing Weibull-distributed events.

For each individual, we first sample one pair of shape and scale parameters for each event type uniformly from the given ranges. Each event type has a different range of shape and scale parameters.

Then we sample event durations for each event type from the corresponding Weibull distribution parametrized by the sampled shape and scale parameters.

The shape and scale parameters are returned as features. For each individual, the event type with the shortest duration is kept as the target event (competing risks setting) and its event identifier and duration are returned as the target dataframe.

A fraction of the individuals are censored by sampling a censoring time from a Weibull distribution with shape 1 and scale equal to the mean duration of the target event times the censoring_relative_scale.

Setting censoring_relative_scale to 0 or None disables censoring. Setting it to a small value (e.g. 0.5 instead of 1.5) will result in a larger fraction of censored individuals.