Survival Analysis Tool

The Survival Analysis tool implements common methods of survival analysis. Survival Models model the time until occurrence of an event (e.g. lapse of life insurance policy). Survival Models are unique in that they feature censoring; a test or trial may end before such an event occurs (e.g. a policy-holder may pass away before the policy can lapse).

Gallery tool

This tool is not automatically installed with Alteryx Designer or the R tools. To use this tool, download it from the Alteryx Analytics Gallery.

This tool can be used for two purposes (determined based on configuration settings):

  1. To gain insight into the "survival function" of a dataset (i.e. to estimate a distribution of survival times across a population).
  2. To determine whether particular factors influence the survival function of a population (e.g. to compare survival functions across groups).

Configure the tool

Use the Required Parameters tab to set the controls for the model generation.

  • Model Name: Each model needs to be given a name so it can later be identified. Model names must start with a letter and may contain letters, numbers, and the special characters period (".") and underscore ("_"). No other special characters are allowed, and R is case sensitive.

  • Input Type: Select one of following (depending on the data in the data stream).
    • Data contains durations: The data includes a field representing durations.
      • Select duration variable: Select the field representing durations.
    • Data contains start and stop times: The data includes a field representing start times and a field representing stop times.
      • Select start time/ left censor variable: Select the field representing start times.
      • Select end time/ right censor variable: Select the field representing end times.
  • Censoring:
    • Data is left-censored: The data includes a field representing 0/1 censoring of the start of the record's life.
      • Select left-censoring variable: Select a 0/1 variable, where 0 represents censoring, and 1 indicates that a record's life began at the start time or at 0 (if "Data contains durations" was specified earlier).
    • Data is right-censored: The data includes a field representing 0/1 truncation of the end of the record's life.
      • Select right-censoring variable: Select a 0/1 variable, where 0 represents censoring and 1 indicates that a record's life ended at the end time or at the duration (if "Data contains durations" was specified earlier).

Use the Analysis Options tab to better define how analysis is calculated.

  • Kaplan-Meier Estimate: This option will find the survival curve of a dataset with an option to group by one variable.

    • Choose field to group by: This option allows for the comparison of survival curves of different groups.
      • Select grouping variable: Select the field corresponding to the grouping variable.
    • Use confidence interval: This option will display upper and lower bounds for the plotting of the KM estimate as well as in its table.
      • Input Confidence Level: Enter confidence level at which to compute upper and lower bounds for KM estimate.
  • Cox Proportional Hazards: Use to see the impact and significance of covariates affecting the survival curve.
    • Select predictor variables: At least one must be selected.
    • Method for tie handling: The method by which to deal with tied times.**
    • Include case weights: This option allows for the selection of a field containing weights for each record.
      • Select Field Specifying Weights: Select the field containing case weights.

Use the Graphics Options tab to set the controls for the graphical output.

  • Plot size: Select inches or centimeters for the size of the graph.
  • Graph resolution: Select the resolution of the graph in dots per inch: 1x (96 dpi); 2x (192 dpi); or 3x (288 dpi). Lower resolution creates a smaller file and is best for viewing on a monitor. Higher resolution creates a larger file with better print quality.

View the output

Connect a Browse tool to each output anchor to view results.

O anchor: Consists of a table of the serialized model with model name and the size of the object. The availability of various models will depend on the choice of "Analysis Type" under "Analysis Options".

  • Summary Analysis - Surv object, Kaplan-Meier estimate object
  • Grouping Analysis - Surv object, Kaplan-Meier estimate object, Cox Proportional Hazards object
  • Factor Analysis - Surv object, Cox Proportional Hazards object

The Cox PH model can be accessed directly from the second element of the output of the O output. If that model is 'model', the Surv and KMest objects can be accessed by 'model$surv' and 'model$KMest', respectively.

R anchor: Consists of the report snippets generated by the Survival Analysis tool, depending on the choice of "Analysis Type" under "Analysis Options".

  • Summary Analysis - Summary statistics and a graph of the survival function.
  • Grouping Analysis - Summary statistics; observed vs expected results for each group; group comparison test results for similarity of groups for Logrank, Likelihood Ratio, and Wald tests; a graph comparing the survival curves of different groups; and distinct survival curves and cumulative hazard curves for each group.
  • Factor Analysis - Summary statistics; factor analysis test results for impact of predictive variables for Logrank, Likelihood Ratio, and Wald tests; and a summary of the Cox Proportional Hazards Model detailing the impact of the predictors.

D anchor: For Summary Analysis and Grouping Analysis (in which case an extra field is added specifying group), this constructs the Kaplan-Meier estimate for the survival curves. For factor analysis, it is not provided.

*https://en.wikipedia.org/wiki/Survival_analysis

**https://stat.ethz.ch/R-manual/R-devel/library/survival/html/coxph.html