Visualize spread of avg. values among all factors for all variables

Visualize variation and logic for a single observation

plot_spread(df, dv, ...)

plot_spread_single_obs(df, dv, ..., labels = FALSE, isolate_id = 1)

plot_spread_interactive(...)

Arguments

df	dataframe to evaluate
dv	dependent variable to use (column name)
...	Arguments passed on to `refactor_columns`, `refactor_columns`, `refactor_columns` `split_on` variable to split data / group by `id_col` field to use as ID `n_cat` for categorical variables, the max number of unique values to keep. This field feeds the `forcats::fct_lump(n = )` argument. `collapse_by` should `n_cat` collapse by the distance to the grand mean `"dv"` leaving the extremes as is and grouping factors closer to the grand mean as "other" or should it use size `"n"` `n_quantile` for numeric/date fields, the number of quantiles used to split the data into a factor. Fields that have less than this amount will not be changed. `n_digits` for numeric fields, the number of digits to keep in the breaks ex: [1.2345 to 2.3456] will be [1.23 to 2.34] if `n_digits = 2` `avg_type` mean or median `ignore_cols` columns to ignore from analysis. Good candidates are fields that have have no duplicate values (primary keys) or fields with a large proportion of null values
labels	when TRUE will show the labels of the factor levels outlined in the plot
isolate_id	the unique id from the field specified in `unique_id` or the row number when `unique_id` is unspecified

Functions

plot_spread_single_obs: highlight a single observation
plot_spread_interactive: utilizing ggplotly

Examples

plot_spread(ggplot2::mpg, dv = hwy)
plot_spread_single_obs(df = employee_attrition[,1:5], dv = attrition)
plot_spread_single_obs(df = employee_attrition[,1:5], dv = attrition, labels = TRUE)
plot_spread_interactive(ggplot2::mpg, dv = hwy)
#> Warning: `gather_()` was deprecated in tidyr 1.2.0.
#> Please use `gather()` instead.