Add column of headlines — add_headline

This works similar to headline() but acts on and returns a data frame.

Usage

add_headline_column(
  df,
  x,
  y,
  headline = "{trend} of {delta} ({orig_values})",
  ...,
  .name = "headline",
  if_match = "There was no difference",
  trend_phrases = headliner::trend_terms(),
  plural_phrases = NULL,
  orig_values = "{x} vs. {y}",
  n_decimal = 1,
  round_all = TRUE,
  multiplier = 1,
  return_cols = .name
)

Arguments

df: data frame, must be a single row
x: a numeric value to compare to the reference value of 'y'
y: a numeric value to act as a control for the 'x' value
headline: a string to format the final output. Uses glue syntax
...: arguments passed to glue_data
.name: string value for the name of the new column to create
if_match: string to display if numbers match, uses glue syntax
trend_phrases: list of values to use for when x is more than y or x is less than y. You can pass it just trend_terms (the default) and call the result with "...{trend}..." or pass is a named list (see examples)
plural_phrases: named list of values to use when difference (delta) is singular (delta = 1) or plural (delta != 1)
orig_values: a string using glue syntax. example: ({x} vs {y})
n_decimal: numeric value to limit the number of decimal places in the returned values.
round_all: logical value to indicate if all values should be rounded. When FALSE, the values will return with no modification. When TRUE (default) all values will be round to the length specified by 'n_decimal'.
multiplier: number indicating the scaling factor. When multiplier = 1 (default), 0.25 will return 0.25. When multiplier = 100, 0.25 will return 25.
return_cols: arguments that can be passed to select, ex: c("a", "b"), starts_with, etc.

Value

Returns the original data frame with columns appended.

Details

What is nice about this function is you can return some of the "talking points" used in the headline calculation. For example, if you want to find the most extreme headlines, you can use add_headline_column(..., return_cols = delta) This will bring back a headline column as well as the delta talking point (the absolute difference between x and y). With this result, you can sort in descending order and filter for the biggest difference.

Examples


# You can use 'add_headline_column()' to reference values in an existing data set.
# Here is an example comparing the box office sales of different Pixar films
head(pixar_films) |>
  dplyr::select(film, bo_domestic, bo_intl) |>
  add_headline_column(
    x = bo_domestic,
    y = bo_intl,
    headline = "{film} was ${delta}M higher {trend} (${x}M vs ${y}M)",
    trend_phrases = trend_terms(more = "domestically", less = "internationally")
  ) |>
  knitr::kable("pandoc")
#> 
#> 
#> film               bo_domestic   bo_intl  headline                                                                
#> ----------------  ------------  --------  ------------------------------------------------------------------------
#> Toy Story                191.8     181.8  Toy Story was $10M higher domestically ($191.8M vs $181.8M)             
#> A Bug's Life             162.8     200.5  A Bug's Life was $37.7M higher internationally ($162.8M vs $200.5M)     
#> Toy Story 2              245.9     251.5  Toy Story 2 was $5.6M higher internationally ($245.9M vs $251.5M)       
#> Monsters, Inc.           289.9     342.4  Monsters, Inc. was $52.5M higher internationally ($289.9M vs $342.4M)   
#> Finding Nemo             339.7     531.3  Finding Nemo was $191.6M higher internationally ($339.7M vs $531.3M)    
#> The Incredibles          261.4     370.2  The Incredibles was $108.8M higher internationally ($261.4M vs $370.2M) 

# You can also use 'return_cols' to return any and all "talking points".
# You can use tidyselect helpers like 'starts_with("delta")' or
# 'everything()'. In this example, I returned the 'raw_delta' & 'trend' columns
# and then identified the records at the extremes
pixar_films |>
  dplyr::select(film, bo_domestic, bo_intl) |>
  add_headline_column(
    x = bo_domestic,
    y = bo_intl,
    headline = "${delta}M {trend} (${x}M vs ${y}M)",
    trend_phrases = trend_terms(more = "higher", less = "lower"),
    return_cols = c(raw_delta, trend)
  ) |>
  dplyr::filter(raw_delta %in% range(raw_delta)) |>
  knitr::kable("pandoc")
#> 
#> 
#> film    bo_domestic   bo_intl  headline                              raw_delta  trend  
#> -----  ------------  --------  -----------------------------------  ----------  -------
#> Cars          244.1     217.9  $26.2M higher ($244.1M vs $217.9M)         26.2  higher 
#> Coco          209.7     597.4  $387.7M lower ($209.7M vs $597.4M)       -387.7  lower