Skip to content

Training and testing data set of fine food reviews.

Usage

attach_small_fine_foods(envir = parent.frame(), quiet = FALSE, ...)

Arguments

envir

Environment to load data sets into. Defaults to parent.frame().

quiet

Logical, should function announce what data sets are loaded.

...

Arguments passed to pins::pin_read().

Value

tibble

Details

These data are from Amazon, who describe it as "This dataset consists of reviews of fine foods from amazon. The data span a period of more than 10 years, including all ~500,000 reviews up to October 2012. Reviews include product and user information, ratings, and a plaintext review."

A subset of the data are contained here and are split into a training and test set. The training set sampled 10 products and retained all of their individual reviews. Since the reviews within these products are correlated, we recommend resampling the data using a leave-one-product-out approach. The test set sampled 500 products that were not included in the training set and selected a single review at random for each.

There is a column for the product, a column for the text of the review, and a factor column for a class variable. The outcome is whether the reviewer gave the product a 5-star rating or not.

tibble print

attach_small_fine_foods()
#> The following data sets have been loaded:
#> `training_data`, `testing_data`
#> Silence this message by setting `quiet = TRUE`.

training_data
#> # A tibble: 4,000 x 3
#>    product    review                                                       score
#>    <chr>      <chr>                                                        <fct>
#>  1 B000J0LSBG "this stuff is  not stuffing  its  not good at all  save yo~ other
#>  2 B000EYLDYE "I absolutely LOVE this dried fruit.  LOVE IT.  Whenever I ~ great
#>  3 B0026LIO9A "GREAT DEAL, CONVENIENT TOO.  Much cheaper than WalMart and~ great
#>  4 B00473P8SK "Great flavor, we go through a ton of this sauce! I discove~ great
#>  5 B001SAWTNM "This is excellent salsa/hot sauce, but you can get it for ~ great
#>  6 B000FAG90U "Again, this is the best dogfood out there.  One suggestion~ great
#>  7 B006BXTCEK "The box I received was filled with teas, hot chocolates, a~ other
#>  8 B002GWH5OY "This is delicious coffee which compares favorably with muc~ great
#>  9 B003R0MFYY "Don't let these little tiny cans fool you.  They pack a lo~ great
#> 10 B001EO5ZXI "One of the nicest, smoothest cup of chai I've made. Nice m~ great
#> # i 3,990 more rows
testing_data
#> # A tibble: 1,000 x 3
#>    product    review                                                       score
#>    <chr>      <chr>                                                        <fct>
#>  1 B005GXFP60 "These are the best tasting gummy fruits I have ever eaten.~ great
#>  2 B000G7V394 "I have been a consumer of Snyders hard sourdough pretzels ~ great
#>  3 B004WJAULO "This tastes so bad, I'm considering throwing it away.  But~ other
#>  4 B003D4MBOS "This product is way too pricey to have so little chocolate~ other
#>  5 B0030Z95B2 "I bought this for my Mom as a gift to accompany her Dolce ~ great
#>  6 B000LRH4WE "This thing is 7 dollars in US?I know its exported from Cyp~ other
#>  7 B000Z91SZW "This tea tastes like hot cocoa.  Very pleasant experience.~ other
#>  8 B00563VNEI "This product is great for a quick cup of coffee. If you us~ great
#>  9 B0085NFX2O "Grilled out brats, chicken, and burgers for the entire fam~ great
#> 10 B000LRH7XK "I ordered 4 cans of this product.  The product is fine, bu~ other
#> # i 990 more rows

glimpse()

tibble::glimpse(training_data)
#> Rows: 4,000
#> Columns: 3
#> $ product <chr> "B000J0LSBG", "B000EYLDYE", "B0026LIO9A", "B00473P8SK", "B001S~
#> $ review  <chr> "this stuff is  not stuffing  its  not good at all  save your ~
#> $ score   <fct> other, great, great, great, great, great, other, great, great,~
tibble::glimpse(testing_data)
#> Rows: 1,000
#> Columns: 3
#> $ product <chr> "B005GXFP60", "B000G7V394", "B004WJAULO", "B003D4MBOS", "B0030~
#> $ review  <chr> "These are the best tasting gummy fruits I have ever eaten. Ca~
#> $ score   <fct> great, great, other, other, great, other, other, great, great,~

Examples

# \donttest{
attach_small_fine_foods()
#> The following data sets have been loaded:
#> `training_data`, `testing_data`
#> Silence this message by setting `quiet = TRUE`.
# }