valid_sets() will give all the data sets that are available in the data frame
Package | LibPath | Item | Title | |
---|---|---|---|---|
5 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | CO2 | Carbon Dioxide Uptake in Grass Plants |
6 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | ChickWeight | Weight versus age of chicks on different diets |
7 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | DNase | Elisa assay of DNase |
13 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | Indometh | Pharmacokinetics of Indomethacin |
17 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | LifeCycleSavings | Intercountry Life-Cycle Savings Data |
18 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | Loblolly | Growth of Loblolly Pine Trees |
20 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | Orange | Growth of Orange Trees |
21 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | OrchardSprays | Potency of Orchard Sprays |
23 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | Puromycin | Reaction Velocity of an Enzymatic Reaction |
25 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | Theoph | Pharmacokinetics of Theophylline |
27 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | ToothGrowth | The Effect of Vitamin C on Tooth Growth in Guinea Pigs |
32 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | USArrests | Violent Crime Rates by US State |
33 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | USJudgeRatings | Lawyers’ Ratings of State Judges in the US Superior Court |
41 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | airquality | New York Air Quality Measurements |
42 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | anscombe | Anscombe’s Quartet of ‘Identical’ Simple Linear Regressions |
43 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | attenu | The Joyner-Boore Attenuation Data |
44 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | attitude | The Chatterjee-Price Attitude Data |
53 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | esoph | Smoking, Alcohol and (O)esophageal Cancer |
59 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | freeny | Freeny’s Revenue Data |
62 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | infert | Infertility after Spontaneous and Induced Abortion |
63 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | iris | Edgar Anderson’s Iris Data |
68 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | longley | Longley’s Economic Regression Data |
71 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | morley | Michelson Speed of Light Data |
72 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | mtcars | Motor Trend Car Road Tests |
75 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | npk | Classical N, P, K Factorial Experiment |
80 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | quakes | Locations of Earthquakes off Fiji |
81 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | randu | Random Numbers from Congruential Generator RANDU |
83 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | rock | Measurements on Petroleum Rock Samples |
84 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | sleep | Student’s Sleep Data |
87 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | stackloss | Brownlee’s Stack Loss Plant Data |
98 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | swiss | Swiss Fertility and Socioeconomic Indicators (1888) Data |
100 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | trees | Diameter, Height and Volume for Black Cherry Trees |
103 | datasets | C:/Users/RPUSH1/AppData/Local/Programs/R/R-4.4.2/library | warpbreaks | The Number of Breaks in Yarn during Weaving |
In case you want to load any data sets from the list of datasets from return of valis_sets() function you can use base::get() function to load the data sets. this will help you to choose on data sets to load dynamycally in any program.
dsets$Item <- as.character(dsets$Item)
mtcars <- get(dsets$Item[dsets$Item == "mtcars"])
knitr::kable(head(mtcars))
mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | |
---|---|---|---|---|---|---|---|---|---|---|---|
Mazda RX4 | 21.0 | 6 | 160 | 110 | 3.90 | 2.620 | 16.46 | 0 | 1 | 4 | 4 |
Mazda RX4 Wag | 21.0 | 6 | 160 | 110 | 3.90 | 2.875 | 17.02 | 0 | 1 | 4 | 4 |
Datsun 710 | 22.8 | 4 | 108 | 93 | 3.85 | 2.320 | 18.61 | 1 | 1 | 4 | 1 |
Hornet 4 Drive | 21.4 | 6 | 258 | 110 | 3.08 | 3.215 | 19.44 | 1 | 0 | 3 | 1 |
Hornet Sportabout | 18.7 | 8 | 360 | 175 | 3.15 | 3.440 | 17.02 | 0 | 0 | 3 | 2 |
Valiant | 18.1 | 6 | 225 | 105 | 2.76 | 3.460 | 20.22 | 1 | 0 | 3 | 1 |
To figure the class of each column in the given data frame use getnumericcols() it return the column names which are numeric
to split paragraph or sentence to induvidial words use splitAndGet(), it returns the list of induvidual words in the given input which can be later used by getFeqTable()
#> [[1]]
#> [1] "**shinyr**" "is" "developed" "to" "build"
#> [6] "dynamic" "shiny" "based" "dashboards" "to"
#> [11] "analyze" "the" "data" "of" "your"
#> [16] "choice." "" "It" "provides" "simple"
#> [21] "yet" "genius" "dashboard" "design" "to"
#> [26] "subset" "the" "data," "perform" "exploratory"
#> [31] "analysis" "and" "predictive" "analysis" "by"
#> [36] "means" "of"
getFeqTable will be used on the output of spliAndGet() to get the frequency of each word, which will be used by getWordCloud
word | freq | |
---|---|---|
analysis | analysis | 2 |
data | data | 2 |
analyze | analyze | 1 |
based | based | 1 |
build | build | 1 |
choice | choice | 1 |
dashboard | dashboard | 1 |
dashboards | dashboards | 1 |
design | design | 1 |
developed | developed | 1 |
dynamic | dynamic | 1 |
exploratory | exploratory | 1 |
genius | genius | 1 |
means | means | 1 |
perform | perform | 1 |
predictive | predictive | 1 |
provides | provides | 1 |
shiny | shiny | 1 |
shinyr | shinyr | 1 |
simple | simple | 1 |
subset | subset | 1 |
yet | yet | 1 |
getDataInsights() takes data frame as an input and returns the basic insights such as class, number of values missing, maximum, min, var, sd, mean, median, unique items for each column.
Column | Class | Missing | Max | Min | Mean | Median | SD | Variance | Unique_items |
---|---|---|---|---|---|---|---|---|---|
mpg | numeric | 0 | 33.9 | 10.4 | 20.09 | 19.2 | 6.03 | 36.32 | 21,22.8,21.4,18.7,18.1,14.3,24.4,19.2,17.8,16.4,17.3,15.2,10.4,14.7,32.4,30.4,33.9,21.5,15.5,13.3,27.3,26,15.8,19.7,15 |
cyl | numeric | 0 | 8 | 4 | 6.19 | 6 | 1.79 | 3.19 | 6,4,8 |
disp | numeric | 0 | 472 | 71.1 | 230.72 | 196.3 | 123.94 | 15360.8 | 160,108,258,360,225,146.7,140.8,167.6,275.8,472,460,440,78.7,75.7,71.1,120.1,318,304,350,400,79,120.3,95.1,351,145,301,121 |
hp | numeric | 0 | 335 | 52 | 146.69 | 123 | 68.56 | 4700.87 | 110,93,175,105,245,62,95,123,180,205,215,230,66,52,65,97,150,91,113,264,335,109 |
drat | numeric | 0 | 4.93 | 2.76 | 3.6 | 3.7 | 0.53 | 0.29 | 3.9,3.85,3.08,3.15,2.76,3.21,3.69,3.92,3.07,2.93,3,3.23,4.08,4.93,4.22,3.7,3.73,4.43,3.77,3.62,3.54,4.11 |
wt | numeric | 0 | 5.424 | 1.513 | 3.22 | 3.33 | 0.98 | 0.96 | 2.62,2.875,2.32,3.215,3.44,3.46,3.57,3.19,3.15,4.07,3.73,3.78,5.25,5.424,5.345,2.2,1.615,1.835,2.465,3.52,3.435,3.84,3.845,1.935,2.14,1.513,3.17,2.77,2.78 |
qsec | numeric | 0 | 22.9 | 14.5 | 17.85 | 17.71 | 1.79 | 3.19 | 16.46,17.02,18.61,19.44,20.22,15.84,20,22.9,18.3,18.9,17.4,17.6,18,17.98,17.82,17.42,19.47,18.52,19.9,20.01,16.87,17.3,15.41,17.05,16.7,16.9,14.5,15.5,14.6,18.6 |
vs | numeric | 0 | 1 | 0 | 0.44 | 0 | 0.5 | 0.25 | 0,1 |
am | numeric | 0 | 1 | 0 | 0.41 | 0 | 0.5 | 0.25 | 1,0 |
gear | numeric | 0 | 5 | 3 | 3.69 | 4 | 0.74 | 0.54 | 4,3,5 |
carb | numeric | 0 | 8 | 1 | 2.81 | 2 | 1.62 | 2.61 | 4,1,2,3,6,8 |
getDataInsight() also calculates the correlation table for the given data frame.
mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | |
---|---|---|---|---|---|---|---|---|---|---|---|
mpg | 1.0000000 | -0.8521620 | -0.8475514 | -0.7761684 | 0.6811719 | -0.8676594 | 0.4186840 | 0.6640389 | 0.5998324 | 0.4802848 | -0.5509251 |
cyl | -0.8521620 | 1.0000000 | 0.9020329 | 0.8324475 | -0.6999381 | 0.7824958 | -0.5912421 | -0.8108118 | -0.5226070 | -0.4926866 | 0.5269883 |
disp | -0.8475514 | 0.9020329 | 1.0000000 | 0.7909486 | -0.7102139 | 0.8879799 | -0.4336979 | -0.7104159 | -0.5912270 | -0.5555692 | 0.3949769 |
hp | -0.7761684 | 0.8324475 | 0.7909486 | 1.0000000 | -0.4487591 | 0.6587479 | -0.7082234 | -0.7230967 | -0.2432043 | -0.1257043 | 0.7498125 |
drat | 0.6811719 | -0.6999381 | -0.7102139 | -0.4487591 | 1.0000000 | -0.7124406 | 0.0912048 | 0.4402785 | 0.7127111 | 0.6996101 | -0.0907898 |
wt | -0.8676594 | 0.7824958 | 0.8879799 | 0.6587479 | -0.7124406 | 1.0000000 | -0.1747159 | -0.5549157 | -0.6924953 | -0.5832870 | 0.4276059 |
qsec | 0.4186840 | -0.5912421 | -0.4336979 | -0.7082234 | 0.0912048 | -0.1747159 | 1.0000000 | 0.7445354 | -0.2298609 | -0.2126822 | -0.6562492 |
vs | 0.6640389 | -0.8108118 | -0.7104159 | -0.7230967 | 0.4402785 | -0.5549157 | 0.7445354 | 1.0000000 | 0.1683451 | 0.2060233 | -0.5696071 |
am | 0.5998324 | -0.5226070 | -0.5912270 | -0.2432043 | 0.7127111 | -0.6924953 | -0.2298609 | 0.1683451 | 1.0000000 | 0.7940588 | 0.0575344 |
gear | 0.4802848 | -0.4926866 | -0.5555692 | -0.1257043 | 0.6996101 | -0.5832870 | -0.2126822 | 0.2060233 | 0.7940588 | 1.0000000 | 0.2740728 |
carb | -0.5509251 | 0.5269883 | 0.3949769 | 0.7498125 | -0.0907898 | 0.4276059 | -0.6562492 | -0.5696071 | 0.0575344 | 0.2740728 | 1.0000000 |
You can use corrplot::corrplot() on correlation table to get the correlation table.
This function was developed to eliminate few items from the list of items for any reason.
You can find out most repeated values in the given set of values.
missing count will calculate the total number of NA, NULL, ““,”NULL”, “NA” s in a given set of values. lets introduce some missing values to mtcars
You can replace the missing values in any column of given data frame with one of mean, median, max, and min, sum and mode by using ImputeMydata(). for example you can impute the missing values in the mpg column by mean of all the values in the column as shown below.
imputeMyData(df = x, col = "mpg", FUN = "mean")
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 20.25 6 160 110 3.90 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag NA 6 160 110 3.90 2.875 17.02 0 1 4 4
#> Datsun 710 22.80 4 108 93 3.85 2.320 18.61 1 1 4 1
#> Hornet 4 Drive 21.40 6 258 110 3.08 3.215 19.44 1 0 3 1
#> Hornet Sportabout 18.70 8 360 175 3.15 3.440 17.02 0 0 3 2
#> Valiant 18.10 6 225 105 2.76 3.460 20.22 1 0 3 1
You can summarize the values of one column by grouping the values in the other column using groupByandSummarize(). For example you can calculate mean of hp by am.
am | mean_of_hp_by_am |
---|---|
1 | 126.8462 |
0 | 126.8462 |
You can split a given data set into training set and test set by using datapartition(), you can specify the percentage to specify the size of trainset. For example you can split mtcars into 85 percent to train and 15 to test as shown below.
partition is a list of length 2, which contains test and train sets.
mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | |
---|---|---|---|---|---|---|---|---|---|---|---|
Mazda RX4 | 21.0 | 6 | 160.0 | 110 | 3.90 | 2.62 | 16.46 | 0 | 1 | 4 | 4 |
Merc 280C | 17.8 | 6 | 167.6 | 123 | 3.92 | 3.44 | 18.90 | 1 | 0 | 4 | 4 |
Merc 450SLC | 15.2 | 8 | 275.8 | 180 | 3.07 | 3.78 | 18.00 | 0 | 0 | 3 | 3 |
Volvo 142E | 21.4 | 4 | 121.0 | 109 | 4.11 | 2.78 | 18.60 | 1 | 1 | 4 | 2 |
Porsche 914-2 | 26.0 | 4 | 120.3 | 91 | 4.43 | 2.14 | 16.70 | 0 | 1 | 5 | 2 |
Ferrari Dino | 19.7 | 6 | 145.0 | 175 | 3.62 | 2.77 | 15.50 | 0 | 1 | 5 | 6 |
mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | |
---|---|---|---|---|---|---|---|---|---|---|---|
Mazda RX4 Wag | 21.0 | 6 | 160.0 | 110 | 3.90 | 2.875 | 17.02 | 0 | 1 | 4 | 4 |
Merc 450SL | 17.3 | 8 | 275.8 | 180 | 3.07 | 3.730 | 17.60 | 0 | 0 | 3 | 3 |
Cadillac Fleetwood | 10.4 | 8 | 472.0 | 205 | 2.93 | 5.250 | 17.98 | 0 | 0 | 3 | 4 |
Toyota Corolla | 33.9 | 4 | 71.1 | 65 | 4.22 | 1.835 | 19.90 | 1 | 1 | 4 | 1 |
Camaro Z28 | 13.3 | 8 | 350.0 | 245 | 3.73 | 3.840 | 15.41 | 0 | 0 | 3 | 4 |
get the metrics of regression model by using regressionModelmMetrics()
actials <- mtcars[,6]
x <- regressionModelMetrics(actuals = actials, predictions = predictions, model = mod)
y <- as.data.frame(x)
row.names(y) <- NULL
knitr::kable(y)
AIC | BIC | MAE | MSE | RMSE | MAPE | Corelation | r.squared | adj.r.squared |
---|---|---|---|---|---|---|---|---|
20.01 | 37.6 | 0.18 | 0.05 | 0.23 | 0.06 | 0.97 | 0.94 | 0.92 |