Basic usage



One of the things that Stata has that RStudio lacks is the variable explorer. This extremely useful especially if you’re working with datasets with a large number of variables with hard to remember names, but descriptive labels. In Stata you just search in the variable explorer and then click on the variable to get its name into the console.

Stata’s variable explorer

As an example of a dataset like this, consider the Quality of Government standard dataset. Here’s the 2018 version of the cross-section data:

qog <- rio::import("")

It has 194 observations (different countries) and 1882 variables.

The variables have names like wdi_acelu, bci_bci, eu_eco2gdpeurhab, gle_cgdpc etc. Not exactly things you want to remember.

Working with this in Stata is relatively easy because you just search in the variable explorer for things like “sanitation”, “corruption”, “GDP”, etc. and you find the variable names.

Unfortunately, RStudio doesn’t have a variable explorer panel. But you can improvise something like the following:

data.frame(Description = sjlabelled::get_label(qog)) %>% DT::datatable()

BAM! We just made a variable explorer! If you run this code in the console it opens the DT::datatable in the RStudio’s Viewer pane, which is pretty much replicating the Stata experience (except that it is read-only).

But we can do better! Why not include additional information, like the number of missing observations, summary statistics, or an overview of the values of each variable?

Introducing vars_explore

Full usage


This will create a searchable variable explorer, and calculate summary statistics for each variable: