Efficient Estimation of Bid-Ask Spreads from Open, High, Low, and Close Prices

This vignette illustrates how to estimate bid-ask spreads from open, high, low, and close prices. Let’s start by loading the package:

library(bidask)

The package offers two ways to estimate bid-ask spreads:

  1. edge(): designed for tidy data.
  2. spread(): designed for xts objects.

The function edge() implements the efficient estimator described in Ardia, Guidotti, & Kroencke (2021). Open, high, low, and close prices are to be passed as separate vectors.

The function spread() requires an xts object containing columns named Open, High, Low, Close and it provides additional functionalities, such as additional estimators and rolling estimates.

An output value of 0.01 corresponds to a spread estimate of 1%.

Examples are provided below.

Tidy data

The function edge() can be easily used with tidy data and the dplyr grammar. In the following example, we estimate bid-ask spreads for cryptocurrencies.

Download daily prices for Bitcoin and Ethereum using the crypto2 package:

library(dplyr)
library(crypto2)
df <- crypto_list(only_active=TRUE) %>%
  filter(symbol %in% c("BTC", "ETH")) %>%
  crypto_history(start_date = "20200101", end_date = "20221231")
#> ❯ Scraping historical crypto data
#> ❯ Processing historical crypto data
head(df)
#> # A tibble: 6 × 16
#>   timestamp              id slug    name  symbol ref_cur  open  high   low close
#>   <dttm>              <int> <chr>   <chr> <chr>  <chr>   <dbl> <dbl> <dbl> <dbl>
#> 1 2020-01-01 23:59:59     1 bitcoin Bitc… BTC    USD     7195. 7254. 7175. 7200.
#> 2 2020-01-02 23:59:59     1 bitcoin Bitc… BTC    USD     7203. 7212. 6935. 6985.
#> 3 2020-01-03 23:59:59     1 bitcoin Bitc… BTC    USD     6984. 7414. 6915. 7345.
#> 4 2020-01-04 23:59:59     1 bitcoin Bitc… BTC    USD     7345. 7427. 7310. 7411.
#> 5 2020-01-05 23:59:59     1 bitcoin Bitc… BTC    USD     7410. 7544. 7401. 7411.
#> 6 2020-01-06 23:59:59     1 bitcoin Bitc… BTC    USD     7410. 7782. 7409. 7769.
#> # ℹ 6 more variables: volume <dbl>, market_cap <dbl>, time_open <dttm>,
#> #   time_close <dttm>, time_high <dttm>, time_low <dttm>

Estimate the spread for each coin in each year:

df %>%
  mutate(yyyy = format(timestamp, "%Y")) %>%
  group_by(symbol, yyyy) %>%
  arrange(timestamp) %>%
  summarise(EDGE = edge(open, high, low, close))
#> # A tibble: 6 × 3
#> # Groups:   symbol [2]
#>   symbol yyyy      EDGE
#>   <chr>  <chr>    <dbl>
#> 1 BTC    2020  0.00319 
#> 2 BTC    2021  0.00376 
#> 3 BTC    2022  0.000200
#> 4 ETH    2020  0.00223 
#> 5 ETH    2021  0.00628 
#> 6 ETH    2022  0.00262

xts objects

The function spread() provides additional functionalities for xts objects. In the following example, we estimate bid-ask spreads for equities.

Download daily data for Microsoft (MSFT) using the quantmod package:

library(quantmod)
x <- getSymbols("MSFT", auto.assign = FALSE, start = "2019-01-01", end = "2022-12-31")
head(x)
#>            MSFT.Open MSFT.High MSFT.Low MSFT.Close MSFT.Volume MSFT.Adjusted
#> 2007-01-03     29.91     30.25    29.40      29.86    76935100      21.43699
#> 2007-01-04     29.70     29.97    29.44      29.81    45774500      21.40110
#> 2007-01-05     29.63     29.75    29.45      29.64    44607200      21.27906
#> 2007-01-08     29.65     30.10    29.53      29.93    50220200      21.48725
#> 2007-01-09     30.00     30.18    29.73      29.96    44636600      21.50879
#> 2007-01-10     29.80     29.89    29.43      29.66    55017400      21.29341

This is an xts object:

class(x)
#> [1] "xts" "zoo"

So we can estimate the spread with:

spread(x)
#>                   EDGE
#> 2023-12-14 0.005510711

By default, the call above is equivalent to:

edge(open = x[,1], high = x[,2], low = x[,3], close = x[,4])
#> [1] 0.005510711

But spread() also provides additional functionalities. For instance, estimate the spread for each month and plot the estimates:

sp <- spread(x, width = endpoints(x, on = "months"))
plot(sp)

Or estimate the spread using a rolling window of 21 obervations:

sp <- spread(x, width = 21)
plot(sp)

To illustrate higher-frequency estimates, we are going to download intraday data from Alpha Vantage. You must register with Alpha Vantage in order to download their data, but the one-time registration is fast and free. Register at https://www.alphavantage.co/ to receive your key. You can set the API key globally as follows:

setDefaults(getSymbols.av, api.key = "<API-KEY>")

Download minute data for Microsoft:

x <- getSymbols(
  Symbols = "MSFT", 
  auto.assign = FALSE, 
  src = "av", 
  periodicity = "intraday", 
  interval = "1min", 
  output.size = "full")
head(x)
#>                     MSFT.Open MSFT.High MSFT.Low MSFT.Close MSFT.Volume
#> 2023-08-17 04:00:00    319.20    322.00   319.20     320.39         992
#> 2023-08-17 04:01:00    320.03    320.18   320.00     320.18         914
#> 2023-08-17 04:02:00    320.38    320.38   320.35     320.35         170
#> 2023-08-17 04:03:00    320.35    320.35   320.06     320.34          96
#> 2023-08-17 04:04:00    320.34    320.34   320.34     320.34          17
#> 2023-08-17 04:05:00    320.34    320.34   320.29     320.30          11

Estimate the spread for each day and plot the estimates:

sp <- spread(x, width = endpoints(x, on = "day"))
plot(sp)

GitHub

If you find this package useful, please star the repo!

The repository also contains implementations for Python, C++, MATLAB, and more.

Cite as

Ardia, David and Guidotti, Emanuele and Kroencke, Tim Alexander, “Efficient Estimation of Bid-Ask Spreads from Open, High, Low, and Close Prices”. Available at SSRN: https://www.ssrn.com/abstract=3892335

A BibTex entry for LaTeX users is:

@unpublished{edge2021,
    author = {Ardia, David and Guidotti, Emanuele and Kroencke, Tim},
    title  = {Efficient Estimation of Bid-Ask Spreads from Open, High, Low, and Close Prices},
    year   = {2021},
    note   = {Available at SSRN}
    url    = {https://www.ssrn.com/abstract=3892335}
}