As-of operations and irregular spacing of data using Flux

There is a topic of how to do as-of type operations on irregularly spaced time series where there are no points in some windows.

I am trying to do similar as-of interpolation of irregular spaced data using the flux language, i. e. on a regular time grid, fill in the most recent value from the irregular stream of data.

When the irregular data is dense, the aggregate approach works

  • window the data
  • select the last data point in each window
  • de-window

For example, we can find a weekly price series from a dense higher frequency series as follows

t1 = from(bucket: “quandl”)
|> range(start: -2520d, stop: 0m)
|> filter(fn: (r) => r[“_measurement”] == “trade”)
|> filter(fn: (r) => r[“sym”] == “XLA”)
|> window(every: 5d)
|> last()
|> window(every: inf)

The problem starts when there is no earlier data point for the first window, or when there is no data point in each window . For example, given daily data for price, assume we want to generate a series every 3hours using last close price for intraday time series. Most windows would be empty and not generated.

While it seems that the way logically to do this is to

  • window the data, generating empty windows as well
  • select the last point in each window, with null if missing
  • de-window
  • fill

It fails since

  • de-windowing drops empty tables
  • empty windows generate tables without rows and even setting dummy values that are non-null do not work

Here is the code

t1 = from(bucket: “quandl”)
|> range(start: -2520d, stop: 0m)
|> filter(fn: (r) => r[“_measurement”] == “trade”)
|> filter(fn: (r) => r[“_field”] == “close”)
|> filter(fn: (r) => r[“sym”] == “IBM”)
|> window(every: 3h, createEmpty: true)
|> window(every: inf)
|> fill(column: “_value”,usePrevious: true)

I note empty tables have good default values, maybe possible to insert these as a row? It would also be great if a developer would update the post linked above with fluxlang examples.
Also, showing how to

– implement an as-of join on irregular time series
– align one series using as-of operation on an irregular time grid of a different series

both of which are critical for a strong timeseries language, would be extremely helpful.

See example:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.merge_asof.html

It seems you need to perform a fill with linear interpolation as is used by the blog post. This isn’t currently supported in flux but it is something that should be supported. We have an issue for looking at this: Spec fill/Interpolate function · Issue #436 · influxdata/flux · GitHub

For the current behavior, the selectors will unfortunately not generate a row. I think you can use a workaround though. I have created this issue to both document the difference in behavior and also document the workaround: Selectors do not insert a null row when used on an empty table · Issue #3064 · influxdata/flux · GitHub.

Thanks.

You need different asof schemes, linear is actually pretty undesirable in many applications due to causality concerns which are fatal in finance.