Hi there,
we are currently evaluating influxdb as a potential replacement of our current data handling.
The data we want to store is tree-structured (i.e. “System.performance.cpu.usage”) and is always queried against completely (never just “System.performance” but the entire path).
For scope: We have a couple thousand leaf-nodes most of which contain time-series data and write up to every ms.
I translated the structure so that the last part is the field-key and everything else is the measurement (in the example: measurement = “System.performance.cpu” field-key=“usage”). Is that a viable way of doing it or is there a better way?
For writing I use the following code:
using(InfluxDBClient client = InfluxDBClientFactory.Create(clientOptions))
{
client.EnableGzip(); // I’m not sure if this helps but it says so in the best-practices
task = client.GetWriteApiAsync().WriteRecordAsync(WritePrecision.Ms, write);
}
clientOptions conatins my url(“localhost”), my token(All buckets read/write), my org and the bucket.
“write” is a string containing a batch of writes, format: “measurement leaf=data TimestampInMs\n…”
This code is executed with 5000 single records in one batch because of this: best-practices
I only wait for the task to complete if i need to query against any of the written data.
For querying I use the following code:
List<FluxTable> table;
using (InfluxDBClient client = InfluxDBClientFactory.Create(clientOptions))
{
table = client.GetQueryApiSync().QuerySync(query);
}
clientOptions is the same as before and the query looks like this:
from(bucket: “MyBucket”) |> range(start: -30d) |> filter(fn: (r) => r._measurement == “path”) |> filter(fn: (r) => r[“_field”] == “leaf”) |> drop(columns: [“_start”, “_stop”, “_measurement”, “_field”]) |> last()
As quite often the data is requested without a time period (mostly ‘last n points’) I am now using my retention period for the start time.
Now to my Issue:
While the time spend on writing is more than fast enough (0.6ms - 2ms per batch), the time for reading lacks behind massively in comparison to our current implementation (0.5ms vs 0.03ms).
And even though both read and write times are fairly similar the size of the batch makes up for the added time per write, but of course doesn’t help the read-time.
Is there any way to make my queries faster?
As influx beats our current implementation at everything but querying it is more than okay if there is a trade off.
I am using Influx 2.1 and the c# api client version 4.0.0