TL;DR I find that the time spent retrieving all data associated to a serie seems strangely long, and I wonder if there is not something fishy in the way data is being internally handled.
- I am creating a database with one measurement, with one serie, with half a million points containing each one integer value, using minute precision.
- I am querying all data with a single http request
- The time spent is about 2 seconds, which seems a lot to me given the small number of data.
In order to get more information, I used the profiling tool (pproof) as indicated on influxdb contributing page. I discovered that :
- around 20% of the elapsed time wad due to http transfert
- nearly all the remaining time was use in the stream function of one (or more) Iterator
- it seemed that one iterator was a MergeSortIterator, and that a significative amount of time was used building this MergeSortIterator using the Go Heap interface.
There is of course the (highly plausible) possibility that I am misreading the code. But if this is not the case, this is why I am puzzled : series are supposed to be stored already sorted wrt time, rigth? So why is there a sort operation when I am just retrieving all data of the serie?
I hope this is clear enough, if no I can provide exact scripts to reproduce my observations.
I’d be pleased to have some answers or advices to investigate further the matter.