Why is my data returned in only one Series?

theo · June 28, 2017, 2:46pm

I built up a very simple and stupid test table.

It looks like this:

>SELECT * FROM mytesttable
name: mytesttable
time                           Status async deppTag myTag subldubl value
----                           ------ ----- ------- ----- -------- -----
1970-01-01T00:00:00Z           OK     true  woot                   30
1970-01-01T00:00:00Z           OK                   Blub           0.1
1970-01-01T00:00:00.000000001Z BAD                  Blub           0.7
1970-01-01T00:00:00.001Z       BAD    true  subl                   11.11
1970-01-01T00:00:00.002Z       OK     true                woot     10
2017-06-28T13:15:12.285171247Z OK                   Blub           0.5

All of the columns are tags except for value which is a field. I can verify this by looking at the series (And seeing that abviously time 0 exists twice, so the timestamps are in different series):

> SHOW SERIES
key
---
mytesttable,Status=BAD,async=true,deppTag=subl
mytesttable,Status=BAD,myTag=Blub
mytesttable,Status=OK,async=true,deppTag=woot
mytesttable,Status=OK,async=true,subldubl=woot
mytesttable,Status=OK,myTag=Blub

If I now query the entire measurement, I would expect from the REST API (And Java Client and so on), to split my response in multiple Series. The result, however, looks different. For a query like the following, I receive all of my data within one Series:

curl -G 'http://localhost:8086/query?pretty=true' --data-urlencode "db=test" --data-urlencode "q=SELECT * FROM \"mytesttable\""
{
    "results": [
        {
            "statement_id": 0,
            "series": [
                {
                    "name": "mytesttable",
                    "columns": [
                        "time",
                        "Status",
                        "async",
                        "deppTag",
                        "myTag",
                        "subldubl",
                        "value"
                    ],
                    "values": [
                        [
                            "1970-01-01T00:00:00Z",
                            "OK",
                            "true",
                            "woot",
                            null,
                            null,
                            30
                        ],
                        [
                            "1970-01-01T00:00:00Z",
                            "OK",
                            null,
                            null,
                            "Blub",
                            null,
                            0.1
                        ],
                        [
                            "1970-01-01T00:00:00.000000001Z",
                            "BAD",
                            null,
                            null,
                            "Blub",
                            null,
                            0.7
                        ],
                        [
                            "1970-01-01T00:00:00.001Z",
                            "BAD",
                            "true",
                            "subl",
                            null,
                            null,
                            11.11
                        ],
                        [
                            "1970-01-01T00:00:00.002Z",
                            "OK",
                            "true",
                            null,
                            null,
                            "woot",
                            10
                        ],
                        [
                            "2017-06-28T13:15:12.285171247Z",
                            "OK",
                            null,
                            null,
                            "Blub",
                            null,
                            0.5
                        ]
                    ]
                }
            ]
        }
    ]
}

Obviously, no tags are set and all tags are treated in the same way as fields are treated. The result consists out of a single series, even though the data is stored in multiple series.

To make my point clear: I really like this behavior. As I already mentioned in another question, I thought it to be quite uncomfortable if my data in a single measurement wouldn’t be queryable sorted “globally” over time. So I do want to query my measurement with time ascending order even if that measurement is split up in multiple series (And not only time ascending per series).
The question is: Can I rely on getting the data from a measurement sorted over time, even if it is huge and contains a lot of series? And can I always assume that for this kind of query, I always receive one result and one series only? (And get multiple series only if I do GROUP BY or such?!)

Best regards

jackzampolin · June 28, 2017, 4:09pm

Yup! This is advantage of a time series database!

In that query (without GROUP BY) you will only receive your results in a single array. Adding GROUP BY will split the results.

Hope this helps!

theo · June 29, 2017, 6:18am

Can you tell me how this works internally ?

I thought, data is organized in different Series and thus, you would need to merge the series data with their timestamps together for a response.
So do you merge the Series on demand?
Or do you have, besides the Series files, a global, time sorted index of the form timestamp->Series to look up or such?

Topic		Replies	Views
How to retrieve values of the same field key for two time series data with a single query? influxdb , time-series , influxql , query	2	205	March 30, 2024
Executing 'show series' using golang influxdb client package returns all fields in single string influxdb , client-libraries , query	3	1113	July 31, 2020
Get data from series where the time range is defined in another series	4	432	March 1, 2021
Duplicate values stored in database Store influxdb	6	593	May 27, 2019
Single query to obtain data grouped by time in a specific fashion	0	359	February 3, 2020

Why is my data returned in only one Series?

Related topics