I have installed influxDB v1.8.2 on centos7
my machine is 64G memory and CPU is Intel(R) Core™ i9-9900K 3.6GHz
I have one measurements: “stock”.“full_sz”.“order_sz”, it has about 59,660,943 records per one day.
influx
Connected to http://localhost:8086 version 1.8.2
InfluxDB shell version: 1.8.2
> show retention policies
name duration shardGroupDuration replicaN default
---- -------- ------------------ -------- -------
autogen 0s 168h0m0s 1 true
full_sz 0s 720h0m0s 1 false
> show tag keys on “stock” from “stock”.“full_sz”.“order_sz”
>
// No output since there’s no any tag on the measurements
> show field keys on “stock” from “stock”.“full_sz”.“order_sz”
name: order_sz
fieldKey fieldType
-------- ---------
chan integer
ex string
exchtime integer
md string
price integer
qty integer
seq integer
side string
src string
sym string
type string
> show series on “stock” from “stock”.“full_sz”.“order_sz”
key
—
order_sz
> show shards
name: stock
id database retention_policy shard_group start_time end_time expiry_time owners
– -------- ---------------- ----------- ---------- -------- ----------- ------
623 stock full_sz 623 2020-08-04T00:00:00Z 2020-09-03T00:00:00Z 2020-09-03T00:00:00Z
current there’s only one shards on retention policy “full_sz”
the line to write the measurements such as:
“POST /write?db=stock&p=%5BREDACTED%5D&precision=ns&rp=full_sz&u=writer HTTP/1.1”
order_sz sym=“000001”,ex=“Z”,exchtime=20200821093000000i,chan=2001i,seq=131i,price=155800i,qty=100i,side=“1”,type=“1”,md=“011”,src=“102” 1597973400000000011
order_sz sym=“000001”,ex=“Z”,exchtime=20200821093000000i,chan=2001i,seq=131i,price=156000i,qty=100i,side=“1”,type=“1”,md=“011”,src=“102” 1597973400000000012
…
the timestamp is incremened by 1 on within one second:
1597973400.000000011
1597973400.000000012
…
1597973400.000050032
1597973401.000000000
1597973401.000000001
…
I have load several day’s data, for one day 2020-08-25, it has about 59,660,943 records;
then I do the query:
SELECT sym, ex, exchtime, chan, seq, price, qty, side, type FROM stock.full_sz.order_sz WHERE time >= ‘2020-08-25’ AND time < ‘2020-08-25’ + 1d TZ(‘Asia/Shanghai’)
it takes long time:
> show queries;
qid query database duration status
— ----- -------- -------- ------
44 SELECT sym, ex, exchtime, chan, seq, price, qty, side, type FROM stock.full_sz.order_sz WHERE time >= ‘2020-08-25’ AND time < ‘2020 -08-25’ + 1d TZ(‘Asia/Shanghai’) stock 16m36s running
59 SHOW QUERIES
finally the journalctl log shows:
Aug 29 13:18:28 ds26 influxd[18742]: ts=2020-08-29T05:18:28.373587Z lvl=info msg=“Executing query” log_id=0Oti26c0000 service=query query=“SELECT sym, ex, exchtime, chan, seq, price, qty, side, type FROM stock.full_sz.order_sz WHERE time >= ‘2020-08-25’ AND time < ‘2020-08-25’ + 1d TZ(‘Asia/Shanghai’)”
Aug 29 13:37:45 ds26 influxd[18742]: [httpd] 192.168.0.23 - muse [29/Aug/2020:13:18:28 +0800] “GET /query?chunk_size=10000&chunked=true&db=stock&epoch=ns&p=%5BREDACTED%5D&q=SELECT+%22sym%22%2C%22ex%22%2C%22exchtime%22%2C%22chan%22%2C%22seq%22%2C%22price%22%2C%22qty%22%2C%22side%22%2C%22type%22+FROM+%22stock%22.%22full_sz%22.%22order_sz%22+WHERE+time%3E%3D%272020-08-25%27+AND+time%3C%272020-08-25%27%2B24h+tz%28%27Asia%2FShanghai%27%29&u=muse HTTP/1.1” 200 5104142042 “-” “-” 124fe3dc-e9b7-11ea-a0d5-9c5c8ebc1688 1156799961
it spends on 19 minutes 17 seconds.
the time is exceed my expected, I think it could be done within 5 mintues
show stats for shard full_sz:
> show stats
name: shard
tags: database=stock, engine=tsm1, id=580, indexType=tsi1, path=/home/influxdb/var/lib/influxdb/data/stock/full_sz/580, retentionPolicy=full_sz, walPath=/home/influxdb/var/lib/influxdb/wal/stock/full_sz/580
diskBytes fieldsCreate seriesCreate writeBytes writePointsDropped writePointsErr writePointsOk writeReq writeReqErr writeReqOk
--------- ------------ ------------ ---------- ------------------ -------------- ------------- -------- ----------- ----------
35278344877 1200651 4 0 0 0 872263918 87242 0 87242
name: tsm1_engine
tags: database=stock, engine=tsm1, id=580, indexType=tsi1, path=/home/influxdb/var/lib/influxdb/data/stock/full_sz/580, retentionPolicy=full_sz, walPath=/home/influxdb/var/lib/influxdb/wal/stock/full_sz/580
cacheCompactionDuration cacheCompactionErr cacheCompactions cacheCompactionsActive tsmFullCompactionDuration tsmFullCompactionErr tsmFullCompactionQueue tsmFullCompactions tsmFullCompactionsActive tsmLevel1CompactionDuration tsmLevel1CompactionErr tsmLevel1CompactionQueue tsmLevel1Compactions tsmLevel1CompactionsActive tsmLevel2CompactionDuration tsmLevel2CompactionErr tsmLevel2CompactionQueue tsmLevel2Compactions tsmLevel2CompactionsActive tsmLevel3CompactionDuration tsmLevel3CompactionErr tsmLevel3CompactionQueue tsmLevel3Compactions tsmLevel3CompactionsActive tsmOptimizeCompactionDuration tsmOptimizeCompactionErr tsmOptimizeCompactionQueue tsmOptimizeCompactions tsmOptimizeCompactionsActive
----------------------- ------------------ ---------------- ---------------------- ------------------------- -------------------- ---------------------- ------------------ ------------------------ --------------------------- ---------------------- ------------------------ -------------------- -------------------------- --------------------------- ---------------------- ------------------------ -------------------- -------------------------- --------------------------- ---------------------- ------------------------ -------------------- -------------------------- ----------------------------- ------------------------ -------------------------- ---------------------- ----------------------------
1411213589462 0 6969 0 2027161615251 0 0 17 0 367887072257 0 0 871 0 595477493339 0 0 217 0 706036649570 0 0 54 0 0 0 0 0 0
name: tsm1_cache
tags: database=stock, engine=tsm1, id=580, indexType=tsi1, path=/home/influxdb/var/lib/influxdb/data/stock/full_sz/580, retentionPolicy=full_sz, walPath=/home/influxdb/var/lib/influxdb/wal/stock/full_sz/580
WALCompactionTimeMs cacheAgeMs cachedBytes diskBytes memBytes snapshotCount writeDropped writeErr writeOk
------------------- ---------- ----------- --------- -------- ------------- ------------ -------- -------
1407453 157492999 231507354732 0 0 0 0 0 87242
name: tsm1_filestore
tags: database=stock, engine=tsm1, id=580, indexType=tsi1, path=/home/influxdb/var/lib/influxdb/data/stock/full_sz/580, retentionPolicy=full_sz, walPath=/home/influxdb/var/lib/influxdb/wal/stock/full_sz/580
diskBytes numFiles
--------- --------
35278344877 23
name: tsm1_wal
tags: database=stock, engine=tsm1, id=580, indexType=tsi1, path=/home/influxdb/var/lib/influxdb/data/stock/full_sz/580, retentionPolicy=full_sz, walPath=/home/influxdb/var/lib/influxdb/wal/stock/full_sz/580
currentSegmentDiskBytes oldSegmentsDiskBytes writeErr writeOk
----------------------- -------------------- -------- -------
0 0 0 87242
during the query, the top command shows one thread takes much CPU:
top - 13:19:34 up 33 days, 21:14, 3 users, load average: 0.85, 0.53, 0.27
Threads: 347 total, 2 running, 345 sleeping, 0 stopped, 0 zombie
%Cpu(s): 6.6 us, 0.0 sy, 0.0 ni, 93.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 32614340 total, 262644 free, 9475420 used, 22876276 buff/cache
KiB Swap: 16449532 total, 16352252 free, 97280 used. 21435168 avail Mem
** PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND**
18753 influxdb 20 0 16.9t 12.8g 4.6g R 99.7 41.3 13:27.31 influxd
18789 influxdb 20 0 16.9t 12.8g 4.6g S 2.3 41.3 11:06.70 influxd
18767 influxdb 20 0 16.9t 12.8g 4.6g S 1.0 41.3 15:58.36 influxd
18778 influxdb 20 0 16.9t 12.8g 4.6g S 1.0 41.3 13:15.00 influxd
18754 influxdb 20 0 16.9t 12.8g 4.6g S 0.7 41.3 15:52.25 influxd
18761 influxdb 20 0 16.9t 12.8g 4.6g S 0.3 41.3 13:55.76 influxd
18766 influxdb 20 0 16.9t 12.8g 4.6g S 0.3 41.3 13:04.77 influxd
18777 influxdb 20 0 16.9t 12.8g 4.6g S 0.3 41.3 13:59.17 influxd
1 root 20 0 46396 4764 2752 S 0.0 0.0 0:09.38 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.24 kthreadd
4 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
6 root 20 0 0 0 0 S 0.0 0.0 0:12.47 ksoftirqd/0
7 root rt 0 0 0 0 S 0.0 0.0 0:00.59 migration/0
8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh
9 root 20 0 0 0 0 S 0.0 0.0 0:34.42 rcu_sched
10 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 lru-add-drain
11 root rt 0 0 0 0 S 0.0 0.0 0:02.28 watchdog/0
12 root rt 0 0 0 0 S 0.0 0.0 0:02.16 watchdog/1
13 root rt 0 0 0 0 S 0.0 0.0 0:00.10 migration/1
14 root 20 0 0 0 0 S 0.0 0.0 0:00.20 ksoftirqd/1
16 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/1:0H
I don’t know why this thread takes much CPU, maybe it doing the decompress the data?
could it be done by multiple thread?
could someone give some suggestion how to improve the query performance.
I just want the query done within 5 minutes instead of 19 minutes 17 seoncds.
Do i need add one tag? or there’s some other suggestion?
thanks!!