Strange query issues when database reaches ~140G

I opened up Queries slow or never respond but only on certain days and when database reaches ~150G · Issue #8221 · influxdata/influxdb · GitHub on this issue.

@pauldix
It’s actually 4.5 days worth of data. The data is drawn from selected (financial) market data taken from several different sources. I haven’t calculated values/sec and it will vary considerably by time of day, but I would estimate that it will be high, of the order or 10,000 records per second. Here is a typical query, a full select:

SELECT originPopHost,sameTimeCount,symbolId,anomalyDetected,packetNbr,packetSize,packetStartPtr,channelId,gapDetected,bookLocked,bookCrossed,destIp,seqNo,fromSecondary,exchSeqNo,bidPrx,askPrx,prxExp,numOfTrades,newTradingStatus,overflowFlag,qdoFirstHexPtr,qdoLastHexPtr,extraNanos,errorReason FROM qedpcaps.“SEVEN_DAYS_ONE_COPY”.QED_PROD_snapshot WHERE originPopHost=‘xxxx005’ AND time >= 1490364000s AND time <= 1490364060s LIMIT 1000 OFFSET 0"

The first 4 selections are tags, the rest fields.

Possibly of note is the fact that the process of parsing files from different sources and then inserting into Influx, normally involves a lot of records being inserted out of order (back filled).