Does data order matter when bulk inserting?

Hi there,

I’m new to InfluxDB but have a bit of experience with MySQL and SQL.

I have an IoT device that’s been logging data to MySQL for a number of years and I want to migrate that data to InfluxDB.

If I change the settings in the IoT device so that it starts writing to InfluxDB and then at some point in the future migrate the data from MySQL to InfluxDB, will it matter that InfluxDB will already have some data in it with datestamps that are after the data I’m bulk migrating? Or to ask another way, does the data in InfluxDB need to be inserted in date order, or will InfluxDB sort all that out behind the scenes?

A similar question is whether or not the data I’m migrating needs to be sorted in datestamp order or does it not matter?

Thanks!

Hi @Thinqer77,
Welcome to the community! Nice to meet you. Depending on how you have timestamped your data within SQL the data should automatically be stored in the correct historical order. Note if not we can change the data structure a little using Flux when you import it.

Sadly this example doesn’t quite relate to your example but the SQL flux package is an extremely powerful tool to allow you to start checking how your data will be ingested from your SQL server: Query SQL data sources with InfluxDB | InfluxDB OSS 2.0 Documentation

Once happy with the table structure you can then use the to() function to write the data to your bucket.

Let me know how you get on and we can walk through parts together.

Hi Jay,

Thanks for that. So to clarify, InfluxDB expects to data to be inserted in date order?

So what would happen if (per my first example) I started collecting data from my IoT device and let it run for a couple of weeks and then loaded all the old data (that would have datestamps earlier than the ‘new’ data that had been captured from the IoT device)?

Thanks in advance!

Hi @Thinqer77 ,
Sorry, i didn’t get notified by this. As long as your data is time-stamped appropriately the order in which it comes does not matter. If your data does not have a timestamp then the order does matter as this will be generated by InfluxDB upon receiving the data.

Hi Jay,

Thanks for that. My data is timestamped at source so I should be good.

Just out of interest, does InfluxDB care what resolution/format your timestamps are if you’re supplying them rather than them being auto added by the database? For instance, could it be date only (if doing daily logging) or does it need a time element? And if so, does the time element need to contain seconds or milliseconds, or can it just be hours and minutes of the day?

Cheers,

Ben.

Hi @Thinqer77 ,
I did a test this morning and parsed the following data successfully into InfluxDB. Here is the schema I used:

As you can see I used a collection of dates and on the final row a different timestamp. As you can see using the SQL.from function this was successfully imported into InfluxDB:

Just some things to note:
For Influx expects there to be a _time column for storing your time series data. Note if your column is called something different to this you can rename it in flux using rename() before saving to a bucket. Any troubles let me know.