Why is data duplicated when I write the same points and timestamps after changing the field value?

influxdb

#1

I have an application where a Python API calls information from InfluxDB to be sent to a Cloud. In order to avoid redundant information I decided upon using a dirty bit concept.

Initially, all values are stored into the InfluxDB with a field status=0. Information is coming from a GPS device.

Information is stored as following:

   gps,gateway=UMG3001,node=core,type=gps status=0 1527072593639895040

This data is stored continuously.

Query

I use in my Python API a wrapper class which has a method as follow:

 def read_gps_points(self):
     return self.db.query('SELECT "lat","lon","alt" FROM "gps" WHERE "status"=0 LIMIT 100')

Which provides me the required information set.

Once I adapt the output from the above mentioned query to be sent to the cloud, I would like to write the batch back to the DB but with status=1 indicating that this information is already uploaded and shouldn’t be queried.

I change the status to 1 and write this information as a batch back into InfluxDB.

            batch = db.read_gps_points() # read 100 points
            # Upload to cloud
            # prepare to update
            updated_batch = []
            for each_measurement in list(batch)[0]:
                each_measurement['status'] = float(1) # make `status=1`
                 # make all measurements convert to float
                new_measurement = {
                    'measurement': 'gps',
                    'tags': {
                        'gateway': 'UMG3001',
                        'node': 'core',
                        'type': 'gps'
                    },
                    'time': each_measurement['time'],
                    'fields': each_measurement
                }
                updated_batch.append(new_measurement) # append to the list
            db.update_set(updated_batch) # update the batch

where update_set(updated_batch) is as follows:

       def update_set(self, incoming_set):
               self.db.write_points(incoming_set,batch_size=len(incoming_set), database='test')

If I check using SELECT COUNT(*) FROM "gps" where "status"=0 I should expect the count to decrease because I am writing the information with the same timestamps but that is not the case

For SELECT * FROM gps LIMIT 100

1527072386612 -5.5 UMG3001 59.226371666666665 10.913943333333334 core 1      gps
1527072386612 -5.5 UMG3001 59.226371666666665 10.913943333333334 core 0      gps
1527072389684 -5.5 UMG3001 59.226371666666665 10.913943333333334 core 1      gps
1527072389684 -5.5 UMG3001 59.226371666666665 10.913943333333334 core 0      gps
1527072392668 -5.5 UMG3001 59.226371666666665 10.913943333333334 core 1      gps
1527072392668 -5.5 UMG3001 59.226371666666665 10.913943333333334 core 0      gps
1527072395680 -5.5 UMG3001 59.226371666666665 10.913943333333334 core 1      gps
1527072395680 -5.5 UMG3001 59.226371666666665 10.913943333333334 core 0      gps
1527072398618 -5.5 UMG3001 59.226371666666665 10.913943333333334 core 1      gps
1527072398618 -5.5 UMG3001 59.226371666666665 10.913943333333334 core 0      gps
1527072401625 -5.6 UMG3001 59.226371666666665 10.913943333333334 core 1      gps
1527072401625 -5.6 UMG3001 59.226371666666665 10.913943333333334 core 0      gps
1527072404663 -5.5 UMG3001 59.226371666666665 10.913943333333334 core 1      gps
1527072404663 -5.5 UMG3001 59.226371666666665 10.913943333333334 core 0      gps
1527072407619 -5.5 UMG3001 59.226371666666665 10.913943333333334 core 1      gps

As you can see the same timestamp now have two entries; one with status=0 and one with status=1

Why is this the case? Please help

P.S.

I have been struggling with this issue for a couple of days now and you can find this question on StackOverflow with a 50 point bounty up for grabs.


#2

One major bug I found out was the status=0 was being set without any of the lat lon alt in the series. However I will look into creating a new data dump and see if I can overwrite the information with status=1 in the above mentioned way