Change type from string to integer

Unfortunately I have a measurement that have fields with an number as the “_value” but the type is a string instead of an integer.
I want to convert the “_value” to integer (overwrite). How can I convert this?
Unfortunately I found no easy way to do this.
That’s why I write a query to get this measurement convert this to integer and write this to a new bucket:

    datatoconvert = from(bucket: "telegraf")
      |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
      |> filter(fn: (r) => r._measurement == "asset_vm_cmdb")
      |> toInt()

    datatoconvert
      |> filter(fn: (r) => exists r._value)
      |> to(bucket: "typeconversion", org: "<MY_ORG>")

I can’t run this query, maybe because of reaching the timeout.
So I thought I can prevent this by create a “task” for this. But the task always fails with following error message: could not execute task run; Err: context canceled: context canceled

Can someone help me?

1 Like

Yep, agreed. But +1 on trying to convert your data, I ended up just dropping and creating a new bucket

If you switch to the raw data view in dataExplorer , does datatoconvert contain anything ? (Ie. Does only the first part of the query work, or are both failing/timing out)

If you do get data from the first part, what is the smallest time window you can select for that still contains, say 10 records? Can you then run the 2nd query as well or does it still timeout?

First, thanks for your reply
I switched to the raw data view but the main issue is the whole browser stucks when I run this query.

Unfortunately I can’t get even when I select a very small window. So in this bucket/measurement are data from a script that runs every 24h (asset data from over 2700 VMs). Even when I run the query against one intervall I reach a “timeout”.
It works only when I filter for example the query to one VM and then everything works.

That sounds strange… :thinking:
Are you sure your data is clean? So you are sure that there is always a “string integer” in the value?

So a workaround could be to adapt the flux script so that it iterates over all VMs?

Another possibility would be to solve the problem right at the time of capture in Telegraf and convert it there?

Not always. So some fields are correct Integers and some are “string integers”.

That’s is a nice idea. So I have to look how I can make a foreach and come back to this post.
The problem is, the data are not coming from telegraf. We have a PowerShell script that collects the data and writes directly to influx. And there was the issue with the types. I fixed the script right now but I can’t not write new data in influx because of the data type difference.

Probably only as a last resort, but you can export all the data within a bucket. My understanding is the format pops out as line records. If so, you could fix the historical data there (eg. Using find/replace, or even power shell.)

Once the data is corrected you could either drop the old bucket and import from the cleaned export, or import into a new bucket altogether.

That could be the problem. What happens when the toInt() function gets an integer instead of a string?
Idea: If you can do type checking in flux (is that possible?) you could convert only the string types?

@FixTestRepeat unfortunately this data is in a bucket with a lot of other datas (for example from telegraf agent,…) and I can’t export the whole bucket and I found no solution to export only a measurement from the bucket.

@Franky1 Thanks for the idea but I found no way to check the types in flux (Unfortunately I think there is no “type checking function” in Flux).

Is there really no way in Flux to overwrite the type of some fields in a measurement? :frowning:

In the meantime, have you made sure that these mixed data types are the real problem?

@scott or @Anaisdg , can either of you confirm if there is a quick and simple way to handle this?

@Franky1 I am not sure how can I verify this but I think this is the issue or maybe there is a Bug that timeouts my queries and that’s the reason that I can’t convert the type.

@fluxator @Franky1 How large of a time range are you querying and what is the resolution of your data? To me, it sounds like you’re potentially pulling back a lot of data points.

There are some operations that Flux can “push down” to the underlying data storage engine and perform there (much faster). In your query, the range and filter operations are pushed down to storage, but toInt cannot be pushed down. Flux has to store all of the data returned from filter in memory and operate on it there. I think this is why your query is timing out. You’re likely running out of memory.

Try querying smaller time ranges or more specific sets of data.

When it comes to replacing a field with a new type, the only way to do that is to write the type-casted field to a new bucket. If you try to write the updated field to the same bucket, the request will fail due to a type conflict.

@scott The time range I query is 30 days, because the bucket has a retention time of 30 days and I want to change the type of three fields for the whole bucket. But the resolution in this bucket for this measurement is low (the data cames from an PowerShell script each 24h). But even when I run the query for only 1 intervall then the query “times out” or stuck. (

The InfluxDB VM has a lot of power, 8 Cores & 64GB Memory. It doesn’t look like the memory is running out while I run the query and watch the memory usage with “htop”.

I know, but that’s why I want to get all the data, change/convert the type and write this down to a new measurement or bucket. But as I mentioned, this is not working.

  • Do you get any errors in the influxd output or in your browser console?
  • What version of InfluxDB are you running?

@scott

  • If you mean in the daemon logs (sudo journalctl -u influxdb.service) then I can’t find any errors or something interesting. I also can’t find any errors in the chrome console but the InfluxDB GUI hangs after about 80 seconds:

  • I run InfluxDB in Version 2.0.2 (84496e5).

Hmm, that’s really strange. I don’t see anything about that query that would make it hang. Could you try running:

from(bucket: "telegraf")
    |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
    |> filter(fn: (r) => r._measurement == "asset_vm_cmdb")
    |> count()

If that works, run:

from(bucket: "telegraf")
    |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
    |> filter(fn: (r) => r._measurement == "asset_vm_cmdb")
    |> count()
    |> group()
    |> sum()

@scott unfortunately this query also hangs after ~3seconds.

Does it return anything when you query the last hour rather than 30 days?

@scott If I change the timerange to the shortest possible time (last time the script was running), unfortunately there is no difference “timeout occurs”.

I don’t recall any defect in 2.0.2 that would cause this, but have you tried upgrading to 2.0.4?