Regex Processor Issues: Tag Not Appearing in Bucket

JAA · November 18, 2024, 11:22am

Hi everyone!

I’m having trouble using the Telegraf regex processor to extract and write a new tag into a test bucket.

Current Situation:

I successfully use the regex processor to extract a date from a string (OF) and convert it into a timestamp (timestamp_extracted) for our production bucket.
Now, I need to extract another part of the same string to create a new tag (Formula) and write it to a test bucket to avoid impacting the production environment.

Problem:

Despite my configuration, the Formula tag is not appearing in the test bucket.
I suspect there might be an issue with the processor chain or how I’m testing the new configuration.

##### current timestamp 
[[processors.regex]]
  # Apply the regex only to the desired metric
  namepass = ["Production"]

  [[processors.regex.fields]]
    # The specific field to process
    key = "OF"  # Adjust this to the field name in which the value is stored
    pattern = "^(\\d{14})-.*"  # Regular expression to extract the first 14 digits (timestamp part)
    replacement = "${1}"  # Extracted timestamp will be placed here
    result_key = "timestamp_extracted"  # Store the result in a temporary field


[[processors.timestamp]]
  # Specify the measurement name
  namepass = ["Production"]
  # The field containing the extracted timestamp
  field = "timestamp_extracted"
  # The format of the source timestamp (yyyyMMddHHmmss)
  source_timestamp_format = "20060102150405"
  # The target format of the timestamp, convert it to unix time (or adjust as needed)
  destination_timestamp_format = "unix"

  # Timezone options (optional). Set as needed, or it defaults to UTC
  source_timestamp_timezone = "Europe/Madrid"
  destination_timestamp_timezone = "UTC"


[[processors.converter]]
  # Specify the measurement name
  namepass = ["Production"]

  [processors.converter.fields]
    # Use the converted timestamp as the new metric timestamp
    timestamp = ["timestamp_extracted"]
    # Specify the format of the timestamp
    timestamp_format = "unix"  # This should match the format you're converting to (e.g., 'unix', 'unix_ms', etc.)

########### Testing new Tag

[[processors.regex]]
  # Apply the regex only to the desired metric
  namepass = ["Production"]

  [[processors.regex.tags]]
    key = "OF"
    pattern = "^(?:.{15})(.*)$" # Capture the second part after the 14-digit timestamp
    replacement = "${1}"  # Place the second part (production identifier) here
    result_key = "Formula"

[[processors.converter]]
  # Specify the measurement name
  namepass = ["Production2"]

  [processors.converter.tags]
    # Use the converted Formula as the new metric 
    string = ["Formula"]

##########################################
##                     INPUT PLUGINS                            ##
#########################################


############################ Production bucket#########################
# Grupo OF
[[inputs.opcua_listener.group]]
  name = "Production"
  namespace = "2"
  identifier_type = "s"
  default_tags = { Planta = "2", BucketDestino = "planta" }
  [[inputs.opcua_listener.group.nodes]]
    name = "OF"
    identifier = "ECMS1.ProductionNumber"
    default_tags = { Linea = "1", Maquina = "ECMS1" }
    monitoring_params = {sampling_interval="0s", queue_size=10, discard_oldest=true}
  [[inputs.opcua_listener.group.nodes]]
    name = "OF"
    identifier = "ECMS2.ProductionNumber"
    default_tags = { Linea = "1", Maquina = "ECMS2" }
    monitoring_params = {sampling_interval="0s", queue_size=10, discard_oldest=true}
  [[inputs.opcua_listener.group.nodes]]
    name = "OF"
    identifier = "ECMS3.ProductionNumber"
    default_tags = { Linea = "2", Maquina = "ECMS3" }
    monitoring_params = {sampling_interval="0s", queue_size=10, discard_oldest=true}
  [[inputs.opcua_listener.group.nodes]]
    name = "OF"
    identifier = "ECMS4.ProductionNumber"
    default_tags = { Linea = "2", Maquina = "ECMS4" }
    monitoring_params = {sampling_interval="0s", queue_size=10, discard_oldest=true}

######################### Test new tag ###################
# Grupo OF2
[[inputs.opcua_listener.group]]
  name = "Production2"
  namespace = "2"
  identifier_type = "s"
  default_tags = { Planta = "2", BucketDestino = "Test" }
  [[inputs.opcua_listener.group.nodes]]
    name = "OF"
    identifier = "ECMS1.ProductionNumber"
    default_tags = { Linea = "1", Maquina = "ECMS1" }
    monitoring_params = {sampling_interval="0s", queue_size=10, discard_oldest=true}
  [[inputs.opcua_listener.group.nodes]]
    name = "OF"
    identifier = "ECMS2.ProductionNumber"
    default_tags = { Linea = "1", Maquina = "ECMS2" }
    monitoring_params = {sampling_interval="0s", queue_size=10, discard_oldest=true}
  [[inputs.opcua_listener.group.nodes]]
    name = "OF"
    identifier = "ECMS3.ProductionNumber"
    default_tags = { Linea = "2", Maquina = "ECMS3" }
    monitoring_params = {sampling_interval="0s", queue_size=10, discard_oldest=true}
  [[inputs.opcua_listener.group.nodes]]
    name = "OF"
    identifier = "ECMS4.ProductionNumber"
    default_tags = { Linea = "2", Maquina = "ECMS4" }
    monitoring_params = {sampling_interval="0s", queue_size=10, discard_oldest=true}

Any help or advice is greatly appreciated!

Best Regards

Alejandro A

npm_engineer · November 21, 2024, 3:20am

I don’t think you would be able to do it this way because your field is always matching the first one, regardless if it does anything or not with it. I believe you would either need to use the split or clone processor instead to make two different metrics.

Topic		Replies	Views
Regex parsing in telegraf Telegraf telegraf	2	1073	August 26, 2022
Telegraf parsing with regex telegraf	3	4990	July 26, 2018
Changing snmp input with regex using telegraf pulls before influxdb Telegraf influxdb , telegraf	27	3760	March 31, 2021
Processor filter by field value Telegraf	5	1319	September 28, 2023
Telegraf plugin Telegraf influxdb , telegraf	3	1004	April 28, 2021

Regex Processor Issues: Tag Not Appearing in Bucket

Current Situation:

Problem:

Related topics