Telegraf jti_openconfig_telemetry generating 1600 Subscribe RPC per second when auth is failing

This relates in part to issue #11286 and recent pull request #13647.

Connection is made from Telegraf to a Juniper router, TCP is established.
However, Authentication is failing.
This results in Telegraf sending Subscribe RPC in a tight loop.

My question is - is there any setting that can alter the timing of the Subscribe request to once per second ?

Thanks

Hi,

Happy to have some feedback on that PR. As we do not have these devices it is hard to go through the full testing loop.

Right now we have the PR set to sleep based on the setting of retry_delay, which is used for any type of retry during issues or errors. That is set to a 1 second default. You could change that value, but it would also increase the delay for other issues.

Do you think that this should be a different delay? Are you not seeing the 1 second delay?

Thanks

Appreciate the quick response…
The retry_delay is not explicitly defined so should be using the default of 1000ms, but we see it retry the rpc every 0.7ms or so. 1000ms would be good.
This rate of requests causes the jsd daemon to consume a lot of cpu cycles.
If you need help getting hold of a virtual mx i can probably assist.

// first req

2   0.000007 10.85.210.151 → 10.85.248.12 TCP 118 53864 → 50051 [PSH, ACK] Seq=1 Ack=1 Win=1020 Len=30 TSval=2091430915 TSecr=2148818709
3   0.000062 10.85.248.12 → 10.85.210.151 TCP 105 50051 → 53864 [PSH, ACK] Seq=1 Ack=31 Win=33304 Len=17 TSval=2148818710 TSecr=2091430915
4   0.000163 10.85.210.151 → 10.85.248.12 TCP 134 53864 → 50051 [PSH, ACK] Seq=31 Ack=1 Win=1020 Len=46 TSval=2091430915 TSecr=2148818709
5   0.000449 10.85.248.12 → 10.85.210.151 TCP 194 50051 → 53864 [PSH, ACK] Seq=18 Ack=77 Win=33304 Len=106 TSval=2148818710 TSecr=2091430915
6   0.000761 10.85.210.151 → 10.85.248.12 TCP 88 53864 → 50051 [ACK] Seq=77 Ack=124 Win=1020 Len=0 TSval=2091430916 TSecr=2148818710

// next req

7   0.000764 10.85.210.151 → 10.85.248.12 TCP 118 53864 → 50051 [PSH, ACK] Seq=77 Ack=124 Win=1020 Len=30 TSval=2091430916 TSecr=2148818710
8   0.000811 10.85.248.12 → 10.85.210.151 TCP 105 50051 → 53864 [PSH, ACK] Seq=124 Ack=107 Win=33304 Len=17 TSval=2148818711 TSecr=2091430916
9   0.000890 10.85.210.151 → 10.85.248.12 TCP 134 53864 → 50051 [PSH, ACK] Seq=107 Ack=124 Win=1020 Len=46 TSval=2091430916 TSecr=2148818710

10 0.001139 10.85.248.12 → 10.85.210.151 TCP 194 50051 → 53864 [PSH, ACK] Seq=141 Ack=153 Win=33304 Len=106 TSval=2148818711 TSecr=2091430916
11 0.001460 10.85.210.151 → 10.85.248.12 TCP 88 53864 → 50051 [ACK] Seq=153 Ack=247 Win=1020 Len=0 TSval=2091430916 TSecr=2148818711

I would really, really appreciate Telegraf logs. If we are going to be changing the behavior and ensure we understand the flow of what is going on. Viewing the logs with debug enabled will help make sure we are following the code path.

My understanding is if we fail to auth, we would hit the sleep again before trying again. If we did auth we would still sleep as well.

Added the debug flag. hope this is what you are looking for.

root@ubuntu:/etc/telegraf# /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
2023-07-21T05:18:08Z I! Starting Telegraf 1.21.4+ds1-0ubuntu2
2023-07-21T05:18:08Z I! Loaded inputs: cpu disk diskio jti_openconfig_telemetry kernel mem processes swap system
2023-07-21T05:18:08Z I! Loaded aggregators:
2023-07-21T05:18:08Z I! Loaded processors:
2023-07-21T05:18:08Z I! Loaded outputs: influxdb prometheus_client
2023-07-21T05:18:08Z I! Tags enabled: host=ubuntu
2023-07-21T05:18:08Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:“ubuntu”, Flush Interval:10s
2023-07-21T05:18:08Z I! [outputs.prometheus_client] Listening on http://[::]:9273/metrics
2023-07-21T05:18:08Z E! [inputs.jti_openconfig_telemetry] Error in plugin: failed to read from 10.85.248.12: rpc error: code = Unauthenticated desc = JGrpcServer: Session not authenticated/authorized
2023-07-21T05:18:08Z E! [inputs.jti_openconfig_telemetry] Error in plugin: failed to read from 10.85.248.12: rpc error: code = Unauthenticated desc = JGrpcServer: Session not authenticated/authorized
2023-07-21T05:18:08Z E! [inputs.jti_openconfig_telemetry] Error in plugin: failed to read from 10.85.248.12: rpc error: code = Unauthenticated desc = JGrpcServer: Session not authenticated/authorized
2023-07-21T05:18:08Z E! [inputs.jti_openconfig_telemetry] Error in plugin: failed to read from 10.85.248.12: rpc error: code = Unauthenticated desc = JGrpcServer: Session not authenticated/authorized

root@ubuntu:~# cat /etc/telegraf/telegraf.conf | grep -v “#” | grep -E “[a-Z]”
[global_tags]
[agent]
interval = “10s”
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = “0s”
flush_interval = “10s”
flush_jitter = “0s”
precision = “”
hostname = “”
omit_hostname = false
[[outputs.influxdb]]
[[outputs.prometheus_client]]
listen = “:9273”
metric_version = 2
[[inputs.cpu]]
percpu = true
totalcpu = true
collect_cpu_time = false
report_active = false
[[inputs.disk]]
ignore_fs = [“tmpfs”, “devtmpfs”, “devfs”, “iso9660”, “overlay”, “aufs”, “squashfs”]
[[inputs.diskio]]
[[inputs.kernel]]
[[inputs.mem]]
[[inputs.processes]]
[[inputs.swap]]
[[inputs.system]]
[[inputs.jti_openconfig_telemetry]]
servers = [“10.85.248.12:50051”]
client_id = “telegraf”
sensors = [
“/interfaces/”,

I hope i’m correct, this looks like the code path in jti_openconfig_telemetry.go where its looping:

			for {
				r, err := stream.Recv()
				if err != nil {
					// If we encounter error in the stream, break so we can retry
					// the connection
					acc.AddError(fmt.Errorf("failed to read from %q: %w", grpcServer, err))
					break
				}

Can you please use the artifacts in the new PR and verify the new functionality please?

You mentioned the recent pull request, but you are not using it? This is a very old version of this plugin that has already had improvements. We know there are issues with the authentication already based on the above issue as well.

[quote]
I hope i’m correct, this looks like the code path in jti_openconfig_telemetry.go where its looping:

[quote]

That is indeed where the error message is coming from, however, you will note the break in that statement, which will break you out of that loop.

Finally, there are missing debug messages from your logs. I would expect to have at least seen some comments about opening a new grpc session when the plugin first starts and debug statements from the agent. I do not see the debug flag in your CLI (e.g. --debug, which would be the easiest way to add it, nor in your outputted config.

Found the .deb artifact and installed it. note the version is now 1.28.0-95ef9a5d.

root@ubuntu:/etc/telegraf# /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d --debug
2023-07-24T23:36:11Z I! Loading config: /etc/telegraf/telegraf.conf
2023-07-24T23:36:11Z I! Starting Telegraf 1.28.0-95ef9a5d
2023-07-24T23:36:11Z I! Available plugins: 237 inputs, 9 aggregators, 28 processors, 24 parsers, 59 outputs, 5 secret-stores
2023-07-24T23:36:11Z I! Loaded inputs: cpu disk diskio jti_openconfig_telemetry kernel mem processes swap system
2023-07-24T23:36:11Z I! Loaded aggregators:
2023-07-24T23:36:11Z I! Loaded processors:
2023-07-24T23:36:11Z I! Loaded secretstores:
2023-07-24T23:36:11Z I! Loaded outputs: influxdb prometheus_client
2023-07-24T23:36:11Z I! Tags enabled: host=ubuntu
2023-07-24T23:36:11Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"ubuntu", Flush Interval:10s
2023-07-24T23:36:11Z D! [agent] Initializing plugins
2023-07-24T23:36:11Z D! [agent] Connecting outputs
2023-07-24T23:36:11Z D! [agent] Attempting connection to [outputs.influxdb]
2023-07-24T23:36:11Z D! [agent] Successfully connected to outputs.influxdb
2023-07-24T23:36:11Z D! [agent] Attempting connection to [outputs.prometheus_client]
2023-07-24T23:36:11Z I! [outputs.prometheus_client] Listening on http://[::]:9273/metrics
2023-07-24T23:36:11Z D! [agent] Successfully connected to outputs.prometheus_client
2023-07-24T23:36:11Z D! [agent] Starting service inputs
2023-07-24T23:36:11Z D! [inputs.jti_openconfig_telemetry] Opened a new gRPC session to 10.85.248.12 on port 50051
2023-07-24T23:36:21Z D! [outputs.prometheus_client] Wrote batch of 25 metrics in 1.694899ms
2023-07-24T23:36:21Z D! [outputs.prometheus_client] Buffer fullness: 0 / 10000 metrics
2023-07-24T23:36:21Z D! [outputs.influxdb] Wrote batch of 25 metrics in 6.70529ms
2023-07-24T23:36:21Z D! [outputs.influxdb] Buffer fullness: 0 / 10000 metrics
2023-07-24T23:36:31Z D! [outputs.prometheus_client] Wrote batch of 34 metrics in 1.156914ms
2023-07-24T23:36:31Z D! [outputs.prometheus_client] Buffer fullness: 0 / 10000 metrics
2023-07-24T23:36:31Z D! [outputs.influxdb] Wrote batch of 34 metrics in 6.736744ms
2023-07-24T23:36:31Z D! [outputs.influxdb] Buffer fullness: 0 / 10000 metrics
2023-07-24T23:36:41Z D! [outputs.prometheus_client] Wrote batch of 34 metrics in 634.728µs
2023-07-24T23:36:41Z D! [outputs.prometheus_client] Buffer fullness: 0 / 10000 metrics
2023-07-24T23:36:41Z D! [outputs.influxdb] Wrote batch of 34 metrics in 9.441477ms
2023-07-24T23:36:41Z D! [outputs.influxdb] Buffer fullness: 0 / 10000 metrics
2023-07-24T23:36:51Z D! [outputs.prometheus_client] Wrote batch of 34 metrics in 305.471µs
2023-07-24T23:36:51Z D! [outputs.prometheus_client] Buffer fullness: 0 / 10000 metrics
2023-07-24T23:36:51Z D! [outputs.influxdb] Wrote batch of 34 metrics in 10.153169ms
2023-07-24T23:36:51Z D! [outputs.influxdb] Buffer fullness: 0 / 10000 metrics
2023-07-24T23:37:01Z D! [outputs.prometheus_client] Wrote batch of 34 metrics in 835.409µs
2023-07-24T23:37:01Z D! [outputs.prometheus_client] Buffer fullness: 0 / 10000 metrics
2023-07-24T23:37:01Z D! [outputs.influxdb] Wrote batch of 34 metrics in 9.56231ms
2023-07-24T23:37:01Z D! [outputs.influxdb] Buffer fullness: 0 / 10000 metrics

// killed the JSD daemon on router (same effect as rebooting the node)

2023-07-24T23:37:10Z E! [inputs.jti_openconfig_telemetry] Error in plugin: failed to read from 10.85.248.12: rpc error: code = Unavailable desc = error reading from server: EOF
2023-07-24T23:37:11Z D! [outputs.prometheus_client] Wrote batch of 34 metrics in 347.841µs
2023-07-24T23:37:11Z D! [outputs.prometheus_client] Buffer fullness: 0 / 10000 metrics
2023-07-24T23:37:11Z D! [outputs.influxdb] Wrote batch of 34 metrics in 6.088517ms
2023-07-24T23:37:11Z D! [outputs.influxdb] Buffer fullness: 0 / 10000 metrics
2023-07-24T23:37:13Z D! [inputs.jti_openconfig_telemetry] Retrying 10.85.248.12 with timeout 1s
2023-07-24T23:37:14Z D! [inputs.jti_openconfig_telemetry] Retrying 10.85.248.12 with timeout 1s
2023-07-24T23:37:15Z D! [inputs.jti_openconfig_telemetry] Received from 10.85.248.12:
2023-07-24T23:37:15Z D! [inputs.jti_openconfig_telemetry] Available collection for 10.85.248.12 is: []
2023-07-24T23:37:15Z E! [inputs.jti_openconfig_telemetry] Error in plugin: failed to read from "10.85.248.12": rpc error: code = Unauthenticated desc = JGrpcServer: Session not authenticated/authorized
2023-07-24T23:37:15Z D! [inputs.jti_openconfig_telemetry] Received from 10.85.248.12:
2023-07-24T23:37:15Z D! [inputs.jti_openconfig_telemetry] Available collection for 10.85.248.12 is: []
2023-07-24T23:37:15Z E! [inputs.jti_openconfig_telemetry] Error in plugin: failed to read from "10.85.248.12": rpc error: code = Unauthenticated desc = JGrpcServer: Session not authenticated/authorized
2023-07-24T23:37:15Z D! [inputs.jti_openconfig_telemetry] Received from 10.85.248.12:
2023-07-24T23:37:15Z D! [inputs.jti_openconfig_telemetry] Available collection for 10.85.248.12 is: []
2023-07-24T23:37:15Z E! [inputs.jti_openconfig_telemetry] Error in plugin: failed to read from "10.85.248.12": rpc error: code = Unauthenticated desc = JGrpcServer: Session not authenticated/authorized

After the Error in plugin message we should break out of the receive loop. And attempt to subscribe again. If we are not authenticated, that should error as well and we would try to reauth.

What type of switch and version of Junos are you using?

This is an MX480 running Junos: 21.4R3-S3.4
FWIW I’ve tested the bleeding edge with exactly the same results.

@ipmonk,

Thanks for the feedback so far.

I want to try the method of passing the credentials via a context as mentioned by others. I have updated the PR with the new method and some additional logging. Could you try the new artifacts please?

I did try to get a system in the Juniper vlab set up to test these out, but was unable to get one with open telemetry accessible. The open telemetry packages themselves seem to require a purchase or non-free account :frowning:

Thanks

Thanks very much for persisting with this. This latest version does prevent the high rate of requests, and I can see the channel count is now 1 post JSD restart.
It is successful in reconnecting. However, I’m perplexed as the LoginCheck RPC is not being initiated upon reconnection
The client ID is not displayed as expected as a result:

foo@bar> show extension-service request-response clients
Client ID              Socket Address                     Client Type   Client
Login Time (UTC)   Channel Count
unix::17               unix::17                           gRPC          No Login Time                     1

unix::18               unix::18                           gRPC          No Login Time                     1

// upon initial connection

foo@bar> show extension-service request-response clients
Client ID              Socket Address                     Client Type   Client
Login Time (UTC)   Channel Count
telegraf               ipv6:::ffff:10.85.210.151:58258    gRPC          Wed Jul 26 23:44:10 2023          1

unix::17               unix::17                           gRPC          No Login Time                     1

unix::18               unix::18                           gRPC          No Login Time                     1

// post JSD restart, with the previous version the channel count was 0. Note the client ID

foo@bar> show extension-service request-response clients
Client ID              Socket Address                     Client Type   Client
Login Time (UTC)   Channel Count
ipv6:::ffff:10.85.210.151:48204 ipv6:::ffff:10.85.210.151:48204 gRPC    Wed Jul 26 23:45:43 2023          1

unix::30               unix::30                           gRPC          No Login Time                     1

unix::31               unix::31                           gRPC          No Login Time                     1

// telegraf debugs

root@ubuntu:~# /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d --debug
2023-07-26T23:06:18Z I! Loading config: /etc/telegraf/telegraf.conf
2023-07-26T23:06:18Z I! Starting Telegraf 1.28.0-47dffb41
2023-07-26T23:06:18Z I! Available plugins: 237 inputs, 9 aggregators, 28 processors, 24 parsers, 59 outputs, 5 secret-stores
2023-07-26T23:06:18Z I! Loaded inputs: cpu disk diskio jti_openconfig_telemetry kernel mem processes swap system
2023-07-26T23:06:18Z I! Loaded aggregators:
2023-07-26T23:06:18Z I! Loaded processors:
2023-07-26T23:06:18Z I! Loaded secretstores:
2023-07-26T23:06:18Z I! Loaded outputs: influxdb prometheus_client
2023-07-26T23:06:18Z I! Tags enabled: host=ubuntu
2023-07-26T23:06:18Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"ubuntu", Flush Interval:10s
2023-07-26T23:06:18Z D! [agent] Initializing plugins
2023-07-26T23:06:18Z D! [agent] Connecting outputs
2023-07-26T23:06:18Z D! [agent] Attempting connection to [outputs.influxdb]
2023-07-26T23:06:18Z D! [agent] Successfully connected to outputs.influxdb
2023-07-26T23:06:18Z D! [agent] Attempting connection to [outputs.prometheus_client]
2023-07-26T23:06:18Z I! [outputs.prometheus_client] Listening on http://[::]:9273/metrics
2023-07-26T23:06:18Z D! [agent] Successfully connected to outputs.prometheus_client
2023-07-26T23:06:18Z D! [agent] Starting service inputs
2023-07-26T23:06:18Z D! [inputs.jti_openconfig_telemetry] Opened a new gRPC session to 10.85.248.10 on port 50051
2023-07-26T23:06:18Z D! [inputs.jti_openconfig_telemetry] Sucessfully subscribed to /interfaces/ on 10.85.248.10
2023-07-26T23:06:28Z D! [outputs.prometheus_client] Wrote batch of 25 metrics in 396.693µs
2023-07-26T23:06:28Z D! [outputs.prometheus_client] Buffer fullness: 0 / 10000 metrics
2023-07-26T23:06:28Z D! [outputs.influxdb] Wrote batch of 25 metrics in 8.725532ms
2023-07-26T23:06:28Z D! [outputs.influxdb] Buffer fullness: 0 / 10000 metrics
2023-07-26T23:06:38Z D! [outputs.prometheus_client] Wrote batch of 34 metrics in 345.338µs
2023-07-26T23:06:38Z D! [outputs.prometheus_client] Buffer fullness: 0 / 10000 metrics
2023-07-26T23:06:38Z D! [outputs.influxdb] Wrote batch of 34 metrics in 67.219836ms
2023-07-26T23:06:38Z D! [outputs.influxdb] Buffer fullness: 0 / 10000 metrics
2023-07-26T23:06:48Z D! [outputs.prometheus_client] Wrote batch of 34 metrics in 325.526µs
2023-07-26T23:06:48Z D! [outputs.prometheus_client] Buffer fullness: 0 / 10000 metrics
2023-07-26T23:06:48Z D! [outputs.influxdb] Wrote batch of 34 metrics in 11.000936ms
2023-07-26T23:06:48Z D! [outputs.influxdb] Buffer fullness: 0 / 10000 metrics
2023-07-26T23:06:58Z D! [outputs.prometheus_client] Wrote batch of 34 metrics in 838.051µs
2023-07-26T23:06:58Z D! [outputs.prometheus_client] Buffer fullness: 0 / 10000 metrics
2023-07-26T23:06:58Z D! [outputs.influxdb] Wrote batch of 34 metrics in 10.468913ms
2023-07-26T23:06:58Z D! [outputs.influxdb] Buffer fullness: 0 / 10000 metrics
2023-07-26T23:07:07Z E! [inputs.jti_openconfig_telemetry] Error in plugin: failed to read from /interfaces/ from 10.85.248.10: rpc error: code = Unavailable desc = error reading from server: EOF
2023-07-26T23:07:08Z D! [outputs.prometheus_client] Wrote batch of 34 metrics in 886.398µs
2023-07-26T23:07:08Z D! [outputs.prometheus_client] Buffer fullness: 0 / 10000 metrics
2023-07-26T23:07:08Z D! [outputs.influxdb] Wrote batch of 34 metrics in 8.836802ms
2023-07-26T23:07:08Z D! [outputs.influxdb] Buffer fullness: 0 / 10000 metrics
2023-07-26T23:07:09Z D! [inputs.jti_openconfig_telemetry] Retrying /interfaces/ from 10.85.248.10 with timeout 1s
2023-07-26T23:07:10Z D! [inputs.jti_openconfig_telemetry] Retrying /interfaces/ from 10.85.248.10 with timeout 1s
2023-07-26T23:07:11Z D! [inputs.jti_openconfig_telemetry] Sucessfully subscribed to /interfaces/ on 10.85.248.10
^C2023-07-26T23:07:17Z D! [agent] Stopping service inputs

// jsd traces for initial connection:

Jul 26 16:06:18 TerminateClient: Client ipv6:::ffff:10.85.210.151:42862 not found in client map
Jul 26 16:06:18 TerminateClientThreads: Removing thread info for peer ipv6:::ffff:10.85.210.151:42862
Jul 26 16:06:18 TerminateClientThreads: Number of grpc clients connected: 2
Jul 26 16:06:18 peer = 16: ipv4:10.85.210.151:46526
Jul 26 16:06:18 JGrpcServerStart: Proto Desriptor object found in proto map for RPC /authentication.Login/LoginCheck for peer ipv6:::ffff:10.85.210.151:46526
Jul 26 16:06:18 TerminateClientThreads: Number of grpc clients connected: 2
Jul 26 16:06:18 Default IRI address is 128.0.0.4:51051
Jul 26 16:06:18 GetExternalConnectionCount: external count: 0
Jul 26 16:06:18 TerminateClientThreads: Number of grpc servers connected: 0
Jul 26 16:06:18 Default IRI address is 128.0.0.4:51051
Jul 26 16:06:18 GetExternalConnectionCount: external count: 0
Jul 26 16:06:18 Freeing connection for peer ipv6:::ffff:10.85.210.151:42862 and connection is 0x38e0320
Jul 26 16:06:18 Allocated connection for peer ipv6:::ffff:10.85.210.151:46526 and connection is 0x38e05c0
Jul 26 16:06:18 CreateClientInfo: client info for peer ipv6:::ffff:10.85.210.151:46526 created successfully
Jul 26 16:06:18 CreateClientHandler: Number of grpc clients connected: 3
Jul 26 16:06:18 JGrpcServerStart: Ipv4 address of connected socket is 10.85.248.10
Jul 26 16:06:18 JGrpcServerStart: callInfo for RPC /authentication.Login/LoginCheck for peer ipv6:::ffff:10.85.210.151:46526 pushed to call queue
Jul 26 16:06:18 grpcClientThread: Processing RPC /authentication.Login/LoginCheck
Jul 26 16:06:18 AllocCallMem:GRPC Server Call Completion queue created successfully
Jul 26 16:06:18 AllocCallMem:GRPC Server Call Details initialized successfully
Jul 26 16:06:18 CreateJapiChannel:GRPC Channel created Successfully
Jul 26 16:06:18 BindClientFdToVrf: URI is unix:/var/run/japi_aaasd
Jul 26 16:06:18 BindClientFdToVrf: Fd 24 is of type unix domain socket, returning
Jul 26 16:06:18 CreateJapiChannel:GRPC Channel created Successfully
Jul 26 16:06:18 BindClientFdToVrf: URI is unix:/var/run/japi_aaasd
Jul 26 16:06:18 BindClientFdToVrf: Fd 24 is of type unix domain socket, returning
Jul 26 16:06:18 SendGrpcLoginResponse: AAASD returned success
Jul 26 16:06:18 AddClientId: client id telegraf added to id set
Jul 26 16:06:18 SendGrpcLoginResponse: Authentication response sent to client telegraf
Jul 26 16:06:18 SendGrpcLoginResponse: Call cleanup and call is 0x37ca450
Jul 26 16:06:18 TerminateClientThreads: Number of grpc clients connected: 3
Jul 26 16:06:18 peer = 16: ipv4:10.85.210.151:46526
Jul 26 16:06:18 JGrpcServerStart: Proto Desriptor object found in proto map for RPC /telemetry.OpenConfigTelemetry/telemetrySubscribe for peer ipv6:::ffff:10.85.210.151:46526
Jul 26 16:06:18 JGrpcServerStart: Ipv4 address of connected socket is 10.85.248.10
Jul 26 16:06:18 JGrpcServerStart: callInfo for RPC /telemetry.OpenConfigTelemetry/telemetrySubscribe for peer ipv6:::ffff:10.85.210.151:46526 pushed to call queue
Jul 26 16:06:18 AllocCallMem:GRPC Server Call Completion queue created successfully
Jul 26 16:06:18 AllocCallMem:GRPC Server Call Details initialized successfully
Jul 26 16:06:18 grpcClientThread: Processing RPC /telemetry.OpenConfigTelemetry/telemetrySubscribe
Jul 26 16:06:18 CreateJapiChannel:GRPC Channel created Successfully
Jul 26 16:06:18 BindClientFdToVrf: URI is unix:/var/run/japi_aaasd
Jul 26 16:06:18 BindClientFdToVrf: Fd 24 is of type unix domain socket, returning
Jul 26 16:06:18 CreateJapiChannel:GRPC Channel created Successfully
Jul 26 16:06:18 BindClientFdToVrf: URI is unix:/var/run/japi_aaasd
Jul 26 16:06:18 BindClientFdToVrf: Fd 24 is of type unix domain socket, returning
Jul 26 16:06:18 Package name: telemetry
Jul 26 16:06:18 CreateJapiChannel:GRPC Channel created Successfully
Jul 26 16:06:18 grpcClientThread: Invoking GrpcToJapi, Peer: ipv6:::ffff:10.85.210.151:46526, CQ: 0x27fab00, Call: 0x37cc050
Jul 26 16:06:18 CreateCompletionQueue:GRPC Completion queue created Successfully
Jul 26 16:06:18 grpcClientThread: Initializing call for RPC /telemetry.OpenConfigTelemetry/telemetrySubscribe
Jul 26 16:06:18 GrpcCall:GRPC call for RPC: /telemetry.OpenConfigTelemetry/telemetrySubscribe to host unix:/var/run/japi_na-grpcd created Successfully
Jul 26 16:06:18 IncrRefCount: rpc request is /telemetry.OpenConfigTelemetry/telemetrySubscribe
Jul 26 16:06:18 IncrRefCount: Ref count incremented to 3 for RPC /telemetry.OpenConfigTelemetry/telemetrySubscribe
Jul 26 16:06:18 AddCallTag: Tag id request_1 added to tag map for client telegraf
Jul 26 16:06:18 ########### AddCallTag: Total Requests: 1
Jul 26 16:06:18 CreateGrpcCall: Thread to handle grpc requests from client telegraf initialized and started successfully
Jul 26 16:06:18 GrpcSendInitialMetadata: client-id metadata for RPC: /telemetry.OpenConfigTelemetry/telemetrySubscribe is telegraf
Jul 26 16:06:18 BindClientFdToVrf: URI is unix:/var/run/japi_na-grpcd
Jul 26 16:06:18 BindClientFdToVrf: Fd 26 is of type unix domain socket, returning
Jul 26 16:06:18 Listen: Grpc channel to unix:/var/run/japi_na-grpcd transitioned from GRPC_CHANNEL_IDLE state to GRPC_CHANNEL_CONNECTING for client ipv6:::ffff:10.85.210.151:46526
Jul 26 16:06:18 RegisterCallCancel:Registering grpc op to handle call cancel
Jul 26 16:06:18 AddCallTag: Tag id request_2 added to tag map for client telegraf
Jul 26 16:06:18 ########### AddCallTag: Total Requests: 2
Jul 26 16:06:18 PushCallQueue: tag request_2 pushed to queue
Jul 26 16:06:18 RegisterCallCancel: Registered grpc op to monitor call cancel successfully for RPC: /telemetry.OpenConfigTelemetry/telemetrySubscribe
Jul 26 16:06:18 grpcClientThread: Call init successful for RPC /telemetry.OpenConfigTelemetry/telemetrySubscribe
Jul 26 16:06:18 Listen: Grpc channel to unix:/var/run/japi_na-grpcd transitioned from GRPC_CHANNEL_CONNECTING state to GRPC_CHANNEL_READY for client ipv6:::ffff:10.85.210.151:46526

// jsd traces post jsd restart

Jul 26 16:07:11 TerminateClientThreads: Number of grpc clients connected: 2
Jul 26 16:07:11 peer = 38: ipv4:10.85.210.151:47556
Jul 26 16:07:11 JGrpcServerStart: Proto Desriptor object found in proto map for RPC /telemetry.OpenConfigTelemetry/telemetrySubscribe for peer ipv6:::ffff:10.85.210.151:47556
Jul 26 16:07:11 TerminateClientThreads: Number of grpc clients connected: 2
Jul 26 16:07:11 Default IRI address is 128.0.0.4:51051
Jul 26 16:07:11 GetExternalConnectionCount: external count: 0
Jul 26 16:07:11 TerminateClientThreads: Number of grpc servers connected: 0
Jul 26 16:07:11 Default IRI address is 128.0.0.4:51051
Jul 26 16:07:11 GetExternalConnectionCount: external count: 0
Jul 26 16:07:11 Allocated connection for peer ipv6:::ffff:10.85.210.151:47556 and connection is 0x38e04a0
Jul 26 16:07:11 CreateClientInfo: client info for peer ipv6:::ffff:10.85.210.151:47556 created successfully
Jul 26 16:07:11 CreateClientHandler: Number of grpc clients connected: 3
Jul 26 16:07:11 JGrpcServerStart: Ipv4 address of connected socket is 10.85.248.10
Jul 26 16:07:11 JGrpcServerStart: callInfo for RPC /telemetry.OpenConfigTelemetry/telemetrySubscribe for peer ipv6:::ffff:10.85.210.151:47556 pushed to call queue
Jul 26 16:07:11 AllocCallMem:GRPC Server Call Completion queue created successfully
Jul 26 16:07:11 AllocCallMem:GRPC Server Call Details initialized successfully
Jul 26 16:07:11 grpcClientThread: Processing RPC /telemetry.OpenConfigTelemetry/telemetrySubscribe
Jul 26 16:07:11 CreateJapiChannel:GRPC Channel created Successfully
Jul 26 16:07:11 BindClientFdToVrf: URI is unix:/var/run/japi_aaasd
Jul 26 16:07:11 BindClientFdToVrf: Fd 51 is of type unix domain socket, returning
Jul 26 16:07:11 CreateJapiChannel:GRPC Channel created Successfully
Jul 26 16:07:11 BindClientFdToVrf: URI is unix:/var/run/japi_aaasd
Jul 26 16:07:11 BindClientFdToVrf: Fd 51 is of type unix domain socket, returning
Jul 26 16:07:11 AddClientId: client id ipv6:::ffff:10.85.210.151:47556 added to id set
Jul 26 16:07:11 CreateJapiChannel:GRPC Channel created Successfully
Jul 26 16:07:11 BindClientFdToVrf: URI is unix:/var/run/japi_aaasd
Jul 26 16:07:11 BindClientFdToVrf: Fd 51 is of type unix domain socket, returning
Jul 26 16:07:11 CreateJapiChannel:GRPC Channel created Successfully
Jul 26 16:07:11 BindClientFdToVrf: URI is unix:/var/run/japi_aaasd
Jul 26 16:07:11 BindClientFdToVrf: Fd 51 is of type unix domain socket, returning
Jul 26 16:07:11 Package name: telemetry
Jul 26 16:07:11 CreateJapiChannel:GRPC Channel created Successfully
Jul 26 16:07:11 grpcClientThread: Invoking GrpcToJapi, Peer: ipv6:::ffff:10.85.210.151:47556, CQ: 0x27fa700, Call: 0x37c8850
Jul 26 16:07:11 CreateCompletionQueue:GRPC Completion queue created Successfully
Jul 26 16:07:11 grpcClientThread: Initializing call for RPC /telemetry.OpenConfigTelemetry/telemetrySubscribe
Jul 26 16:07:11 GrpcCall:GRPC call for RPC: /telemetry.OpenConfigTelemetry/telemetrySubscribe to host unix:/var/run/japi_na-grpcd created Successfully
Jul 26 16:07:11 IncrRefCount: rpc request is /telemetry.OpenConfigTelemetry/telemetrySubscribe
Jul 26 16:07:11 IncrRefCount: Ref count incremented to 3 for RPC /telemetry.OpenConfigTelemetry/telemetrySubscribe
Jul 26 16:07:11 AddCallTag: Tag id request_1 added to tag map for client ipv6:::ffff:10.85.210.151:47556
Jul 26 16:07:11 ########### AddCallTag: Total Requests: 1
Jul 26 16:07:11 CreateGrpcCall: Thread to handle grpc requests from client ipv6:::ffff:10.85.210.151:47556 initialized and started successfully
Jul 26 16:07:11 GrpcSendInitialMetadata: client-id metadata for RPC: /telemetry.OpenConfigTelemetry/telemetrySubscribe is ipv6:::ffff:10.85.210.151:47556
Jul 26 16:07:11 BindClientFdToVrf: URI is unix:/var/run/japi_na-grpcd
Jul 26 16:07:11 BindClientFdToVrf: Fd 53 is of type unix domain socket, returning
Jul 26 16:07:11 Listen: Grpc channel to unix:/var/run/japi_na-grpcd transitioned from GRPC_CHANNEL_IDLE state to GRPC_CHANNEL_CONNECTING for client ipv6:::ffff:10.85.210.151:47556
Jul 26 16:07:11 RegisterCallCancel:Registering grpc op to handle call cancel
Jul 26 16:07:11 AddCallTag: Tag id request_2 added to tag map for client ipv6:::ffff:10.85.210.151:47556
Jul 26 16:07:11 ########### AddCallTag: Total Requests: 2
Jul 26 16:07:11 PushCallQueue: tag request_2 pushed to queue
Jul 26 16:07:11 RegisterCallCancel: Registered grpc op to monitor call cancel successfully for RPC: /telemetry.OpenConfigTelemetry/telemetrySubscribe
Jul 26 16:07:11 grpcClientThread: Call init successful for RPC /telemetry.OpenConfigTelemetry/telemetrySubscribe
Jul 26 16:07:11 Listen: Grpc channel to unix:/var/run/japi_na-grpcd transitioned from GRPC_CHANNEL_CONNECTING state to GRPC_CHANNEL_READY for client ipv6:::ffff:10.85.210.151:47556

I understand that the user/pass/id is being sent as part of subscribe RPC metadata.
I’ll raise a PR to determine if this info can be used to populate the client ID in the cli.

Thank you very much for the quick test!

I’ll raise a PR to determine if this info can be used to populate the client ID in the cli.

This PR is with Juniper?

In the metadata I was only adding the username and password. Do you want me to try adding the clientid there as well?

Yes that was a Juniper PR. I wrongly assumed Client ID would be passed in metadata. I see you added this and then reverted the change. did it break functionality?

I was going to wait to hear back from you first. I have pushed the PR with the additional client ID added to the metadata. Can you give it a shot?

Thanks!

You’ll need to wait 20-30mins from this message before new artifacts are posted.

Thanks Joshua, no change on the router after the addition of the Client ID to the metadata unfortunately, I’ll report the outcome of the PR. Appreciate all your help.

FYI we merged the PR today and landed it as a part of v1.27.3. If you hear back from Juniper regarding the client ID I would happily make any changes that they suggests or even take a PR :wink:

Thanks for the help and feedback on this!

@ipmonk,

Did you ever hear from Juniper about the reconnect correctly setting the client ID?