Problems with saving metrics on InfluxDB - Continuous Quieres -

Hello :slight_smile:
I am new here and a newbee. Hope you can help me with the following problem.
Scenario:
Server 1: docker builts of prometheus and grafana .
A bunch of exporters are also installed on this server and nginx, too. With grafana i can see their metrics.

Server 2: deb installed influxdb as a system service.

My idea was to save the metrics (from my Server 1) into my influxdb database - prometheus -, over prometheus, so that I can store metrics for a longer time, before they will be deleted. Simultaneously, after a time period (e.g. 1 day), I wanted to add a downsampling with a continuous query.
I want to save every metric in intervals of 15s and with the help of the downsampling, after a time period, only every 5 min of the older metrics. The newer metrics should be still at 15s.
I really do not want to use more retention policies or databases, and especially do not want to rename metrics (if that is even possible). Due to that, I tried to make a command for the continuou query, which could be wrong …

Here is the prometheus.yml I am using for the idea:

my global config

global:
scrape_interval: 15s
evaluation_interval: 15s
external_labels:
monitor: ‘prometheus-monitor’
rule_files:
scrape_configs:
#Node_Exporte

  • job_name: ‘nodes’
    static_configs:
    • targets: [‘:9100’]
      labels:
      job: node
      #Non-Admin
      remote_write:
  • url: “<domain_IP_influxDB_Server>:8086/api/v1/prom/write?db=prometheus&u=normal&p=password”

remote_read:

  • url: “<domain_IP_influxDB_Server>:8086/api/v1/prom/read?db=prometheus&u=normal&p=password”

I installed the scenario on two virtual machines.
I create a database with a retention policy with a duration of 3 day (tried to make a little test set up before going serious). Here are the commands I used on Server 2, the influxdb server, after installing debian paket of influxdb:
→ influx
→ CREATE USER “normal” WITH PASSWORD “password”
→ CREATE DATABASE “prometheus” WITH DURATION 3d NAME “rp_1”
→ USE prometheus
→ GRANT ALL ON “prometheus” TO “normal”
–>CREATE CONTINUOUS QUERY “cq_after_1_day” ON “prometheus” BEGIN SELECT mean() INTO “prometheus”.“rp_1”.:MEASUREMENT FROM /./ WHERE time < now() -1h GROUP BY time(5m),* END

Now my problem is, that at the very first time the metrics were send by prometheus, they were saved in the “prometheus” database from influxdb. I got alomost HTTP 200 status code, when I looked up in the syslogs from the influxdb server.
When I used the api call to delete the prometheus local stored metrics, every metric were on the prometheus DB on the influx server and grafana shows every metric, as it was never been deleted before…

api call I used: curl -X POST -g ‘<domain_IP_Prometheus_Server>:9090/api/v1/admin/tsdb/delete_series?match={instance=“<domain_IP_Prometheus_Server>:9100”}’

When I stopped the influxdb service, grafana did not show any metrics, which were deleted, after I used the api call.
Then I started the influxdb service again, and grafana showed every metric again!
That meant (or so I understand it), that the metrics are saved in my “prometheus” database on influxdb server, aren’t they?

Now after a day (servers were still online), I can only (or almost only) see HTTP 204 status code, when I am looking in the syslogs (example at the bottom of this topic), and if I use the api call now to delete the local prometheus metrics, grafana does not show any metrics anymore…(does not matter, if the the influxdb service is running)

Yesterday it worked, now it does not :frowning:
From here: Prometheus endpoints support in InfluxDB | InfluxDB OSS 1.7 Documentation
there it says
" If a batch of values contains values that are subsequently dropped, HTTP status code 204 is returned"
Does that mean, that my data will not be saved?

Now my to questions:
Are my commands wrong for the purpose I wanted to use them? Especially the continuous query is important for the downsampling.
Can you help me with the problem? Why did influxdb save the metrics yesterday, but not today anymore?

I am sorry for my bad english. I hope you could still understand everything… :see_no_evil:

Here is an excerpt output from syslogs:
00 service=continuous_querier trace_id=0Hp~KRml000 op_name=continuous_querier_execute op_event=start
Sep 12 05:35:00 user influxd[751]: ts=2019-09-12T05:35:00.040747Z lvl=info msg=“Executing continuous query” log_id=0Hpz2Ra0000 service=continuous_querier trace_id=0Hp~KRml000
op_name=continuous_querier_execute name=cq_after_1_day db_instance=prometheus start=2019-09-12T05:30:00.000000Z end=2019-09-12T05:35:00.000000Z

Sep 12 05:35:00 user influxd[751]: ts=2019-09-12T05:35:00.041430Z lvl=info msg=“Executing query” log_id=0Hpz2Ra0000
service=query query=“SELECT mean() INTO prometheus.meine_rp.:MEASUREMENT FROM prometheus.rp_1././ WHERE time >= ‘2019-09-12T05:30:00Z’ AND time < ‘2019-09-12T05:35:00Z’ GROUP BY time(5m), *”

Sep 12 05:35:00 user influxd[751]: ts=2019-09-12T05:35:00.819618Z lvl=info msg=“Finished continuous query” log_id=0Hpz2Ra0000
service=continuous_querier trace_id=0Hp~KRml000 op_name=continuous_querier_execute name=cq_after_1_day db_instance=prometheus written=927 start=2019-09-12T05:30:00.000000Z end=2019-09-12T05:35:00.000000Z duration=779ms

Sep 12 05:35:00 user influxd[751]: ts=2019-09-12T05:35:00.820242Z lvl=info msg=“Continuous query execution (end)” log_id=0Hpz2Ra0000
service=continuous_querier trace_id=0Hp~KRml000 op_name=continuous_querier_execute op_event=end op_elapsed=780.595ms

Sep 12 05:35:04 user influxd[751]: [httpd] <domain_IP_Prometheus_Server> - normal [12/Sep/2019:05:35:04 +0000] “POST /api/v1/prom/write?db=prometheus&p=%5BREDACTED%5D&u=normal
HTTP/1.1” 204 0 “-” “Prometheus/2.12.0” 12590863-d51f-11e9-89fc-0800273df99e 6965