ํ‹ฐ์Šคํ† ๋ฆฌ ๋ทฐ

ํšŒ์‚ฌ์— ๋ชจ๋‹ˆํ„ฐ๋ง ์‹œ์Šคํ…œ์ธ datadog์„ ๋„์ž…ํ•˜๊ฒŒ ๋˜๋ฉด์„œ ์–ป์—ˆ๋˜ ์ง€์‹๋“ค์„ ๊ณต์œ ํ•˜๋ คํ•ฉ๋‹ˆ๋‹ค. ML ์—”์ง€๋‹ˆ์–ด๋กœ์จ ์ƒ์†Œํ–ˆ๋˜ ๋ชจ๋‹ˆํ„ฐ๋ง ์‹œ์Šคํ…œ์ด๋ผ ๊ณ ๊ตฐ๋ถ„ํˆฌํ–ˆ์ง€๋งŒ ์‹œ๊ฐ„์ด ์ง€๋‚˜๊ณ ๋ณด๋‹ˆ datadog ๊ณต์‹๋ฌธ์„œ๊ฐ€ ๊ฝค ์ž˜ ๋˜์–ด์žˆ๋‹ค๋Š” ๊ฑธ ๊นจ๋‹ฌ์•˜๋„ค์š” ๐Ÿ”ฅ agent๋ฅผ ์‹ฌ์–ด ๋ชจ๋‹ˆํ„ฐ๋งํ•˜๋Š” ๊ฒƒ์ด ์ฃผ์š” ๊ธฐ๋Šฅ์ด๊ธฐ ๋•Œ๋ฌธ์— ๋ชจ๋‹ˆํ„ฐ๋ง ํ•˜๊ณ ์ž ํ•˜๋Š” ์‹œ์Šคํ…œ์˜ ์ข…๋ฅ˜์— ๋”ฐ๋ผ ๋‹ฌ๋ผ์ง€๊ฒ ์ง€๋งŒ ์กฐ๊ธˆ์ด๋ผ๋„ ๋„์›€์ด ๋˜๊ณ ์ž ์ž‘์„ฑํ•˜์˜€์Šต๋‹ˆ๋‹ค. ์•„๋ž˜ ๋‚ด์šฉ์€ ํŽธ์˜์ƒ ํ‰์„œ๋ฌธ์œผ๋กœ ์ž‘์„ฑํ•˜์˜€์Šต๋‹ˆ๋‹ค.

Datadog์ด๋ž€?

  • SaaS ๊ธฐ๋ฐ˜ ๋Œ€๊ทœ๋ชจ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๋ฐ ์ธํ”„๋ผ๋ฅผ ์œ„ํ•œ ๋ชจ๋‹ˆํ„ฐ๋ง ๋ฐ ๋ถ„์„ ํ†ตํ•ฉ ํ”Œ๋žซํผ
  • ์„œ๋ฒ„์— ์—์ด์ „ํŠธ๋ฅผ ์„ค์น˜ ํ›„ ๋ชจ๋‹ˆํ„ฐ๋งํ•˜๋Š” ๋‹จ์ˆœํ•œ ๊ธฐ๋Šฅ์ž„
  • ์—์ด์ „ํŠธ์— ์ถ”๊ฐ€์ ์ธ ์„ค์ •์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค๋‚˜ ๋ฉ”๋ชจ๋ฆฌ ์Šคํ† ์–ด ๋ฐ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๋“ฑ์„ ์ƒ์„ธํ•˜๊ฒŒ ๋ชจ๋‹ˆํ„ฐ๋งํ•  ์ˆ˜ ์žˆ์Œ
  • ์™ธ๋ถ€ API ์—ฐ๋™์ด ๊ฐ€๋Šฅํ•œ ๊ฒƒ์ด ํฐ ์žฅ์ ์ž„(AWS, Microsoft Azure, Google cloud๋„ ๊ฐ€๋Šฅํ•˜๋ฉฐ ๊ฐ€๋Šฅํ•œ API๋Š” ์•ฝ 600๊ฐœ๊ฐ€ ๋„˜์Œ)
  • ์—ฌ๋Ÿฌ ์„œ๋น„์Šค์™€ ๋ฆฌ์†Œ์Šค์— agent๋ฅผ ์„ค์น˜ํ•˜์—ฌ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜๋ฉฐ ์ฃผ๋กœ Go์™€ Python ์–ธ์–ด๋กœ ์ž‘์„ฑ๋˜์–ด ์žˆ์Œ
  • datadog ์ฃผ์š” ๊ธฐ๋Šฅ ์„ค๋ช…
    • log collection: API ๊ด€๋ จ log ์‹ค์‹œ๊ฐ„์œผ๋กœ ํ™•์ธ ๊ฐ€๋Šฅ
    • APM: Application Performance Monitoring, ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ์„ฑ๋Šฅ์„ ๊ฐ์‹œํ•˜๋Š” ๊ธฐ๋Šฅ
    • infrastructure: host, container, process, cpu ๋“ฑ ํ™•์ธ
    • host: ์šฐ๋ฆฌ๊ฐ€ ๋ชจ๋‹ˆํ„ฐ๋งํ•˜๊ณ  ์‹ถ์€ ๋Œ€์ƒ, ์„œ๋ฒ„, API ๋“ฑ

datadog agent

  • agent(๋Œ€๋ฆฌ์ธ์„ ๋œปํ•˜๋Š” ๋ง๋กœ ์ค‘๊ฐœ๋ฅผ ํ•ด์ฃผ๋Š” ์‚ฌ๋žŒ)๋ž€? ์„œ๋ฒ„, ์ปดํ“จํ„ฐ, ๋˜๋Š” ๋‹ค๋ฅธ ์žฅ์น˜์— ์„ค์น˜๋˜์–ด ํŠน์ • ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๊ฑฐ๋‚˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜๋Š” ์†Œํ”„ํŠธ์›จ์–ด ์ปดํฌ๋„ŒํŠธ๋ฅผ ์˜๋ฏธํ•จ. ๋ณดํ†ต ๋ฐฑ๊ทธ๋ผ์šด๋“œ์—์„œ ์‹คํ–‰๋˜๋ฉฐ ์‚ฌ์šฉ์ž์˜ ์ง์ ‘์ ์ธ ๊ฐœ์ž… ์—†์ด๋„ ์ž๋™์œผ๋กœ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Œ
  • datadog agent๋Š” ์„œ๋ฒ„์˜ CPU ์‚ฌ์šฉ๋Ÿ‰, ๋ฉ”๋ชจ๋ฆฌ ์ƒํƒœ, ๋””์Šคํฌ ์‚ฌ์šฉ๋Ÿ‰, ๋„คํŠธ์›Œํฌ ํŠธ๋ž˜ํ”ฝ ๋“ฑ๊ณผ ๊ฐ™์€ ์‹œ์Šคํ…œ ๋งคํŠธ๋ฆญ์„ ์ˆ˜์ง‘ํ•˜๋ฉฐ ์ˆ˜์ง‘ํ•œ ๋ฐ์ดํ„ฐ๋Š” ์•”ํ˜ธํ™”๋˜์–ด datadog ์„œ๋ฒ„๋กœ ์ „์†ก๋จ
  • agent๋Š” platform ๋ณ„๋กœ ์„ค์น˜ํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ๋‹ค๋ฆ„

ํšŒ์‚ฌ์—์„œ๋Š” AI API๋ฅผ docker container๋กœ ๊ด€๋ฆฌํ•˜๋ฉฐ agent ๋˜ํ•œ container ํ˜•ํƒœ๋กœ ๋„์›Œ๋†“์€ ์ƒํƒœ. ๊ธฐ๋ณธ์ ์œผ๋กœ datadog agent๋Š” TCP 8126๊ณผ UDP 8125 ํฌํŠธ๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋Š”๋ฐ TCP 8126์€ ์ฃผ๋กœ datadog์˜ APM ๊ธฐ๋Šฅ์— ์‚ฌ์šฉ๋˜๋ฉฐ UDP 8125๋Š” StatsD ๋ฉ”ํŠธ๋ฆญ์„ ์ˆ˜์ง‘ํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋œ๋‹ค.

  • TCP: ์—ฐ๊ฒฐ ์ง€ํ–ฅ์ ์ด๊ณ  ์‹ ๋ขฐ์„ฑ์ด ๋†’์€ ๋ฐ์ดํ„ฐ ์ „์†ก์„ ์ œ๊ณตํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ค‘์š”ํ•œ ํŠธ๋ ˆ์ด์Šค ๋ฐ์ดํ„ฐ๋ฅผ ์•ˆ์ „ํ•˜๊ฒŒ ์ „์†กํ•˜๊ธฐ ์œ„ํ•ด์„œ
  • UDP: ๋‹จ์ˆœํ•˜๊ณ  ํšจ์œจ์ ์ธ ํ†ต๊ณ„ ์ˆ˜์ง‘์„ ์œ„ํ•œ ํ”„๋กœํ† ์ฝœ์ด๋ฉฐ ์—ฐ๊ฒฐ์„ ์„ค์ •ํ•  ํ•„์š”๊ฐ€ ์—†๊ณ  ๋น ๋ฅด๊ฒŒ ๋ฐ์ดํ„ฐ๋ฅผ ์ „์†กํ•  ์ˆ˜ ์žˆ์–ด ๋Œ€๋Ÿ‰์˜ ๊ฐ„๋‹จํ•œ ๋ฉ”ํŠธ๋ฆญ์„ ๋น ๋ฅด๊ฒŒ ์ˆ˜์ง‘ํ•˜๋Š”๋ฐ ์ ํ•ฉํ•˜๊ธฐ ๋•Œ๋ฌธ

 

docker agent ์„ค์น˜ ๋ฐฉ๋ฒ• ๐Ÿ”— ์ฐธ๊ณ ํ•œ ๋ฌธ์„œ

  1. API key๋ฅผ ๋ฐœ๊ธ‰๋ฐ›๋Š”๋‹ค.(ํ•œ๋ฒˆ๋งŒ ํ•˜๋ฉด ๋จ)
  2. datadog image๋ฅผ pull ํ•ด์ค€๋‹ค.
  3. docker run์œผ๋กœ container๋ฅผ ์‹คํ–‰ํ•œ๋‹ค.
  4. AI API container์—๋„ ํ™˜๊ฒฝ๋ณ€์ˆ˜์™€ ์˜ต์…˜ ์„ธํŒ…์„ ํ•ด์ค€๋‹ค.
docker pull gcr.io/datadoghq/agent

docker run -d --name datadog-agent \\
           --cgroupns host \\
           --pid host \\
           -e DD_API_KEY=<DATADOG_API_KEY> \\
           -e DD_LOGS_ENABLED=true \\
           -e DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL=true \\
           -e DD_CONTAINER_EXCLUDE="name:datadog-agent" \\
           -e DD_SITE=<DD_SITE>
           -v /var/run/docker.sock:/var/run/docker.sock:ro \\
           -v /var/lib/docker/containers:/var/lib/docker/containers:ro \\
           -v /proc/:/host/proc/:ro \\
           -v /opt/datadog-agent/run:/opt/datadog-agent/run:rw \\
           -v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro \\
           gcr.io/datadoghq/agent:latest

์˜ต์…˜๊ณผ ๊ด€๋ จํ•œ ๋‚ด์šฉ์€ ๋งํฌ ์ฐธ๊ณ . datadog agents๋Š” ์—ฌ๋Ÿฌ๊ฐœ์˜ ์„œ๋ธŒ ์ปดํฌ๋„ŒํŠธ๋กœ ๊ตฌ์„ฑ๋˜๋ฉฐ ๊ฐ ์ปดํฌ๋„ŒํŠธ๋Š” ํŠน์ •์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๊ณ  ๊ด€๋ จ ๋กœ๊ทธ๋ฅผ ์ƒ์„ฑํ•จ.

agent.log: ์ฃผ ์ปดํฌ๋„ŒํŠธ์— ๋Œ€ํ•œ ๋กœ๊ทธ. ๋ฉ”ํŠธ๋ฆญ ์ˆ˜์ง‘, agent ์ƒํƒœ, agent์™€ datadog ์„œ๋ฒ„ ๊ฐ„์˜ ํ†ต์‹  ๋“ฑ์— ๊ด€๋ จ๋œ ์ •๋ณด

init.log: agent ์ดˆ๊ธฐํ™” ํ”„๋กœ์„ธ์Šค์— ๋Œ€ํ•œ ์ •๋ณด

 

process-agent.log: ์‹คํ–‰์ค‘์ธ ํ”„๋กœ์„ธ์Šค, ๋ฆฌ์†Œ์Šค ์‚ฌ์šฉ๋Ÿ‰, ํ”„๋กœ์„ธ์Šค ๊ฐ„์˜ ํ†ต์‹  ๋“ฑ์— ๊ด€ํ•œ ์ •๋ณด

security-agent.log: ๋ณด์•ˆ ๊ด€๋ จ ์ด๋ฒคํŠธ

system-probe.log: ๋„คํŠธ์›Œํฌ ํŠธ๋ž˜ํ”ฝ, ์—ฐ๊ฒฐ ์ƒํƒœ, ํฌํŠธ ์‚ฌ์šฉ ๋“ฑ์˜ ๋„คํŠธ์›Œํฌ ๊ด€๋ จ ๋ฉ”ํŠธ๋ฆญ ์ˆ˜์ง‘

trace-agent.log: APM๊ณผ ๊ด€๋ จ๋œ ๋กœ๊ทธ. ์‚ฌ์šฉ์ค‘์ธ python ๋ฒ„์ „, tracing ๋ฐ›์€ ์ˆ˜, ์šฉ๋Ÿ‰, ์ด๋ฒคํŠธ ์ถ”์ถœ ์ •๋ณด

Infrastructure

 

Infrastructure

Datadog, the leading service for cloud-scale monitoring.

docs.datadoghq.com

cpu, memory ์‚ฌ์šฉ๋Ÿ‰์ด๋‚˜ container์˜ cpu, memory ๋“ฑ์˜ ํ™œ๋™์„ ๋ชจ๋‹ˆํ„ฐ๋งํ•  ์ˆ˜ ์žˆ์Œ

APM

 

Tracing Docker Applications

Datadog, the leading service for cloud-scale monitoring.

docs.datadoghq.com

  • ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์˜ ์„ฑ๋Šฅ์„ ๊ฐ์‹œํ•˜๋Š” ๊ธฐ๋Šฅ
  • python ์ฝ”๋“œ ๋‚ด์— trace๋ฅผ ๋ณด๋‚ด๋Š” ์ฝ”๋“œ๋ฅผ ์ถ”๊ฐ€ํ•ด์ค˜์•ผํ•˜๊ณ  host name์€ ip address์ž„
  • ์„ธํŒ…๋ณ„๋กœ ๋‹ค๋ฅด์ง€๋งŒ ํ•„์ž์˜ ๊ฒฝ์šฐ, 2๊ฐœ์˜ container๋ฅผ ๊ฐ๊ฐ ๋„์šฐ๊ธฐ ๋•Œ๋ฌธ์— ai container์—์„œ datadog container๋กœ ๋ณด๋‚ด๋Š” ip๋ฅผ ์…‹์—…ํ•ด์•ผํ•จ
  • ์•„๋ž˜ ๋ช…๋ น์–ด๋กœ datadog container ip๋ฅผ ํ™•์ธํ•ด์•ผํ•จ. default๋กœ 172.17.0.3
  • docker inspect [container name]
  • ํŒŒ์ด์ฌ ๋ฉ”์ธ ์ฝ”๋“œ์•ˆ์— ์•„๋ž˜์˜ ์ฝ”๋“œ๋ฅผ ๋„ฃ์–ด์คŒ
  • from ddtrace import tracer tracer.configure(hostname='172.17.0.3', port=8126)
  • trace๋ฅผ ์ž˜ ๋ฐ›๊ณ  ์žˆ๋Š”์ง€ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด์„  trace-agent.log ๋ณผ ๊ฒƒ
cd /var/log/datadog
tail -f trace-agent.log

2023-09-01 06:41:16 UTC | TRACE | INFO | (run.go:269 in Infof) | [lang:python lang_version:3.8.17 interpreter:CPython tracer_version:1.17.3 endpoint_version:v0.5] -> traces received: 1, traces filtered: 0, traces amount: 490 bytes, events extracted: 0, events sampled: 0
2023-09-01 06:42:26 UTC | TRACE | INFO | (run.go:269 in Infof) | [lang:python lang_version:3.8.17 interpreter:CPython tracer_version:1.17.3 endpoint_version:v0.5] -> traces received: 2, traces filtered: 0, traces amount: 914 bytes, events extracted: 0, events sampled: 0
2023-09-01 06:52:16 UTC | TRACE | INFO | (run.go:269 in Infof) | [lang:python lang_version:3.8.17 interpreter:CPython tracer_version:1.17.3 endpoint_version:v0.5] -> traces received: 1, traces filtered: 0, traces amount: 620 bytes, events extracted: 0, events sampled: 0
2023-09-01 07:01:06 UTC | TRACE | INFO | (run.go:269 in Infof) | [lang:python lang_version:3.8.17 interpreter:CPython tracer_version:1.17.3 endpoint_version:v0.5] -> traces received: 73, traces filtered: 0, traces amount: 28747 bytes, events extracted: 0, events sampled: 0
2023-09-01 07:02:06 UTC | TRACE | INFO | (run.go:269 in Infof) | [lang:python lang_version:3.8.17 interpreter:CPython tracer_version:1.17.3 endpoint_version:v0.5] -> traces received: 54, traces filtered: 0, traces amount: 21259 bytes, events extracted: 0, events sampled: 

→ ์ž˜ ๋ฐ›๊ณ  ์žˆ๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Œ

datadog์—์„œ๋„ distributed tracing์ด detected ๋˜๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

์œ„์—์„œ ์„ธํŒ…ํ•œ ๋’ค service๋ฅผ ์ถ”๊ฐ€ํ•ด์ค˜์•ผ ํ•œ๋‹ค.

  1. APM์—์„œ setup & Config ํด๋ฆญ ํ›„ create New Entry ํด๋ฆญ
  2. DD_SERVICE์— bentoml dockerfile์— ์ ํ˜€์žˆ๋Š” DD_SERVICE ํ™˜๊ฒฝ ๋ณ€์ˆ˜ ์ด๋ฆ„ ์ ๊ณ  save entry ํด๋ฆญ

log collection

 

Docker Log collection

Datadog, the leading service for cloud-scale monitoring.

docs.datadoghq.com

  • ํ…์ŠคํŠธ ์ •๋ณด๋ฅผ ๋‹ค๋ฃจ๋Š” ๋กœ๊ทธ๋ฅผ ์ˆ˜์ง‘ํ•จ
  • log๋ฅผ ๋ชจ์œผ๊ธฐ ์œ„ํ•ด์„  datadog agent container ์‹คํ–‰ ์‹œ ๋ช‡๊ฐ€์ง€ ํ™˜๊ฒฝ๋ณ€์ˆ˜๋ฅผ ์ถ”๊ฐ€ํ•ด์ฃผ์–ด์•ผ ํ•œ๋‹ค.
docker run -d --name datadog-agent \
           --cgroupns host \
           --pid host \
           -e DD_API_KEY=<DATADOG_API_KEY> \
           -e DD_LOGS_ENABLED=true \
           -e DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL=true \
           -e DD_CONTAINER_EXCLUDE="name:datadog-agent" \
           -e DD_SITE=<DD_SITE>
           -v /var/run/docker.sock:/var/run/docker.sock:ro \
           -v /var/lib/docker/containers:/var/lib/docker/containers:ro \
           -v /proc/:/host/proc/:ro \
           -v /opt/datadog-agent/run:/opt/datadog-agent/run:rw \
           -v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro \
           gcr.io/datadoghq/agent:latest

 

info๊ฐ€ default๋กœ ๋˜์–ด์žˆ์–ด ๋ชจ๋“  log๋Š” info๋กœ ํ‘œ์‹œ๋œ๋‹ค. → datadog log configure์—์„œ ์ •๊ทœํ‘œํ˜„์‹์„ ์ด์šฉํ•ด ์„ธํŒ…์ด ํ•„์š”ํ•˜๋‹ค.

์•„๋ž˜ add a new pipeline ํด๋ฆญ ํ›„ filter๋ฅผ ๊ฑธ์–ด์ค€๋‹ค. ์šฐ๋ฆฌ๋Š” service๋ฅผ ๊ธฐ์ค€์œผ๋กœ ๋‚˜๋‰˜์–ด์žˆ์–ด service๋ฅผ ์„ ํƒํ•จ

grok parser ์„ธํŒ…

Parse My Logs ๋ฒ„ํŠผ์„ ๋ˆ„๋ฅด๋ฉด ์•Œ์•„์„œ ๋กœ๊ทธ๋ฅผ ๊ฐ€์ ธ์™€ ์ •๊ทœํ‘œํ˜„์‹์„ ๊ฐ€์ ธ์™€์ค€๋‹ค.

์šฐ๋ฆฌ๋Š” log ํ˜•์‹์ด ํ•˜๋‚˜ ๋” ์žˆ์–ด์„œ add ๋ฒ„ํŠผ์„ ๋ˆŒ๋Ÿฌ ์ถ”๊ฐ€ํ•ด์ฃผ๊ณ  ์ •๊ทœํ‘œํ˜„์‹์„ ์ถ”๊ฐ€ํ•ด์ฃผ์—ˆ๋‹ค. ์ •๊ทœํ‘œํ˜„์‹์€ ์ด๊ณณ์„ ์ฐธ๊ณ ํ•˜๋ฉด ๋˜๋Š”๋ฐ key: value ํ˜•์‹์ด๋‹ค. ์ž์ฃผ ์‚ฌ์šฉํ•˜๋Š” ๊ฑด date, word

log level ์ •์˜: status remapper

๊ฒฐ๊ณผ: info๋งŒ ์ฐํžˆ๋˜๊ฒŒ python logger์— ๋งž์ถฐ error, debug, info๋กœ ์ž˜ ๋‚˜๋‰œ๋‹ค!

 

Custom metric

  • ์œ ์ €๊ฐ€ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋Š” AI ๋ชจ๋ธ์ด ์ž˜ ์˜ˆ์ธกํ•˜๊ณ  ์žˆ๋Š”์ง€ ์น˜์šฐ์ณ์„œ ์˜ˆ์ธกํ•˜๋Š” ํด๋ž˜์Šค๊ฐ€ ์žˆ๋Š”์ง€ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด metric์„ ์ถ”๊ฐ€ํ•จ
  • host๋Š” datadog container๋ฅผ ์˜๋ฏธํ•œ๋‹ค. ์šฐ๋ฆฌ๋Š” container๋กœ ๋„์› ๊ธฐ ๋•Œ๋ฌธ์— docker inspect [contianer_id]๋กœ ip address๋ฅผ ํ™•์ธ ํ›„ ์ ์–ด์ค€๋‹ค.
  • port๋Š” 8125๊ฐ€ ๊ธฐ๋ณธ๊ฐ’์ด๋‹ค.
  • ์ฐธ๊ณ  datadog container์˜ ๊ธฐ๋ณธ ip๋ฅผ 172.17.0.3์ด๋‹ค.์ดˆ๊ธฐ ์„ธํŒ…์€ service๊ฐ€ ์‹œ์ž‘ํ•˜๋Š” ์Šคํฌ๋ฆฝํŠธ์— ์•„๋ž˜์˜ ์ฝ”๋“œ๋ฅผ ์‹ฌ์–ด์ค€๋‹ค.
from datadog import initialize, statsd

options = {
    'statsd_host':'172.17.0.3',
    'statsd_port':8125
}

initialize(**options)

 

์ˆ˜์ง‘ํ•˜๊ณ ์ž ํ•˜๋Š” ์ง€ํ‘œ๋ฅผ ์•„๋ž˜์™€ ๊ฐ™์ด ์ž‘์„ฑํ•œ๋‹ค. tag๋กœ ๊ตฌ๋ณ„ํ•  ์ˆ˜ ์žˆ์–ด ์นดํ…Œ๊ณ ๋ฆฌ๋ฅผ ๋‚˜๋ˆŒ ์ˆ˜ ์žˆ๋‹ค.

statsd.gauge('face_recovery.class', value, tags=['key':'value'])

 

๋‹ค์–‘ํ•œ ์ง€ํ‘œ๋ฅผ ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ๋Š”๋ฐ ์ง€ํ‘œ์˜ type์— ๋”ฐ๋ผ gauge, set, histogram, distribution ๋“ฑ์ด ์žˆ์œผ๋ฉฐ ์ž์„ธํ•œ ๋‚ด์šฉ์€ ์•„๋ž˜์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

 

Metric Submission: DogStatsD

Submit custom metrics directly from your application.

docs.datadoghq.com

๊ฒฐ๊ณผ๋Š” metric tab์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Œ

AI container(host) ์„ธํŒ…

  • docker container running ์‹œ env์™€ service ์ด๋ฆ„ ๋“ฑ์„ ์„ค์ •ํ•ด์ค˜์•ผํ•˜๊ณ  API ๋‚ด ํ™˜๊ฒฝ ๋ณ€์ˆ˜๋ฅผ ์ถ”๊ฐ€ํ•ด์ฃผ์–ด์•ผ ํ•œ๋‹ค.
    • ์ปจํ…Œ์ด๋„ˆ ์ด๋ฆ„์„ ์„ค์ •ํ•ด๋‘๋ฉด datadog์—์„œ ๋ชจ๋‹ˆํ„ฐ๋งํ•˜๋Š”๋ฐ ์‰ฝ๊ฒŒ ์‹๋ณ„ํ•  ์ˆ˜ ์žˆ๋‹ค.
    docker container run -t -d -p 8080:3000 \\
    --name [container ์ด๋ฆ„] \\
    -l com.datadoghq.tags.env=[ํ™˜๊ฒฝ stg, prd ๋“ฑ] \\
    -l com.datadoghq.tags.service=[service ์ด๋ฆ„] \\
    [docker image]
    
  • ํ™˜๊ฒฝ ๋ณ€์ˆ˜
ENV DD_AGENT_HOST=datadog-agent
ENV DD_SERVICE=[service ์ด๋ฆ„]
ENV DD_ENV=[ํ™˜๊ฒฝ stg, prd ๋“ฑ]
ENV DD_LOGS_INJECTION=true

 

Reference

๋งบ์Œ๋ง

AI API๋ฅผ ์œ„ํ•œ ์„ธํŒ…์ด๋‹ค๋ณด๋‹ˆ ์ฐธ๊ณ ํ• ๋งŒํ•œ ๊ธ€์ด ๋งŽ์ด ์—†์–ด ์ดˆ๋ฐ˜์— ๋งŽ์ด ํ—ค๋งธ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜๋„ ์„ธํŒ…ํ•˜๊ณ  ๋‚˜๋‹ˆ ์—๋Ÿฌ๋ฅผ ๋””๋ฒ„๊น…ํ•˜๊ฑฐ๋‚˜ ์–ด๋А ์˜์—ญ์—์„œ ์‹œ๊ฐ„์„ ๋งŽ์ด ์“ฐ๊ณ  ์žˆ๋Š”์ง€ ์šฐ๋ฆฌ์˜ ํ™˜๊ฒฝ์ด ์ž˜ ํ™œ์šฉ๋˜๊ณ  ์žˆ๋Š”์ง€๋ฅผ ๋ณผ ์ˆ˜ ์žˆ์–ด ์ข‹๋„ค์š”. ์ด์™ธ์—๋„ ํ™œ์šฉํ•  ๋งŒํ•œ ๊ธฐ๋Šฅ๋“ค์ด ๋งŽ๊ธฐ ๋•Œ๋ฌธ์— ์ ์‘์ด ๋œ๋‹ค๋ฉด ๋‹ค๋ฅธ ๊ธฐ๋Šฅ๋“ค๋„ ์ ์šฉํ•ด๋ณด๊ณ  ์‹ถ๋„ค์š”. ์ฝ์–ด์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ๋„์›€์ด ๋˜์…จ๊ธธ ๋ฐ”๋ž˜์š”! ๐Ÿซง

'๐Ÿค– MLOps' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

๐Ÿฅ ์Šคํƒ€ํŠธ์—… MLOps ํ™œ์šฉ๊ธฐ  (0) 2024.05.06
์ตœ๊ทผ์— ์˜ฌ๋ผ์˜จ ๊ธ€
ยซ   2025/05   ยป
์ผ ์›” ํ™” ์ˆ˜ ๋ชฉ ๊ธˆ ํ† 
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
Total
Today
Yesterday