ํ‹ฐ์Šคํ† ๋ฆฌ ๋ทฐ

๋ฒ ์ด์ฆˆ ์ •๋ฆฌ๋ž€?

์‚ฌ์ „ ํ™•๋ฅ (prior)์ด๋ž€ ์‚ฌ๊ฑด A, B๊ฐ€ ์žˆ์„ ๋•Œ ์‚ฌ๊ฑด A๋ฅผ ๊ธฐ์ค€์œผ๋กœ ๋ณด๋ฉด ์‚ฌ๊ฑด B๊ฐ€ ๋ฐœ์ƒํ•˜๊ธฐ ์ „์— ๊ฐ€์ง€๊ณ  ์žˆ๋˜ ์‚ฌ๊ฑด A์˜ ํ™•๋ฅ ์ž…๋‹ˆ๋‹ค. ๋งŒ์•ฝ ์‚ฌ๊ฑด B๊ฐ€ ๋ฐœ์ƒํ•˜๋ฉด ์ด ์ •๋ณด๋ฅผ ๋ฐ˜์˜ํ•˜์—ฌ ์‚ฌ๊ฑด A์˜ ํ™•๋ฅ ์€ P(A|B)๋กœ ๋ณ€ํ•˜๊ฒŒ ๋˜๊ณ  ์ด๊ฒŒ ์‚ฌํ›„ํ™•๋ฅ (posterior)์ž…๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด๋ณด์ž๋ฉด, ์‚ฌ๊ฑด B๋ฅผ ์ฒ ์ˆ˜๊ฐ€ ์˜ํฌ๋ฅผ ์ข‹์•„ํ•œ๋‹ค๋กœ ์ •์˜ํ•˜๊ณ  ์‚ฌ๊ฑด A๋Š” ์ฒ ์ˆ˜๊ฐ€ ์˜ํฌ์—๊ฒŒ ์ดˆ์ฝœ๋ฆฟ์„ ์ค€๋‹ค๋ผ๊ณ  ํ• ๊ฒŒ์š”. ์ฒ ์ˆ˜๊ฐ€ ์˜ํฌ๋ฅผ ์ข‹์•„ํ•  ํ™•๋ฅ ์„ 0.5๋ผ๊ณ  ํ•˜๊ณ  ์ฒ ์ˆ˜๊ฐ€ ์˜ํฌ์—๊ฒŒ ์ดˆ์ฝœ๋ฆฟ์„ ์ค„ ํ™•๋ฅ ์€ 0.4, ์ฒ ์ˆ˜๊ฐ€ ์˜ํฌ๋ฅผ ์ข‹์•„ํ•  ๋•Œ ์ดˆ์ฝœ๋ฆฟ์„ ์ค„ ํ™•๋ฅ ์„ ํ™•๋ฅ ์€ 0.2๋ผ๊ณ  ์ •์˜ํ•ด๋ณผ๊ฒŒ์š”. ๊ทธ๋Ÿฌ๋ฉด ์‚ฌ์ „ ํ™•๋ฅ ์€ P(B) = 0.5๊ฐ€ ๋˜๊ณ  ์‚ฌํ›„ ํ™•๋ฅ  P(A|B) = 0.2๊ฐ€ ๋ฉ๋‹ˆ๋‹ค. ๋˜ ํ•œ๊ฐ€์ง€ ๋” ์‚ฌ์ „ ํ™•๋ฅ  P(A)=0.4์ด ๋˜์ฃ .

  • ์ฒ ์ˆ˜๊ฐ€ ์˜ํฌ๋ฅผ ์ข‹์•„ํ•  ํ™•๋ฅ  P(B) = 0.5
  • ์ฒ ์ˆ˜๊ฐ€ ์˜ํฌ์—๊ฒŒ ์ดˆ์ฝœ๋ฆฟ์„ ์ค„ ํ™•๋ฅ  P(A) = 0.4
  • ์ฒ ์ˆ˜๊ฐ€ ์˜ํฌ๋ฅผ ์ข‹์•„ํ•  ๋•Œ ์ดˆ์ฝœ๋ฆฟ์„ ์ค„ ํ™•๋ฅ  P(A|B) = 0.2

์—ฌ๊ธฐ์„œ ํ•œ๊ฐ€์ง€ ๋” ๋‚˜์•„๊ฐ€ ๋ฒ ์ด์ฆˆ ์ •๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ์–ด์š”. ๋ฒ ์ด์ฆˆ ์ •๋ฆฌ๊ฐ€ ๋ญ”์ง€ ์œ„ํ‚ค๋ฐฑ๊ณผ์—์„œ ๊ฐ€์ ธ์™€๋ดค์Šต๋‹ˆ๋‹ค.

๋ฒ ์ด์ฆˆ ์ •๋ฆฌ๋Š” ๋‘ ํ™•๋ฅ  ๋ณ€์ˆ˜์˜ ์‚ฌ์ „ ํ™•๋ฅ ๊ณผ ์‚ฌํ›„ ํ™•๋ฅ  ์‚ฌ์ด์˜ ๊ด€๊ณ„๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ์ •๋ฆฌ๋‹ค.

์œ„์˜ ์˜ˆ์‹œ๋กœ ๋ดค์„ ๋•Œ ์—ฌ๊ธฐ์„œ ๋งํ•˜๋Š” ๋‘ ํ™•๋ฅ  ๋ณ€์ˆ˜๋Š” ์ฒ ์ˆ˜๊ฐ€ ์˜ํฌ๋ฅผ ์ข‹์•„ํ•œ๋‹ค๋ผ๋Š” ๋ณ€์ˆ˜์™€ ์ฒ ์ˆ˜๊ฐ€ ์˜ํฌ์—๊ฒŒ ์ดˆ์ฝœ๋ฆฟ์„ ์ค€๋‹ค๋กœ ์–˜๊ธฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋จผ์ € ๋ฒ ์ด์ฆˆ ์ •๋ฆฌ ์‹์„ ๋ณผ๊นŒ์š”?

$$ P(A|B) = \frac {P(B|A)P(A)} {P(B)} $$

์ด ์‹์—์„œ ์ €ํฌ๊ฐ€ ๋ชจ๋ฅด๋Š” ๊ฑด ์ดˆ์ฝœ๋ฆฟ์„ ์คฌ์„ ๋•Œ ์ฒ ์ˆ˜๊ฐ€ ์˜ํฌ๋ฅผ ์ข‹์•„ํ•  ํ™•๋ฅ ์„ ์˜๋ฏธํ•˜๋Š” P(B|A)๋ฐ–์— ์—†์–ด์š”. ์ด ํ™•๋ฅ ์„ ๋ฒ ์ด์ฆˆ ์ •๋ฆฌ๋ฅผ ์ด์šฉํ•ด ๊ตฌํ•  ์ˆ˜ ์žˆ๋Š” ๊ฒƒ์ด์ฃ . ์‹์— ๋Œ€์ž…ํ•ด๋ณผ๊นŒ์š”?

$$ 0.2 = \frac {P(B|A) \times 0.4} {0.5} $$

์ฆ‰, P(B|A)๋Š” ์œ„์˜ ์‹์„ ํ†ตํ•ด 0.25๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ฆ‰, ์ฒ ์ˆ˜๊ฐ€ ์˜ํฌ์—๊ฒŒ ์ดˆ์ฝœ๋ฆฟ์„ ์คฌ์„ ๋•Œ ์ฒ ์ˆ˜๊ฐ€ ์˜ํฌ๋ฅผ ์ข‹์•„ํ•  ํ™•๋ฅ ์€ 0.25๋ผ๋Š” ๊ฒƒ์ด์ฃ .

์šฐ๋ฆฌ์˜ ๋ฌธ์ œ์— ์ ์šฉํ•ด๋ณธ๋‹ค๋ฉด?

neural network์—์„œ ํ•™์Šต๋˜๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ w์™€ ์šฐ๋ฆฌ๊ฐ€ input์œผ๋กœ ๋„ฃ๋Š” ๋ฐ์ดํ„ฐ D๊ฐ€ ์žˆ์„ ๋•Œ p(w|D)๋Š” ์šฐ๋ฆฌ๊ฐ€ data๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ ๊ฐ€์žฅ ์žฌํ˜„ ์ž˜ํ•˜๋Š” w๋ฅผ ๋งŒ๋“ค ํ™•๋ฅ  ๋ถ„ํฌ์ž…๋‹ˆ๋‹ค. ์ข€ ๋” ์ž์„ธํžˆ ์„ค๋ช…ํ•˜์ž๋ฉด ๋ณดํ†ต ๋”ฅ๋Ÿฌ๋‹ ํ•™์Šต์—์„  data๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ ํ™•์ •๋œ w๊ฐ€ ๋งŒ๋“ค์–ด์ง€์ฃ . ์•„๋ž˜ ๊ทธ๋ฆผ์ฒ˜๋Ÿผ ํ™•์ •๋œ w๊ฐ€ ์•„๋‹Œ w์˜ ํ™•๋ฅ  ๋ถ„ํฌ๊ฐ€ ๋งŒ๋“ค์–ด์ง€๋Š” ๊ฑฐ์˜ˆ์š”.

๋ฐ์ดํ„ฐ๋Š” ์—ฐ์†์ ์ด๊ธฐ ๋•Œ๋ฌธ์— ๋ฒ ์ด์ฆˆ ์ •๋ฆฌ์˜ ์‹์ด ์กฐ๊ธˆ ๋ณ€ํ˜•๋ฉ๋‹ˆ๋‹ค. ์œ„์˜ ๋ฒ ์ด์ฆˆ ์ •๋ฆฌ ์‹๊ณผ๋Š” ๋‹ค๋ฅด๊ฒŒ p(D)๊ฐ€ ์—†์–ด์ง„ ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์–ด์š”. ์—ฌ๊ธฐ์„œ ํ•œ๊ฐ€์ง€ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. posterior p(D|w)๋Š” ์‰ฝ๊ฒŒ ๊ตฌํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ ๊ฐ™์ง€๋งŒ ๋ถ„๋ชจ๋ฅผ ๋ณด์‹œ๋ฉด ํŒŒ๋ผ๋ฏธํ„ฐ w์— ๋Œ€ํ•ด ์ ๋ถ„ํ•˜๊ฒŒ ๋˜๋Š”๋ฐ ๋”ฅ๋Ÿฌ๋‹ layer๊ฐ€ ๊นŠ์–ด์งˆ์ˆ˜๋ก ๊ฐฏ์ˆ˜๊ฐ€ ๋งŽ์•„์ง€๊ณ  ๋ชจ๋“  w์— ๋Œ€ํ•ด ์ ๋ถ„์„ ํ•˜๋Š” ๊ฒƒ์€ ๋ถˆ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

$$ P(w|D) = \frac {p(D|w)p(w)} {\int p(D|w)p(w)dw} $$

๊ทธ๋ž˜์„œ ํ™œ์šฉํ•˜๋Š”๊ฒŒ variational inference์ž…๋‹ˆ๋‹ค. variational infernce๋ฅผ ํ•œ๊ธ€๋กœ ๋ฒˆ์—ญํ•˜๋ฉด ๋ณ€๋ถ„ ์ถ”๋ก ์œผ๋กœ ์šฐ๋ฆฌ๊ฐ€ ์•„๋Š” ํ•จ์ˆ˜ q๋ฅผ ์ •์˜ํ•˜์—ฌ p(D|w)์™€ ๋น„์Šทํ•œ ๋ถ„ํฌ๋ฅผ ๊ฐ€์ง„ variational distibution์„ ๋งŒ๋“ค์ž! ์ž…๋‹ˆ๋‹ค. ๋ณ€๋ถ„์ด๋ผ๋Š” ๋ง์„ ์ข€ ๋” ์ž์„ธํžˆ ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ๋ณ€๋ถ„์ด๋ž€ ์–ด๋–ค ๊ฐ’์ด ์ตœ์†Œ ๋˜๋Š” ์ตœ๋Œ€๊ฐ€ ๋˜๊ฒŒ ํ•˜๋Š” ์กฐ๊ฑด์„ ์ฐพ๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. ๋ณ€๋ถ„๋ฒ•์—์„œ ์ตœ์†Œ๋‚˜ ์ตœ๋Œ€๊ฐ€ ์ฐพ๊ธฐ ์œ„ํ•ด์„œ ๋ฏธ๋ถ„๊ณผ ์ ๋ถ„์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์‹œ ๋ณธ๋ก ์œผ๋กœ ๋Œ์•„์™€์„œ ์„ค๋ช…ํ•˜์ž๋ฉด p(w|D)์™€ ๊ฐ€์žฅ ๋น„์Šทํ•œ ํ™•๋ฅ  ๋ถ„ํฌ๋ฅผ ๊ฐ€์ง€๋Š” q(w|θ)๋ฅผ ๋งŒ๋“œ๋Š” θ๋ฅผ ์ •ํ•ด์ฃผ๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ ‡๋‹ค๋ฉด ์ด ๋‘ ํ™•๋ฅ  ๋ถ„ํฌ๊ฐ€ ๋น„์Šทํ•œ์ง€ ์–ด๋–ป๊ฒŒ ์ธก์ •ํ•  ์ˆ˜ ์žˆ์„๊นŒ์š”? ์ด ๋•Œ ์‚ฌ์šฉ๋˜๋Š” ์ด๋ก ์ด KL-Divergece์ž…๋‹ˆ๋‹ค. KL-Divergence๋ฅผ ์„ค๋ช…ํ•˜์ž๋ฉด ๋‘๊ฐœ์˜ ๋ถ„ํฌ๊ฐ€ ๋น„์Šทํ• ์ˆ˜๋ก ์ž‘์€ ๊ฐ’์„ ๊ฐ–๋Š” ์‹์ž…๋‹ˆ๋‹ค. KL Divergence์— ๋Œ€ํ•œ ๋‚ด์šฉ์€ KL divergence ์‚ดํŽด๋ณด๊ธฐ ๐Ÿ”—๋ฅผ ์ฐธ๊ณ ํ•˜์‹œ๋ฉด ์ข€ ๋” ์ดํ•ดํ•˜๊ธฐ ํŽธํ• ๊ฑฐ์˜ˆ์š”.

Varational Auto Encoder(VAE)๋ž€?

์œ„ํ‚คํ”ผ๋””์•„์— ์ ํ˜€์žˆ๋Š” VAE ์ •์˜์—” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ ํ˜€์žˆ์Šต๋‹ˆ๋‹ค.

Variational autoencoders are probabilistic generative models that require neural networks. The neural network components are typically referred to as the encoder and decoder for the first and second component respectively. The first neural network maps the input variable to a latent space that corresponds to the parameters of a variational distribution. In this way, the encoder can produce multiple different samples that all come from the same distribution. The decoder has the opposite function, which is to map from the latent space to the input space, in order to produce or generate data points.

๋ฒˆ์—ญ๊ธฐ์˜ ํž˜์„ ๋นŒ๋ ค๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. VAE๋Š” ์‹ ๊ฒฝ๋ง์ด ํ•„์š”ํ•œ ํ™•๋ฅ ์  ์ƒ์„ฑ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์‹ ๊ฒฝ๋ง์˜ ๊ตฌ์กฐ๋Š” ์ธ์ฝ”๋”์™€ ๋””์ฝ”๋”๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ์œผ๋ฉฐ, ์ธ์ฝ”๋”๋Š” ์ž…๋ ฅ ๋ณ€์ˆ˜๋ฅผ ๋ณ€๋™ ๋ถ„ํฌ์˜ ๋งค๊ฐœ๋ณ€์ˆ˜์— ํ•ด๋‹นํ•˜๋Š” latent space์— ๋งคํ•‘ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฐฉ์‹์œผ๋กœ ์ธ์ฝ”๋”๋Š” ๋ชจ๋‘ ๋™์ผํ•œ ๋ถ„ํฌ์—์„œ ๋‚˜์˜ค๋Š” ์—ฌ๋Ÿฌ ๋‹ค๋ฅธ ์ƒ˜ํ”Œ์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋””์ฝ”๋”์—๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•ด latent space์—์„œ ์ž…๋ ฅ space๋กœ ๋งคํ•‘ํ•˜๋Š” ๊ธฐ๋Šฅ์„ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

์ข€ ๋” ์‰ฝ๊ฒŒ ์„ค๋ช…ํ•˜์ž๋ฉด input X๋ฅผ ์ž˜ ์„ค๋ช…ํ•˜๋Š” feature๋ฅผ ์ถ”์ถœํ•˜์—ฌ latent vector z์— ๋‹ด๊ณ , ์ด latent vector z๋ฅผ ํ†ตํ•ด X์™€ ๋น„์Šทํ•˜๋ฉด์„œ ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•ฉ๋‹ˆ๋‹ค. ์•„๋ž˜์˜ ๊ตฌ์กฐ๋ฅผ ํ•˜๋‚˜์”ฉ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

Encoder

๋จผ์ € x๋Š” encoder๋ฅผ ํ†ตํ•ด z๋ฅผ ๋งŒ๋“ค๊ธฐ ์œ„ํ•œ ํ‰๊ท  μ์™€ ํ‘œ์ค€ํŽธ์ฐจ σ๋ฅผ ๋งŒ๋“ค์–ด๋ƒ…๋‹ˆ๋‹ค. ์•„๋ž˜ ๊ทธ๋ฆผ์˜ ์ˆ˜์‹์„ ๋ณด์‹œ๋ฉด p๊ฐ€ ์•„๋‹Œ q์ธ๋ฐ์š”. x๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ z๊ฐ€ ๋‚˜์˜ฌ ํ™•๋ฅ  ๋ถ„ํฌ๋ฅผ ์•Œ๋ฉด ์ข‹๊ฒ ์ง€๋งŒ ์‰ฝ์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— variational inference๋ฅผ ์ด์šฉํ•ด q๋ผ๋Š” ์šฐ๋ฆฌ๊ฐ€ ์•Œ๊ณ  ์žˆ๋Š” ํ•จ์ˆ˜๋ฅผ ํ†ตํ•ด μ์™€ σ๋ฅผ ๋งŒ๋“ค์–ด๋ƒ…๋‹ˆ๋‹ค. ์•„๋ž˜ ์ฝ”๋“œ๋ฅผ ๋ณด์‹œ๋ฉด ์šฐ๋ฆฌ๊ฐ€ ์•Œ๊ณ  ์žˆ๋Š” ํ•จ์ˆ˜๋Š” relu๋ฅผ ํฌํ•จํ•œ 2์ธต ๊นŠ์ด์˜ linear์ž…๋‹ˆ๋‹ค.

class VAE(nn.Module):
    def __init__(self, input_dim, hidden_dim, latent_dim):
        super(VAE, self).__init__()
        self.input_dim = input_dim
        self.hidden_dim = hidden_dim
        self.latent_dim = latent_dim

        # encode
        self.fc1 = nn.Linear(self.input_dim, self.hidden_dim)
        self.fc2 = nn.Linear(self.hidden_dim, self.latent_dim)

	def encode(self, x):
	    hidden = F.relu(self.fc1(x))
	    mu = F.relu(self.fc2(hidden))
	    sigma = F.relu(self.fc2(hidden))
	    return mu, sigma

Latent vector z

z๋ฅผ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด์„  ์ •๊ทœ๋ถ„ํฌ์—์„œ ์ƒ˜ํ”Œ๋งํ•˜๋ฉด ๋˜์ง€๋งŒ VAE์—์„  μ, σ, ε 3๊ฐœ์˜ ๋ถ„ํฌ๋ฅผ ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ •๊ทœ๋ถ„ํฌ๋ฅผ ์ƒ˜ํ”Œ๋งํ•˜๊ฒŒ ๋˜๋ฉด back propagation์ด ๋ถˆ๊ฐ€ํ•˜๊ฒŒ ๋˜๊ธฐ ๋•Œ๋ฌธ์— reparametric trick์„ ์‚ฌ์šฉํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. reparametric trick์€ back propation์„ ํ•˜๊ธฐ ์œ„ํ•จ๊ณผ ๋™์‹œ์— noise๋ฅผ samplingํ•˜์—ฌ ๋งค๋ฒˆ ์กฐ๊ธˆ์€ ๋‹ค๋ฅธ ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•จ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ฆผ์—์„œ ๋ณด์—ฌ๋“œ๋ ธ๋˜ ๊ฒƒ๊ณผ ๊ฐ™์ด μ์™€ ε์„ ๊ณฑํ•œ σ๋ฅผ ๋”ํ•ด z๋ฅผ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.

class VAE(nn.Module):
    def __init__(self, input_dim, hidden_dim, latent_dim):
        super(VAE, self).__init__()
        self.input_dim = input_dim
        self.hidden_dim = hidden_dim
        self.latent_dim = latent_dim

    def forward(self, x):
        mu, sigma = self.encode(x)
        z = mu + sigma * torch.randn(self.latent_dim)

Decoder

์ด์ œ decoder๋ฅผ ์ด์šฉํ•ด ์ƒˆ๋กœ์šด x๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

class VAE(nn.Module):
    def __init__(self, input_dim, hidden_dim, latent_dim):
        super(VAE, self).__init__()
        self.input_dim = input_dim
        self.hidden_dim = hidden_dim
        self.latent_dim = latent_dim

        # decode
        self.fc3 = nn.Linear(self.latent_dim, self.hidden_dim)
        self.fc4 = nn.Linear(self.hidden_dim, self.input_dim)

    def decode(self, z):
        hidden = F.relu(self.fc3(z))
        output = self.fc4(hidden)
        return output

    def forward(self, x):
        reconstructed_z = self.decode(z)
        return reconstructed_z, mu, sigma

์ฝ”๋“œ๋ฅผ ๋ชจ๋‘ ํ•ฉ์ณ๋ฉด ์•„๋ž˜์™€ ๊ฐ™์ด ๋‚˜ํƒ€๋‚ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

class VAE(nn.Module):
    def __init__(self, input_dim, hidden_dim, latent_dim):
        super(VAE, self).__init__()
        self.input_dim = input_dim
        self.hidden_dim = hidden_dim
        self.latent_dim = latent_dim

        # encode
        self.fc1 = nn.Linear(self.input_dim, self.hidden_dim)
        self.fc2 = nn.Linear(self.hidden_dim, self.latent_dim)

        # decode
        self.fc3 = nn.Linear(self.latent_dim, self.hidden_dim)
        self.fc4 = nn.Linear(self.hidden_dim, self.input_dim)

    def encode(self, x):
        hidden = F.relu(self.fc1(x))
        mu = F.relu(self.fc2(hidden))
        sigma = F.relu(self.fc2(hidden))
        return mu, sigma

    def decode(self, z):
        hidden = F.relu(self.fc3(z))
        output = self.fc4(hidden)
        return output

    def forward(self, x):
        mu, sigma = self.encode(x)
        z = mu + sigma * torch.randn(self.latent_dim)
        reconstructed_z = self.decode(z)
        return reconstructed_z, mu, sigma

 

์ฐธ๊ณ ํ•œ ์ž๋ฃŒ๋“ค

๋ฒ ์ด์ฆˆ ์ •๋ฆฌ๋ฅผ ์ดํ•ดํ•˜๋Š” ๊ฐ€์žฅ ์‰ฌ์šด ๋ฐฉ๋ฒ• ๐Ÿ”—

[ํ†ต๊ณ„ ์ด๋ก ] ๋ฒ ์ด์ง€์•ˆ ํ†ต๊ณ„ํ•™: ์‚ฌํ›„ ๋ถ„ํฌ ๐Ÿ”—

Variational Inference, ๋ฒ ์ด์ง€์•ˆ ๋”ฅ๋Ÿฌ๋‹ ๐Ÿ”—

VAE ์ถ”์ฒœ ์‹œ์Šคํ…œ ๊ตฌํ˜„ํ•˜๊ธฐ ๐Ÿ”—

VAE ์ง๊ด€์  ์ดํ•ด ๐Ÿ”—

์ตœ๊ทผ์— ์˜ฌ๋ผ์˜จ ๊ธ€
ยซ   2025/06   ยป
์ผ ์›” ํ™” ์ˆ˜ ๋ชฉ ๊ธˆ ํ† 
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30
Total
Today
Yesterday