<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>kimyeji2358 님의 블로그</title>
    <link>https://kimyeji2358.tistory.com/</link>
    <description>kimyeji2358 님의 블로그 입니다.</description>
    <language>ko</language>
    <pubDate>Sat, 16 May 2026 01:36:52 +0900</pubDate>
    <generator>TISTORY</generator>
    <ttl>100</ttl>
    <managingEditor>kimyeji2358</managingEditor>
    <item>
      <title>5. 선형대수학</title>
      <link>https://kimyeji2358.tistory.com/19</link>
      <description>&lt;h2 data-path-to-node=&quot;5&quot; data-ke-size=&quot;size26&quot;&gt;1. 직교 집합과 정규 직교 집합 (Orthogonal and Orthonormal Sets)&lt;/h2&gt;
&lt;p id=&quot;p-rc_e6759779233ad672-122&quot; data-path-to-node=&quot;6&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span data-path-to-node=&quot;6,0&quot;&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;6,1&quot;&gt;&lt;span&gt;직교 투영을 이해하기 위해 먼저 벡터 집합의 성질을 정의함&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;6,2&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;6,3&quot;&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-path-to-node=&quot;7&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li id=&quot;p-rc_e6759779233ad672-123&quot; data-path-to-node=&quot;7,0,1&quot;&gt;&lt;span data-path-to-node=&quot;7,0,1,0&quot;&gt;&lt;b data-index-in-node=&quot;0&quot; data-path-to-node=&quot;7,0,1,0&quot;&gt;&lt;span&gt;직교 집합 (Orthogonal Set):&lt;/span&gt;&lt;/b&gt;&lt;span&gt; 벡터 집합 &lt;/span&gt;&lt;span data-index-in-node=&quot;30&quot; data-math=&quot;\{u_{1},...,u_{p}\}&quot;&gt;$\{u_{1},...,u_{p}\}$&lt;/span&gt;&lt;span&gt; 내의 서로 다른 모든 벡터 쌍이 직교하는 집합임&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;7,0,1,1&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;7,0,1,2&quot;&gt;. &lt;/span&gt;&lt;span data-path-to-node=&quot;7,0,1,3&quot;&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;7,0,1,4&quot;&gt;&lt;span&gt;즉, &lt;/span&gt;&lt;span data-index-in-node=&quot;3&quot; data-math=&quot;i \neq j&quot;&gt;$i \neq j$&lt;/span&gt;&lt;span&gt;일 때 &lt;/span&gt;&lt;span data-index-in-node=&quot;15&quot; data-math=&quot;u_{i} \cdot u_{j} = 0&quot;&gt;$u_{i} \cdot u_{j} = 0$&lt;/span&gt;&lt;span&gt;을 만족함&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;7,0,1,5&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;7,0,1,6&quot;&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;li data-path-to-node=&quot;7,0,1&quot;&gt;&lt;b data-index-in-node=&quot;0&quot; data-path-to-node=&quot;7,1,1,0&quot;&gt;정규 직교 집합 (Orthonormal Set):&lt;/b&gt; 직교 집합이면서 각 벡터가 단위 벡터(크기가 1)인 집합임&lt;span style=&quot;letter-spacing: 0px;&quot; data-path-to-node=&quot;7,1,1,1&quot;&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li id=&quot;p-rc_e6759779233ad672-125&quot; data-path-to-node=&quot;7,2,1&quot;&gt;&lt;span data-path-to-node=&quot;7,2,1,0&quot;&gt;&lt;b data-index-in-node=&quot;0&quot; data-path-to-node=&quot;7,2,1,0&quot;&gt;&lt;span&gt;특징:&lt;/span&gt;&lt;/b&gt;&lt;span&gt; 임의의 기저는 그램-슈미트 과정(Gram-Schmidt process)을 통해 직교 또는 정규 직교 기저로 변환 가능하며, 이는 QR 분해(QR factorization)로 이어짐&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;7,2,1,1&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;7,2,1,2&quot;&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-path-to-node=&quot;9&quot; data-ke-size=&quot;size26&quot;&gt;2. 직선 위로의 직교 투영 (Projection onto a Line)&lt;/h2&gt;
&lt;p id=&quot;p-rc_e6759779233ad672-126&quot; data-path-to-node=&quot;10&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span data-path-to-node=&quot;10,0&quot;&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;10,1&quot;&gt;&lt;span&gt;1차원 부분공간(직선) &lt;/span&gt;&lt;span data-index-in-node=&quot;13&quot; data-math=&quot;L&quot;&gt;$L$&lt;/span&gt;&lt;span&gt; 위로 벡터 &lt;/span&gt;&lt;span data-index-in-node=&quot;21&quot; data-math=&quot;y&quot;&gt;$y$&lt;/span&gt;&lt;span&gt;를 투영하는 경우를 살펴봄&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;10,2&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;10,3&quot;&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-path-to-node=&quot;11&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li id=&quot;p-rc_e6759779233ad672-127&quot; data-path-to-node=&quot;11,0,1&quot;&gt;&lt;span data-path-to-node=&quot;11,0,1,0&quot;&gt;&lt;b data-index-in-node=&quot;0&quot; data-path-to-node=&quot;11,0,1,0&quot;&gt;&lt;span&gt;일반 공식:&lt;/span&gt;&lt;/b&gt;&lt;span&gt; 벡터 &lt;/span&gt;&lt;span data-index-in-node=&quot;10&quot; data-math=&quot;u&quot;&gt;$u$&lt;/span&gt;&lt;span&gt;가 만드는 직선 &lt;/span&gt;&lt;span data-index-in-node=&quot;20&quot; data-math=&quot;L&quot;&gt;$L$&lt;/span&gt;&lt;span&gt; 위로의 투영 $\hat{y}$는 다음과 같음&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;11,0,1,1&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;11,0,1,2&quot;&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;$$\hat{y} = \text{proj}_{L} y = \frac{y \cdot u}{u \cdot u} u$$&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-path-to-node=&quot;11&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li data-path-to-node=&quot;11,0,1&quot;&gt;&lt;span data-path-to-node=&quot;11,1,1,0&quot;&gt;&lt;b data-path-to-node=&quot;11,1,1,0&quot; data-index-in-node=&quot;0&quot;&gt;&lt;span&gt;단위 벡터인 경우:&lt;/span&gt;&lt;/b&gt;&lt;span&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;만약&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span data-math=&quot;u&quot; data-index-in-node=&quot;14&quot;&gt;$u$&lt;/span&gt;&lt;span&gt;가 단위 벡터라면&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span data-math=&quot;u \cdot u = 1&quot; data-index-in-node=&quot;25&quot;&gt;$u \cdot u = 1$&lt;/span&gt;&lt;span&gt;이 되므로 식이 단순화됨&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;11,1,1,1&quot;&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;11,1,1,2&quot;&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div data-path-to-node=&quot;11,1,2&quot;&gt;
&lt;div data-math=&quot;\hat{y} = (y \cdot u) u&quot;&gt;$$\hat{y} = (y \cdot u) u$$&lt;/div&gt;
&lt;/div&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-path-to-node=&quot;11&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li id=&quot;p-rc_e6759779233ad672-130&quot; data-path-to-node=&quot;11,2,1&quot;&gt;&lt;span data-path-to-node=&quot;11,2,1,0&quot;&gt;&lt;b data-index-in-node=&quot;0&quot; data-path-to-node=&quot;11,2,1,0&quot;&gt;&lt;span&gt;기하학적 의미:&lt;/span&gt;&lt;/b&gt;&lt;span&gt; 벡터 $y - \hat{y}$는 직선 &lt;/span&gt;&lt;span data-index-in-node=&quot;30&quot; data-math=&quot;L&quot;&gt;$L$&lt;/span&gt;&lt;span&gt;과 수직(직교)을 이룸&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;11,2,1,1&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;11,2,1,2&quot;&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;636&quot; data-origin-height=&quot;370&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bDv290/dJMcaiDgjvg/rJVqjAe9CLYH41pqYzXFn0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bDv290/dJMcaiDgjvg/rJVqjAe9CLYH41pqYzXFn0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bDv290/dJMcaiDgjvg/rJVqjAe9CLYH41pqYzXFn0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbDv290%2FdJMcaiDgjvg%2FrJVqjAe9CLYH41pqYzXFn0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;493&quot; height=&quot;287&quot; data-origin-width=&quot;636&quot; data-origin-height=&quot;370&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h2 data-path-to-node=&quot;13&quot; data-ke-size=&quot;size26&quot;&gt;3. 평면 및 부분공간 위로의 직교 투영 (Projection onto a Plane/Subspace)&lt;/h2&gt;
&lt;p id=&quot;p-rc_e6759779233ad672-131&quot; data-path-to-node=&quot;14&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span data-path-to-node=&quot;14,0&quot;&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;14,1&quot;&gt;&lt;span&gt;2차원 이상의 부분공간 &lt;/span&gt;&lt;span data-index-in-node=&quot;13&quot; data-math=&quot;W&quot;&gt;$W$&lt;/span&gt;&lt;span&gt; 위로 투영할 때는 직교 기저의 성질을 이용함&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;14,2&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;14,3&quot;&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-path-to-node=&quot;15&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li id=&quot;p-rc_e6759779233ad672-132&quot; data-path-to-node=&quot;15,0,1&quot;&gt;&lt;span data-path-to-node=&quot;15,0,1,0&quot;&gt;&lt;b data-index-in-node=&quot;0&quot; data-path-to-node=&quot;15,0,1,0&quot;&gt;&lt;span&gt;직교 기저 ${u_1, u_2}$를 가질 때:&lt;/span&gt;&lt;/b&gt;&lt;span&gt; 각 기저 벡터에 대한 투영을 독립적으로 계산하여 더함&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;15,0,1,1&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;15,0,1,2&quot;&gt;. &lt;/span&gt;&lt;span data-path-to-node=&quot;15,0,1,3&quot;&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;$$\hat{y} = \text{proj}_{W} y = \frac{y \cdot u_1}{u_1 \cdot u_1} u_1 + \frac{y \cdot u_2}{u_2 \cdot u_2} u_2$$&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot; data-path-to-node=&quot;15&quot;&gt;
&lt;li id=&quot;p-rc_e6759779233ad672-134&quot; data-path-to-node=&quot;15,1,1&quot;&gt;&lt;span data-path-to-node=&quot;15,1,1,0&quot;&gt;&lt;b data-path-to-node=&quot;15,1,1,0&quot; data-index-in-node=&quot;0&quot;&gt;&lt;span&gt;정규 직교 기저일 때:&lt;/span&gt;&lt;/b&gt;&lt;span&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;각 분모가 1이 되어 계산이 매우 간편해짐&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;15,1,1,1&quot;&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;15,1,1,2&quot;&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-path-to-node=&quot;15,1,1,2&quot;&gt; $$\hat{y} = (y \cdot u_1) u_1 + (y \cdot u_2) u_2$$&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;406&quot; data-origin-height=&quot;501&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/PrlV7/dJMcabjSVzZ/5zPhCdP2oEdnbiLkWkqEVK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/PrlV7/dJMcabjSVzZ/5zPhCdP2oEdnbiLkWkqEVK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/PrlV7/dJMcabjSVzZ/5zPhCdP2oEdnbiLkWkqEVK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FPrlV7%2FdJMcabjSVzZ%2F5zPhCdP2oEdnbiLkWkqEVK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;270&quot; height=&quot;333&quot; data-origin-width=&quot;406&quot; data-origin-height=&quot;501&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;$y \in W$인 경우: 투영하려는 벡터가 이미 그 평면(부분공간) 안에 있다면, 투영 결과는 자기 자신인 $y$와 같음.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;456&quot; data-origin-height=&quot;427&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/vM6lY/dJMcabxmtYK/ybpYSfdxkjBywJUZo1UWLK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/vM6lY/dJMcabxmtYK/ybpYSfdxkjBywJUZo1UWLK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/vM6lY/dJMcabxmtYK/ybpYSfdxkjBywJUZo1UWLK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FvM6lY%2FdJMcabxmtYK%2FybpYSfdxkjBywJUZo1UWLK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;360&quot; height=&quot;337&quot; data-origin-width=&quot;456&quot; data-origin-height=&quot;427&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;div data-path-to-node=&quot;15,1,2&quot;&gt;
&lt;div data-math=&quot;\hat{y} = (y \cdot u_1) u_1 + (y \cdot u_2) u_2&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;/div&gt;
&lt;h2 data-path-to-node=&quot;17&quot; data-ke-size=&quot;size26&quot;&gt;4. 선형 변환 관점에서의 행렬 표현 (Matrix Perspective)&lt;/h2&gt;
&lt;p id=&quot;p-rc_e6759779233ad672-160&quot; data-path-to-node=&quot;18&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span data-path-to-node=&quot;18,0&quot;&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;18,1&quot;&gt;&lt;span&gt;직교 투영을 행렬 연산으로 공식화함&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;18,2&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;18,3&quot;&gt;. &lt;/span&gt;&lt;span data-path-to-node=&quot;18,4&quot;&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;18,5&quot;&gt;&lt;span&gt;정규 직교 기저 ${u_1, u_2}$를 열벡터로 갖는 행렬을 &lt;/span&gt;&lt;span data-index-in-node=&quot;35&quot; data-math=&quot;U&quot;&gt;$U$&lt;/span&gt;&lt;span&gt;라고 함&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;18,6&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;18,7&quot;&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-path-to-node=&quot;19&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;b data-index-in-node=&quot;0&quot; data-path-to-node=&quot;19,0,0&quot;&gt;변환 과정:&lt;/b&gt;
&lt;div data-path-to-node=&quot;19,0,2&quot;&gt;
&lt;div data-math=&quot;\begin{aligned} \hat{b} &amp;amp;= (u_1^T b) u_1 + (u_2^T b) u_2 \\ &amp;amp;= u_1 (u_1^T b) + u_2 (u_2^T b) \\ &amp;amp;= (u_1 u_1^T) b + (u_2 u_2^T) b \\ &amp;amp;= (u_1 u_1^T + u_2 u_2^T) b \end{aligned}&quot;&gt;$$\begin{aligned} \hat{b} &amp;amp;= (u_1^T b) u_1 + (u_2^T b) u_2 \\ &amp;amp;= u_1 (u_1^T b) + u_2 (u_2^T b) \\ &amp;amp;= (u_1 u_1^T) b + (u_2 u_2^T) b \\ &amp;amp;= (u_1 u_1^T + u_2 u_2^T) b \end{aligned}$$&lt;/div&gt;
&lt;/div&gt;
&lt;/li&gt;
&lt;li id=&quot;p-rc_e6759779233ad672-162&quot; data-path-to-node=&quot;19,1,1&quot;&gt;&lt;span data-path-to-node=&quot;19,1,1,0&quot;&gt;&lt;b data-index-in-node=&quot;0&quot; data-path-to-node=&quot;19,1,1,0&quot;&gt;&lt;span&gt;결과:&lt;/span&gt;&lt;/b&gt;&lt;span&gt; 행렬 &lt;/span&gt;&lt;span data-index-in-node=&quot;7&quot; data-math=&quot;U&quot;&gt;$U$&lt;/span&gt;&lt;span&gt;를 사용하면 투영은 &lt;/span&gt;&lt;span data-index-in-node=&quot;19&quot; data-math=&quot;\hat{b} = UU^T b&quot;&gt;$\hat{b} = UU^T b$&lt;/span&gt;&lt;span&gt;라는 선형 변환으로 표현됨&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;19,1,1,1&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;19,1,1,2&quot;&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-path-to-node=&quot;21&quot; data-ke-size=&quot;size26&quot;&gt;5. 정규 방정식과의 관계 (Relationship with &lt;span data-index-in-node=&quot;34&quot; data-math=&quot;A^T A&quot;&gt;$A^T A$&lt;/span&gt;)&lt;/h2&gt;
&lt;p id=&quot;p-rc_e6759779233ad672-163&quot; data-path-to-node=&quot;22&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span data-path-to-node=&quot;22,0&quot;&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;22,1&quot;&gt;&lt;span&gt;행렬 &lt;/span&gt;&lt;span data-index-in-node=&quot;3&quot; data-math=&quot;A&quot;&gt;$A$&lt;/span&gt;&lt;span&gt;의 열공간(Column Space)으로 투영할 때, &lt;/span&gt;&lt;span data-index-in-node=&quot;33&quot; data-math=&quot;C = A^T A&quot;&gt;$C = A^T A$&lt;/span&gt;&lt;span&gt;가 가역 행렬인 경우를 가정함&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;22,2&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;22,3&quot;&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-path-to-node=&quot;23&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li id=&quot;p-rc_e6759779233ad672-164&quot; data-path-to-node=&quot;23,0,1&quot;&gt;&lt;span data-path-to-node=&quot;23,0,1,0&quot;&gt;&lt;b data-index-in-node=&quot;0&quot; data-path-to-node=&quot;23,0,1,0&quot;&gt;&lt;span&gt;일반적인 투영 공식:&lt;/span&gt;&lt;/b&gt;&lt;span&gt; &lt;/span&gt;&lt;span data-index-in-node=&quot;12&quot; data-math=&quot;\hat{b} = A(A^T A)^{-1} A^T b&quot;&gt;$\hat{b} = A(A^T A)^{-1} A^T b$&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li data-path-to-node=&quot;23,0,1&quot;&gt;&lt;b data-index-in-node=&quot;0&quot; data-path-to-node=&quot;23,1,1,0&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;A&quot;&gt;$A$&lt;/span&gt;의 열이 정규 직교할 때:&lt;/b&gt; &lt;span data-index-in-node=&quot;16&quot; data-math=&quot;A^T A = I&quot;&gt;$A^T A = I$&lt;/span&gt;(단위 행렬)가 됨&lt;span style=&quot;letter-spacing: 0px;&quot; data-path-to-node=&quot;23,1,1,1&quot;&gt;&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot; data-path-to-node=&quot;23,1,1,2&quot;&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;b data-index-in-node=&quot;0&quot; data-path-to-node=&quot;23,2,0&quot;&gt;최종 유도:&lt;/b&gt;
&lt;div data-path-to-node=&quot;23,2,2&quot;&gt;
&lt;div data-math=&quot;\hat{b} = A(I)^{-1} A^T b = AA^T b&quot;&gt;$$\hat{b} = A(I)^{-1} A^T b = AA^T b$$&lt;/div&gt;
&lt;/div&gt;
&lt;span data-path-to-node=&quot;23,2,3,0&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;23,2,3,1&quot;&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;23,2,3,2&quot;&gt;&lt;span&gt;이는 앞서 구한 &lt;/span&gt;&lt;span data-index-in-node=&quot;9&quot; data-math=&quot;UU^T b&quot;&gt;$UU^T b$&lt;/span&gt;&lt;span&gt;와 동일한 결과임을 확인할 수 있음&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;23,2,3,3&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;23,2,3,4&quot;&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-path-to-node=&quot;25&quot; data-ke-size=&quot;size16&quot;&gt;&lt;b data-index-in-node=&quot;0&quot; data-path-to-node=&quot;25&quot;&gt;정리:&lt;/b&gt; 직교 투영은 주어진 벡터를 특정 부분공간에서 가장 가까운 벡터로 근사하는 최적의 방법임. 특히 정규 직교 기저를 사용하면 복잡한 역행렬 계산 없이 &lt;span data-index-in-node=&quot;86&quot; data-math=&quot;AA^T&quot;&gt;$AA^T$&lt;/span&gt;만으로 투영 벡터를 구할 수 있음&lt;/p&gt;
&lt;p data-path-to-node=&quot;25&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-path-to-node=&quot;5&quot; data-ke-size=&quot;size26&quot;&gt;6. 그람-슈미트 직교화 (Gram-Schmidt Orthogonalization)&lt;/h2&gt;
&lt;p id=&quot;p-rc_3fda7223dfabda49-167&quot; data-path-to-node=&quot;6&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span data-path-to-node=&quot;6,0&quot;&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;6,1&quot;&gt;&lt;span&gt;그람-슈미트 과정은 임의의 선형 독립인 벡터 집합 ${x_1, \dots, x_n}$을 직교하는 벡터 집합 ${v_1, \dots, v_n}$으로 변환하는 알고리즘&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-path-to-node=&quot;7&quot; data-ke-size=&quot;size18&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-path-to-node=&quot;7&quot; data-ke-size=&quot;size18&quot;&gt;핵심 알고리즘 단계&lt;/p&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; data-path-to-node=&quot;8&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;&lt;b data-index-in-node=&quot;0&quot; data-path-to-node=&quot;8,0,0&quot;&gt;Step 1:&lt;/b&gt; 첫 번째 벡터는 그대로 유지함.&lt;br /&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-path-to-node=&quot;8,0,1&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li id=&quot;p-rc_3fda7223dfabda49-168&quot; data-path-to-node=&quot;8,0,1,0,1&quot;&gt;&lt;span data-path-to-node=&quot;8,0,1,0,1,0&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;v_1 = x_1&quot;&gt;$v_1 = x_1$&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;8,0,1,0,1,1&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;b data-index-in-node=&quot;0&quot; data-path-to-node=&quot;8,1,0&quot;&gt;Step 2:&lt;/b&gt; 두 번째 벡터 &lt;span data-index-in-node=&quot;16&quot; data-math=&quot;x_2&quot;&gt;$x_2$&lt;/span&gt;에서 &lt;span data-index-in-node=&quot;22&quot; data-math=&quot;v_1&quot;&gt;$v_1$&lt;/span&gt; 방향으로의 투영 성분을 제거하여, &lt;span data-index-in-node=&quot;45&quot; data-math=&quot;v_1&quot;&gt;$v_1$&lt;/span&gt;과 수직인 &lt;span data-index-in-node=&quot;54&quot; data-math=&quot;v_2&quot;&gt;$v_2$&lt;/span&gt;를 구함.&lt;br /&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-path-to-node=&quot;8,1,1&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li id=&quot;p-rc_3fda7223dfabda49-169&quot; data-path-to-node=&quot;8,1,1,0,1&quot;&gt;&lt;span data-path-to-node=&quot;8,1,1,0,1,0&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;v_2 = x_2 - \text{proj}_{W_1} x_2 = x_2 - \frac{x_2 \cdot v_1}{v_1 \cdot v_1} v_1&quot;&gt; &lt;span data-index-in-node=&quot;0&quot; data-math=&quot;v_2 = x_2 - \text{proj}_{W_1} x_2 = x_2 - \frac{x_2 \cdot v_1}{v_1 \cdot v_1} v_1&quot;&gt;$v_2 = x_2 - \text{proj}_{W_1} x_2 = x_2 - \frac{x_2 \cdot v_1}{v_1 \cdot v_1} v_1$&lt;/span&gt; &lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;b data-index-in-node=&quot;0&quot; data-path-to-node=&quot;8,2,0&quot;&gt;Step 3:&lt;/b&gt; 세 번째 벡터 &lt;span data-index-in-node=&quot;16&quot; data-math=&quot;x_3&quot;&gt;$x_3$&lt;/span&gt;에서 &lt;span data-index-in-node=&quot;22&quot; data-math=&quot;v_1&quot;&gt;$v_1$&lt;/span&gt;과 &lt;span data-index-in-node=&quot;27&quot; data-math=&quot;v_2&quot;&gt;$v_2$&lt;/span&gt;가 이루는 평면 &lt;span data-index-in-node=&quot;39&quot; data-math=&quot;W_2&quot;&gt;$W_2$&lt;/span&gt; 위로의 투영 성분을 제거하여, 앞선 모든 벡터와 수직인 &lt;span data-index-in-node=&quot;74&quot; data-math=&quot;v_3&quot;&gt;$v_3$&lt;/span&gt;를 구함.&lt;br /&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-path-to-node=&quot;8,2,1&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li id=&quot;p-rc_3fda7223dfabda49-170&quot; data-path-to-node=&quot;8,2,1,0,1&quot;&gt;&lt;span data-path-to-node=&quot;8,2,1,0,1,0&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;v_3 = x_3 - \text{proj}_{W_2} x_3 = x_3 - \left( \frac{x_3 \cdot v_1}{v_1 \cdot v_1} v_1 + \frac{x_3 \cdot v_2}{v_2 \cdot v_2} v_2 \right)&quot;&gt;$v_3 = x_3 - \text{proj}_{W_2} x_3 = x_3 - \left( \frac{x_3 \cdot v_1}{v_1 \cdot v_1} v_1 + \frac{x_3 \cdot v_2}{v_2 \cdot v_2} v_2 \right)$&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;8,2,1,0,1,1&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p id=&quot;p-rc_3fda7223dfabda49-171&quot; data-path-to-node=&quot;9&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span data-path-to-node=&quot;9,0&quot;&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;9,1&quot;&gt;&lt;span&gt;이 과정을 반복하면 모든 벡터가 서로 수직인 직교 기저(Orthogonal Basis)를 얻게 됨&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;9,2&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;9,3&quot;&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p data-path-to-node=&quot;9&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-path-to-node=&quot;11&quot; data-ke-size=&quot;size26&quot;&gt;7. 그람-슈미트 과정의 기하학적 이해&lt;/h2&gt;
&lt;p data-path-to-node=&quot;12&quot; data-ke-size=&quot;size16&quot;&gt;그람-슈미트 과정은 기하학적으로 기존 공간에 수직인 새로운 성분만을 남기는 과정.&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-path-to-node=&quot;13&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li id=&quot;p-rc_3fda7223dfabda49-172&quot; data-path-to-node=&quot;13,0,1&quot;&gt;&lt;span data-path-to-node=&quot;13,0,1,0&quot;&gt;&lt;span&gt;벡터 &lt;/span&gt;&lt;span data-index-in-node=&quot;30&quot; data-math=&quot;x_3&quot;&gt;$x_3$&lt;/span&gt;&lt;span&gt;를 평면 &lt;/span&gt;&lt;span data-index-in-node=&quot;38&quot; data-math=&quot;W_2 = \text{Span}\{v_1, v_2'\}&quot;&gt;$W_2 = \text{Span}\{v_1, v_2'\}$&lt;/span&gt;&lt;span&gt; 위로 투립시킨 지점이 &lt;/span&gt;&lt;span data-index-in-node=&quot;81&quot; data-math=&quot;\text{proj}_{W_2} x_3&quot;&gt;$\text{proj}_{W_2} x_3$&lt;/span&gt;&lt;span&gt;임&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;13,0,1,1&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;13,0,1,2&quot;&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span data-path-to-node=&quot;13,1,0,0&quot;&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;13,1,0,1&quot;&gt;&lt;span&gt;이때 &lt;/span&gt;&lt;span data-index-in-node=&quot;3&quot; data-math=&quot;x_3&quot;&gt;$x_3$&lt;/span&gt;&lt;span&gt;에서 투영 벡터 &lt;/span&gt;&lt;span data-index-in-node=&quot;15&quot; data-math=&quot;\text{proj}_{W_2} x_3&quot;&gt;$\text{proj}_{W_2} x_3$&lt;/span&gt;&lt;span&gt;를 빼주면, 평면 &lt;/span&gt;&lt;span data-index-in-node=&quot;46&quot; data-math=&quot;W_2&quot;&gt;$W_2$&lt;/span&gt;&lt;span&gt;와 완벽하게 수직인 벡터 &lt;/span&gt;&lt;span data-index-in-node=&quot;63&quot; data-math=&quot;v_3&quot;&gt;$v_3$&lt;/span&gt;&lt;span&gt;가 생성됨&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;13,1,0,2&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;13,1,0,3&quot;&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;이를 통해 ${v_1, v_2, v_3}$는 서로 직교하는 3차원 기저가 됨.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;827&quot; data-origin-height=&quot;346&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/CwmNF/dJMcabjSVJl/5vKkMngeJvT81HeIdQjrTK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/CwmNF/dJMcabjSVJl/5vKkMngeJvT81HeIdQjrTK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/CwmNF/dJMcabjSVJl/5vKkMngeJvT81HeIdQjrTK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FCwmNF%2FdJMcabjSVJl%2F5vKkMngeJvT81HeIdQjrTK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;595&quot; height=&quot;249&quot; data-origin-width=&quot;827&quot; data-origin-height=&quot;346&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h2 data-path-to-node=&quot;15&quot; data-ke-size=&quot;size26&quot;&gt;8. QR 분해 (QR Factorization)&lt;/h2&gt;
&lt;p id=&quot;p-rc_3fda7223dfabda49-174&quot; data-path-to-node=&quot;16&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span data-path-to-node=&quot;16,0&quot;&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;16,1&quot;&gt;&lt;span&gt;QR 분해는 선형 독립인 열을 가진 행렬 &lt;/span&gt;&lt;span data-index-in-node=&quot;23&quot; data-math=&quot;A&quot;&gt;$A$&lt;/span&gt;&lt;span&gt;를 직교 행렬 &lt;/span&gt;&lt;span data-index-in-node=&quot;32&quot; data-math=&quot;Q&quot;&gt;$Q$&lt;/span&gt;&lt;span&gt;와 상삼각 행렬 &lt;/span&gt;&lt;span data-index-in-node=&quot;42&quot; data-math=&quot;R&quot;&gt;$R$&lt;/span&gt;&lt;span&gt;의 곱으로 나타내는 것임&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;16,2&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;16,3&quot;&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h4 data-path-to-node=&quot;17&quot; data-ke-size=&quot;size20&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;A = QR&quot;&gt;$A = QR$&lt;/span&gt;의 구성 요소&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt; &lt;b data-path-to-node=&quot;18,0,1,0&quot; data-index-in-node=&quot;0&quot;&gt;&lt;span&gt;행렬&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span data-math=&quot;Q&quot; data-index-in-node=&quot;3&quot;&gt;$Q$&lt;/span&gt;&lt;span&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;(&lt;/span&gt;&lt;span data-math=&quot;m \times n&quot; data-index-in-node=&quot;6&quot;&gt;$m \times n$&lt;/span&gt;&lt;span&gt;):&lt;/span&gt;&lt;/b&gt;&lt;span style=&quot;color: #333333; text-align: start;&quot;&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;행렬&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;$A$의 열공간(Col A)에 대한 정규 직교 기저(Orthonormal Basis)를 열벡터로 가짐 &lt;/span&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;그람-슈미트로 구한 ${v_1, \dots, v_n}$을 각각의 크기로 나누어 정규화한 ${u_1, \dots, u_n}$이 &lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot; data-index-in-node=&quot;70&quot; data-math=&quot;Q&quot;&gt;$Q$&lt;/span&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;의 열이 됨&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;행렬&amp;nbsp;$R$&amp;nbsp;($n \times n$):&amp;nbsp;상삼각 행렬(Upper Triangular Matrix)이며, 대각 성분은 양수임.
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span data-path-to-node=&quot;18,1,2,0,1,0&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;A&quot;&gt;$A$&lt;/span&gt;&lt;span&gt;의 원래 열벡터 &lt;/span&gt;&lt;span data-index-in-node=&quot;10&quot; data-math=&quot;x_k&quot;&gt;$x_k$&lt;/span&gt;&lt;span&gt;를 정규 직교 기저 &lt;/span&gt;&lt;span data-index-in-node=&quot;24&quot; data-math=&quot;u_i&quot;&gt;$u_i$&lt;/span&gt;&lt;span&gt;들의 선형 결합으로 표현했을 때의 계수들로 구성됨&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;18,1,2,0,1,1&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;18,1,2,0,1,2&quot;&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-path-to-node=&quot;20&quot; data-ke-size=&quot;size26&quot;&gt;9. QR 분해 예시 계산&lt;/h2&gt;
&lt;p id=&quot;p-rc_3fda7223dfabda49-178&quot; data-path-to-node=&quot;21&quot; data-ke-size=&quot;size16&quot;&gt;&lt;span data-path-to-node=&quot;21,0&quot;&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;21,1&quot;&gt;&lt;span&gt;행렬 $A = \begin{bmatrix} 1 &amp;amp; 0 &amp;amp; 0 \ 1 &amp;amp; 1 &amp;amp; 0 \ 1 &amp;amp; 1 &amp;amp; 1 \ 1 &amp;amp; 1 &amp;amp; 1 \end{bmatrix}$의 QR 분해 과정임&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;21,2&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;21,3&quot;&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; data-path-to-node=&quot;22&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;&lt;b data-index-in-node=&quot;0&quot; data-path-to-node=&quot;22,0,0&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;Q&quot;&gt;$Q$&lt;/span&gt; 구하기:&lt;/b&gt; &lt;span data-index-in-node=&quot;7&quot; data-math=&quot;A&quot;&gt;$A$&lt;/span&gt;의 열벡터들에 그람-슈미트 과정을 적용하고 정규화함.&lt;br /&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-path-to-node=&quot;22,0,1&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li id=&quot;p-rc_3fda7223dfabda49-179&quot; data-path-to-node=&quot;22,0,1,0,1&quot;&gt;&lt;span data-path-to-node=&quot;22,0,1,0,1,0&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;u_1&quot;&gt;$u_1$&lt;/span&gt;&lt;span&gt;은 &lt;/span&gt;&lt;span data-index-in-node=&quot;5&quot; data-math=&quot;x_1&quot;&gt;$x_1$&lt;/span&gt;&lt;span&gt;을 정규화하여 구함: &lt;/span&gt;&lt;span data-index-in-node=&quot;20&quot; data-math=&quot;u_1 = \begin{bmatrix} 1/2 \\ 1/2 \\ 1/2 \\ 1/2 \end{bmatrix}&quot;&gt;$u_1 = \begin{bmatrix} 1/2 \\ 1/2 \\ 1/2 \\ 1/2 \end{bmatrix}$&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;22,0,1,0,1,1&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span data-path-to-node=&quot;22,0,1,1,0,0&quot;&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;22,0,1,1,0,1&quot;&gt;&lt;span&gt;같은 방식으로 &lt;/span&gt;&lt;span data-index-in-node=&quot;8&quot; data-math=&quot;u_2, u_3&quot;&gt;$u_2, u_3$&lt;/span&gt;&lt;span&gt;를 차례로 구하여 행렬 $Q = \begin{bmatrix} u_1 &amp;amp; u_2 &amp;amp; u_3 \end{bmatrix}$를 구성함&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;22,0,1,1,0,2&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;22,0,1,1,0,3&quot;&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;b data-index-in-node=&quot;0&quot; data-path-to-node=&quot;22,1,0&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;R&quot;&gt;$R$&lt;/span&gt; 구하기:&lt;/b&gt; &lt;span data-index-in-node=&quot;7&quot; data-math=&quot;x_k&quot;&gt;$x_k$&lt;/span&gt;와 &lt;span data-index-in-node=&quot;12&quot; data-math=&quot;u_i&quot;&gt;$u_i$&lt;/span&gt;의 내적을 통해 계수를 구함.&lt;br /&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-path-to-node=&quot;22,1,1&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li id=&quot;p-rc_3fda7223dfabda49-181&quot; data-path-to-node=&quot;22,1,1,0,1&quot;&gt;&lt;span data-path-to-node=&quot;22,1,1,0,1,0&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;x_1 = 2u_1 \rightarrow r_{11} = 2&quot;&gt;$x_1 = 2u_1 \rightarrow r_{11} = 2$&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;22,1,1,0,1,1&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span data-path-to-node=&quot;22,1,1,1,0,0&quot;&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;22,1,1,1,0,1&quot;&gt;&lt;span&gt;최종적으로 구해진 &lt;/span&gt;&lt;span data-index-in-node=&quot;10&quot; data-math=&quot;R&quot;&gt;$R$&lt;/span&gt;&lt;span&gt;은 다음과 같음: &lt;/span&gt;&lt;span data-index-in-node=&quot;21&quot; data-math=&quot;R = \begin{bmatrix} 2 &amp;amp; -3/2 &amp;amp; 1 \\ 0 &amp;amp; -3/\sqrt{12} &amp;amp; 2/\sqrt{12} \\ 0 &amp;amp; 0 &amp;amp; 2/\sqrt{6} \end{bmatrix}&quot;&gt;$R = \begin{bmatrix} 2 &amp;amp; -3/2 &amp;amp; 1 \\ 0 &amp;amp; -3/\sqrt{12} &amp;amp; 2/\sqrt{12} \\ 0 &amp;amp; 0 &amp;amp; 2/\sqrt{6} \end{bmatrix}$&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;&lt;span data-path-to-node=&quot;22,1,1,1,0,1&quot;&gt;&lt;span data-index-in-node=&quot;21&quot; data-math=&quot;R = \begin{bmatrix} 2 &amp;amp; -3/2 &amp;amp; 1 \\ 0 &amp;amp; -3/\sqrt{12} &amp;amp; 2/\sqrt{12} \\ 0 &amp;amp; 0 &amp;amp; 2/\sqrt{6} \end{bmatrix}&quot;&gt;10. 실습&lt;/span&gt;&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1125&quot; data-origin-height=&quot;757&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/GJme2/dJMcaiceKqk/Khr6Ejt2dKK9yVE4AyXaC1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/GJme2/dJMcaiceKqk/Khr6Ejt2dKK9yVE4AyXaC1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/GJme2/dJMcaiceKqk/Khr6Ejt2dKK9yVE4AyXaC1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FGJme2%2FdJMcaiceKqk%2FKhr6Ejt2dKK9yVE4AyXaC1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1125&quot; height=&quot;757&quot; data-origin-width=&quot;1125&quot; data-origin-height=&quot;757&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1132&quot; data-origin-height=&quot;850&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/dwfhe0/dJMcab5dsIQ/Gv7CfXdp1fxDHO9Scxzpk0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/dwfhe0/dJMcab5dsIQ/Gv7CfXdp1fxDHO9Scxzpk0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/dwfhe0/dJMcab5dsIQ/Gv7CfXdp1fxDHO9Scxzpk0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fdwfhe0%2FdJMcab5dsIQ%2FGv7CfXdp1fxDHO9Scxzpk0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1132&quot; height=&quot;850&quot; data-origin-width=&quot;1132&quot; data-origin-height=&quot;850&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1127&quot; data-origin-height=&quot;826&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bHPHfX/dJMcaipKnX3/2nALbCKBdsdpCkm7fudQp1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bHPHfX/dJMcaipKnX3/2nALbCKBdsdpCkm7fudQp1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bHPHfX/dJMcaipKnX3/2nALbCKBdsdpCkm7fudQp1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbHPHfX%2FdJMcaipKnX3%2F2nALbCKBdsdpCkm7fudQp1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1127&quot; height=&quot;826&quot; data-origin-width=&quot;1127&quot; data-origin-height=&quot;826&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1122&quot; data-origin-height=&quot;537&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/QsUvb/dJMcaiQMk31/4tsiZIz3N227kKZmhj9U8K/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/QsUvb/dJMcaiQMk31/4tsiZIz3N227kKZmhj9U8K/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/QsUvb/dJMcaiQMk31/4tsiZIz3N227kKZmhj9U8K/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FQsUvb%2FdJMcaiQMk31%2F4tsiZIz3N227kKZmhj9U8K%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1122&quot; height=&quot;537&quot; data-origin-width=&quot;1122&quot; data-origin-height=&quot;537&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <author>kimyeji2358</author>
      <guid isPermaLink="true">https://kimyeji2358.tistory.com/19</guid>
      <comments>https://kimyeji2358.tistory.com/19#entry19comment</comments>
      <pubDate>Mon, 11 May 2026 02:17:18 +0900</pubDate>
    </item>
    <item>
      <title>4. 선형대수학</title>
      <link>https://kimyeji2358.tistory.com/18</link>
      <description>&lt;h2 data-ke-size=&quot;size26&quot;&gt;1. Over-determined System&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;연립방정식에서 방정식의 개수(m)가 미지수의 개수(n)보다 많은 경우&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;m &amp;gt; n&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;문제: 데이터가 너무 많아 모든 식을 동시에 만족하는 해 x가 존재하지 않는 경우가 대부분 (Ax=b의 해가 없음)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;벡터 공간에서의 해석: 행렬 A의 열벡터들의 선형 결합인 Ax는 항상 A의 열공간(ColA)안에 존재&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;목표 벡터 b가 이 열 공간(ColA) 밖에 있다면, 어떤 x를 선택해도 Ax = b를 만족시킬 수 없음&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;2. Least Squares&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;해가 없을 때, 가장 근사한 해를 찾으려고 노력&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;1) 최적의 근사 기준 : 오차 제곱합&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;오차 벡터를 e = b - Ax라고 할 때, 이 오차의 크기를 최소화하는 x를 찾는 것이 목표&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;오차 제곱합(Sum of Squared Errors) : 각 오차의 제곱을 모두 더한 값&lt;/li&gt;
&lt;li&gt;수식 : 최적의 해 $\hat{x}$
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;
&lt;div data-math=&quot;\hat{x} = \arg \min_{x} \|b - Ax\|&quot;&gt;$\hat{x} = \arg \min_{x} \|b - Ax\|$&lt;/div&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;ex) Life-span 예측&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;382&quot; data-origin-height=&quot;132&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cJGKHj/dJMcagkHv7M/ek5tF00kTE0nyRgoSkTitK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cJGKHj/dJMcagkHv7M/ek5tF00kTE0nyRgoSkTitK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cJGKHj/dJMcagkHv7M/ek5tF00kTE0nyRgoSkTitK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcJGKHj%2FdJMcagkHv7M%2Fek5tF00kTE0nyRgoSkTitK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;449&quot; height=&quot;155&quot; data-origin-width=&quot;382&quot; data-origin-height=&quot;132&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;4명의 데이터(몸무게, 키, 흡연 여부)를 통해 수명을 예측하는 모델 Ax=b&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- &lt;span data-index-in-node=&quot;7&quot; data-math=&quot;x = [-0.4, 20, -20]^T&quot;&gt;$x = [-0.4, 20, -20]^T$&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;첫 3명에 대해서 오차는 0이지만, 4번쨰 사람 오차 -12&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;오차 제곱합의 루트값은 12.0&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- &lt;span data-index-in-node=&quot;7&quot; data-math=&quot;x = [-0.12, 16, -9.5]^T&quot;&gt;$x = [-0.12, 16, -9.5]^T$&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;모든 사람에게 조금씩 오차가 발생하지만&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;오차 제곱합의 루트값은 약 9.55&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Inner Product(내적): &lt;span data-index-in-node=&quot;19&quot; data-math=&quot;u \cdot v = u^Tv&quot;&gt;$u \cdot v = u^Tv$&lt;/span&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;두 벡터 사이의 각도 $\theta$&lt;/li&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;25&quot; data-math=&quot;u \cdot v = \|u\|\|v\|\cos\theta&quot;&gt;$u \cdot v = \|u\|\|v\|\cos\theta$&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;25&quot; data-math=&quot;u \cdot v = \|u\|\|v\|\cos\theta&quot;&gt;Norm&lt;/span&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;25&quot; data-math=&quot;u \cdot v = \|u\|\|v\|\cos\theta&quot;&gt;벡터의 길이 &lt;span data-index-in-node=&quot;17&quot; data-math=&quot;\|v\| [cite_start]= \sqrt{v \cdot v}&quot;&gt;$\|v\| [cite_start]= \sqrt{v \cdot v}$&lt;/span&gt; &lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;25&quot; data-math=&quot;u \cdot v = \|u\|\|v\|\cos\theta&quot;&gt;Orhogonal(직교)&lt;/span&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;25&quot; data-math=&quot;u \cdot v = \|u\|\|v\|\cos\theta&quot;&gt;두 벡터의 내적이 0이면 &lt;span&gt;(&lt;/span&gt;&lt;span data-index-in-node=&quot;30&quot; data-math=&quot;u \cdot v = 0&quot;&gt;$u \cdot v = 0$&lt;/span&gt;&lt;span&gt;) 두 벡터는 수직&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;25&quot; data-math=&quot;u \cdot v = \|u\|\|v\|\cos\theta&quot;&gt;&lt;span&gt;Unit Vector (단위벡터)&lt;/span&gt;&lt;/span&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;25&quot; data-math=&quot;u \cdot v = \|u\|\|v\|\cos\theta&quot;&gt;&lt;span&gt;길이가 1인 벡터&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;25&quot; data-math=&quot;u \cdot v = \|u\|\|v\|\cos\theta&quot;&gt;&lt;span&gt; &lt;span data-index-in-node=&quot;33&quot; data-math=&quot;u = \frac{1}{\|v\|}v&quot;&gt;$u = \frac{1}{\|v\|}v$로 규화하여 만듦&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;&lt;span data-index-in-node=&quot;25&quot; data-math=&quot;u \cdot v = \|u\|\|v\|\cos\theta&quot;&gt;&lt;span&gt;&lt;span data-index-in-node=&quot;33&quot; data-math=&quot;u = \frac{1}{\|v\|}v&quot;&gt;3. Orthogonal Projection (직교 투영)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;25&quot; data-math=&quot;u \cdot v = \|u\|\|v\|\cos\theta&quot;&gt;&lt;span&gt;&lt;span data-index-in-node=&quot;33&quot; data-math=&quot;u = \frac{1}{\|v\|}v&quot;&gt;원리 : &lt;span data-index-in-node=&quot;7&quot; data-math=&quot;Ax&quot;&gt;$Ax$&lt;/span&gt;는 &lt;span data-index-in-node=&quot;11&quot; data-math=&quot;Col A&quot;&gt;$Col A$&lt;/span&gt; 평면 위의 점&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;25&quot; data-math=&quot;u \cdot v = \|u\|\|v\|\cos\theta&quot;&gt;&lt;span&gt;&lt;span data-index-in-node=&quot;33&quot; data-math=&quot;u = \frac{1}{\|v\|}v&quot;&gt;b와 가장 가까운 평면 위의 점은 b에서 평면에 내린 수선의 발 $\hat{b} = A\hat{x}$ &lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;25&quot; data-math=&quot;u \cdot v = \|u\|\|v\|\cos\theta&quot;&gt;&lt;span&gt;&lt;span data-index-in-node=&quot;33&quot; data-math=&quot;u = \frac{1}{\|v\|}v&quot;&gt;직교 조건: 최적의 지점에서는 오차 벡터 $b - A\hat{x}$가 A의 모든 열벡터 &lt;span&gt;(&lt;/span&gt;&lt;span data-index-in-node=&quot;49&quot; data-math=&quot;a_1, a_2, \dots, a_n&quot;&gt;$a_1, a_2, \dots, a_n$&lt;/span&gt;&lt;span&gt;)와 수직이어야 함&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;div data-math=&quot;A^T(b - A\hat{x}) = 0&quot;&gt;$$A^T(b - A\hat{x}) = 0$$&lt;/div&gt;
&lt;div data-math=&quot;A^T(b - A\hat{x}) = 0&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;4. Normal Equation (정규 방정식) 유도&lt;/h2&gt;
&lt;div data-math=&quot;A^T(b - A\hat{x}) = 0&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;A^T(b - A\hat{x}) = 0&quot;&gt;1. $A^T(b - A\hat{x}) = 0$&lt;/span&gt;&lt;/div&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;A^T b - A^T A \hat{x} = 0&quot;&gt;2. $A^T b - A^T A \hat{x} = 0$&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;3. 정규 방정식: &lt;span data-index-in-node=&quot;8&quot; data-math=&quot;A^T A \hat{x} = A^T b&quot;&gt;$A^T A \hat{x} = A^T b$&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;8&quot; data-math=&quot;A^T A \hat{x} = A^T b&quot;&gt;미분을 통한 유도&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;8&quot; data-math=&quot;A^T A \hat{x} = A^T b&quot;&gt;오차 함수 $f(x) = |b - Ax|^2 = (b - Ax)^T(b - Ax)$를 x에 대해 미분하여 0이 되는 지점을 찾음 &lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;$(b - Ax)^T(b - Ax) = b^Tb - x^TA^Tb - b^TAx + x^TA^TAx$&lt;/li&gt;
&lt;li&gt;x에 대해 미분하면 : &lt;span data-index-in-node=&quot;12&quot; data-math=&quot;-2A^Tb + 2A^TAx = 0&quot;&gt;$-2A^Tb + 2A^TAx = 0$&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;결과적으로 동일한 &lt;span data-index-in-node=&quot;10&quot; data-math=&quot;A^TA\hat{x} = A^Tb&quot;&gt;$A^TA\hat{x} = A^Tb$를 얻음&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;10&quot; data-math=&quot;A^TA\hat{x} = A^Tb&quot;&gt;최종 해 (Inverse Matrix 존재 시)&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;10&quot; data-math=&quot;A^TA\hat{x} = A^Tb&quot;&gt;만약 &lt;span data-index-in-node=&quot;3&quot; data-math=&quot;A^TA&quot;&gt;$A^TA$의 역행렬이 존재한다면, 최적의 해는&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;div data-math=&quot;\hat{x} = (A^T A)^{-1} A^T b&quot;&gt;$$\hat{x} = (A^T A)^{-1} A^T b$$&lt;/div&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;5. 정규 방정식의 성질과 주의점&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;해의 존재성
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;8&quot; data-math=&quot;A^TA\hat{x} = A^Tb&quot;&gt;$A^TA\hat{x} = A^Tb$는 항상 최소 하나 이상의 해를 가짐&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;8&quot; data-math=&quot;A^TA\hat{x} = A^Tb&quot;&gt;수선의 발을 내리지 못하는 경우는 없음&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;8&quot; data-math=&quot;A^TA\hat{x} = A^Tb&quot;&gt;역행렬이 없는 경우&lt;/span&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;8&quot; data-math=&quot;A^TA\hat{x} = A^Tb&quot;&gt;A의 열벡터들이 선형 종속일 때 발생&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;8&quot; data-math=&quot;A^TA\hat{x} = A^Tb&quot;&gt;이 경우 해는 무수히 많아짐&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;8&quot; data-math=&quot;A^TA\hat{x} = A^Tb&quot;&gt;일반적인 경우&lt;/span&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;8&quot; data-math=&quot;A^TA\hat{x} = A^Tb&quot;&gt;데이터가 충분히 독립적이라면 &lt;span data-index-in-node=&quot;25&quot; data-math=&quot;A^TA&quot;&gt;$A^TA$는 역행렬이 존재하며 유일한 해를 가짐&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;</description>
      <author>kimyeji2358</author>
      <guid isPermaLink="true">https://kimyeji2358.tistory.com/18</guid>
      <comments>https://kimyeji2358.tistory.com/18#entry18comment</comments>
      <pubDate>Fri, 3 Apr 2026 13:26:06 +0900</pubDate>
    </item>
    <item>
      <title>4-3. 생성형 AI와 LLM, RAG, AGENT</title>
      <link>https://kimyeji2358.tistory.com/17</link>
      <description>&lt;h2 data-ke-size=&quot;size26&quot;&gt;1. 생성형 AI(Generative AI)&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;생성형 AI는 데이터의 단순 재현이 아닌, 데이터가 생성되는 패턴과 구조를 학습하여 이전에 존재하지 않았던&amp;nbsp; &lt;u&gt;새로운 데이터를 생성&lt;/u&gt;하는 모델&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;○ 학습 데이터&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;입력값 자체가 정답 역할을 하므로 별도의 라벨링이 필요 없는 자기 지도 학습이 가능&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;이를 통해 인터넷상의 방대한 데이터를 학습할 수 있음&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;○ 수식 차이&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;판별 모델: &lt;span data-index-in-node=&quot;7&quot; data-math=&quot;P(y|x)&quot;&gt;$P(y|x)$ 데이터 x가 주어졌을 때 정답 y를 맞춤&lt;/span&gt;&lt;span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;7&quot; data-math=&quot;P(y|x)&quot;&gt;생성 모델 : &lt;span data-index-in-node=&quot;7&quot; data-math=&quot;P(x)&quot;&gt;$P(x)$&lt;/span&gt;&lt;span&gt; 또는 &lt;/span&gt;&lt;span data-index-in-node=&quot;15&quot; data-math=&quot;P(x|\text{조건})&quot;&gt;$P(x|\text{조건})$ 데이터 자체의 분포를 학습하거나 조건에 맞는 새로운 x를 생성&lt;/span&gt;&lt;/span&gt;&lt;span data-index-in-node=&quot;7&quot; data-math=&quot;P(y|x)&quot;&gt;&lt;span data-index-in-node=&quot;15&quot; data-math=&quot;P(x|\text{조건})&quot;&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;span data-index-in-node=&quot;7&quot; data-math=&quot;P(y|x)&quot;&gt;&lt;span data-index-in-node=&quot;15&quot; data-math=&quot;P(x|\text{조건})&quot;&gt;○ 생성형 AI 데이터 만드는 과정&lt;/span&gt;&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;336&quot; data-origin-height=&quot;150&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/m1LqN/dJMcajn7Rsl/MXzD8bcDqjkNBCV79wqwZK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/m1LqN/dJMcajn7Rsl/MXzD8bcDqjkNBCV79wqwZK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/m1LqN/dJMcajn7Rsl/MXzD8bcDqjkNBCV79wqwZK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fm1LqN%2FdJMcajn7Rsl%2FMXzD8bcDqjkNBCV79wqwZK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;432&quot; height=&quot;193&quot; data-origin-width=&quot;336&quot; data-origin-height=&quot;150&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;7&quot; data-math=&quot;P(y|x)&quot;&gt;&lt;span data-index-in-node=&quot;15&quot; data-math=&quot;P(x|\text{조건})&quot;&gt;1. 학습 단계 (Training)&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;7&quot; data-math=&quot;P(y|x)&quot;&gt;&lt;span data-index-in-node=&quot;15&quot; data-math=&quot;P(x|\text{조건})&quot;&gt;데이터의 본질을 파악하는 과정&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;구조와 패턴 파악: 데이터가 어떤 형태일 때 자연스러운지, 요소 간의 상관관계는 어떠한지 학습&lt;/li&gt;
&lt;li&gt;멀티모달 데이터 활용: 텍스트, 이미지, 텍스트+이미지 모두 학습 가능&lt;/li&gt;
&lt;li&gt;데이터 정제: 수집된 데이터에서 노이즈를 제거하고 필터링하여 고품질의 데이터셋을 구축&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;2, 생성 단계 (Generation)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;학습한 패턴을 기반으로 실제로 데이터를 출력하는 단계&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;확률적으로 계산하여 만들어 냄&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;핵심 요소: 잠재 변수&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 데이터에서 직접 관찰 되지는 않지만 데이터를 구성하는 숨겨진 핵심 특징을 의미함&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 복잡한 데이터 분포를 단순화하고 데이터의 구조를 이해하여, 일정한 맥락과 스타일을 갖춘 새로운 데이터를 생성하도록 도움&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;○ 데이터 생성 방식&lt;/h4&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;1. GAN (Generative Adversarial Networks)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;생성기와 판별기가 경쟁하며 학습&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;생성기는 잠재 변수 z를 통해 까자 데이터를 만들고, 판별기는 이를 실제와 구별하도록 훈련&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; 실제 데이터와 유사한 데이터를 생성함&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;2. VAE (Variational Autoencoders)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;인코더를 통해 데이터를 저차원 잠재 변수 z로 압축하고&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;디코더가 이를 다시 고차원 데이터로 복원하여 새로운 데이터를 생성&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;인코더는 잠재변수의 평균과 표준편차를 예측 &amp;rarr; 잠재 변수를 정규 분포에서 샘플링하여 출력 &amp;rarr; 데이터의 저차원 표현을 학습&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;3. 확산 모델 (Diffusion Model)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;데이터에 단계적으로 노이즈를 추가하는 순방향 확산과&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;이를 다시 복원하는 역방향 확산 과정을 통해 데이터를 생성&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;2. LLM (Large Language Model)&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;LLM은 대량의 텍스트 데이터를 학습하여 인간과 유사한 언어를 생성하는 모델&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;생성형 AI 중 텍스트 생성에 특화된 모델&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;○ 동작 원리 : Next Token Prediction&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;LLM의 본질은 다음에 올 가장 자연스러운 단어를 확률적으로 예측하는 것&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;ex) &lt;span data-path-to-node=&quot;14,0,0,1&quot;&gt;&lt;span&gt;&quot;I am a&quot; 입력 시, &lt;/span&gt;&lt;span&gt;student(0.6)&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;developer(0.3)&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;teacher(0.1)&lt;/span&gt;&lt;span&gt; 중 확률이 가장 높은 단어를 선택하여 문장을 이어 붙임&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- LLM 성이 좋은 이유&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;방대한 데이터 규모&lt;/li&gt;
&lt;li&gt;거대한 모델 크기&lt;/li&gt;
&lt;li&gt;트랜스포머 구조&lt;/li&gt;
&lt;li&gt;전이 학습&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;○ 프롬프트 전략 : In-context Learning&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;모델의 파라미터를 수정하지 않고 입력(프롬포트) 내의 문맥 정보를 활용해 성능을 높이는 방식&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Zero-shot : 예시 없이 바로 요청&lt;/li&gt;
&lt;li&gt;One -shot : 1개의 예시 제공&lt;/li&gt;
&lt;li&gt;Few-shot : 여러 개의 예시를 제공하여 패턴을 학습시킴&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;○ 생성 방식 제어&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Temperature : 확률 분포를 얼마나 랜덤하게 사용할지 결정하는 수단
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;낮음 (0~0.3) : 확률이 높은 단어를 선택하여 안정적이고 정확한 결과를 냄&lt;/li&gt;
&lt;li&gt;높음 (0.7~1.0) : 확률이 낮은 단어도 선택될 기회를 주어 다양하고 창의적인 문장 생성&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Top-k : 확률이 높은 상위 k개의 단어만 후보로 사용&lt;/li&gt;
&lt;li&gt;Top-p : 상위 단어들의 확률 합이 p가 될 때까지 후보군을 선택하는 방식&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;3. RAG (Retrieval-Augmented Generation)&lt;/h2&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;792&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/Y38lk/dJMcafMT0vX/Jng5AlEireCnMBKJRKkhSk/img.jpg&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/Y38lk/dJMcafMT0vX/Jng5AlEireCnMBKJRKkhSk/img.jpg&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/Y38lk/dJMcafMT0vX/Jng5AlEireCnMBKJRKkhSk/img.jpg&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FY38lk%2FdJMcafMT0vX%2FJng5AlEireCnMBKJRKkhSk%2Fimg.jpg&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;518&quot; height=&quot;321&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;792&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;RAG는 LLM의 내부 지식 한계를 극복하기 위해 외부 데이터베이스에서 관련 정보를 검색하여&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;이를 바탕으로 답변을 생성하는 기술&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;○ Rag의 3단계 구조&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;1. Retrieval (검색)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;사용자의 질문을 벡터로 변환하여 벡터 DB에서 관련 문서를 찾음&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;2.&amp;nbsp; Augmentation(증강)&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;검색된 정보를 질문과 함께 LLM에 입력으로 제공&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;3. Generation (생성)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;LLM이 제공된 근거 데이터를 바탕으로 최종 답변을 생성&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 장점&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;최종 정보 반영&lt;/li&gt;
&lt;li&gt;정확도 향상&lt;/li&gt;
&lt;li&gt;출처 제공 가능&lt;/li&gt;
&lt;li&gt;환각 현상 감소&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;&amp;nbsp;&lt;/h2&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;4. AI Agnet&amp;nbsp;&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Ai Agent는 특정 목표를 달성하기 위해 스스로 판단하고 외부 도구를 사용하여 행동하는 시스템&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;주요 특징&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;자율성: 사람의 개입 없이 스스로 작업 수행&lt;/li&gt;
&lt;li&gt;상태 유지: 이전 대화나 맥락을 기억하며 환경 변화에 맞춰 행동을 바꿈&lt;/li&gt;
&lt;li&gt;실행 능력: 텍스트 생성뿐만 아니라 API호출, 파일 수정, 캘린더 등록 등 실제 작업을 수행함&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;○ 기본 구조&lt;/h4&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;1. Model (Think)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;지능의 핵심으로, 상황을 이해하고 추론함&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;2. Orchestrator (Coordinate)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;요청 목적을 해석, 어떤 도구를 어떤 순서로 사용할지 계획을 세움&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;3. Tools (Act)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;외부 인터페이스 (Extensions), 사용자 정의 함수 (Functions), 데이터 저장소 (Data Stores) 등을 통해 실제 행동을 취함&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;○ 주요 사고 전략&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&amp;nbsp;ReAct : 추론과 행동을 번갈아 수행하며 문제 해결&lt;/li&gt;
&lt;li&gt;CoT (Chain of Thought) : 복잡한 문제를 해결하기 위해 중간 사고 과정을 단계별로 서술&lt;/li&gt;
&lt;li&gt;ToT (Three of Thought) : 여러 아이디어를 병렬로 전개하고 장단점을 비교하여 최적의 경로 선택&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;5. Multi-Agent&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Multi-Agent(Agentic AI)는 여러 개의 전문화된 Agent가 협력하여 하나의 크고 복잡한 목표를 해결하는 시스템&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;역할 분담&amp;nbsp;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;각 에이전트가 자재 관리, 생산 계획, 품질 검사 등 특징 전문 영역을 담당&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;고도의 자율성
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;단일 모델로 해결하기 어려운 복잡한 문제를 에이전트 간의 협업으로 해결&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;동적 계획 수정
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;상황 변화에 따라 에이전트들이 서로 소통하며 계획을 실시간으로 수정&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;○ 협업 방식&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;1. 오케스트레이션 (Orchestration) 방식&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;중앙 컨트롤러가 전체 흐름을 관리하며 각 에이전트에 작업을 분배하는 방식&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;2. 코레오그래피 (Choreography) 방식&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;중앙 통제 없이 에이전트들이 자율적으로 소통하며 협업하는 분산형 구조&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;table style=&quot;border-collapse: collapse; width: 100%;&quot; border=&quot;1&quot; data-ke-align=&quot;alignLeft&quot;&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;구분&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;AI Agent&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;Multi-Agent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;수행 작업&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;단일 작업 중심&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;복잡한 목표 중심&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;의사 결정&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;단일 모델의 추론에 의존&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;여러 전문 에이전트 간의 협의 및 협력&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;시스템 구조&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;모델+오케스트레이터+도구&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;여러 에이전트들의 네트워크 및 통신 체계&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;5. 최신 기술 동향&amp;nbsp;&lt;/h2&gt;
&lt;h3 data-path-to-node=&quot;9&quot; data-ke-size=&quot;size23&quot;&gt;지식 증강 및 추론 기술 (Advanced RAG &amp;amp; Reasoning)&lt;/h3&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1024&quot; data-origin-height=&quot;583&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cAaRxl/dJMcaibMwDZ/PW4G5kLqw4B6qveeDjH8P0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cAaRxl/dJMcaibMwDZ/PW4G5kLqw4B6qveeDjH8P0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cAaRxl/dJMcaibMwDZ/PW4G5kLqw4B6qveeDjH8P0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcAaRxl%2FdJMcaibMwDZ%2FPW4G5kLqw4B6qveeDjH8P0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;476&quot; height=&quot;271&quot; data-origin-width=&quot;1024&quot; data-origin-height=&quot;583&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h3 data-path-to-node=&quot;2&quot; data-ke-size=&quot;size23&quot;&gt;1. GraphRAG 그래프 기반 검색 증강 생성&lt;/h3&gt;
&lt;p data-path-to-node=&quot;3&quot; data-ke-size=&quot;size16&quot;&gt;데이터를 낱개로 저장하지 않고 거미줄처럼 서로 연결된 지식의 지도로 만들어 활용하는 기술임&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-path-to-node=&quot;4&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;최신성: 문서 전체의 주제나 인물 간의 복잡한 관계를 요약하는 능력&lt;/li&gt;
&lt;li&gt;&lt;span data-path-to-node=&quot;4,1,1,0&quot;&gt;&lt;span&gt;수식적 포인트&lt;/span&gt;&lt;span&gt;: 지식 그래프&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span data-math=&quot;G = (E, R)&quot; data-index-in-node=&quot;16&quot;&gt;$G = (E, R)$&lt;/span&gt;&lt;span&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;구조를 사용함&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-path-to-node=&quot;4,1,2&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;E&quot;&gt;$E$&lt;/span&gt; (Entity, 노드): 사람 회사 도시 같은 핵심 정보 알갱이임&lt;/li&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;R&quot;&gt;$R$&lt;/span&gt; (Relation, 간선): 누가 누구의 CEO인지 어느 회사가 어느 도시에 있는지 연결하는 선임&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li data-path-to-node=&quot;4,1,1&quot;&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;작동 원리: 데이터를 조각내어 보관하는 대신 각 정보가 서로 어떻게 연결되어 있는지 위상적 구조를 학습함&lt;/span&gt;&lt;/li&gt;
&lt;li data-path-to-node=&quot;4,1,1&quot;&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;효과: 파편화된 정보들 사이의 맥락을 읽어내어 &quot;A사의 사장이 사는 도시의 특징&quot; 같은 복잡한 질문에도 정확히 답변함&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-path-to-node=&quot;6&quot; data-ke-size=&quot;size23&quot;&gt;2.&amp;nbsp; Search-Augmented Reasoning 추론형 검색&lt;/h3&gt;
&lt;p data-path-to-node=&quot;7&quot; data-ke-size=&quot;size16&quot;&gt;질문을 받자마자 답하는 것이 아니라 정답을 찾기 위해 스스로 전략을 짜고 검색을 반복하는 과정임&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-path-to-node=&quot;8&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;최신성: OpenAI의 o1이나 Google의 최신 모델들에 적용된 방식으로 AI가 스스로 무엇을 더 찾아봐야 할지 판단함&lt;/li&gt;
&lt;li&gt;작동 매커니즘:
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Chain of Thought (CoT)&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;: 큰 문제를 해결하기 위해&amp;nbsp; 생각을 단계별로 쪼갬&lt;/span&gt;&lt;b data-index-in-node=&quot;0&quot; data-path-to-node=&quot;8,1,1,1,2,0,0&quot;&gt;&lt;/b&gt;&lt;span data-path-to-node=&quot;8,1,1,1,1,0&quot;&gt;&lt;b data-path-to-node=&quot;8,1,1,1,1,0&quot; data-index-in-node=&quot;0&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span data-path-to-node=&quot;8,1,1,1,1,0&quot;&gt;&lt;span&gt;ReAct 루프&lt;/span&gt;&lt;span&gt;: 생각하고 행동하고 관찰하는 과정을 정답이 나올 때까지 반복함&lt;/span&gt;&lt;/span&gt;
&lt;ol style=&quot;list-style-type: decimal;&quot; data-ke-list-type=&quot;decimal&quot;&gt;
&lt;li&gt;Thought (추론): 지금 상황에서 어떤 정보가 더 필요한지 판단함&lt;/li&gt;
&lt;li&gt;&lt;span data-path-to-node=&quot;8,1,1,1,2,1,1,0&quot;&gt;&lt;span&gt;Action (검색 실행)&lt;/span&gt;&lt;span&gt;: 외부 데이터나 지식 베이스에서 실제로 정보를 찾아봄 &lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span data-path-to-node=&quot;8,1,1,1,2,1,1,1&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;Observation (확인): 찾아온 결과가 도움이 되는지 확인하고 다음 할 일을 결정함&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;효과: 한 번의 검색으로 알 수 없는 복잡한 문제도 스스로 논리적 오류를 고쳐가며 정답에 도달함&lt;/li&gt;
&lt;/ul&gt;</description>
      <author>kimyeji2358</author>
      <guid isPermaLink="true">https://kimyeji2358.tistory.com/17</guid>
      <comments>https://kimyeji2358.tistory.com/17#entry17comment</comments>
      <pubDate>Thu, 2 Apr 2026 23:52:23 +0900</pubDate>
    </item>
    <item>
      <title>4-2. Transformer &amp;amp; Attention</title>
      <link>https://kimyeji2358.tistory.com/16</link>
      <description>&lt;h2 data-ke-size=&quot;size26&quot;&gt;1. 시퀀스 모델의 발전과 Transformer의 등장&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ 전통적인 모델의 한계 (RNN, LSTM)&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;시퀀스 데이터&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 순서가 있는 데이터를 의미&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;-앞의 단어가 뒤의 단어에 영향을 미치는 텍스트, 음성, 주식 데이터 등이 해당&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;전통적인 모델(RNN, LSTM):&amp;nbsp; 이전 입력 정보를 순환 구조로 기억하며 처리하는 방식&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;한계&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;장기 의존성 문제 (Long-term Edpendency) : 문장이 길어질수록 앞부분의 정보가 뒤로 전달되지 않고 소실되는 현상&lt;/li&gt;
&lt;li&gt;학습 속도 및 병렬 처리 : 데이터를 순서대로 처리해야 하므로 GPU를 활용한 병렬 연산이 불가능하여 학습 속도가 매우 느려짐&lt;/li&gt;
&lt;li&gt;기울기 소실 및 폭발 : 역전파 과정에서 그래디언트가 너무 작아지거나 커져 학습이 제대로 이루어지지 않음&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ seq2seq 모델의 한계&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;구조 : 인코더가 입력 시퀀스를 하나의 벡터로 압축하고, 디코더가 이를 통해 출력 시퀀스를 만듦&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;한계&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;정보 압축의 병목 현상 : 인코더가 전체 문장을 하나의 고정된 크기(벡터)로 압축히야 하므로 정보 손실이 발생&lt;/li&gt;
&lt;li&gt;해결책 : 문장의 특정 부분에 집중하는 Attention 매커니즘 등장&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;2. Attention 매커니즘의 작동 원리&lt;/h2&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;○ Attention의 기본 개념&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;디코더가 단어를 생성할 때 인코더의 모든 정보를 균일하게 보는 대신,&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;u&gt;필요한 부분만 선택적으로 참고&lt;/u&gt;하는 기법&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;구성요소
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Query(Q) : 현재 찾고자 하는 정보 (디코더의 현재 단어 상태)&lt;/li&gt;
&lt;li&gt;Key (K) : 인코더 각 단어의 특성 (비교 대상)&lt;/li&gt;
&lt;li&gt;Value (V) : 각 단어가 가진 실제 정보 내용&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;○ Attention 작동 단계&lt;/h4&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;250&quot; data-origin-height=&quot;187&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/wSsqB/dJMcacvRUJl/ZuuxkfczE6OOnQaMmHXW8K/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/wSsqB/dJMcacvRUJl/ZuuxkfczE6OOnQaMmHXW8K/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/wSsqB/dJMcacvRUJl/ZuuxkfczE6OOnQaMmHXW8K/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FwSsqB%2FdJMcacvRUJl%2FZuuxkfczE6OOnQaMmHXW8K%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;295&quot; height=&quot;221&quot; data-origin-width=&quot;250&quot; data-origin-height=&quot;187&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;1. 유사도 계산&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span&gt;Query(&lt;/span&gt;&lt;span data-index-in-node=&quot;6&quot; data-math=&quot;s_t&quot;&gt;$s_t$&lt;/span&gt;&lt;span&gt;)와 모든 Key(&lt;/span&gt;&lt;span data-index-in-node=&quot;19&quot; data-math=&quot;h_i&quot;&gt;$h_i$&lt;/span&gt;&lt;span&gt;)를 비교하여 얼마나 관련이 있는지 점수를 매김&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span&gt;방법 : 주로 두 벡터의 내적(Dot Product)을 사용&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span&gt;내적 값이 클수록 두 단어의 연관성이 높다는 의미&lt;/span&gt;&lt;/p&gt;
&lt;div data-math=&quot;e_t = Q \cdot K^T&quot;&gt;$$e_t = Q \cdot K^T$$&lt;/div&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;2. 가중치 계산&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;계산된 점수들을 확률 값으로 바꿈&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;방법 : Softmax 함수를 통과시키면 모든 값의 합이 1이 됨.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;결과 : 0과 1사이의 값이 나오며, 이를 어텐션 가중치라고 부름&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;이 과정을 통해 어떤 단어에 더 집중할지 결정되는 어텐션 분포가 형성됨&lt;/p&gt;
&lt;div data-math=&quot;\alpha_t = \text{softmax}(e_t)&quot;&gt;$$\alpha_t = \text{softmax}(e_t)$$&lt;/div&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;3. 가중합 계산&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;각 단어의 실제 정보(Value)에 구한 가중치를 곱해 모두 더함&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;방법 : 가중치가 높은 단어의 정보는 크게 반영되고, 낮은 단어는 작게 반영됨&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;결과 : 이렇게 얻은 최종 결과물 &amp;rarr; 어텐션 값(Attention Value) or 컨텍스트 벡터(Context Vector)라고 부름&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;입력 문장의 문맥 정보가 Query에 맞춰 압축된 형태&lt;/p&gt;
&lt;div data-math=&quot;a_t = \sum_{i=1}^{N} \alpha_i^t h_i&quot;&gt;$$a_t = \sum_{i=1}^{N} \alpha_i^t h_i$$&lt;/div&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;3. 단어 예측&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;구해진 어텐션 값을 단어 예측에 반영함&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;방법 : 어텐션 값 &lt;span&gt;(&lt;/span&gt;&lt;span data-index-in-node=&quot;10&quot; data-math=&quot;a_t&quot;&gt;$a_t$&lt;/span&gt;&lt;span&gt;)&lt;/span&gt; 과 현재 디코더의 상태&lt;span&gt;(&lt;/span&gt;&lt;span data-index-in-node=&quot;27&quot; data-math=&quot;s_t&quot;&gt;$s_t$&lt;/span&gt;&lt;span&gt;)를 연결(Concatenate)&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span&gt;예측 : 연결된 벡터를 신경망에 통과시켜 최종적으로 다음에 올 가장 적절한 단어를 출력&lt;/span&gt;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;3. Self-Attention&lt;/h2&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;604&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bmMnrV/dJMcacbxEcI/KeVkaeCj1nbvfIZrPOMiqK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bmMnrV/dJMcacbxEcI/KeVkaeCj1nbvfIZrPOMiqK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bmMnrV/dJMcacbxEcI/KeVkaeCj1nbvfIZrPOMiqK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbmMnrV%2FdJMcacbxEcI%2FKeVkaeCj1nbvfIZrPOMiqK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;599&quot; height=&quot;283&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;604&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;정의: 문장 내 단어들끼리에게 Attention을 취하여 단어 간의 연관성을 파악&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;이점: &quot;The animal... beacause it...' 문장에서 &quot;it&quot;이 'animal'을 가르킨다는 문맥을 컴퓨터가 이해하도록 도움&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;계산 특징: 시작 값 (Q,K,V)은 동일 문장에서 나오지만, 학습되는 가중치&lt;span&gt;(&lt;/span&gt;&lt;span data-index-in-node=&quot;45&quot; data-math=&quot;W^Q, W^K, W^V&quot;&gt;$W^Q, W^K, W^V$&lt;/span&gt;&lt;span&gt;)에 의해 최종 값은 달라짐&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;&lt;span&gt;○ Scaled Dot-Product Attention&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;217&quot; data-origin-height=&quot;232&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/y68J2/dJMcaaEOUX6/pGkhAK2XWvutcM5XjdWHI0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/y68J2/dJMcaaEOUX6/pGkhAK2XWvutcM5XjdWHI0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/y68J2/dJMcaaEOUX6/pGkhAK2XWvutcM5XjdWHI0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fy68J2%2FdJMcaaEOUX6%2FpGkhAK2XWvutcM5XjdWHI0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;278&quot; height=&quot;297&quot; data-origin-width=&quot;217&quot; data-origin-height=&quot;232&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span&gt;Transformer에서 사용하는 Self-Attention 공식&lt;/span&gt;&lt;/p&gt;
&lt;div data-math=&quot;\text{Attention}(Q, K, V) = \text{softmax}\left(\frac{QK^T}{\sqrt{d_k}}\right)V \text{ [cite: 257]}&quot;&gt;
&lt;div data-math=&quot;\text{Attention}(Q, K, V) = \text{softmax}\left(\frac{QK^T}{\sqrt{d_k}}\right)V&quot;&gt;$$\text{Attention}(Q, K, V) = \text{softmax}\left(\frac{QK^T}{\sqrt{d_k}}\right)V$$&lt;/div&gt;
&lt;/div&gt;
&lt;div data-math=&quot;\text{Attention}(Q, K, V) = \text{softmax}\left(\frac{QK^T}{\sqrt{d_k}}\right)V \text{ [cite: 257]}&quot;&gt;- Scaling 연산: &lt;span data-index-in-node=&quot;12&quot; data-math=&quot;QK^T&quot;&gt;$QK^T$&lt;/span&gt;&lt;span&gt;를 $\sqrt{d_k}$로 나눔&lt;/span&gt;&lt;/div&gt;
&lt;div data-math=&quot;\text{Attention}(Q, K, V) = \text{softmax}\left(\frac{QK^T}{\sqrt{d_k}}\right)V \text{ [cite: 257]}&quot;&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;&amp;rarr; 차원이 커질수로 내적값이 커져 Softmax의 기울기가 소실되는 문제를 방지하기 위함&lt;/span&gt;&lt;/div&gt;
&lt;div data-math=&quot;\text{Attention}(Q, K, V) = \text{softmax}\left(\frac{QK^T}{\sqrt{d_k}}\right)V \text{ [cite: 257]}&quot;&gt;&lt;span style=&quot;letter-spacing: 0px;&quot;&gt;- 병렬 처리: 여러 단어를 한꺼번에 처리하여 연산 속도(GPU 활용)가 비약적으로 향상됨&lt;/span&gt;&lt;/div&gt;
&lt;div data-math=&quot;\text{Attention}(Q, K, V) = \text{softmax}\left(\frac{QK^T}{\sqrt{d_k}}\right)V \text{ [cite: 257]}&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Q 왜 &lt;span data-path-to-node=&quot;23,0,0&quot;&gt;&lt;span data-index-in-node=&quot;5&quot; data-math=&quot;d_k&quot;&gt;$d_k$&lt;/span&gt;의 제곱근으로 나누는가? (Scaling)&lt;/span&gt;&lt;span data-path-to-node=&quot;23,0,1&quot;&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;23,0,2&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-path-to-node=&quot;23,0,0&quot;&gt;답: Softmax의 기울기 소실 문제를 방지하기 위함&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;8&quot; data-math=&quot;d_k&quot;&gt;$d_k$&lt;/span&gt;(Key의 차원)가 커질수록 내적 값(&lt;span data-index-in-node=&quot;32&quot; data-math=&quot;QK^T&quot;&gt;$QK^T$&lt;/span&gt;)의 분산이 커짐.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;내적 값이 커지면 Softmax 함수 그래프에서 미분 값의 0에 수렴하는 평평한 영역으로 이동&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; 미분 값이 0이 되면 역전파시 학습이 이루어지지 않음. 따라서 $\sqrt{d_k}$로 나누어 값의 범위를 조절함으로써 안정적인 학습 환경을 제공&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Q eq2seq 모델과 비교했을 때 Attention이 해결한 가장 큰 문제점은 무엇인가?&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;답: 정보의 병목 현상과 병렬 처리의 한계를 해결함&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;병목 제거: 고정된 크기의 벡터 하나에 모든 정보를 넣지 않고, 필요할 때마다 입력 문장 전체를 다시 들여다 보는 방식&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;병렬성 확보: RNN의 순차 구조를 벗어나 GPU 연산 효율을 극대화함.&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;&lt;span data-path-to-node=&quot;23,0,0&quot;&gt;4. Transformer 아키텍처&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1320&quot; data-origin-height=&quot;1860&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/C7yme/dJMcahKHEAm/wbvOlUn6ieWZUisdjMKgck/img.webp&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/C7yme/dJMcahKHEAm/wbvOlUn6ieWZUisdjMKgck/img.webp&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/C7yme/dJMcahKHEAm/wbvOlUn6ieWZUisdjMKgck/img.webp&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FC7yme%2FdJMcahKHEAm%2FwbvOlUn6ieWZUisdjMKgck%2Fimg.webp&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;344&quot; height=&quot;485&quot; data-origin-width=&quot;1320&quot; data-origin-height=&quot;1860&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;&lt;span data-path-to-node=&quot;23,0,0&quot;&gt;○ 인코더-디코더 구조&lt;/span&gt;&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;N개 층: 기존 seq2seq와 달리 인코더와 디코더 단위를 N개씩 쌓아 고차원 특징을 학습&lt;/li&gt;
&lt;li&gt;Auto-regressive : 디코더는 &amp;lt;sos&amp;gt; 로 시작해 &amp;lt;eos&amp;gt;가 나올 떄까지 순차적으로 예측을 진행&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ 내부 구조적 특징&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Multi-Head Attention : Attention을 여러 개(Head)로 나누어 병렬로 수행. 이를 통해 한 문장 안에서도 문법적 관계, 의미적 관계 등 다양한 문맥적 정보를 동시에 포착 가능&lt;/li&gt;
&lt;li&gt;Residual Connection (잔차 연결): 각 층의 입력을 출력에 다시 더해주는 방식으로, 정보가 소실되지 않고 깊은 층까지 전달되도록 도움&lt;/li&gt;
&lt;li&gt;Layer Normalization (층 정규화) : 각 층의 출력을 일정 범위로 정규화하여 학습 속도를 높이고 안정화&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ Positional Encoding (위치 인코딩)&lt;/h3&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;2159&quot; data-origin-height=&quot;1296&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cdbsIM/dJMb99TpJir/S2s4b7xWibVWGMu3zDk320/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cdbsIM/dJMb99TpJir/S2s4b7xWibVWGMu3zDk320/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cdbsIM/dJMb99TpJir/S2s4b7xWibVWGMu3zDk320/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcdbsIM%2FdJMb99TpJir%2FS2s4b7xWibVWGMu3zDk320%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;470&quot; height=&quot;282&quot; data-origin-width=&quot;2159&quot; data-origin-height=&quot;1296&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;필요성: Transformer는 데이터를 한꺼번에 입력받으므로 단어의 위치 정보가 없다&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;방법: 임베딩 벡터에 위치 정보를 담은 함수값을 더함&lt;/p&gt;
&lt;div data-path-to-node=&quot;26,1,2,0,1&quot;&gt;
&lt;div data-math=&quot;PE_{(pos, 2i)} = \sin(pos/10000^{2i/d_{model}})&quot;&gt;$$PE_{(pos, 2i)} = \sin(pos/10000^{2i/d_{model}})$$&lt;/div&gt;
&lt;/div&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;효과: 동일한 단어라도 문장 내 위치에 따라 모델이 다르게 인식하게 됨&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ Transformer의 강점 및 응용&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;강점: 장기 의존상 학습이 가능, 병렬 처리를 통해 대규모 데이터 학습에 최적화&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;응용 분야: GPT, BERT의 뼈대가 되었으며 자동번역, 챗봇, 요약, 코드 생성 등에 사용&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 최신 동향 및 과제&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;동향: 모델 경량화, 다중 모달(이미지+텍스트) 모델, 설명 가능한 AI(XAI)&lt;/li&gt;
&lt;li&gt;과제 : 데이터 향 문제, 개인 정보 보호, 모델 해석력 향상&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Q 왜 Transformer는 RNN 없이도 문장의 순서를 이해할 수 있는가?&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;답: 포지셔널 인코딩(Positional Encoding) 덕분&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Transformer는 모든 단어를 한꺼번에 입력받아 병렬 처리하므로 위치 정보가 사라짐&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;이를 해결하기 위해 각 단어의 임베딩 벡터에 고유한 위치 정보를 가진 함수값을 더함&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;모델은 단어의 절대적 위치, 단어 간의 상대적인 거리 차이까지 학습 가능. 순차적 처리 없이도 문맥을 파악함&lt;/p&gt;</description>
      <author>kimyeji2358</author>
      <guid isPermaLink="true">https://kimyeji2358.tistory.com/16</guid>
      <comments>https://kimyeji2358.tistory.com/16#entry16comment</comments>
      <pubDate>Thu, 2 Apr 2026 16:28:19 +0900</pubDate>
    </item>
    <item>
      <title>4-1. NLP기초 (Embedding &amp;amp; RNN)</title>
      <link>https://kimyeji2358.tistory.com/15</link>
      <description>&lt;h2 data-ke-size=&quot;size26&quot;&gt;1. 임베딩(Embedding)&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;컴퓨터는 인간의 언어를 직접 이해할 수 없으며 오직 숫자만 처리 가능&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;텍스트 데이터를 컴퓨터가 처리할 수 있도록 수치화하는 과정 필요&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; 벡터화(Vectorization)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;1-1. One-Hot Encoding&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;단어 집합의 크기를 차원으로 함&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;표현하고 싶은 단어의 인덱스에만 1을 부여, 나머지는 모두 0 표시&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; 희소 표현 (Sparse Representation)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;문제점&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;의미 결여 : 단어 간의 유사도를 계산할 수 없음&amp;nbsp;&lt;/li&gt;
&lt;li&gt;차원의 저주 : 단어 수가 늘어날수록 베터의 길이와 0의 개수가 무하히 증가하여 계산 효율&amp;darr;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;1-2. 워드 임베딩 &amp;amp; Word2Vec&lt;/h3&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;○ 임베딩 (Embedding)&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;단어를 고정된 크기의 밀집 벡터(Dense Vector)로 표현하는 방식 &amp;rarr; 분산 표현(Distributed Representation)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;학습을 통해 단어의 의미를 다차원 공간상의 좌표로 나타냄&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;장점
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;의미가 비슷한 단어들은 벡터 공간에서 서로 가까운 거리에 위치&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;○ Word2Vec&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;저차원에 단어의 의미를 여러 차원에 분산하여 표현하는 기술&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;두 가지 학습 방식: CBOW, Skip-gram&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;875&quot; data-origin-height=&quot;453&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/IrAb3/dJMcadanR0X/kK4iG84L0igzCxFCONFxc0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/IrAb3/dJMcadanR0X/kK4iG84L0igzCxFCONFxc0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/IrAb3/dJMcadanR0X/kK4iG84L0igzCxFCONFxc0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FIrAb3%2FdJMcadanR0X%2FkK4iG84L0igzCxFCONFxc0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;536&quot; height=&quot;277&quot; data-origin-width=&quot;875&quot; data-origin-height=&quot;453&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- CBOW (Continuous Bag of Words)&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;원리 : 주변에 있는 단어들(맥락)을 통해 중앙에 있는 빈칸(타깃 단어)을 예측하는 신경망 구조&lt;/li&gt;
&lt;li&gt;특징 : 여러 단어를 한꺼번에 처리하므로 학습 속도가 빠름&lt;/li&gt;
&lt;li&gt;ex) You __ goodbye 타깃: __에 들어갈 단어&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- Skip-gram 방식&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;원리 : 중앙의 단어로부터 주변의 여러 단어를 예측하는 모델&lt;/li&gt;
&lt;li&gt;특징 : CBOW보다 학습량 많고 더 어려운 문제를 풂&amp;nbsp;&amp;rarr; 단어의 분산 표현(임베딩) 결과가 더 뛰어날 가능성 높음&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;2. RNN (Recurrent Neural Network, 순환 신경망)&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ FNN (Forward Neural Network) 의 한계&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;기존의 순전파 네트워크는 데이터가 한 방향으로만 이동&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;입력 크기가 고정된 데이터에는 적합&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;but 이전 단어를 기억해야하는 시퀀스(Sequence) 데이터 처리에는 어려움이 있다&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ RNN의 구조와 특징&lt;/h3&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;800&quot; data-origin-height=&quot;267&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/trEH9/dJMcafF6IUa/EkdDj7e4bGwd7w5w5Gt6qk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/trEH9/dJMcafF6IUa/EkdDj7e4bGwd7w5w5Gt6qk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/trEH9/dJMcafF6IUa/EkdDj7e4bGwd7w5w5Gt6qk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FtrEH9%2FdJMcafF6IUa%2FEkdDj7e4bGwd7w5w5Gt6qk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;611&quot; height=&quot;204&quot; data-origin-width=&quot;800&quot; data-origin-height=&quot;267&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;RNN은 시계열 또는 순차 데이터를 예측하는 데 효과적인 구조&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;순환 구조: 이전 시점의 은닉 상태(Hidden State, &lt;span&gt; &lt;/span&gt;&lt;span data-index-in-node=&quot;34&quot; data-math=&quot;h_{t-1}&quot;&gt;$h_{t-1}$)를 현재 시점의 입력 &lt;span&gt;(&lt;/span&gt;&lt;span data-index-in-node=&quot;54&quot; data-math=&quot;x_t&quot;&gt;$x_t$&lt;/span&gt;&lt;span&gt;)과 함께 참조하여 현재의 값 &lt;span&gt;(&lt;/span&gt;&lt;span data-index-in-node=&quot;74&quot; data-math=&quot;h_t&quot;&gt;$h_t$&lt;/span&gt;&lt;span&gt;)을 결정&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;34&quot; data-math=&quot;h_{t-1}&quot;&gt;&lt;span&gt;&lt;span&gt;수식 : &lt;span&gt; &lt;/span&gt;&lt;span data-index-in-node=&quot;4&quot; data-math=&quot;h_t = f(W_h h_{t-1} + W_x x_t + b)&quot;&gt;$h_t = f(W_h h_{t-1} + W_x x_t + b)$ (활성화 함수 &lt;span&gt; &lt;/span&gt;&lt;span data-index-in-node=&quot;47&quot; data-math=&quot;f&quot;&gt;$f$를 거쳐 결과를 내보냄)&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;34&quot; data-math=&quot;h_{t-1}&quot;&gt;&lt;span&gt;&lt;span&gt;&lt;span data-index-in-node=&quot;4&quot; data-math=&quot;h_t = f(W_h h_{t-1} + W_x x_t + b)&quot;&gt;&lt;span data-index-in-node=&quot;47&quot; data-math=&quot;f&quot;&gt;종류&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;34&quot; data-math=&quot;h_{t-1}&quot;&gt;&lt;span&gt;&lt;span&gt;&lt;span data-index-in-node=&quot;4&quot; data-math=&quot;h_t = f(W_h h_{t-1} + W_x x_t + b)&quot;&gt;&lt;span data-index-in-node=&quot;47&quot; data-math=&quot;f&quot;&gt;One-to-Many (이미지 캡셔닝)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;34&quot; data-math=&quot;h_{t-1}&quot;&gt;&lt;span&gt;&lt;span&gt;&lt;span data-index-in-node=&quot;4&quot; data-math=&quot;h_t = f(W_h h_{t-1} + W_x x_t + b)&quot;&gt;&lt;span data-index-in-node=&quot;47&quot; data-math=&quot;f&quot;&gt;하나의 입력 데이터를 반복하여 신경망에 입력, 연속된 시퀀스 출력&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;34&quot; data-math=&quot;h_{t-1}&quot;&gt;&lt;span&gt;&lt;span&gt;&lt;span data-index-in-node=&quot;4&quot; data-math=&quot;h_t = f(W_h h_{t-1} + W_x x_t + b)&quot;&gt;&lt;span data-index-in-node=&quot;47&quot; data-math=&quot;f&quot;&gt;Many-to-One (감정 분석)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;34&quot; data-math=&quot;h_{t-1}&quot;&gt;&lt;span&gt;&lt;span&gt;&lt;span data-index-in-node=&quot;4&quot; data-math=&quot;h_t = f(W_h h_{t-1} + W_x x_t + b)&quot;&gt;&lt;span data-index-in-node=&quot;47&quot; data-math=&quot;f&quot;&gt;연속된 단어로 이루어진 문장 시퀀스에 대한 분석&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;34&quot; data-math=&quot;h_{t-1}&quot;&gt;&lt;span&gt;&lt;span&gt;&lt;span data-index-in-node=&quot;4&quot; data-math=&quot;h_t = f(W_h h_{t-1} + W_x x_t + b)&quot;&gt;&lt;span data-index-in-node=&quot;47&quot; data-math=&quot;f&quot;&gt;Many-to-Many (번역, 주식 예측)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;시계열 데이터 예측&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;400&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/d9Vj7Y/dJMcah4ZH7o/WJniZsOjibfOZJUiMwNaIK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/d9Vj7Y/dJMcah4ZH7o/WJniZsOjibfOZJUiMwNaIK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/d9Vj7Y/dJMcah4ZH7o/WJniZsOjibfOZJUiMwNaIK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fd9Vj7Y%2FdJMcah4ZH7o%2FWJniZsOjibfOZJUiMwNaIK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;575&quot; height=&quot;400&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;400&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ BPTT (Backpropagation Through time)&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;RNN의 역전파 방법&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;모든 시점(Sequence)에 대한 동일한 가중치 벡터를 사용&lt;/li&gt;
&lt;li&gt;역전파 시, 각 시점의 가중치 기울기를 전부 더해서 업데이트하여, 현재의 오차가 과거의 오차와 연결&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;637&quot; data-origin-height=&quot;161&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/VGXYx/dJMcaibL6t1/KDzxIWjKphokn7nmKFNsJk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/VGXYx/dJMcaibL6t1/KDzxIWjKphokn7nmKFNsJk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/VGXYx/dJMcaibL6t1/KDzxIWjKphokn7nmKFNsJk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FVGXYx%2FdJMcaibL6t1%2FKDzxIWjKphokn7nmKFNsJk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;637&quot; height=&quot;161&quot; data-origin-width=&quot;637&quot; data-origin-height=&quot;161&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ RNN의 문제점과 발전된 모델(LSTM, GRU)&lt;/h3&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;- RNN의 한계&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;1. 장기 의존성 (Long-Term Dependency) 문제&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;시퀀스가 길어질수록 앞쪽의 정보가 뒤쪽까지 충분히 전달되지 않아 기억력이 약해짐&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;2. 기울기 소실(Vanishing Gradient) 및 폭발&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;동일한 가중치를 반복해서 곱하기 때문에&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;가중치가 1보다 작으면 기울기가 0으로 수렴하고 1보다 크면 무한히 커짐&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;● LSTM (Long Short - Term Memory)&lt;/h4&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;362&quot; data-origin-height=&quot;139&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bPzs4W/dJMcabqbERI/79QO0VqTaqh6pot1rG08MK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bPzs4W/dJMcabqbERI/79QO0VqTaqh6pot1rG08MK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bPzs4W/dJMcabqbERI/79QO0VqTaqh6pot1rG08MK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbPzs4W%2FdJMcabqbERI%2F79QO0VqTaqh6pot1rG08MK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;425&quot; height=&quot;163&quot; data-origin-width=&quot;362&quot; data-origin-height=&quot;139&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;장기 의존성 문제를 해결하기 위해 Cell State를 도입한 모델&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;3개의 게이트
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Forget Gate: 기억하지 않아도 될 정보를 삭제&lt;/li&gt;
&lt;li&gt;Input Gate: 새로운 입력 중 어떤 정보를 기억할지 결정&lt;/li&gt;
&lt;li&gt;Output Gate: 어떤 정보를 출력으로 내보낼지 결정&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;이를 통해 장기 기억을 효과적으로 유지&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;● GRU(Gated Recurrent Unit)&lt;/h4&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;332&quot; data-origin-height=&quot;152&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/lR9Dg/dJMcacP8Q4Y/LYhAREEMxBxJKr5PQvN7Ck/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/lR9Dg/dJMcacP8Q4Y/LYhAREEMxBxJKr5PQvN7Ck/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/lR9Dg/dJMcacP8Q4Y/LYhAREEMxBxJKr5PQvN7Ck/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FlR9Dg%2FdJMcacP8Q4Y%2FLYhAREEMxBxJKr5PQvN7Ck%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;463&quot; height=&quot;212&quot; data-origin-width=&quot;332&quot; data-origin-height=&quot;152&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;LSTM의 구조를 단순화하면서도 유사한 성능을 내도록 설계된 모델&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;2개의 게이터
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Update Gate&lt;/li&gt;
&lt;li&gt;Reset Gate&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;특징 : 파라미터 수가 적어 연산 속도가 빠르며, 비교적 적은 데이터에서도 효율적임&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <author>kimyeji2358</author>
      <guid isPermaLink="true">https://kimyeji2358.tistory.com/15</guid>
      <comments>https://kimyeji2358.tistory.com/15#entry15comment</comments>
      <pubDate>Thu, 2 Apr 2026 14:20:15 +0900</pubDate>
    </item>
    <item>
      <title>3. 선형대수</title>
      <link>https://kimyeji2358.tistory.com/14</link>
      <description>&lt;h2 data-ke-size=&quot;size26&quot;&gt;1. 함수의 기본 요소&amp;nbsp;&lt;/h2&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;정의역 (Domain, &lt;span data-index-in-node=&quot;13&quot; data-math=&quot;\mathbb{R}^n&quot;&gt;$\mathbb{R}^n$) : 입력 벡터 x가 속한 전체 집합&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;13&quot; data-math=&quot;\mathbb{R}^n&quot;&gt;공역 (Co-domain, &lt;span data-index-in-node=&quot;14&quot; data-math=&quot;\mathbb{R}^m&quot;&gt;$\mathbb{R}^m$) : 출력 벡터 y가 존재할 수 있는 후보지 전체 집합&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;13&quot; data-math=&quot;\mathbb{R}^n&quot;&gt;&lt;span data-index-in-node=&quot;14&quot; data-math=&quot;\mathbb{R}^m&quot;&gt;상 (Image) : 특정 입력 x에 의해 매핑된 결과값 $T(x)$를 의미&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;13&quot; data-math=&quot;\mathbb{R}^n&quot;&gt;&lt;span data-index-in-node=&quot;14&quot; data-math=&quot;\mathbb{R}^m&quot;&gt;치역 (Range) : 정의역의 모든 원소를 변환했을 때 얻어지는 실제 결과값들의 집합&lt;/span&gt;&lt;/span&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;13&quot; data-math=&quot;\mathbb{R}^n&quot;&gt;&lt;span data-index-in-node=&quot;14&quot; data-math=&quot;\mathbb{R}^m&quot;&gt;치역은 항상 공역의 부분집합&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;2. 선형 변환 (Linear Transformation)&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;선형변환은 벡터를 다른 벡터로 옮길 떄, 공간의 격자모양과 원점을 유지하는 변환&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;- 선형성의 조건&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;T(u + v) = T(u) + T(v)&quot;&gt;$$T(u + v) = T(u) + T(v)$$&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;T(u + v) = T(u) + T(v)&quot;&gt;1. 더하기를 먼저 하나 변환 후 더하나 같아야 함.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;T(cu) = cT(u)&quot;&gt;$$T(cu) = cT(u)$$&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;T(cu) = cT(u)&quot;&gt;2. 길이를 늘리고 변환하나 변환 후 늘리나 같아야 함.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;T(cu) = cT(u)&quot;&gt;ex) &lt;span data-index-in-node=&quot;4&quot; data-math=&quot;T(x) = 3x&quot;&gt;$T(x) = 3x$&lt;/span&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;x_1=1, x_2=2&quot;&gt;$x_1=1, x_2=2$&lt;/span&gt;&lt;span&gt;일 때, 선형결합 &lt;/span&gt;&lt;span data-index-in-node=&quot;22&quot; data-math=&quot;4x_1 + 5x_2 = 14&quot;&gt;$4x_1 + 5x_2 = 14$&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b data-index-in-node=&quot;0&quot; data-path-to-node=&quot;10,0,1,1,1,0&quot;&gt;&lt;span&gt;방법 1&lt;/span&gt;&lt;/b&gt;&lt;span&gt;: &lt;/span&gt;&lt;span data-index-in-node=&quot;6&quot; data-math=&quot;T(14) = 3 \times 14 = 42&quot;&gt;$T(14) = 3 \times 14 = 42$&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b data-index-in-node=&quot;0&quot; data-path-to-node=&quot;10,0,1,2,1,0&quot;&gt;&lt;span&gt;방법 2&lt;/span&gt;&lt;/b&gt;&lt;span&gt;: &lt;/span&gt;&lt;span data-index-in-node=&quot;6&quot; data-math=&quot;4T(1) + 5T(2) = 4(3) + 5(6) = 12 + 30 = 42&quot;&gt;$4T(1) + 5T(2) = 4(3) + 5(6) = 12 + 30 = 42$&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; &lt;span&gt;두 결과가 같으므로 이 변환은 &lt;/span&gt;&lt;b data-index-in-node=&quot;17&quot; data-path-to-node=&quot;10,0,1,3,0,1&quot;&gt;&lt;span&gt;선형&lt;/span&gt;&lt;/b&gt;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;T(cu) = cT(u)&quot;&gt;- 표준 행렬 (Standard Matrix)&lt;/span&gt;&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;T(cu) = cT(u)&quot;&gt;모든 선형변환은 행렬 A와 x의 곱(Ax)으로 표현 가능&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;T(cu) = cT(u)&quot;&gt;행렬 A의 각 열은 단위행렬의 기저 벡터 &lt;span data-index-in-node=&quot;17&quot; data-math=&quot;e_j&quot;&gt;$e_j$가 변환된 결과값 &lt;span data-index-in-node=&quot;44&quot; data-math=&quot;T(e_j)&quot;&gt;$T(e_j)$과 같다&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;ex) &lt;span data-index-in-node=&quot;7&quot; data-math=&quot;T: \mathbb{R}^2 \rightarrow \mathbb{R}^3&quot;&gt;$T: \mathbb{R}^2 \rightarrow \mathbb{R}^3$&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;T(\begin{bmatrix} 1 \\ 0 \end{bmatrix}) = \begin{bmatrix} 2 \\ -1 \\ 1 \end{bmatrix}&quot;&gt;$T(\begin{bmatrix} 1 \\ 0 \end{bmatrix}) = \begin{bmatrix} 2 \\ -1 \\ 1 \end{bmatrix}$, $T(\begin{bmatrix} 0 \\ 1 \end{bmatrix}) = \begin{bmatrix} 0 \\ 1 \\ 2 \end{bmatrix}$ 이라면, &lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span&gt;이때 임의의 벡터 $x = \begin{bmatrix} x_1 \ x_2 \end{bmatrix}$에 대한 표준 행렬(Standard Matrix) &lt;/span&gt;&lt;span data-index-in-node=&quot;85&quot; data-math=&quot;A&quot;&gt;$A$&lt;/span&gt;&lt;span&gt;를 구하는 과정&lt;/span&gt;&lt;/p&gt;
&lt;div data-math=&quot;T(x) = x_1 T(e_1) + x_2 T(e_2) = x_1 \begin{bmatrix} 2 \\ -1 \\ 1 \end{bmatrix} + x_2 \begin{bmatrix} 0 \\ 1 \\ 2 \end{bmatrix} = \begin{bmatrix} 2 &amp;amp; 0 \\ -1 &amp;amp; 1 \\ 1 &amp;amp; 2 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix}&quot;&gt;$T(x) = x_1 T(e_1) + x_2 T(e_2) = x_1 \begin{bmatrix} 2 \\ -1 \\ 1 \end{bmatrix} + x_2 \begin{bmatrix} 0 \\ 1 \\ 2 \end{bmatrix} = \begin{bmatrix} 2 &amp;amp; 0 \\ -1 &amp;amp; 1 \\ 1 &amp;amp; 2 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix}$&lt;/div&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; &lt;span data-index-in-node=&quot;6&quot; data-math=&quot;A = \begin{bmatrix} 2 &amp;amp; 0 \\ -1 &amp;amp; 1 \\ 1 &amp;amp; 2 \end{bmatrix}&quot;&gt;$A = \begin{bmatrix} 2 &amp;amp; 0 \\ -1 &amp;amp; 1 \\ 1 &amp;amp; 2 \end{bmatrix}$&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;&lt;span data-index-in-node=&quot;6&quot; data-math=&quot;A = \begin{bmatrix} 2 &amp;amp; 0 \\ -1 &amp;amp; 1 \\ 1 &amp;amp; 2 \end{bmatrix}&quot;&gt;3. 신경망에서의 응용&lt;/span&gt;&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;&lt;span data-index-in-node=&quot;6&quot; data-math=&quot;A = \begin{bmatrix} 2 &amp;amp; 0 \\ -1 &amp;amp; 1 \\ 1 &amp;amp; 2 \end{bmatrix}&quot;&gt;- 선형 레이어(Linear Layer)&lt;/span&gt;&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;6&quot; data-math=&quot;A = \begin{bmatrix} 2 &amp;amp; 0 \\ -1 &amp;amp; 1 \\ 1 &amp;amp; 2 \end{bmatrix}&quot;&gt;인공신경망의 Fully-connected layer는 선형변환&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;6&quot; data-math=&quot;A = \begin{bmatrix} 2 &amp;amp; 0 \\ -1 &amp;amp; 1 \\ 1 &amp;amp; 2 \end{bmatrix}&quot;&gt;시각적으로는 직사각형 모눈종이를 평행사변형 형태의 모눈종이로 비틀어 공간을 변형시키는 역할&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1467&quot; data-origin-height=&quot;611&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bjYPLp/dJMcaaY0to1/ZL1NJdxhZVGvYKKwRu21U1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bjYPLp/dJMcaaY0to1/ZL1NJdxhZVGvYKKwRu21U1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bjYPLp/dJMcaaY0to1/ZL1NJdxhZVGvYKKwRu21U1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbjYPLp%2FdJMcaaY0to1%2FZL1NJdxhZVGvYKKwRu21U1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;634&quot; height=&quot;264&quot; data-origin-width=&quot;1467&quot; data-origin-height=&quot;611&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;- Affine Layer&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;신경망에서 입력 데이터에 선형 변환을 수행한 후 편향을 더해주는 층&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;367&quot; data-origin-height=&quot;137&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bDv9rf/dJMcacCx0A3/FPWkjDdfPZRfg0cqqRT5j0/img.jpg&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bDv9rf/dJMcacCx0A3/FPWkjDdfPZRfg0cqqRT5j0/img.jpg&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bDv9rf/dJMcacCx0A3/FPWkjDdfPZRfg0cqqRT5j0/img.jpg&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbDv9rf%2FdJMcacCx0A3%2FFPWkjDdfPZRfg0cqqRT5j0%2Fimg.jpg&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;445&quot; height=&quot;166&quot; data-origin-width=&quot;367&quot; data-origin-height=&quot;137&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;일반적인 신경망 레이어에는 &lt;span&gt; &lt;/span&gt;&lt;span data-index-in-node=&quot;15&quot; data-math=&quot;y = Ax + b&quot;&gt;$y = Ax + b$처럼 &lt;span&gt;Bias(&lt;/span&gt;&lt;span data-index-in-node=&quot;35&quot; data-math=&quot;b&quot;&gt;$b$&lt;/span&gt;&lt;span&gt;, 절편)가 포함됨&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;4&quot; data-math=&quot;y = 3x + 2&quot;&gt;$y = 3x + 2$&lt;/span&gt;&lt;span&gt;처럼 Bias가 있으면 원점을 지나지 않으므로 &lt;/span&gt;&lt;span&gt;선형변환이 아닙&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span&gt;딥러닝에서는 이를 &lt;/span&gt;&lt;b data-index-in-node=&quot;10&quot; data-path-to-node=&quot;7,2,0,1&quot;&gt;&lt;span&gt;Affine Layer&lt;/span&gt;&lt;/b&gt;&lt;span&gt;라고 부르며, 입력을 한 차원 늘려 선형변환처럼 처리하기도 합&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;i&gt;&lt;span&gt;선형 변환은 격자가 평행사변형으로 변하는 것&lt;/span&gt;&lt;/i&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;i&gt;&lt;span&gt;어파인 변환은 선형 변환에 평행이동이 추가된 것&lt;/span&gt;&lt;/i&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;table style=&quot;border-collapse: collapse; width: 100%; height: 110px;&quot; border=&quot;1&quot; data-ke-align=&quot;alignLeft&quot;&gt;
&lt;tbody&gt;
&lt;tr style=&quot;height: 22px;&quot;&gt;
&lt;td style=&quot;width: 33.3333%; height: 22px;&quot;&gt;구분&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%; height: 22px;&quot;&gt;선형변환&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%; height: 22px;&quot;&gt;어파인 변환&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 22px;&quot;&gt;
&lt;td style=&quot;width: 33.3333%; height: 22px;&quot;&gt;기본 수식&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%; height: 22px;&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;y = Ax&quot;&gt;$y = Ax$&lt;/span&gt;&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%; height: 22px;&quot;&gt;&lt;span data-index-in-node=&quot;0&quot; data-math=&quot;y = Ax + b&quot;&gt;$y = Ax + b$&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 22px;&quot;&gt;
&lt;td style=&quot;width: 33.3333%; height: 22px;&quot;&gt;원점 통과&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%; height: 22px;&quot;&gt;반드시 통과&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%; height: 22px;&quot;&gt;통과하지 않음(Bias 존재)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 22px;&quot;&gt;
&lt;td style=&quot;width: 33.3333%; height: 22px;&quot;&gt;공간의 변화&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%; height: 22px;&quot;&gt;회전, 크기 조절, 전단&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%; height: 22px;&quot;&gt;선형변환 + 평행 이동&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 22px;&quot;&gt;
&lt;td style=&quot;width: 33.3333%; height: 22px;&quot;&gt;딥러닝 처리&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%; height: 22px;&quot;&gt;Linear Layer&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%; height: 22px;&quot;&gt;Affine Layer(차원 확장으로 처리)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;4. 전사함수 (Onto)&lt;/h2&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;859&quot; data-origin-height=&quot;470&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/646qx/dJMcagdSwTp/UKfvNnmUIKvR5FK8KLsEJK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/646qx/dJMcagdSwTp/UKfvNnmUIKvR5FK8KLsEJK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/646qx/dJMcagdSwTp/UKfvNnmUIKvR5FK8KLsEJK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2F646qx%2FdJMcagdSwTp%2FUKfvNnmUIKvR5FK8KLsEJK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;519&quot; height=&quot;284&quot; data-origin-width=&quot;859&quot; data-origin-height=&quot;470&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 정의&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;공역의 모든 원소가 최소한 하나 이상의 정의역 원소와 연결될 때&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; 공역과 치역이 같은 상태&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 조건&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;행렬의 열벡터들이 공역 &lt;span data-index-in-node=&quot;32&quot; data-math=&quot;\mathbb{R}^m&quot;&gt;$\mathbb{R}^m$ 전체를 Span 해야 함.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;32&quot; data-math=&quot;\mathbb{R}^m&quot;&gt;- 차원 관계&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;32&quot; data-math=&quot;\mathbb{R}^m&quot;&gt;주로 정의역의 차원이 공역보다 크거나 같을 때 &lt;span&gt;(&lt;/span&gt;&lt;span data-index-in-node=&quot;33&quot; data-math=&quot;n \ge m&quot;&gt;$n \ge m$&lt;/span&gt;&lt;span&gt;) 발생 가능성이 높다&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;span data-index-in-node=&quot;32&quot; data-math=&quot;\mathbb{R}^m&quot;&gt;&lt;span&gt;- 딥러닝 응용&lt;/span&gt;&lt;/span&gt;&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;GAN (생성적 적대 신경망)
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;낮은 차원의 잠재 벡터를 다시 원래 이미지 사이즈로 복원하는 Decoding 과정에서 사용&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Manifold
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;실제 데이터가 존재할 법한 특정 서브(Sub) 공간을 의미하며, 생성 모델이 이 공간 전체를 잘 채우느냐가 전사 개념과 연결됨&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;i&gt;(생성 모델이 공간을 잘 채우는 것과 전사 개념이 어떻게 직결되는지 아직 잘 이해가 안 감)&lt;/i&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;5. 일대일 함수 (One-to-One)&lt;/h2&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;859&quot; data-origin-height=&quot;470&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/DJy0K/dJMcahKCRsw/ASwtG3exHLLMwksOhFRN0k/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/DJy0K/dJMcahKCRsw/ASwtG3exHLLMwksOhFRN0k/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/DJy0K/dJMcahKCRsw/ASwtG3exHLLMwksOhFRN0k/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FDJy0K%2FdJMcahKCRsw%2FASwtG3exHLLMwksOhFRN0k%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;518&quot; height=&quot;283&quot; data-origin-width=&quot;859&quot; data-origin-height=&quot;470&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 정의&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;공역의 각 원소가 중복 없이 정의역의 원소와 연결될 때&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 조건&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;행렬의 열벡터들이 서로 선형 독립(Linearly Independent) 이어야 함.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 차원 관계&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;정의역의 차원이 공역보다 크면 (n&amp;gt; m), 반드시 중복이 생기므로 일대일 함수가 될 수 없다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;- 딥러닝 응용&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;의도적 정보 삭제: 신경망의 Fully-connected layer를 거치며 차원을 줄이는 &lt;span data-index-in-node=&quot;53&quot; data-math=&quot;\mathbb{R}^3 \rightarrow \mathbb{R}^2&quot;&gt;$\mathbb{R}^3 \rightarrow \mathbb{R}^2$은 불필요한 차이점을 없애고, 유의미한 특징만 추출하기 위함이다&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;53&quot; data-math=&quot;\mathbb{R}^3 \rightarrow \mathbb{R}^2&quot;&gt;중복의 발생: 여러 데이터를 넣었을 때 동일한 특징값으로 모이는 것은 일대일 함수가 아님을 뜻하며, 이는 효율적인 데이터 예측을 위한 압축 과정이다.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style6&quot; /&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;53&quot; data-math=&quot;\mathbb{R}^3 \rightarrow \mathbb{R}^2&quot;&gt;ex) &lt;span data-index-in-node=&quot;9&quot; data-math=&quot;T(x) = \begin{bmatrix} 2 &amp;amp; 0 \\ -1 &amp;amp; 1 \\ 1 &amp;amp; 2 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix}&quot;&gt;$T(x) = \begin{bmatrix} 2 &amp;amp; 0 \\ -1 &amp;amp; 1 \\ 1 &amp;amp; 2 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix}$&lt;/span&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;53&quot; data-math=&quot;\mathbb{R}^3 \rightarrow \mathbb{R}^2&quot;&gt;- 일대일 함수인가&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;열벡터 $\begin{bmatrix} 2 \ -1 \ 1 \end{bmatrix}$과 $\begin{bmatrix} 0 \ 1 \ 2 \end{bmatrix}$는 서로 배수 관계가 아니므로 &lt;b data-index-in-node=&quot;106&quot; data-path-to-node=&quot;16,0,1,0,0&quot;&gt;선형 독립&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b data-index-in-node=&quot;106&quot; data-path-to-node=&quot;16,0,1,0,0&quot;&gt;&amp;rarr; &lt;/b&gt;일대일 함수&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 전사함수 인가&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;2개의 벡터로는 3차원 공간(&lt;span data-index-in-node=&quot;16&quot; data-math=&quot;\mathbb{R}^3&quot;&gt;$\mathbb{R}^3$&lt;/span&gt;) 전체를 채울(Span) 수 없다&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; 전사함수 아&lt;/p&gt;</description>
      <author>kimyeji2358</author>
      <guid isPermaLink="true">https://kimyeji2358.tistory.com/14</guid>
      <comments>https://kimyeji2358.tistory.com/14#entry14comment</comments>
      <pubDate>Fri, 27 Mar 2026 16:24:30 +0900</pubDate>
    </item>
    <item>
      <title>3-3. CNN</title>
      <link>https://kimyeji2358.tistory.com/13</link>
      <description>&lt;h2 data-ke-size=&quot;size26&quot;&gt;1. 모델 발전 흐름&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;1) MLP(다층 퍼셉트론)의 한계&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;초기에는 이미지를 1차원 배열로 펼펴서(Flatten) 입력하는 MLP 방식을 사용했으나 문제가 발생함&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;문제 1) 파라미터 폭발&lt;/h4&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;441&quot; data-origin-height=&quot;538&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/pGdUT/dJMcafzgJqz/cDwWSLWmrc9nY4aBnf5VFK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/pGdUT/dJMcafzgJqz/cDwWSLWmrc9nY4aBnf5VFK/img.png&quot; data-alt=&quot;Fully-connected 완전 연결&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/pGdUT/dJMcafzgJqz/cDwWSLWmrc9nY4aBnf5VFK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FpGdUT%2FdJMcafzgJqz%2FcDwWSLWmrc9nY4aBnf5VFK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;278&quot; height=&quot;339&quot; data-origin-width=&quot;441&quot; data-origin-height=&quot;538&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;Fully-connected 완전 연결&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;해상도&amp;uarr;&amp;nbsp; 파라미터 수&amp;uarr;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;224&amp;times;224 해상도의 RGB(3채널) 이미지를 1차원으로 펼치면(Flatten) 150,528 차원이 됨&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;이를 1,000개의 뉴런과 완전연결(Fully-Connected)하면 약 1억 5천만 개의 가중치가 생성됨&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; 계산량을 폭발 적으로 늘림&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; 학습 속도를 저하시킴&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; 심각한 과적합 유발&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;문제 2) 공간 정보 손실&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;2D 이미지를 1차원으로 펼치는 순간 픽셀 간의 상하좌우 근접성(지역성, 위치 관계)이 완전히 파괴&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;flatten &amp;rarr; 이웃한 픽셀 관계 사라짐 &amp;rarr; 1차원 벡터로 변환&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;=&amp;gt; 이미지의 핵심인 지역성(Locality), 패턴 반복성, 위치 관계가 완전히 무시됨&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;2) CNN의 등장&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;MLP의 공간 정보 파괴와 파라미터 폭발을 해결하기 위해,&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;인간 시각 인식의 수용장(Receptive Field) 개념을 적용한 모델이 필요해짐&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; 단순 패턴에서 시작해 물체 단위로 계층적 추상화를 거친다는 점에서 아이디어를 얻어 등장&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;수용장(Receptive Field)&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;695&quot; data-origin-height=&quot;545&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/Tjt3X/dJMcadai6qR/wpSQWy4fe1xrunDn5wbUck/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/Tjt3X/dJMcadai6qR/wpSQWy4fe1xrunDn5wbUck/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/Tjt3X/dJMcadai6qR/wpSQWy4fe1xrunDn5wbUck/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FTjt3X%2FdJMcadai6qR%2FwpSQWy4fe1xrunDn5wbUck%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;339&quot; height=&quot;266&quot; data-origin-width=&quot;695&quot; data-origin-height=&quot;545&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;

&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;하나의 뉴런이 전체 이미지를 보지 않고, 3&amp;times;3 같은 작은 일부 영역만 보도록 설계&lt;/li&gt;
&lt;li&gt;파라미터 수를 획기적으로 줄임&lt;/li&gt;
&lt;li&gt;공간 구조를 유지&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;층이 깊어질수록 수용장이 겹치며 넓어짐
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;낮은 층에서는 선/경계 같은 단순 패턴 학습&lt;/li&gt;
&lt;li&gt;높은 층에서는 물체 단위의 복잡한 개념을 학습&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;네오 코그니트론 (CNN의 원형)&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;284&quot; data-origin-height=&quot;178&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cT8T20/dJMcaiJthWo/2lhcfFelf7NfPKnHz5m3gk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cT8T20/dJMcaiJthWo/2lhcfFelf7NfPKnHz5m3gk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cT8T20/dJMcaiJthWo/2lhcfFelf7NfPKnHz5m3gk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcT8T20%2FdJMcaiJthWo%2F2lhcfFelf7NfPKnHz5m3gk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;360&quot; height=&quot;226&quot; data-origin-width=&quot;284&quot; data-origin-height=&quot;178&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;

&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;초기 CNN의 원형&lt;/li&gt;
&lt;li&gt;시각 계층 구조를 인공적으로 구현&lt;/li&gt;
&lt;li&gt;구성
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;특징 추출: S-cell (현대의 Convolution)&lt;/li&gt;
&lt;li&gt;위치 변화에 강건하게 만드는 C-cell(현대의 Pooling)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;당시에는 데이터와 연산 자원, 학습 방법이 부족해 성능을 내지 못함&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;3) CNN의 한계&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;성능을 높이기 위해 망을 깊게 쌓으면서 새로운 문제 발생&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;Degradation Problem&amp;nbsp;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;층을 일정 수준 이상 추가하면 오히려 성능이 저하되는 현상&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Gradient Vanishing / Exploding
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;역전파 과정에서 기울기가 0으로 사라지거나 비정상적으로 커져 깊은 층의 학습이 붕괴되는 현
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;기울기&amp;darr;&amp;nbsp;&amp;rarr; 앞쪽 레이어 학습 X&lt;/li&gt;
&lt;li&gt;기울기&amp;darr;&amp;nbsp;&amp;rarr; 학습이 불안&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;연산량 증가와 비효율성
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;층과 채널이 늘어남에 따라 연산이 무거워져 모바일이나 실시간 환경 적용이 어려워짐&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;4)&amp;nbsp; ResNet의 등장&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;깊은 네트워크를 안정적으로 학습할 수 없는 기존 CNN의 치명적 한계를 해결하여 현대 CNN의 백본(Backbone)이 됨.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;○ Residual Learning(전자 학습) &amp;amp; Skip connection&lt;/h4&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;508&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/P9APT/dJMcaipdK8t/WQWkunbxTL7Mu4PiZfH1MK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/P9APT/dJMcaipdK8t/WQWkunbxTL7Mu4PiZfH1MK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/P9APT/dJMcaipdK8t/WQWkunbxTL7Mu4PiZfH1MK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FP9APT%2FdJMcaipdK8t%2FWQWkunbxTL7Mu4PiZfH1MK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;454&quot; height=&quot;180&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;508&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;전체 결과를 직접 학습하는 대신, 입력과 출력의 차이인 잔차 (&lt;span data-index-in-node=&quot;79&quot; data-math=&quot;F(x)&quot;&gt;$F(x)$&lt;/span&gt;)만 학습하도록 구조를 변경&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; &lt;b&gt;$H(x) = x + F(x)$ &lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;입력값 x를 층을 건너뛰어 그대로 더해주는 Skip Connection을 통해,&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;정보와 기울기가 입력층까지 소실 없이 전달되게 만듦.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;+ Bottleneck 구조&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;643&quot; data-origin-height=&quot;243&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/dOvepF/dJMcaiCI4Bg/kRFIvRWqFHRRBhLJs3l7k1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/dOvepF/dJMcaiCI4Bg/kRFIvRWqFHRRBhLJs3l7k1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/dOvepF/dJMcaiCI4Bg/kRFIvRWqFHRRBhLJs3l7k1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FdOvepF%2FdJMcaiCI4Bg%2FkRFIvRWqFHRRBhLJs3l7k1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;442&quot; height=&quot;167&quot; data-origin-width=&quot;643&quot; data-origin-height=&quot;243&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;깊은 층의 연산량을 줄이기 위해&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;1&amp;times;1 Conv(채널 축소) &amp;rarr; 3&amp;times;3 Conv(특징 추출) &amp;rarr; 1&amp;times;1 Conv(채널 복원)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;로 구성된 병목 구조 도입&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;5) EfficientNet의 등장&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;단순히 층을 깊게 쌓는 것을 넘어,&lt;i&gt; &quot;모델을더 크게 만들 때 무엇을 어떻게 키우는 것이 효율적인가&quot;&lt;/i&gt;&amp;nbsp;에 대한 구조적 해답을 제시&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; 깊이, 너비, 해상도를 균형 있게 확장하여 효율을 극대화한 모델&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;850&quot; data-origin-height=&quot;552&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/nL2jI/dJMcahw5poP/RkpTRKp17xPtJWZ5RTXDHK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/nL2jI/dJMcahw5poP/RkpTRKp17xPtJWZ5RTXDHK/img.png&quot; data-alt=&quot;EfficientNet 구조&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/nL2jI/dJMcahw5poP/RkpTRKp17xPtJWZ5RTXDHK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FnL2jI%2FdJMcahw5poP%2FRkpTRKp17xPtJWZ5RTXDHK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;492&quot; height=&quot;320&quot; data-origin-width=&quot;850&quot; data-origin-height=&quot;552&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;EfficientNet 구조&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;color: #333333; text-align: start;&quot;&gt;Convolution&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span style=&quot;color: #333333; text-align: start;&quot;&gt;&amp;rarr;&amp;nbsp; MBConv 블록 반혹&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&amp;rarr; Feature Map 생성&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&amp;nbsp;&lt;/h4&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;○ Compounding Scaling&lt;/h4&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;800&quot; data-origin-height=&quot;378&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/nHxd0/dJMcabwP5yS/FwLaOgOm3KH1zYqTQjmRPk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/nHxd0/dJMcabwP5yS/FwLaOgOm3KH1zYqTQjmRPk/img.png&quot; data-alt=&quot;기본 모델, 너비 높임, 깊이 키움, 해상도 키움, 셋 다 조절&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/nHxd0/dJMcabwP5yS/FwLaOgOm3KH1zYqTQjmRPk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FnHxd0%2FdJMcabwP5yS%2FFwLaOgOm3KH1zYqTQjmRPk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;550&quot; height=&quot;260&quot; data-origin-width=&quot;800&quot; data-origin-height=&quot;378&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;기본 모델, 너비 높임, 깊이 키움, 해상도 키움, 셋 다 조절&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;모델 성능을 결정하는 세 가지 요소인&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;너비(채널 수) : 커질수록 미세한 정보 포착&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;깊이(레이어 수) : 더 복잡한 패턴 학습 가능&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;해상(이미지 크기) : 상세한 정보 유지 가능&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;를 하나만 늘리지 않고, 세 요소를 수학적으로 계산된 일정한 비율로 균형 있게 동시에 확장&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; 적은 파라미터로 SOTA(최고 성능) 달성&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;○ MBConv 블록&lt;/h4&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;319&quot; data-origin-height=&quot;710&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/QWaVX/dJMcagx8Qud/ky1KLgBfetumsMxXAhlrh1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/QWaVX/dJMcagx8Qud/ky1KLgBfetumsMxXAhlrh1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/QWaVX/dJMcagx8Qud/ky1KLgBfetumsMxXAhlrh1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FQWaVX%2FdJMcagx8Qud%2Fky1KLgBfetumsMxXAhlrh1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;185&quot; height=&quot;412&quot; data-origin-width=&quot;319&quot; data-origin-height=&quot;710&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;ResNet의 Skip Connection을 계상하되&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;1) Expansion&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;2) Depthwise Conv : RGB 채널별로 따로 합성곱을 수행해 연산량 대폭 감소&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;3) SE Block : 어떤 채널이 중요한지 골라내는 Attention&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;4) 1&amp;times;1 Projection&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;5) Skip Connection&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;세부 파이프라인을 구축해 효율성을 극대화&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;2. CNN 구조&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;CNN은 다차원 데이터의 형태를 유지한 채 국소적 패턴을 계층적으로 추출하는 수학적 파이프라&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1920&quot; data-origin-height=&quot;1080&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bDo08n/dJMcafzgUy1/oxVuKzXdaTsue6eVvJ0cL0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bDo08n/dJMcafzgUy1/oxVuKzXdaTsue6eVvJ0cL0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bDo08n/dJMcafzgUy1/oxVuKzXdaTsue6eVvJ0cL0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbDo08n%2FdJMcafzgUy1%2FoxVuKzXdaTsue6eVvJ0cL0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;671&quot; height=&quot;377&quot; data-origin-width=&quot;1920&quot; data-origin-height=&quot;1080&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;1) 특징 추출 단계 ( Feature Extracion)&lt;/h3&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;1. 입력된 이미지에 Convolution(합성곱) 연산을 수행&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;설정된 커널이 이미지 전체를 슬라이딩하여 픽셀값과 가중치를 곱하고 더하는 MAC 연산을 통해 2D 형태의 Feature Map 생성&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;660&quot; data-origin-height=&quot;260&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/AWh59/dJMb996SxcG/UWT7trbp3IfJP906FhZysK/img.gif&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/AWh59/dJMb996SxcG/UWT7trbp3IfJP906FhZysK/img.gif&quot; data-alt=&quot;convolution 연산&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/AWh59/dJMb996SxcG/UWT7trbp3IfJP906FhZysK/img.gif&quot; srcset=&quot;https://blog.kakaocdn.net/dn/AWh59/dJMb996SxcG/UWT7trbp3IfJP906FhZysK/img.gif&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;378&quot; height=&quot;260&quot; data-origin-width=&quot;660&quot; data-origin-height=&quot;260&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;convolution 연산&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;2. Pooling 연산 수행&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;생성된 Feature Map 내에서 최대값이나 평균값만을 추출하여 해상도의 크기를 줄임&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; 위치 변화에 강건해지고 파라미터가 감소함&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;413&quot; data-origin-height=&quot;122&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/kFYVp/dJMcaaSgX2f/kZtPqFq2tII2QQqF0Urcz0/img.jpg&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/kFYVp/dJMcaaSgX2f/kZtPqFq2tII2QQqF0Urcz0/img.jpg&quot; data-alt=&quot;max pooling vs average pooling&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/kFYVp/dJMcaaSgX2f/kZtPqFq2tII2QQqF0Urcz0/img.jpg&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FkFYVp%2FdJMcaaSgX2f%2FkZtPqFq2tII2QQqF0Urcz0%2Fimg.jpg&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;460&quot; height=&quot;136&quot; data-origin-width=&quot;413&quot; data-origin-height=&quot;122&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;max pooling vs average pooling&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;3. 1,2 연산을 반복 수행&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;하위 층: 선, 모서리와 같은 단순한 픽셀 수준의 패턴 포착&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;상위 층: 수용장이 겹치며 넓어져 객체의 전체적인 구조와 같은 추상적인 고차원 특징 학습&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;2) 분류 단계 (Classification)&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;1. 공간 구조 정보가 담긴 최종 Feature Map을 1차원 벡터로 변환&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;2. 완전연결층 (FC Layer)에 통과시킴&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;3. Softmax 활성화 함수 적용&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; 해당 이미지가 특정 클래스에 속할 최종 확률값을 도출&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;640&quot; data-origin-height=&quot;390&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/be45nb/dJMcagEXrrH/iQPoulKIrBcBXOU0dKKrzk/img.webp&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/be45nb/dJMcagEXrrH/iQPoulKIrBcBXOU0dKKrzk/img.webp&quot; data-alt=&quot;softmax 함수 : 최종 확률 도출(0~1)&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/be45nb/dJMcagEXrrH/iQPoulKIrBcBXOU0dKKrzk/img.webp&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fbe45nb%2FdJMcagEXrrH%2FiQPoulKIrBcBXOU0dKKrzk%2Fimg.webp&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;345&quot; height=&quot;210&quot; data-origin-width=&quot;640&quot; data-origin-height=&quot;390&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;softmax 함수 : 최종 확률 도출(0~1)&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;3) Translation Equivariance &amp;amp; Spatial Inductive Bias&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;동일한 필터 가중치를 이미지 전체 영역에 공유하며 연산&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; 특징 패턴이 위치가 바뀌어도(Translation Equivariance) 동일하게 그 특징을 추출할 수 있음&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;i&gt;=&amp;gt; &quot;가까운 픽셀끼리 의미가 깊다&quot;&lt;/i&gt;는 강한 공간적 전제(Spatial Inductive Bias)를 모델 구조 자체에 성공적으로 반영&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;+ Translation Equivariance vs Translation Invariance&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1400&quot; data-origin-height=&quot;800&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/Y8n9Q/dJMcai3K7Ct/tDUjrqEkKxneddWK8xKy00/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/Y8n9Q/dJMcai3K7Ct/tDUjrqEkKxneddWK8xKy00/img.png&quot; data-alt=&quot;Equivariance vs Invariance :&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/Y8n9Q/dJMcai3K7Ct/tDUjrqEkKxneddWK8xKy00/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FY8n9Q%2FdJMcai3K7Ct%2FtDUjrqEkKxneddWK8xKy00%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;597&quot; height=&quot;800&quot; data-origin-width=&quot;1400&quot; data-origin-height=&quot;800&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;Equivariance vs Invariance :&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;Translation Equivariance&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 입력 데이터가 변환되면, 출력 데이터도 똑같은 방식으로 변환되는 성질&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;작은 고양이 이미지를 convolution 연산하면 작은 윤곽선(특징맴)이 나옴&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;만약 입력 이미지를 크게 키우면(Scaling), 출력되는 특징맵도 똑같이 크게 키워진 형태로 나타남&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 특징의 공간적 위치 변화를 그대로 추적&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;Translation Invariance&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 입력 데이터가 변환되어도, 최종 출력 결과는 전혀 변하지 않고 동일하게 유지되는 성질&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;고양이 이미지가 크든 작든 상관없이 cnn 모델을 통과한 최종 분류 결과는 형태가 변하지 않은 똑같은 고양이&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 객체의 위치 변화에 영향을 받지 않고 결과를 도출&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;3. Convolution 하이퍼파라미터&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;convolution layer의 구조를 결정하는 핵심 하이퍼파라미터&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ Kernel Size (커널 크기)&lt;/h3&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;501&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/099BK/dJMcaduBQY2/YB3rYxCQB0USbxhUKdh36k/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/099BK/dJMcaduBQY2/YB3rYxCQB0USbxhUKdh36k/img.png&quot; data-alt=&quot;3&amp;amp;times;3 kernel&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/099BK/dJMcaduBQY2/YB3rYxCQB0USbxhUKdh36k/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2F099BK%2FdJMcaduBQY2%2FYB3rYxCQB0USbxhUKdh36k%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;551&quot; height=&quot;216&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;501&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;3&amp;times;3 kernel&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;의미&amp;nbsp;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;연산을 위해 이미지 위를 슬라이딩하는 가중치 필터의 2D 공간 차원 ex) 3&amp;times;3, 5&amp;times;5&lt;/li&gt;
&lt;li&gt;한 번에 포착하는 정보의 영역(수용장)의 크기를 결정함&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;-값 변화
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;값이 커지면 넓은 영역을 한 번에 파악하지만 연산량이 기하급수적으로 폭발함&lt;/li&gt;
&lt;li&gt;값이 작아지면 매우 세밀한 공간 특징을 추출&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;장단점 / 권장
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;큰 커널(5&amp;times;5) 한 번보다 작은 커널(3&amp;times;3)을 두 번 연속 적용하는 것
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;수용장은 유사하게 유지&lt;/li&gt;
&lt;li&gt;파라미터 수는 더 적음 (5&amp;times;5 = 25개 , (3&amp;times;3)&amp;times;2 = 18개)&lt;/li&gt;
&lt;li&gt;비선형성(ReLU)을 추가로 적용할 수 있음&lt;/li&gt;
&lt;li&gt;&amp;rarr; 모델의 표현력과 효율성 측면에서 유리함&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ Stride&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 의미&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;커널이 입력 데이터 위를 이동할 때 한 번에 건너뛰는 픽셀의 간격&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 값 변화&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;값이 커지면 출력 Feature Map의 가로세로 차원이 작아짐&amp;nbsp;&amp;rarr; 다운샘플링 효과 발생&lt;/li&gt;
&lt;li&gt;값이 작으면 데이터의 원래 크기 정보가 촘촘하게 유지&lt;/li&gt;
&lt;li&gt;장점
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;값을 키우면 후속 레이어의 계산량이 크게 줄어듦&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;단점
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;세밀한 공간적 디테일이 영구적으로 손실될 수 있음&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;주로 초기에는 1로 유지하다가 해상도를 줄여야 하는 시점에만 2이상의 값을 적용&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ Padding&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;의미
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;이미지의 가장자리를 어떻게 처리할 지 결정하는 것&amp;nbsp;&lt;/li&gt;
&lt;li&gt;Zero-Padding&amp;nbsp;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;convolution 연산에 의한 출력 데이터 크기 축소를 방지하기 위해 입력 이미지의 가장자리 둘레를 0 등의 임의의 값으로 덧붙여 채우는 기법 &lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;564&quot; data-origin-height=&quot;658&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/KhHgn/dJMcadg8n1l/2wpmcUdm47WKXbtyxxDb3k/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/KhHgn/dJMcadg8n1l/2wpmcUdm47WKXbtyxxDb3k/img.png&quot; data-alt=&quot;zero-padding&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/KhHgn/dJMcadg8n1l/2wpmcUdm47WKXbtyxxDb3k/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FKhHgn%2FdJMcadg8n1l%2F2wpmcUdm47WKXbtyxxDb3k%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;229&quot; height=&quot;267&quot; data-origin-width=&quot;564&quot; data-origin-height=&quot;658&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;zero-padding&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&amp;nbsp;값 변화
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;패딩을 적용하지 않으면(Valid Padding) 층을 통과할 때마다 크기가 지속적으로 감소&lt;/li&gt;
&lt;li&gt;패딩을 적용하면 입력과 출력의 차원을 동일하게 맞추는 것&amp;nbsp;&amp;rarr; Same Padding&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;장점
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;패딩을 적용하면 깊은 층에서도 공간 해상도 유지&lt;/li&gt;
&lt;li&gt;연산 횟수가 적어 소외되기 쉬운 이미지 최외곽 가장자리의 경계 픽셀 정보도 학습에 온전히 포함시킬 수 있음&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ Filter / Channels&amp;nbsp;&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;의미
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;단일 convolution 레이어 내에서 서로 다른 가중치 배열을 가진 커널을 몇개 사용할 것인지 의미&lt;/li&gt;
&lt;li&gt;이 개수는 생설될 Feature Map의 총 깊이(채널 수)가 됨&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;값 변화
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;필터 수가 커질수록
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;네트워크는 대각선, 수평선, 특정 색상 등 더욱 다양하고 구체적인 특징을 병렬로 동시에 출력&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;장점
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;특징을 많이 뽑을수록 모델의 정보 표현력이 높아져 복잡한 문제 해결에 유리&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;단점
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;특징을 많이 뽑을수록 가중치 파라미터 개수와 연산 메모리가 폭발적으로 증가하여 병목 발생&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;</description>
      <author>kimyeji2358</author>
      <guid isPermaLink="true">https://kimyeji2358.tistory.com/13</guid>
      <comments>https://kimyeji2358.tistory.com/13#entry13comment</comments>
      <pubDate>Fri, 27 Mar 2026 15:16:37 +0900</pubDate>
    </item>
    <item>
      <title>3-2. 딥러닝 학습의 원리</title>
      <link>https://kimyeji2358.tistory.com/12</link>
      <description>&lt;h2 data-ke-size=&quot;size26&quot;&gt;1. 신경망 학습의 핵심&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;처음부터 정답을 아는 것이 아니라&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;입력에 대한 예측 출력과 실제 정답과의 차이를 보고&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;내부 값(가중치)을 줄이는 방향으로 조금씩 수정하는 과정&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;핵심 정보: 파라미터를 어느 방향으로 얼마나 수정해야 하는가를 알려주는 개념 &amp;rarr; &lt;b&gt;기울기&lt;/b&gt;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;2. 수학 기초 (학습의 방향키)&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;1) 미분 (Derivative)&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;어떤 변수를 조금 바꾸었을 때, 함수값이 얼마나 변하는지를 나타내는 값(변화율)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;ex) &lt;span data-index-in-node=&quot;7&quot; data-math=&quot;y=x^2&quot;&gt;$y=x^2$&lt;/span&gt;&lt;span&gt;일 때 미분은 &lt;/span&gt;&lt;span data-index-in-node=&quot;20&quot; data-math=&quot;dy/dx=2x&quot;&gt;$dy/dx=2x$&lt;/span&gt;&lt;span&gt; &lt;/span&gt; &lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 역할&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;값을 증가 또는 감소시켜야 할지 방향 결정&lt;/li&gt;
&lt;li&gt;변화에 대한 민감도를 측정하여 경사하강법의 핵심 입력값이 됨&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;2) 편미분 (Partial Derivative)&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;신경망에는 여러 가중치가 있으므로&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;다른 변수는 고정하고 단 하나의 변수만 변화시킬 때의 변화율을 구하여&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;각 가중치의 개별 영향력을 측정&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;ex) &lt;span data-index-in-node=&quot;7&quot; data-math=&quot;f(x,y) = x^2 + y^2&quot;&gt;$f(x,y) = x^2 + y^2$&lt;/span&gt;&lt;span&gt;에서&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span&gt; &lt;/span&gt;&lt;span data-index-in-node=&quot;28&quot; data-math=&quot;x&quot;&gt;$x$&lt;/span&gt;&lt;span&gt;에 대한 편미분 &lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; &lt;span data-path-to-node=&quot;7,1,2,0,0,1&quot;&gt;&lt;span data-index-in-node=&quot;38&quot; data-math=&quot;\partial f/\partial x = 2x&quot;&gt;$\partial f/\partial x = 2x$&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-path-to-node=&quot;7,1,2,0,0,1&quot;&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;span data-index-in-node=&quot;66&quot; data-math=&quot;y&quot;&gt;$y$&lt;/span&gt;&lt;span&gt;에 대한 편미분 &lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-path-to-node=&quot;7,1,2,0,0,1&quot;&gt;&lt;span data-index-in-node=&quot;76&quot; data-math=&quot;\partial f/\partial y = 2y&quot;&gt;&amp;rarr; $\partial f/\partial y = 2y$&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;7,1,2,0,0,2&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;3) 그래디언트 (Gradient, &lt;span data-index-in-node=&quot;17&quot; data-math=&quot;\nabla&quot;&gt;$\nabla$)&lt;/span&gt;&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;17&quot; data-math=&quot;\nabla&quot;&gt;각 파라미터(가중치)에 대한 편미분 값을 한데 모아 만든 벡터&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;17&quot; data-math=&quot;\nabla&quot;&gt;- 의미&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;손실이 가장 빠르게 증가하는 방향과 증가율을 나타냄&lt;/li&gt;
&lt;li&gt;학습은 그래디언트의 반대 방향( $-\nabla L$)으로 이동&lt;/li&gt;
&lt;/ul&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;3. 손실함수 (Loss Function)&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;- 손실함수의 역할&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;예측값과 정답의 차이를 하나의 수치로 표현하여,&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;모델이 무엇을 줄여야 하는지 방향을 제시함&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;- 문제 유형에 따른 손실함수 선택&lt;/h3&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;1) 평균제곱오차 (MSE)&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;주로 &lt;b&gt;회귀&lt;/b&gt;(연속값 예측) 문제에 사용&lt;/p&gt;
&lt;div data-math=&quot;MSE = \frac{1}{n}\sum(y-\hat{y})^2&quot;&gt;$$MSE = \frac{1}{n}\sum(y-\hat{y})^2$$&lt;/div&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;4&quot; data-math=&quot;MSE = (1/n)\sum(y-\hat{y})^2&quot;&gt;y : 실제값&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;4&quot; data-math=&quot;MSE = (1/n)\sum(y-\hat{y})^2&quot;&gt;$\hat{y}$: 예측값&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;4&quot; data-math=&quot;MSE = (1/n)\sum(y-\hat{y})^2&quot;&gt;n: 데이터 개수&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;4&quot; data-math=&quot;MSE = (1/n)\sum(y-\hat{y})^2&quot;&gt;- 제곱하는 이유&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;오차를 제곱하므로 양/음 오차가 상쇄되지 않음&lt;/li&gt;
&lt;li&gt;오차가 클수록 큰 패널티를 부여&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;259&quot; data-origin-height=&quot;194&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/dzTjqE/dJMcagdROmC/FyUUPuh0NtbzJQtELSX5Xk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/dzTjqE/dJMcagdROmC/FyUUPuh0NtbzJQtELSX5Xk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/dzTjqE/dJMcagdROmC/FyUUPuh0NtbzJQtELSX5Xk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FdzTjqE%2FdJMcagdROmC%2FFyUUPuh0NtbzJQtELSX5Xk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;379&quot; height=&quot;284&quot; data-origin-width=&quot;259&quot; data-origin-height=&quot;194&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;MSE가 0에 가까울수록 추측한 값이 원본에 가까움&amp;nbsp;&amp;rarr; 정확도 높음&lt;/li&gt;
&lt;li&gt;예측값과 실제값 차이의 면적의 평균과 같다&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;2) 교차엔트로피 (Cross-Entropy)&lt;/h4&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;320&quot; data-origin-height=&quot;333&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/EOuSh/dJMcadaiGbn/xWBFkThg83jsUFBR8I4Ys1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/EOuSh/dJMcadaiGbn/xWBFkThg83jsUFBR8I4Ys1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/EOuSh/dJMcadaiGbn/xWBFkThg83jsUFBR8I4Ys1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FEOuSh%2FdJMcadaiGbn%2FxWBFkThg83jsUFBR8I4Ys1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;316&quot; height=&quot;329&quot; data-origin-width=&quot;320&quot; data-origin-height=&quot;333&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;주로 &lt;b&gt;분류&lt;/b&gt;(확률 예측) 문제에서 정답에 대한 확신도를 평가할 때 사용&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;정답 클래스에 할당한 확률이 높을수록 손실이 작아지도록 정의된 함수&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;4&quot; data-math=&quot;L = -\sum y \cdot \log(\hat{y})&quot;&gt;$$L = -\sum y \cdot \log(\hat{y})$$&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;4&quot; data-math=&quot;L = -\sum y \cdot \log(\hat{y})&quot;&gt;- 특징&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;4&quot; data-math=&quot;L = -\sum y \cdot \log(\hat{y})&quot;&gt;정답 클래스에 대해 예측한 확률에 로그를 취함&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;예측 확률이 1에 가까우면 손실이 0에 수렴&lt;/li&gt;
&lt;li&gt;예측 확률이 0에 가까우면 손실이 커짐&lt;/li&gt;
&lt;/ul&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style6&quot; /&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;연속값을 다루는 회귀 문제 &amp;rarr; MSE&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;확률을 다루는 분류 문제 &amp;rarr; 교차엔트로피&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;손실함수를 통해 모델이 무엇을 줄여야 할지를 수치화했다면&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;어떻게 줄여나갈지(실행 방법)가 필요&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;손실 지형에서 낮은 곳을 내려가는 알고리즘 &amp;rarr; 경사하강법&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;4. 경사하강법과 학습룰 (어떻게 줄일 것인가)&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ 경사하강법 (Gradient Descent)&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;손실함수의 값을 줄이기 위해&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;현재 기울기의 반대 방향으로 모델의 파라미터를 조금씩 이동시켜 최소점을 찾는 최적화 방법&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 비유&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;손실을 지형으로 보았을 때, 가장 낮은 지점(최소값)을 향해 경사를 따라 내려가는 것과 같음&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 수식&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;현재 가중치 - (기울기 &amp;times; 학습률)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;4&quot; data-math=&quot;\theta = \theta - \eta \nabla J(\theta)&quot;&gt;$$\theta = \theta - \eta \nabla J(\theta)$$&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;4&quot; data-math=&quot;\theta = \theta - \eta \nabla J(\theta)&quot;&gt;$\theta$ : 파라미터(가중치)&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;$\eta$ : 학습률&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;4&quot; data-math=&quot;\theta = \theta - \eta \nabla J(\theta)&quot;&gt;$\nabla J(\theta)$ : 손실함수의 기울기&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ 학습률 (Learning Rate, &lt;span data-index-in-node=&quot;20&quot; data-math=&quot;\eta&quot;&gt;$\eta$)&lt;/span&gt;&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;20&quot; data-math=&quot;\eta&quot;&gt;한 번 업데이트할 떄 이동하는 보폭의 크기&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;489&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/1P6F7/dJMcahw4R0f/qkHsK8xw4IUJKKqWbGJaZ1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/1P6F7/dJMcahw4R0f/qkHsK8xw4IUJKKqWbGJaZ1/img.png&quot; data-alt=&quot;왼: 학습률이 클 때 / 오: 학습률이 작을&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/1P6F7/dJMcahw4R0f/qkHsK8xw4IUJKKqWbGJaZ1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2F1P6F7%2FdJMcahw4R0f%2FqkHsK8xw4IUJKKqWbGJaZ1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;572&quot; height=&quot;219&quot; data-origin-width=&quot;1280&quot; data-origin-height=&quot;489&quot;/&gt;&lt;/span&gt;&lt;figcaption&gt;왼: 학습률이 클 때 / 오: 학습률이 작을&lt;/figcaption&gt;
&lt;/figure&gt;
&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;학습률이 너무 클 때
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;최적점을 지나침&lt;/li&gt;
&lt;li&gt;불안정한 수렴: 지그재그로 이동&lt;/li&gt;
&lt;li&gt;손실이 폭증하며 발산(불안정)함&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;학습률이 작을 때
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;수렴 속도가 매우 느려짐&lt;/li&gt;
&lt;li&gt;기울기가 작은 평평한 영역에서 진전 없이 정체될 수 있음&lt;/li&gt;
&lt;li&gt;비효율: 만흔 에폭 필요, 자원 소모 증가&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style6&quot; /&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;경사하강법으로 가중치를 업데이트하려면&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;신경망 내의 모든 가중치가 오차에 미친 기울기를 각각 알아야함&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;신경이 깊어질수록 이를 앞에서부터 일일이 계산하는 것은 불가능&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; 출력층의 오차를 거꾸로 전달하며 한 번에 기울기를 구하는&lt;b&gt; 역전파&lt;/b&gt; 필요&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;5. 역전파 (Backpropagation)&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;출력층에서 발생한 오차를 앞쪽 은닉층으로 거꾸로 전달하며&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;신경망 전체의 모든 가중치에 대한 기울기를 효율적으로 계산&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;+ 연쇄법칙 이용&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;- 입력 - 기울기 계산까지의 전체 과정&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;입력 &amp;rarr; 순전파 &amp;rarr; 손실 계산 &amp;rarr; 역전파&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;- 역전파 단계&lt;/h3&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;1) 순전파(Foward)&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;입력 &amp;rarr; 은닉 &amp;rarr; 출력 으로 진행&lt;/li&gt;
&lt;li&gt;예측값 $\hat{y}$ 계산&lt;/li&gt;
&lt;li&gt;각 층의 중간 출력값과 z값(가중합)을 메모리에 저장해 역전파에 활용함&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;2) 손실 계산&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;정답 y와 예측값 $\hat{y}$를 비교하여 손실 L을 계산&lt;/li&gt;
&lt;li&gt;&lt;span&gt;수식&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span data-index-in-node=&quot;4&quot; data-math=&quot;L=L(y,\hat{y})&quot;&gt;$L=L(y,\hat{y})$&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span data-index-in-node=&quot;4&quot; data-math=&quot;L=L(y,\hat{y})&quot;&gt;학습 목표 : 손실 L 최소화&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&lt;span data-index-in-node=&quot;4&quot; data-math=&quot;L=L(y,\hat{y})&quot;&gt;3) 출력층 오차&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;(역전파 출발점) 출력층의 오차 &lt;span&gt;(&lt;/span&gt;&lt;span data-index-in-node=&quot;25&quot; data-math=&quot;\delta_{out}&quot;&gt;$\delta_{out}$&lt;/span&gt;&lt;span&gt;)를 계산&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt; &lt;span&gt;수식 (예시)&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span data-index-in-node=&quot;9&quot; data-math=&quot;\delta_{out}=\partial L/\partial z_{out}&quot;&gt;$\delta_{out}=\partial L/\partial z_{out}$&lt;/span&gt; &lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;4) 오차 전달&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;연쇄법칙 기반&lt;/li&gt;
&lt;li&gt;앞서 구한 출력층의 오차를 이전 층인 은닉층 방향으로 거꾸로 전달&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;5) 편미분 계산&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;거꾸로 전달받은 오차와 1단계에서 저장해둔 입력값(중간값)을 곱함&lt;/li&gt;
&lt;li&gt;각 파라미터가 전체 손실에 미친 영향(편미분)을 구함&lt;/li&gt;
&lt;li&gt;&lt;span data-path-to-node=&quot;7,1,2,1,1,0&quot;&gt;&lt;span&gt;계산 대상 (수식)&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span data-index-in-node=&quot;12&quot; data-math=&quot;\partial L/\partial w&quot;&gt;$\partial L/\partial w$&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span data-index-in-node=&quot;35&quot; data-math=&quot;\partial L/\partial b&quot;&gt;$\partial L/\partial b$&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;7,1,2,1,1,1&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&lt;span data-path-to-node=&quot;7,1,2,1,1,0&quot;&gt;&lt;span data-index-in-node=&quot;35&quot; data-math=&quot;\partial L/\partial b&quot;&gt;6) 경사하강법 갱신&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;5단계에서 구한 기울기의 반대 방향으로 파라미터(가중치, 편향)을 업데이트&lt;/li&gt;
&lt;li&gt;한 번에 이동할 보폭(학습률)을 곱함&lt;/li&gt;
&lt;li&gt;가중치 갱신: &lt;span data-index-in-node=&quot;8&quot; data-math=&quot;w := w - \eta \cdot \partial L/\partial w&quot;&gt;$w := w - \eta \cdot \partial L/\partial w$&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;향 갱신: &lt;span data-index-in-node=&quot;7&quot; data-math=&quot;b := b - \eta \cdot \partial L/\partial b&quot;&gt;$b := b - \eta \cdot \partial L/\partial b$&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;6. 학습 방법 (데이터 업데이트 단위)&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ Batch GD(경사하강법)&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;에폭당 1회&lt;/li&gt;
&lt;li&gt;전체 데이터를 모두 계산한 뒤 파라미터 업데이트&lt;/li&gt;
&lt;li&gt;이동 방향은 안정적&lt;/li&gt;
&lt;li&gt;계산이 무겁고 속도가 느림&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ SGD(확률적 경사하강법)&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;샘플 1개마다 파라미터 갱신&lt;/li&gt;
&lt;li&gt;빠른 갱신으로 초기 하강이 빠름&lt;/li&gt;
&lt;li&gt;샘플 1개에만 의존하므로 전체 데이터의 방향과 맞지 않아 노이즈가 크고 심하게 진동&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ Mini-batch (미니배치)&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;전체 데이터를 32, 64 등의 작음 묶음으로 나누어 업데이트하는 방식&lt;/li&gt;
&lt;li&gt;속도와 안정성의 균형을 잡을 수 있음&lt;/li&gt;
&lt;li&gt;GPU의 병렬 처리 구조를 활용해 연산 효율 극대화&lt;/li&gt;
&lt;li&gt;학습 루프: 배치분할&amp;nbsp;&amp;rarr; 순전파 &amp;rarr; 손실 계산 &amp;rarr; 역전파 &amp;rarr; 갱신&amp;nbsp;&amp;rarr; 다음 배치&lt;/li&gt;
&lt;/ul&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;7. 경사하강법의 한계&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ 극소점 (Local Minima)&lt;/h3&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;250&quot; data-origin-height=&quot;200&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bLLRjW/dJMcaaEJfDh/OXC3kfsKJKe77RzY5DzFG1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bLLRjW/dJMcaaEJfDh/OXC3kfsKJKe77RzY5DzFG1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bLLRjW/dJMcaaEJfDh/OXC3kfsKJKe77RzY5DzFG1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbLLRjW%2FdJMcaaEJfDh%2FOXC3kfsKJKe77RzY5DzFG1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;359&quot; height=&quot;287&quot; data-origin-width=&quot;250&quot; data-origin-height=&quot;200&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;주변 점들과 비교했을 때는 손실이 가장 작지만, 전체 지형에서 가장 낮은 전역 최소값은 아님&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;▷ 지역 골짜기에 갇혀 학습이 멈추는 현상&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;극소점 : 극소적으로만 최솟값&lt;/li&gt;
&lt;li&gt;전역 최솟값: 전체 영역 중 최솟값&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ 평평한 영역 (Flat Region)&lt;/h3&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1056&quot; data-origin-height=&quot;737&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/RnoYz/dJMcaibGTbN/woQqcH3nMMzysz6iJS7cDK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/RnoYz/dJMcaibGTbN/woQqcH3nMMzysz6iJS7cDK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/RnoYz/dJMcaibGTbN/woQqcH3nMMzysz6iJS7cDK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FRnoYz%2FdJMcaibGTbN%2FwoQqcH3nMMzysz6iJS7cDK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;451&quot; height=&quot;315&quot; data-origin-width=&quot;1056&quot; data-origin-height=&quot;737&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;손실 표면의 기울기가 거의 0에 가까워&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;극소점이 아닌데도 업데이트 크기가 미세해져 학습이 정체되는 넓은 구간&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ 경사 방향 오류&lt;/h3&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1024&quot; data-origin-height=&quot;559&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cn8uwE/dJMcajaxjII/ekY1F4raTUeI5OUA1kM9X1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cn8uwE/dJMcajaxjII/ekY1F4raTUeI5OUA1kM9X1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cn8uwE/dJMcajaxjII/ekY1F4raTUeI5OUA1kM9X1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fcn8uwE%2FdJMcajaxjII%2FekY1F4raTUeI5OUA1kM9X1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;528&quot; height=&quot;288&quot; data-origin-width=&quot;1024&quot; data-origin-height=&quot;559&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;길고 좁은 골짜기 지형에서 전역적인 최적 방향을 찾지 못하고 국소 기울기만 따라가다가&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;좌우 벽을 향해 불필요하게 지그재그로 진동하는 현상&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;8. 최적화 알고리즘&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ 모멘텀(Momentum)&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;이전 이동 방향의 관성을 누적하여 현재 기울기에 더하는 방식&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; 현재 기울기 + 과거 이동 흐름&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;지그재그 진동을 상쇄하여 완화함&lt;/li&gt;
&lt;li&gt;일관되게 내려가는 방향으로 속도를 가속하여 평평한 영역을 통과함&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ AdaGrad (적응형 학습률)&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;파라미터 별로 과거의 기울기 제곱합을 누적하여 학습률을 자동 조절함&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;크게 변한 파라미터는 보폭을 줄임&lt;/li&gt;
&lt;li&gt;드물게 변한 파라미터는 보폭을 키움&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 단점&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;시간이 지날수록 누적값이 커져 학습룰이 0 으로 수렴함
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;보완: RMSProp 처럼 이동평균으로 최근 기울기 중심 반영&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;&amp;nbsp;&lt;/h3&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ RMSProp&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;AdaGrad의 단점을 보완해 기울기 제곱의 이동평균을 사용함&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;오래된 과거 정보는 잊게 만들어, 학습 후반부에도 적절한 학습률을 유지&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- &lt;span style=&quot;letter-spacing: 0px;&quot;&gt;효과&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;진동 완화&lt;/li&gt;
&lt;li&gt;안정적 수렴&lt;/li&gt;
&lt;li&gt;평평한 영역에서도 보폭 유지 학습 지속&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;&amp;nbsp;&lt;/h3&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ Adam (Adaptive Moment Estimation)&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;모멘텀(방향 안정성, 1차 모멘트) + RMSProp(적응적 보포 크기, 2차 모멘트)&amp;nbsp;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;빠른 수렴과 진동 억제에 유리&lt;/li&gt;
&lt;li&gt;파라미터별 적응적 학습률로 안정적 업데이트&lt;/li&gt;
&lt;li&gt;초기 설정에 비교적 강건, 다양한 문제에서 무난&lt;/li&gt;
&lt;/ul&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;9. 과적합과 데이터 분할&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ 과적합 (Overfitting)&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;모델이 훈련 데이터의 잡음과 우연한 패턴까지 암기해버려&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;훈련 손실은 낮지만&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;처음 보는 새로운 데이터에 대한 성능은 오히려 떨어지는 현상&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 주요 원인&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;모델 과복잡&lt;/li&gt;
&lt;li&gt;데이터 부족&lt;/li&gt;
&lt;li&gt;과도한 학습&lt;/li&gt;
&lt;li&gt;정규화 부재&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; 훈련에만 최적화된 상태, 일반화를 위해 원인(복잡도, 데이터, 학습시간, 정규화)을 관리해야함&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ 데이터 분할 원칙&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;이를 방지하고 평가하기 위해 데이터를 세 가지로 나눔&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;1) Train (훈련셋)&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;모델의 가중치와 파라미터를 학습하는데 직접 사용&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;2) Validation (검증셋)&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;목적: 학습 중 모델 선택, 튜닝
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;학습 과정 중 모델의 과적합 여부를 모니터링함&lt;/li&gt;
&lt;li&gt;최적의 하이퍼파라미터나 조기 종료 시점을 선택하는 데 사용됨&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;3) Test (테스트셋)&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;목적: 최종 일반화 성능 측정
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;모든 구조와 튜닝이 확정된 후, 모델의 최종 일반화 성능을 산출하기 위해 단 1회만 사용됨&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;의사결정에 사용 금지&lt;/li&gt;
&lt;/ul&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;10. 과적합 방지 기술&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ 드롭아웃 (Dropout)&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;학습 과정 중 지정된 확률&lt;span&gt;(&lt;/span&gt;&lt;span data-index-in-node=&quot;31&quot; data-math=&quot;p&quot;&gt;p&lt;/span&gt;&lt;span&gt;, 보통 0.1~0.5)로 일부 뉴런의 연결을 무작위로 비활성화함&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;특정 뉴런 조합에만 과도하게 의존하는 공적응 현상 방지&lt;/li&gt;
&lt;li&gt;여러 서브 네트워크의 앙상블 효과를 내어 일반화 성능을 높임&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ 배치정규화 (BatchNorm)&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;층을 지나는 입력 데이터의 분포를 각 미니 배치의 평균과 분산을 이용해 0~1 수준으로 정규화함&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;내부 공변량 변화를 완화해 기울기의 폭주, 소실을 감소시킴&lt;/li&gt;
&lt;li&gt;더 큰 학습률을 안전하게 사용 가능하게 하여 수렴을 가속함&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○하이퍼파라미터&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;모델 연산이 아닌 사람이 직접 설정하는 변수들&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;이를 조절하여 과적합/과소적합의 균형을 맞춤&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 학습률&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;너무 크면 불안정&lt;/li&gt;
&lt;li&gt;너무 작으면 느림&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 배치 크기&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;32/64/128 등 자원 한도 내에서 설정&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 에폭 수&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 정규화 강도&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;과적합 시 중대&lt;/li&gt;
&lt;li&gt;과소적합 시 완화&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 드롭아웃 비율&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;0.1~0.5 범위 탐색&lt;/li&gt;
&lt;li&gt;과적합이 심할수록 높임&lt;/li&gt;
&lt;li&gt;성능 정체 시 낮춤&lt;/li&gt;
&lt;/ul&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;11. 모델 평가 지표&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;○ 정확도 (Accuracy)의 함정&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;데이터 불균형(예: 정상 95%, 이상 5%) 상태에서는 무조건 정상이라고만 찍어도 정확도가 95%가 나오는 착시 발생&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; 이를 막기 위해 여러 지표 확인&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;- Precision(정밀도)&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-path-to-node=&quot;35,1,1,0&quot;&gt;&lt;span data-index-in-node=&quot;17&quot; data-math=&quot;\frac{TP}{TP+FP}&quot;&gt;$$\frac{TP}{TP+FP}$$&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;35,1,1,1&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-path-to-node=&quot;35,1,1,0&quot;&gt;&lt;span data-index-in-node=&quot;17&quot; data-math=&quot;\frac{TP}{TP+FP}&quot;&gt;양성이라고 예측한 것 중 진짜 양성 비율&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-path-to-node=&quot;35,1,1,0&quot;&gt;&lt;span data-index-in-node=&quot;17&quot; data-math=&quot;\frac{TP}{TP+FP}&quot;&gt;▷ 거짓 양성(FP) 비용이 클 때 중요&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;span data-path-to-node=&quot;35,1,1,0&quot;&gt;&lt;span data-index-in-node=&quot;17&quot; data-math=&quot;\frac{TP}{TP+FP}&quot;&gt;- Recall(재현율)&lt;/span&gt;&lt;/span&gt;&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-path-to-node=&quot;35,2,1,0&quot;&gt;&lt;span data-index-in-node=&quot;14&quot; data-math=&quot;\frac{TP}{TP+FN}&quot;&gt;$&lt;span data-path-to-node=&quot;35,2,1,0&quot;&gt;&lt;span data-index-in-node=&quot;14&quot; data-math=&quot;\frac{TP}{TP+FN}&quot;&gt;$\frac{TP}{TP+FN}$$&lt;/span&gt;&lt;/span&gt;&lt;span data-path-to-node=&quot;35,2,1,1&quot;&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-path-to-node=&quot;35,2,1,0&quot;&gt;&lt;span data-index-in-node=&quot;14&quot; data-math=&quot;\frac{TP}{TP+FN}&quot;&gt;&lt;span data-path-to-node=&quot;35,2,1,0&quot;&gt;&lt;span data-index-in-node=&quot;14&quot; data-math=&quot;\frac{TP}{TP+FN}&quot;&gt;실제 양성 중 올바르게 양성으로 찾아낸 비율&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-path-to-node=&quot;35,2,1,0&quot;&gt;&lt;span data-index-in-node=&quot;14&quot; data-math=&quot;\frac{TP}{TP+FN}&quot;&gt;&lt;span data-path-to-node=&quot;35,2,1,0&quot;&gt;&lt;span data-index-in-node=&quot;14&quot; data-math=&quot;\frac{TP}{TP+FN}&quot;&gt;▷ FN 비용이 클 때 중요, ex) 암 진단&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;span data-path-to-node=&quot;35,2,1,0&quot;&gt;&lt;span data-index-in-node=&quot;14&quot; data-math=&quot;\frac{TP}{TP+FN}&quot;&gt;&lt;span data-path-to-node=&quot;35,2,1,0&quot;&gt;&lt;span data-index-in-node=&quot;14&quot; data-math=&quot;\frac{TP}{TP+FN}&quot;&gt;- F1-score&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;10&quot; data-math=&quot;2 \times \frac{Precision \times Recall}{Precision + Recall}&quot;&gt;$$2 \times \frac{Precision \times Recall}{Precision + Recall}$$&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;10&quot; data-math=&quot;2 \times \frac{Precision \times Recall}{Precision + Recall}&quot;&gt;정밀도와 재현율의 조화평균&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;10&quot; data-math=&quot;2 \times \frac{Precision \times Recall}{Precision + Recall}&quot;&gt;두 지표의 균형을 강제하여 한쪽으로만 치우치는 쪽에 패널티를 부여하는 종합 평가 지표&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;&lt;span data-index-in-node=&quot;10&quot; data-math=&quot;2 \times \frac{Precision \times Recall}{Precision + Recall}&quot;&gt;- ROC-AUC vs PR-AUC&lt;/span&gt;&lt;/h4&gt;
&lt;table style=&quot;border-collapse: collapse; width: 100%;&quot; border=&quot;1&quot; data-ke-align=&quot;alignLeft&quot;&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;&amp;nbsp;&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;ROC-AUC&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;PR-AUC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;핵심&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;임계값 전반의 분리 능력 평가&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;양성(회귀) 탐지 품질에 집중&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;불균형 민감도&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;상대적으로 낮음&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;매우 높음&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;유리한 상황&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;양, 음 클래스 균형, 오류 비용이 유사&lt;/td&gt;
&lt;td style=&quot;width: 33.3333%;&quot;&gt;양성 최소, 양성 누락 비용이 큼&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;균형 잡힌 데이터에서의 전반적 분류 능력 &amp;rarr; ROC-AUC로 평가&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;양성 클래스가 극도로 회귀한 불균형 데이터 &amp;rarr; PR-AUC&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;hr contenteditable=&quot;false&quot; data-ke-type=&quot;horizontalRule&quot; data-ke-style=&quot;style5&quot; /&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;전체 흐름 요약&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;1. 목표 설정 : 손실함수가 틀린 정도를 정의&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;2. 방법 도출: 미니배치와 역전파를 통해 효율적으로 기울기 방향 계산&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;3. 최적화: Adam 등의 옵티마이저를 통해 진동 없이 안정적으로 최적점을 찾아감&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;4. 일반화 : 드롭아웃과 배치정규화로 정답을 외우는 과적합을 방지하여 현실 데이터 적응력을 높임&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;5. 평가: Accuracy의 함정을 피해, 목적에 맞는 혼동행렬 기반 세부 지표로 성능을 검증&lt;/p&gt;</description>
      <author>kimyeji2358</author>
      <guid isPermaLink="true">https://kimyeji2358.tistory.com/12</guid>
      <comments>https://kimyeji2358.tistory.com/12#entry12comment</comments>
      <pubDate>Fri, 27 Mar 2026 01:39:42 +0900</pubDate>
    </item>
    <item>
      <title>3-1 딥러닝 기초: 인공신경망과 퍼셉트론</title>
      <link>https://kimyeji2358.tistory.com/11</link>
      <description>&lt;h2 data-ke-size=&quot;size26&quot;&gt;1. 딥러닝 등장 배경&lt;/h2&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;- Rule - based AI &amp;rarr; 머신러닝&lt;/h3&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;Rule - based AI (규칙기반)&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 사람이 직접 명시적인 규칙을 작성&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 복잡한 문제나 새로운 상황에 대응하기 어렵다&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;머신러닝 (학습기반)&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;- 모델이 직접 데이터로부터 규칙과 패턴을 자동으로 학습&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;- 초기 머신러닝 모델&lt;/h3&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;선형 회귀
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;연속값 예측&amp;nbsp;&amp;rarr; 선형(직선) 관계만 파악 가능&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;로지스틱 회귀
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;이진 분류&amp;nbsp;&amp;rarr; 복잡한 데이터 분류에 한계&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;퍼셉트론
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;이진 분류(초기 신경망)&amp;nbsp;&amp;rarr; 완벽히 직선으로 나뉠 때만 정답 도출&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;=&amp;gt; 공통점: 데이터로 스스로 학습하지만, 구조가 단순하고 &lt;u&gt;결정경계가 선형이다&lt;/u&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;※선형 결정경계&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; 입력 공간에서 두 클래스를 나누는 경계가 선형 형태로 나타나는 경우&lt;/p&gt;
&lt;ul style=&quot;list-style-type: circle;&quot; data-ke-list-type=&quot;circle&quot;&gt;
&lt;li&gt;결정경계 : 데이터를 나누는 &lt;b&gt;기준선&lt;/b&gt;&lt;/li&gt;
&lt;li&gt;선형 : 기준선이 공간상에서 단 하나의 직선임&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;&amp;nbsp;&lt;/h2&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;2. 퍼셉트론&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;인간의 뇌신경세포(뉴런)가 신호를 처리하는 방식을 수학적으로 단순화한 모델&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;- 생물학적 뉴런 vs 인공 뉴런(퍼셉트론)의 대응&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;수상돌기(입력 수신)&amp;nbsp;&amp;rarr; 입력(x)&lt;/li&gt;
&lt;li&gt;체세포 (신호 통합) &amp;rarr; 가중합( &lt;span data-index-in-node=&quot;39&quot; data-math=&quot;z = w^T x + b&quot;&gt;$z = w^T x + b$&lt;/span&gt; &lt;span data-index-in-node=&quot;72&quot; data-math=&quot;z = w^T x + b&quot;&gt;)&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;축삭 (신호 출력)&amp;nbsp;&amp;rarr; 활성화 함수 (f) 를 거친 출력 (a)&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;- 퍼셉트론 동작 원리&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;입력 &amp;rarr; 가중합 &amp;rarr; 활성화 &amp;rarr; 출력&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;44&quot; data-math=&quot;z = w^T x + b&quot;&gt;$$z = w^T x + b$$&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;가중합: 입력 벡터 x에 가중치 w를 곱하고 편향 b를 더함&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;$$a = f(z)$$&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;활성화: 계산된 가중합을 활성화 함수에 통과시켜 최종 출력을 만듦.&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;=&amp;gt; 퍼셉트론은 먼저 데이터를 나누는 기준선 하나를 만들고, 활성화 함수를 통해 데이터 구분&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;- 퍼셉트론의 한계&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;퍼셉트론이 2차원 공간에서 데이터를 나누는 기준선(결정경계)은 단 하나의 직선&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;직선하나로 나눌 수 있는 데이터에는 강하지만, 복잡한 비선형 문제는 약함.&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;XOR 문제: 두 클래스가 대각선으로 엇갈려 있어 직선 하나로는 분리 불가능&lt;/li&gt;
&lt;li&gt;동심원 데이터: 데이터가 원형으로 둘러싸인 형태, 직선 경계로는 분리 불가능&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;3. 한계 극복: 다층 퍼셉트론(MLP) 등장&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;단층 퍼셉트론의 한계를 극복하기 위해 입력층과 출력층 사이에 은닉층 추가&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;420&quot; data-origin-height=&quot;263&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bgJYds/dJMcabp55Eu/mW0n15hfG68R81k37euLn1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bgJYds/dJMcabp55Eu/mW0n15hfG68R81k37euLn1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bgJYds/dJMcabp55Eu/mW0n15hfG68R81k37euLn1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbgJYds%2FdJMcabp55Eu%2FmW0n15hfG68R81k37euLn1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;391&quot; height=&quot;245&quot; data-origin-width=&quot;420&quot; data-origin-height=&quot;263&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;- 은닉층&lt;/h4&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;정의 : 입력과 출력 사이에서 특징을 변환 , 추출 하는 중간 층&lt;/li&gt;
&lt;li&gt;역할 : 판단을 한 번에 끝내지 않고, 입력을 여러 부분 판단으로 분해하여 다음 층에서 조합할 수 있도록 유용한 표현 공간으로 변환함&lt;/li&gt;
&lt;li&gt;은닉층이 깊어지고 뉴런 수가 늘수록 정교한 표현 학습 가능&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;- 복잡한 결정경계 형성&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;1) 각 은닉 뉴런은 서로 다른 직선 기준(하프스페이스)을 만듦.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;2) 출력층이 이 여러 개의 직선 조각들을 조합(선형 결합)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;3) 결과적으로 곡선처럼 보이는 복잡한 다각형 영역(결정경계) 형성&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;rarr; 비선형 문제 해결 가능&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;4. 비선형 활성화 함수&lt;/h2&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;- 왜 선형 모델만 쌓으면 안 될까?&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;선형 변환을 여러 번 합성하는 것은 결국 또 다른 하나의 선형 변환으로 치환됨.&lt;/p&gt;
&lt;div data-math=&quot;y = W_3(W_2(W_1x)) \Rightarrow y = Wx&quot;&gt;$$y = W_3(W_2(W_1x)) \Rightarrow y = Wx$$&lt;/div&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li data-math=&quot;y = W_3(W_2(W_1x)) \Rightarrow y = Wx&quot;&gt;비선형 활성화 함수 없이 층만 쌓아 봤자 결정경계는 여정히 직선/초평면 &amp;rarr; 표현력 증가 없음&lt;/li&gt;
&lt;li data-math=&quot;y = W_3(W_2(W_1x)) \Rightarrow y = Wx&quot;&gt;복잡한 패턴 표현을 위해서는 비선형 활성화가 필요&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;- 비선형 활성화 함수&amp;nbsp;&lt;/h4&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;1) Sigmoid 함수&lt;/h3&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;280&quot; data-origin-height=&quot;180&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/pJkvm/dJMcahcKmZm/IZagAl1cXg2D9BAN6SOPQ0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/pJkvm/dJMcahcKmZm/IZagAl1cXg2D9BAN6SOPQ0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/pJkvm/dJMcahcKmZm/IZagAl1cXg2D9BAN6SOPQ0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FpJkvm%2FdJMcahcKmZm%2FIZagAl1cXg2D9BAN6SOPQ0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;320&quot; height=&quot;206&quot; data-origin-width=&quot;280&quot; data-origin-height=&quot;180&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 수식/ 범위&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;0 &amp;lt; y &amp;lt; 1&lt;/li&gt;
&lt;li&gt;S자 곡선&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 장점&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;출력이 0~1 사이라 확률적 의미로 직관적임&lt;/li&gt;
&lt;li&gt;이진 분류의 최종 출력층에 주로 사용&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 단점&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;입력 절댓값이 크면 미분값이 0에 가까워지는 기울기 소실 발생&lt;/li&gt;
&lt;li&gt;출력이 0 중심이 아님&lt;/li&gt;
&lt;li&gt;지수 연산(exp) 비용이 커서 은닉층에서 드물게 사용&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;2) ReLU 함수&lt;/h3&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;846&quot; data-origin-height=&quot;554&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bLC1XO/dJMcafzf00J/9FAuMaYj2FQ80x2RqZipt1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bLC1XO/dJMcafzf00J/9FAuMaYj2FQ80x2RqZipt1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bLC1XO/dJMcafzf00J/9FAuMaYj2FQ80x2RqZipt1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbLC1XO%2FdJMcafzf00J%2F9FAuMaYj2FQ80x2RqZipt1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;381&quot; height=&quot;249&quot; data-origin-width=&quot;846&quot; data-origin-height=&quot;554&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 수식/범위 &lt;span data-path-to-node=&quot;29,2,0,0&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li data-path-to-node=&quot;29,2,1,1&quot;&gt;$\max(0, x)$&lt;/li&gt;
&lt;li data-path-to-node=&quot;29,2,1,1&quot;&gt;음수 : 0&lt;/li&gt;
&lt;li data-path-to-node=&quot;29,2,1,1&quot;&gt;양수 : x&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 장점&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;계산이 매우 단순하고 학습이 빠름&lt;/li&gt;
&lt;li&gt;Sigmoid 대비 기울기 소실 문제를 완화하여 은닉층의 표준으로 사용&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 단점&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;음수 구간에서는 뉴런이 꺼지는 문제(dying ReLU) 발생&lt;/li&gt;
&lt;li&gt;x=0에서 미분 불가&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;3) tanh 함수&lt;/h3&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;490&quot; data-origin-height=&quot;270&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/Bi8Va/dJMcahjyzfN/5tJypjlG2RcHDGT22QV7wK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/Bi8Va/dJMcahjyzfN/5tJypjlG2RcHDGT22QV7wK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/Bi8Va/dJMcahjyzfN/5tJypjlG2RcHDGT22QV7wK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FBi8Va%2FdJMcahjyzfN%2F5tJypjlG2RcHDGT22QV7wK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;414&quot; height=&quot;228&quot; data-origin-width=&quot;490&quot; data-origin-height=&quot;270&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 수식/범위&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;-1 &amp;lt; y &amp;lt; 1&lt;/li&gt;
&lt;li&gt;S자 곡선&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 장점&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;중앙 정렬(Zero-centered) 되어 있어, 그래디언트 방향 왜곡&lt;/li&gt;
&lt;li&gt;이 적음&lt;/li&gt;
&lt;li&gt;학습이 더 안정적&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 단점&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;입력이 크면 여전히 기울기 소실이 발생&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;4) Softmax&lt;/h3&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;1063&quot; data-origin-height=&quot;1063&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/b3rIDG/dJMcacJi6jG/uRgZbZk8QPNnPFr9Hu4R80/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/b3rIDG/dJMcacJi6jG/uRgZbZk8QPNnPFr9Hu4R80/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/b3rIDG/dJMcacJi6jG/uRgZbZk8QPNnPFr9Hu4R80/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fb3rIDG%2FdJMcacJi6jG%2FuRgZbZk8QPNnPFr9Hu4R80%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;336&quot; height=&quot;336&quot; data-origin-width=&quot;1063&quot; data-origin-height=&quot;1063&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-path-to-node=&quot;29,4,1,0&quot; data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-path-to-node=&quot;29,4,1,0&quot; data-ke-size=&quot;size18&quot;&gt;- 수식&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li id=&quot;p-rc_ff7eea34b408fe5f-238&quot; data-path-to-node=&quot;29,4,1,1&quot;&gt;$p_{i}=\frac{\exp(z_{i})}{\sum \exp(z_{j})}$&lt;/li&gt;
&lt;li data-path-to-node=&quot;29,4,1,1&quot;&gt;z(각 클래스 점수), p(확률), exp(지수 함수)&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 특징&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;모든 출력값을 0과 1 사이로 만듦.&lt;/li&gt;
&lt;li&gt;전체 합이 1이 되도록 변환&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 용도&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;모델의 출력을 직관적인 확률 분포로 바꿈&amp;nbsp;&amp;rarr; 다중 분류의 출력층에서 사용&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;5. 계층적 학습&lt;/h2&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;비선형 활성화 함수와 여러 층이 결합 &amp;rarr; 신경망은 데이터를 점진적이고 계층적으로 이해함&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;- 이미지 인식의 계층적 학습&lt;/h4&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;1) 저층 (Low-level)&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;선, 모서리(에지), 방향성 같은 단순하고 기본적인 패턴만 감지&lt;/li&gt;
&lt;li&gt;ex) Sobel, Gabor 필터&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;2) 중충 (Mid-level)&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;저층에서 찾은 선과 모서리들을 조합해 눈, 코, 입과 같은 부분적인 부위 구조 인식&lt;/li&gt;
&lt;li&gt;불변성: 위치, 스케일 변화에 점진적 견고성&lt;/li&gt;
&lt;li&gt;효과: 파트 기반 인식으로 의미적 구조 형성&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;3) 고층 (High-level)&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;중층의 부위 정보들을 통합하여 '얼굴' 이라는 전체적인 고차원 개념을 형성&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <author>kimyeji2358</author>
      <guid isPermaLink="true">https://kimyeji2358.tistory.com/11</guid>
      <comments>https://kimyeji2358.tistory.com/11#entry11comment</comments>
      <pubDate>Thu, 26 Mar 2026 22:32:25 +0900</pubDate>
    </item>
    <item>
      <title>2. 선형대수</title>
      <link>https://kimyeji2358.tistory.com/9</link>
      <description>&lt;h3 data-ke-size=&quot;size23&quot;&gt;선형 시스템과 해의 조건&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;선형 대수학에서 선형 시스템의 행렬 방정식은 Ax = b로 표현된다&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;이를 벡터 방정식으로 풀어쓰면&amp;nbsp; 아래처럼 된다&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;$$a_1x_1 + a_2x_2 + a_3x_3 = b$$&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;- 해가 존재할 조건&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;벡터 b가 행렬 A의 열벡터들이 만들어내는 Span 영역 안에 있을 때만 해가 존재&lt;/p&gt;
&lt;div data-math=&quot;b \in \text{Span}\{a_1, a_2, a_3\}&quot;&gt;$$b \in \text{Span}\{a_1, a_2, a_3\}$$&lt;/div&gt;
&lt;div data-math=&quot;b \in \text{Span}\{a_1, a_2, a_3\}&quot;&gt;&amp;nbsp;&lt;/div&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;- 해의 유일성&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;해가 존재할 경우, 이 해가 1개인지 무수히 많은지는 열벡터들의 관계에 따라 결정&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;열벡터가 선형 독립인 경우일 때 해가 유일함&lt;/li&gt;
&lt;li&gt;열벡터들이 선형 종속이라면 해는 무수히 많이 존재&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;1. 선형 독립(Linear independence)&lt;/h2&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;800&quot; data-origin-height=&quot;400&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/AmxBc/dJMcabwLCyA/vLpap9zIeMkX6dmPEuCAck/img.webp&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/AmxBc/dJMcabwLCyA/vLpap9zIeMkX6dmPEuCAck/img.webp&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/AmxBc/dJMcabwLCyA/vLpap9zIeMkX6dmPEuCAck/img.webp&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FAmxBc%2FdJMcabwLCyA%2FvLpap9zIeMkX6dmPEuCAck%2Fimg.webp&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;548&quot; height=&quot;274&quot; data-origin-width=&quot;800&quot; data-origin-height=&quot;400&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;선형 독립은 모든 벡터가 각자 자기만의 방향을 가지고 있어서, 서로가 서로를 대체할 수 없는 상태를 말함&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 기하학적 의미&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;어떤 벡터 집합에 있는 벡터들 중 그 어느 것도 다른 벡터들의 조합(선형 결합)으로 만들 수 없음&lt;/li&gt;
&lt;li&gt;ex) 3차원 공간에서 세 벡터가 선형 독립이라면, 이들은 서로 다른 방향을 가리키며 3차원 입체(Span)을 구성&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 수학적 정의&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&lt;span data-index-in-node=&quot;2&quot; data-math=&quot;x_1v_1 + x_2v_2 + \dots + x_pv_p = 0&quot;&gt;$$x_1v_1 + x_2v_2 + \dots + x_pv_p = 0$$&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;2&quot; data-math=&quot;x_1v_1 + x_2v_2 + \dots + x_pv_p = 0&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;을 만족시키는 해가 오직&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&lt;span data-index-in-node=&quot;53&quot; data-math=&quot;x_1 = 0, x_2 = 0, \dots, x_p = 0&quot;&gt;$$x_1 = 0, x_2 = 0, \dots, x_p = 0$$&lt;/span&gt;&lt;span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;53&quot; data-math=&quot;x_1 = 0, x_2 = 0, \dots, x_p = 0&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;뿐일 때(= trivial solution), 이 벡터들은 선형 독립임&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&lt;span data-index-in-node=&quot;53&quot; data-math=&quot;x_1 = 0, x_2 = 0, \dots, x_p = 0&quot;&gt;- 선형 시스템 (Ax=b)에서의 의미&lt;/span&gt;&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;행렬 A의 열벡터들이 선형 독립이라면, 해가 존재할 경우 그 해는 단 1개만 존재&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;2. 선형 종속(Linear Dependence)&lt;/h2&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;blob&quot; data-origin-width=&quot;163&quot; data-origin-height=&quot;126&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bpv2cq/dJMcabXQ4RT/b3hGM9J5NkPlV5iczotrL0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bpv2cq/dJMcabXQ4RT/b3hGM9J5NkPlV5iczotrL0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bpv2cq/dJMcabXQ4RT/b3hGM9J5NkPlV5iczotrL0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fbpv2cq%2FdJMcabXQ4RT%2Fb3hGM9J5NkPlV5iczotrL0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;302&quot; height=&quot;233&quot; data-filename=&quot;blob&quot; data-origin-width=&quot;163&quot; data-origin-height=&quot;126&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;선형 종속은 반대로 벡터들 중 최소한 하나는 다른 벡터들의 조합으로 만들어낼 수 있는 상태&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 기하학적 의미&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;특정 벡터가 다른 벡터들이 이미 만들어 놓은 공간(Span) 안에 들어가있는 경우&lt;/li&gt;
&lt;li&gt;ex) v₃ = 2v₁ + 3v₂ 처럼 표현될 수 있다면, v₃는 새로운 영역을 개척하지 못하므로 선형 종속. 즉, 종속인 벡터가 추가되어도 전체 생성 공간(Span)은 커지지 않음&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 수학적 정의&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&lt;span data-index-in-node=&quot;2&quot; data-math=&quot;x_1v_1 + x_2v_2 + \dots + x_pv_p = 0&quot;&gt;$$x_1v_1 + x_2v_2 + \dots + x_pv_p = 0$$&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span data-index-in-node=&quot;2&quot; data-math=&quot;x_1v_1 + x_2v_2 + \dots + x_pv_p = 0&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;을 만족시키는 해 중에서, 0이 아닌 값이 하나라도 존재한다면 선형 종속&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 선형 시스템 (Ax=b)에서의 의미&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;주어진 시스템의 해가 존재한다고 가정할 때, 행렬 A의 열벡터들이 선형 종속이라면 하나의 벡터를&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;나타내는 조합의 수가 여러개가 되므로 무수히 많은 해를 가지게 됨.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;table style=&quot;border-collapse: collapse; width: 100%; height: 66px;&quot; border=&quot;1&quot; data-ke-align=&quot;alignLeft&quot;&gt;
&lt;tbody&gt;
&lt;tr style=&quot;height: 24px;&quot;&gt;
&lt;td style=&quot;width: 25%; height: 24px;&quot;&gt;구분&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 24px;&quot;&gt;Span의 확장&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 24px;&quot;&gt;Ax = b의 해&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 24px;&quot;&gt;방정식의 해&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 21px;&quot;&gt;
&lt;td style=&quot;width: 25%; height: 21px;&quot;&gt;선형 독립&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 21px;&quot;&gt;벡터 개수만큼 차원 확장&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 21px;&quot;&gt;유일한 해&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 21px;&quot;&gt;모두 0인 해만 존재&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;height: 21px;&quot;&gt;
&lt;td style=&quot;width: 25%; height: 21px;&quot;&gt;선형 종속&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 21px;&quot;&gt;차원 확장에 기여 못함&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 21px;&quot;&gt;무수히 많음&lt;/td&gt;
&lt;td style=&quot;width: 25%; height: 21px;&quot;&gt;0이 아닌 해가 존재&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;3. 부분공간(Subspace)과 생성(Span)&lt;/h2&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;535&quot; data-origin-height=&quot;486&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/8f9sy/dJMcaiWVYK4/Y1izu4V0benLjILEeW80gk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/8f9sy/dJMcaiWVYK4/Y1izu4V0benLjILEeW80gk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/8f9sy/dJMcaiWVYK4/Y1izu4V0benLjILEeW80gk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2F8f9sy%2FdJMcaiWVYK4%2FY1izu4V0benLjILEeW80gk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;362&quot; height=&quot;329&quot; data-origin-width=&quot;535&quot; data-origin-height=&quot;486&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 부분 공간(Subspace)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;부분공간 H는 Rⁿ의 부분집합으로, 선형 결합에 대해 닫혀있는(closed) 공간을 의미&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 닫혀있다는 것의 의미&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;H에 속한 임의의 두 벡터 u₁, u₂와 임의의 스칼라 c, d에 대하여, cu₁ + du₂의 결과값 역시 반드시 H 안에 존재해야 합니다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;특정 벡터들의 생성인 Span{v₁, ..., vp}은 항상 이러한 부분공간의 성질을 만족하며, 실제 모든 부분공간은 특정 벡터들의 Span으로 표현됨.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;4. 기저(Basis)와 차원(Dimension)&lt;/h2&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;- 기저(Basis)&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;부분 공간 H의 기저는 2가지 조건을 만족하는 벡터들의 집합&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;해당 부분공간 H를 옩ㄴ히 생성(Fully spans)할 수 있어야 함.&lt;/li&gt;
&lt;li&gt;서로 선형 독립이어야 함.&lt;/li&gt;
&lt;/ul&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 기저의 비유일성&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;동일한 부분공간 H을 구성하는 기저 집합은 한 개가 아니라 여러 개가 존재할 수 있다&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size18&quot;&gt;- 차원 (Dimension)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;기저 자체는 여러 개일 수 있지만, 어떤 기저든 그 기저를 구성하는 벡터의 개수는 항상 고유함.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;이 벡터의 개수를 부분 공간의 차원이라 부르며, &lt;b&gt;dim H&lt;/b&gt; 로 표기함&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 data-ke-size=&quot;size26&quot;&gt;5. 행렬의 열공간(Column Space) 과 랭크(Rank)&lt;/h2&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;571&quot; data-origin-height=&quot;424&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/K8Cby/dJMcadH5w8K/acbegeCsJDGYgkbWH2ND70/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/K8Cby/dJMcadH5w8K/acbegeCsJDGYgkbWH2ND70/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/K8Cby/dJMcadH5w8K/acbegeCsJDGYgkbWH2ND70/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FK8Cby%2FdJMcadH5w8K%2FacbegeCsJDGYgkbWH2ND70%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;383&quot; height=&quot;284&quot; data-origin-width=&quot;571&quot; data-origin-height=&quot;424&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;- 열공간(Column Space)&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;행렬 A의 열공간(Col A)은 A를 구성하는 열벡터들이 만들어내는 부분공간을 뜻함&lt;/p&gt;
&lt;ul style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;만약 행렬에 선형 종속인 열이 있다면, 해당 열은 다른 열들의 선형 결합으로 만들어질 수 있으므로 열공간을 구할 떄는 배제됨&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 data-ke-size=&quot;size20&quot;&gt;- 랭크(Rank)&lt;/h4&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;행렬 A의 랭크는 행렬 A가 가지는 열공간의 차원(Dimension)을 의미&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;이를 수식으로 나타내면 &lt;b&gt;rank A = dim Col A &lt;/b&gt;가 됨.&lt;/p&gt;</description>
      <author>kimyeji2358</author>
      <guid isPermaLink="true">https://kimyeji2358.tistory.com/9</guid>
      <comments>https://kimyeji2358.tistory.com/9#entry9comment</comments>
      <pubDate>Sun, 22 Mar 2026 22:03:20 +0900</pubDate>
    </item>
  </channel>
</rss>