Linear Regression의 cost 최소화의 TensorFlow 구현¶

cost함수는 어떻게 생겼을까?¶

import tensorflow as tf
import matplotlib.pyplot as plt
X = [1,2,3]
Y = [1,2,3]

W =tf.placeholder(tf.float32)
# Out hypothesis for linear molde X * W
hypothesis = X * W

# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
#Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
#Variables for plotting cost function
W_val = []
cost_val = []
for i in range(-30,50):
    feed_W = i * 0.1
    curr_cost, curr_W = sess.run([cost, W], feed_dict={W: feed_W})
    W_val.append(curr_W)
    cost_val.append(curr_cost)
    
# Show the cost function
plt.plot(W_val, cost_val)
plt.show()

Gradient descent¶

W- 뒷부분을 gradient라 한다.

# Minimize: Gradient Descent using derivative:
W - = learning_rate * derivative
learing_rate =0.1
gradient = tf.reduce_mean((W * X - Y) * X)
descent= W -learning_rate * gradient
update = W.assign(descent) # 우변의 식이 좌변에 assign한다.

import tensorflow as tf
x_data = [1,2,3]
y_data = [1,2,3]

W = tf.Variable(tf.random_normal([1]), name='weight')
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)

# Our hypothesis for linear model X * W
hypothesis = X * W

# cost/loss function
cost = tf.reduce_sum(tf.square(hypothesis - Y))

# Minimize: Gradient Descent using derivative: W -= learning_rate * derivative
learning_rate = 0.1
gradient = tf.reduce_mean((W * X - Y) * X)
descent = W -learning_rate * gradient
update = W.assign(descent)

# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
for step in range(21):
    sess.run(update, feed_dict={X: x_data, Y: y_data})
    print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run(W))
    #업데이트가 잘 되는지 cost와 W값 확인. step, cost, W 순으로 나옴.

0 0.2356646 [0.87025726]
1 0.067033365 [0.9308039]
2 0.019067265 [0.9630954]
3 0.0054236 [0.98031753]
4 0.0015427156 [0.98950267]
5 0.00043882325 [0.9944014]
6 0.00012481687 [0.9970141]
7 3.5503454e-05 [0.99840754]
8 1.0098807e-05 [0.9991507]
9 2.8719483e-06 [0.99954706]
10 8.171229e-07 [0.9997584]
11 2.325797e-07 [0.99987113]
12 6.614671e-08 [0.9999313]
13 1.8799046e-08 [0.99996334]
14 5.351012e-09 [0.99998045]
15 1.5194992e-09 [0.99998957]
16 4.3216986e-10 [0.99999446]
17 1.2649082e-10 [0.999997]
18 3.474554e-11 [0.99999845]
19 9.166001e-12 [0.99999917]
20 3.1832315e-12 [0.9999995]

cost가 간단했기에 미분함수 ((W X - Y) X)가 간단히 나왔지만 cost가 굉장히 복잡할 수 있어. 그럴때 그냥 다음과 같이 하면 미분하지 않고도 구할 수 있다.¶

# Minimize: Gradient Descent Magic
optimizer = 
    tf.train.GradientDescentOptimizer(learning_rate=0.1)
train = optimizer.minimize(cost)

Output when W=5¶

import tensorflow as tf

# tf Graph Input
X = [1,2,3]
Y = [1,2,3]

# Set wrong model weights

W = tf.Variable(5.0)
# Linear model
hypothesis = X * W
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
# Minimize: Gradient Descent Magic
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1)
train = optimizer.minimize(cost)

# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())

for step in range(10):
    print(step, sess.run(W))
    sess.run(train)

0 5.0
1 1.2666664
2 1.0177778
3 1.0011852
4 1.000079
5 1.0000052
6 1.0000004
7 1.0
8 1.0
9 1.0

Output when W=-3¶

import tensorflow as tf

# tf Graph Input
X = [1,2,3]
Y = [1,2,3]

# Set wrong model weights

W = tf.Variable(-3.0)
# Linear model
hypothesis = X * W
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
# Minimize: Gradient Descent Magic
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1)
train = optimizer.minimize(cost)

# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())

for step in range(10):
    print(step, sess.run(W))
    sess.run(train)

0 -3.0
1 0.7333336
2 0.98222226
3 0.9988148
4 0.99992096
5 0.9999947
6 0.99999964
7 0.99999994
8 1.0
9 1.0

optional: compute_gradient and apply_gradient¶

만약에 텐서플로우가 주는 gradient를 여러가지 연산을 이용하여 손대고 싶을때 사용

import tensorflow as tf

X = [1,2,3]
Y = [1,2,3]

# Set wrong model weights
W = tf.Variable(5.)

# Linear model
hypothesis = X * W
# Manual gradient
gradient = tf.reduce_mean((W * X - Y) * X) * 2
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)

# Get gradients
gvs = optimizer.compute_gradients(cost)
# Apply gradients
apply_gradients = optimizer.apply_gradients(gvs)

# Launch the graph in a session.
sess = tf.Session()

sess.run(tf.global_variables_initializer())

for step in range(100):
    print(step, sess.run([gradient, W, gvs]))
    sess.run(apply_gradients)

WARNING:tensorflow:From C:\Users\whanh\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
0 [37.333332, 5.0, [(37.333336, 5.0)]]
1 [33.84889, 4.6266665, [(33.84889, 4.6266665)]]
2 [30.689657, 4.2881775, [(30.689657, 4.2881775)]]
3 [27.825287, 3.9812808, [(27.825287, 3.9812808)]]
4 [25.228262, 3.703028, [(25.228262, 3.703028)]]
5 [22.873621, 3.4507453, [(22.873623, 3.4507453)]]
6 [20.738752, 3.2220092, [(20.73875, 3.2220092)]]
7 [18.803137, 3.0146217, [(18.803137, 3.0146217)]]
8 [17.048176, 2.8265903, [(17.048176, 2.8265903)]]
9 [15.457013, 2.6561086, [(15.457014, 2.6561086)]]
10 [14.014359, 2.5015385, [(14.01436, 2.5015385)]]
11 [12.706352, 2.361395, [(12.706352, 2.361395)]]
12 [11.520427, 2.2343314, [(11.520427, 2.2343314)]]
13 [10.445186, 2.119127, [(10.445185, 2.119127)]]
14 [9.470302, 2.0146751, [(9.470302, 2.0146751)]]
15 [8.586407, 1.9199722, [(8.586407, 1.9199722)]]
16 [7.785009, 1.8341081, [(7.785009, 1.8341081)]]
17 [7.0584083, 1.756258, [(7.0584083, 1.756258)]]
18 [6.399624, 1.685674, [(6.399624, 1.685674)]]
19 [5.8023257, 1.6216778, [(5.8023252, 1.6216778)]]
20 [5.260776, 1.5636545, [(5.260776, 1.5636545)]]
21 [4.7697697, 1.5110468, [(4.7697697, 1.5110468)]]
22 [4.324591, 1.4633491, [(4.324591, 1.4633491)]]
23 [3.9209633, 1.4201032, [(3.9209635, 1.4201032)]]
24 [3.5550067, 1.3808936, [(3.5550067, 1.3808936)]]
25 [3.2232056, 1.3453435, [(3.2232056, 1.3453435)]]
26 [2.9223735, 1.3131114, [(2.9223735, 1.3131114)]]
27 [2.6496189, 1.2838877, [(2.6496186, 1.2838877)]]
28 [2.4023216, 1.2573916, [(2.4023216, 1.2573916)]]
29 [2.178105, 1.2333684, [(2.178105, 1.2333684)]]
30 [1.9748148, 1.2115873, [(1.9748147, 1.2115873)]]
31 [1.7904993, 1.1918392, [(1.7904994, 1.1918392)]]
32 [1.623386, 1.1739342, [(1.6233861, 1.1739342)]]
33 [1.4718695, 1.1577003, [(1.4718695, 1.1577003)]]
34 [1.3344955, 1.1429816, [(1.3344957, 1.1429816)]]
35 [1.2099417, 1.1296366, [(1.2099419, 1.1296366)]]
36 [1.0970144, 1.1175373, [(1.0970144, 1.1175373)]]
37 [0.9946267, 1.1065671, [(0.9946267, 1.1065671)]]
38 [0.90179497, 1.0966209, [(0.901795, 1.0966209)]]
39 [0.8176275, 1.087603, [(0.81762755, 1.087603)]]
40 [0.7413151, 1.0794266, [(0.7413151, 1.0794266)]]
41 [0.67212623, 1.0720135, [(0.67212623, 1.0720135)]]
42 [0.609394, 1.0652922, [(0.609394, 1.0652922)]]
43 [0.5525169, 1.0591983, [(0.5525169, 1.0591983)]]
44 [0.50094914, 1.0536731, [(0.50094914, 1.0536731)]]
45 [0.45419374, 1.0486636, [(0.45419377, 1.0486636)]]
46 [0.41180158, 1.0441216, [(0.41180158, 1.0441216)]]
47 [0.37336722, 1.0400037, [(0.37336725, 1.0400037)]]
48 [0.33851996, 1.03627, [(0.33852, 1.03627)]]
49 [0.30692515, 1.0328848, [(0.30692515, 1.0328848)]]
50 [0.27827826, 1.0298156, [(0.2782783, 1.0298156)]]
51 [0.25230527, 1.0270327, [(0.25230527, 1.0270327)]]
52 [0.2287569, 1.0245097, [(0.2287569, 1.0245097)]]
53 [0.20740573, 1.022222, [(0.20740573, 1.022222)]]
54 [0.18804836, 1.020148, [(0.18804836, 1.020148)]]
55 [0.17049654, 1.0182675, [(0.17049655, 1.0182675)]]
56 [0.15458433, 1.0165626, [(0.15458433, 1.0165626)]]
57 [0.14015675, 1.0150168, [(0.14015675, 1.0150168)]]
58 [0.12707591, 1.0136153, [(0.12707591, 1.0136153)]]
59 [0.11521538, 1.0123445, [(0.11521538, 1.0123445)]]
60 [0.10446167, 1.0111923, [(0.10446167, 1.0111923)]]
61 [0.09471202, 1.0101477, [(0.09471202, 1.0101477)]]
62 [0.08587202, 1.0092006, [(0.08587202, 1.0092006)]]
63 [0.07785805, 1.0083419, [(0.07785805, 1.0083419)]]
64 [0.07059129, 1.0075634, [(0.07059129, 1.0075634)]]
65 [0.06400236, 1.0068574, [(0.06400236, 1.0068574)]]
66 [0.05802846, 1.0062174, [(0.05802846, 1.0062174)]]
67 [0.052612226, 1.005637, [(0.052612226, 1.005637)]]
68 [0.047702473, 1.005111, [(0.047702473, 1.005111)]]
69 [0.043249767, 1.0046339, [(0.043249767, 1.0046339)]]
70 [0.03921318, 1.0042014, [(0.03921318, 1.0042014)]]
71 [0.035553534, 1.0038093, [(0.035553537, 1.0038093)]]
72 [0.032236177, 1.0034539, [(0.03223618, 1.0034539)]]
73 [0.029227654, 1.0031315, [(0.029227655, 1.0031315)]]
74 [0.02649951, 1.0028392, [(0.02649951, 1.0028392)]]
75 [0.024025917, 1.0025742, [(0.024025917, 1.0025742)]]
76 [0.021783749, 1.002334, [(0.02178375, 1.002334)]]
77 [0.01975123, 1.0021162, [(0.019751232, 1.0021162)]]
78 [0.017907381, 1.0019187, [(0.017907381, 1.0019187)]]
79 [0.016236702, 1.0017396, [(0.016236704, 1.0017396)]]
80 [0.014720838, 1.0015773, [(0.014720838, 1.0015773)]]
81 [0.01334699, 1.00143, [(0.013346991, 1.00143)]]
82 [0.012100856, 1.0012965, [(0.012100856, 1.0012965)]]
83 [0.010971785, 1.0011755, [(0.010971785, 1.0011755)]]
84 [0.0099481745, 1.0010659, [(0.0099481745, 1.0010659)]]
85 [0.009018898, 1.0009663, [(0.009018898, 1.0009663)]]
86 [0.008176883, 1.0008761, [(0.008176884, 1.0008761)]]
87 [0.007413149, 1.0007943, [(0.007413149, 1.0007943)]]
88 [0.006721576, 1.0007201, [(0.006721576, 1.0007201)]]
89 [0.0060940585, 1.0006529, [(0.0060940585, 1.0006529)]]
90 [0.005525271, 1.000592, [(0.0055252714, 1.000592)]]
91 [0.0050098896, 1.0005368, [(0.0050098896, 1.0005368)]]
92 [0.004542589, 1.0004867, [(0.004542589, 1.0004867)]]
93 [0.0041189194, 1.0004413, [(0.0041189194, 1.0004413)]]
94 [0.0037339528, 1.0004001, [(0.003733953, 1.0004001)]]
95 [0.0033854644, 1.0003628, [(0.0033854644, 1.0003628)]]
96 [0.0030694802, 1.0003289, [(0.0030694804, 1.0003289)]]
97 [0.0027837753, 1.0002983, [(0.0027837753, 1.0002983)]]
98 [0.0025234222, 1.0002704, [(0.0025234222, 1.0002704)]]
99 [0.0022875469, 1.0002451, [(0.0022875469, 1.0002451)]]

출처: https://www.inflearn.com/course/%EA%B8%B0%EB%B3%B8%EC%A0%81%EC%9D%B8-%EB%A8%B8%EC%8B%A0%EB%9F%AC%EB%8B%9D-%EB%94%A5%EB%9F%AC%EB%8B%9D-%EA%B0%95%EC%A2%8C/lecture/3381

Logistic Regression (0)	2019.05.04
Loading data from file (0)	2019.05.04
Multi-variable linear regression (0)	2019.05.03
Tensorflow로 간단한 linear regression 구현 (0)	2019.05.02
Tensorflow의 기본적인 operation (0)	2019.05.02

조환희의 학습 블로그

티스토리 뷰

Linear Regression의 cost 최소화의 TensorFlow 구현

Linear Regression의 cost 최소화의 TensorFlow 구현¶

cost함수는 어떻게 생겼을까?¶

Gradient descent¶

cost가 간단했기에 미분함수 ((W X - Y) X)가 간단히 나왔지만 cost가 굉장히 복잡할 수 있어. 그럴때 그냥 다음과 같이 하면 미분하지 않고도 구할 수 있다.¶

Output when W=5¶

Output when W=-3¶

optional: compute_gradient and apply_gradient¶

'beginner > 파이썬 딥러닝 기초' 카테고리의 다른 글

티스토리툴바

« 2025/02 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28