티스토리 뷰

 

 

 

 

딥러닝으로 MNIST 98%이상 해보기

 

이번 시간은 neural net을 사용할 때 유용한 팁에 대해서 알아보겠다.

 

MNIST Softmax!

In [1]:
# Lab 7 Learning rate and Evaluation
import tensorflow as tf
import random
# import matplotlib.pyplot as plt
from tensorflow.examples.tutorials.mnist import input_data
tf.set_random_seed(777)  # reproducibility

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
# Check out https://www.tensorflow.org/get_started/mnist/beginners for
# more information about the mnist dataset

# parameters
learning_rate = 0.001
training_epochs = 15
batch_size = 100

# input place holders
X = tf.placeholder(tf.float32, [None, 784])
Y = tf.placeholder(tf.float32, [None, 10])

# weights & bias for nn layers
W = tf.Variable(tf.random_normal([784, 10]))
b = tf.Variable(tf.random_normal([10]))

hypothesis = tf.matmul(X, W) + b

# define cost/loss & optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
    logits=hypothesis, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

# initialize
sess = tf.Session()
sess.run(tf.global_variables_initializer())

# train my model
for epoch in range(training_epochs):
    avg_cost = 0
    total_batch = int(mnist.train.num_examples / batch_size)

    for i in range(total_batch):
        batch_xs, batch_ys = mnist.train.next_batch(batch_size)
        feed_dict = {X: batch_xs, Y: batch_ys}
        c, _ = sess.run([cost, optimizer], feed_dict=feed_dict)
        avg_cost += c / total_batch

    print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost))

print('Learning Finished!')

# Test model and check accuracy
correct_prediction = tf.equal(tf.argmax(hypothesis, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print('Accuracy:', sess.run(accuracy, feed_dict={
      X: mnist.test.images, Y: mnist.test.labels}))

# Get one and predict
r = random.randint(0, mnist.test.num_examples - 1)
print("Label: ", sess.run(tf.argmax(mnist.test.labels[r:r + 1], 1)))
print("Prediction: ", sess.run(
    tf.argmax(hypothesis, 1), feed_dict={X: mnist.test.images[r:r + 1]}))

# plt.imshow(mnist.test.images[r:r + 1].
#           reshape(28, 28), cmap='Greys', interpolation='nearest')
# plt.show()
 
WARNING:tensorflow:From <ipython-input-1-41fa7b38d9b0>:8: read_data_sets (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
WARNING:tensorflow:From C:\Users\whanh\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\mnist.py:260: maybe_download (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Please write your own downloading logic.
WARNING:tensorflow:From C:\Users\whanh\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\mnist.py:262: extract_images (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST_data/train-images-idx3-ubyte.gz
WARNING:tensorflow:From C:\Users\whanh\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\mnist.py:267: extract_labels (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST_data/train-labels-idx1-ubyte.gz
WARNING:tensorflow:From C:\Users\whanh\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\mnist.py:110: dense_to_one_hot (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use tf.one_hot on tensors.
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
WARNING:tensorflow:From C:\Users\whanh\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\mnist.py:290: DataSet.__init__ (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
WARNING:tensorflow:From C:\Users\whanh\AppData\Local\Continuum\anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From <ipython-input-1-41fa7b38d9b0>:29: softmax_cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version.
Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See `tf.nn.softmax_cross_entropy_with_logits_v2`.

Epoch: 0001 cost = 5.745170995
Epoch: 0002 cost = 1.780056727
Epoch: 0003 cost = 1.122778645
Epoch: 0004 cost = 0.872012251
Epoch: 0005 cost = 0.738203191
Epoch: 0006 cost = 0.654728889
Epoch: 0007 cost = 0.596023612
Epoch: 0008 cost = 0.552216822
Epoch: 0009 cost = 0.518254963
Epoch: 0010 cost = 0.491113201
Epoch: 0011 cost = 0.468347534
Epoch: 0012 cost = 0.449374355
Epoch: 0013 cost = 0.432675662
Epoch: 0014 cost = 0.418828156
Epoch: 0015 cost = 0.406128935
Learning Finished!
Accuracy: 0.9023
Label:  [1]
Prediction:  [1]
 

지난번 MNIST data (https://jfun.tistory.com/169)에서 했던 내용이다.

hypothesis = tf.matmul(X, W) + b

# define cost/loss & optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
    logits=hypothesis, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

이 코드는 위에있는 간단한 네트워크의 모델부분인데 3줄밖에 안되는데도 정확도가 무려 90%가 나왔다.
이 모델부분을 조금 더 깊이 들어가보자.

 

NN for MNIST

 

3단+ReLU추가해보자.
NN에서는 레이어의 크기를 조심해야 한다.

In [2]:
# input place holders
X = tf.placeholder(tf.float32, [None, 784])
Y = tf.placeholder(tf.float32, [None, 10])

# weights & bias for nn layers
W1 = tf.Variable(tf.random_normal([784, 256]))
b1 = tf.Variable(tf.random_normal([256]))
L1 = tf.nn.relu(tf.matmul(X, W1) + b1)

W2 = tf.Variable(tf.random_normal([256, 256]))
b2 = tf.Variable(tf.random_normal([256]))
L2 = tf.nn.relu(tf.matmul(L1, W2) + b2)

W3 = tf.Variable(tf.random_normal([256, 10]))
b3 = tf.Variable(tf.random_normal([10]))
hypothesis = tf.matmul(L2, W3) + b3

# define cost/loss &/ optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
    logits=hypothesis, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
In [3]:
# initialize
sess = tf.Session()
sess.run(tf.global_variables_initializer())

# train my model
for epoch in range(training_epochs):
    avg_cost = 0
    total_batch = int(mnist.train.num_examples / batch_size)

    for i in range(total_batch):
        batch_xs, batch_ys = mnist.train.next_batch(batch_size)
        feed_dict = {X: batch_xs, Y: batch_ys}
        c, _ = sess.run([cost, optimizer], feed_dict=feed_dict)
        avg_cost += c / total_batch

    print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost))

print('Learning Finished!')

# Test model and check accuracy
correct_prediction = tf.equal(tf.argmax(hypothesis, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print('Accuracy:', sess.run(accuracy, feed_dict={
      X: mnist.test.images, Y: mnist.test.labels}))

# Get one and predict
r = random.randint(0, mnist.test.num_examples - 1)
print("Label: ", sess.run(tf.argmax(mnist.test.labels[r:r + 1], 1)))
print("Prediction: ", sess.run(
    tf.argmax(hypothesis, 1), feed_dict={X: mnist.test.images[r:r + 1]}))

# plt.imshow(mnist.test.images[r:r + 1].
#           reshape(28, 28), cmap='Greys', interpolation='nearest')
# plt.show()
 
Epoch: 0001 cost = 166.656716985
Epoch: 0002 cost = 41.038044298
Epoch: 0003 cost = 25.719991985
Epoch: 0004 cost = 17.777964834
Epoch: 0005 cost = 12.983673341
Epoch: 0006 cost = 9.572005866
Epoch: 0007 cost = 7.205640663
Epoch: 0008 cost = 5.499154909
Epoch: 0009 cost = 4.002632276
Epoch: 0010 cost = 3.117470723
Epoch: 0011 cost = 2.328740600
Epoch: 0012 cost = 1.740799948
Epoch: 0013 cost = 1.238861716
Epoch: 0014 cost = 1.043431234
Epoch: 0015 cost = 0.779316331
Learning Finished!
Accuracy: 0.9433
Label:  [3]
Prediction:  [3]
 

무려 94%라니!!!!!!!!!!

 

지난시간에 초기화를 잘 해야 한다는 말을 한적이 있었는데, xavier라는 방법이 있었다.
모르는게 생긴다면 구글에 가서 질문하기 'xavier initialization tensorflow'

 

Xavier fo MNIST

In [4]:
# input place holders
X = tf.placeholder(tf.float32, [None, 784])
Y = tf.placeholder(tf.float32, [None, 10])

# weights & bias for nn layers
# http://stackoverflow.com/questions/33640581/how-to-do-xavier-initialization-on-tensorflow
W1 = tf.get_variable("W1", shape=[784, 256],
                     initializer=tf.contrib.layers.xavier_initializer())
b1 = tf.Variable(tf.random_normal([256]))
L1 = tf.nn.relu(tf.matmul(X, W1) + b1)

W2 = tf.get_variable("W2", shape=[256, 256],
                     initializer=tf.contrib.layers.xavier_initializer())
b2 = tf.Variable(tf.random_normal([256]))
L2 = tf.nn.relu(tf.matmul(L1, W2) + b2)

W3 = tf.get_variable("W3", shape=[256, 10],
                     initializer=tf.contrib.layers.xavier_initializer())
b3 = tf.Variable(tf.random_normal([10]))
hypothesis = tf.matmul(L2, W3) + b3

# define cost/loss & optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
    logits=hypothesis, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
In [5]:
# initialize
sess = tf.Session()
sess.run(tf.global_variables_initializer())

# train my model
for epoch in range(training_epochs):
    avg_cost = 0
    total_batch = int(mnist.train.num_examples / batch_size)

    for i in range(total_batch):
        batch_xs, batch_ys = mnist.train.next_batch(batch_size)
        feed_dict = {X: batch_xs, Y: batch_ys}
        c, _ = sess.run([cost, optimizer], feed_dict=feed_dict)
        avg_cost += c / total_batch

    print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost))

print('Learning Finished!')

# Test model and check accuracy
correct_prediction = tf.equal(tf.argmax(hypothesis, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print('Accuracy:', sess.run(accuracy, feed_dict={
      X: mnist.test.images, Y: mnist.test.labels}))

# Get one and predict
r = random.randint(0, mnist.test.num_examples - 1)
print("Label: ", sess.run(tf.argmax(mnist.test.labels[r:r + 1], 1)))
print("Prediction: ", sess.run(
    tf.argmax(hypothesis, 1), feed_dict={X: mnist.test.images[r:r + 1]}))

# plt.imshow(mnist.test.images[r:r + 1].
#           reshape(28, 28), cmap='Greys', interpolation='nearest')
# plt.show()
 
Epoch: 0001 cost = 0.301935923
Epoch: 0002 cost = 0.116421225
Epoch: 0003 cost = 0.076252542
Epoch: 0004 cost = 0.057026181
Epoch: 0005 cost = 0.039157043
Epoch: 0006 cost = 0.031723986
Epoch: 0007 cost = 0.023718875
Epoch: 0008 cost = 0.020155743
Epoch: 0009 cost = 0.013723808
Epoch: 0010 cost = 0.017393448
Epoch: 0011 cost = 0.015805782
Epoch: 0012 cost = 0.009462772
Epoch: 0013 cost = 0.010965110
Epoch: 0014 cost = 0.007961460
Epoch: 0015 cost = 0.008922998
Learning Finished!
Accuracy: 0.978
Label:  [4]
Prediction:  [4]
 

와 거의 98%에 육박하는 정확도가 낮았다.
그리고 재미있는점은 에폭이 1이어도 cost가 매우 낮다.
이것은 초기값이 잘 initialize 되었다는 의미이다. (같은 모델에 초기값만 잘 잡아줘도 정확도를 올리는데 영향을 많이 준다.)

 

Deep NN for MNIST

 

256에서 512로 넓게 하고, 그리고 5단으로 보다 깊게 해보자.

In [6]:
# input place holders
X = tf.placeholder(tf.float32, [None, 784])
Y = tf.placeholder(tf.float32, [None, 10])

# weights & bias for nn layers
# http://stackoverflow.com/questions/33640581/how-to-do-xavier-initialization-on-tensorflow
W1 = tf.get_variable("W1_", shape=[784, 512],
                     initializer=tf.contrib.layers.xavier_initializer())
b1 = tf.Variable(tf.random_normal([512]))
L1 = tf.nn.relu(tf.matmul(X, W1) + b1)

W2 = tf.get_variable("W2_", shape=[512, 512],
                     initializer=tf.contrib.layers.xavier_initializer())
b2 = tf.Variable(tf.random_normal([512]))
L2 = tf.nn.relu(tf.matmul(L1, W2) + b2)

W3 = tf.get_variable("W3_", shape=[512, 512],
                     initializer=tf.contrib.layers.xavier_initializer())
b3 = tf.Variable(tf.random_normal([512]))
L3 = tf.nn.relu(tf.matmul(L2, W3) + b3)

W4 = tf.get_variable("W4_", shape=[512, 512],
                     initializer=tf.contrib.layers.xavier_initializer())
b4 = tf.Variable(tf.random_normal([512]))
L4 = tf.nn.relu(tf.matmul(L3, W4) + b4)

W5 = tf.get_variable("W5_", shape=[512, 10],
                     initializer=tf.contrib.layers.xavier_initializer())
b5 = tf.Variable(tf.random_normal([10]))
hypothesis = tf.matmul(L4, W5) + b5
In [7]:
# define cost/loss & optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
    logits=hypothesis, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

# initialize
sess = tf.Session()
sess.run(tf.global_variables_initializer())

# train my model
for epoch in range(training_epochs):
    avg_cost = 0
    total_batch = int(mnist.train.num_examples / batch_size)

    for i in range(total_batch):
        batch_xs, batch_ys = mnist.train.next_batch(batch_size)
        feed_dict = {X: batch_xs, Y: batch_ys}
        c, _ = sess.run([cost, optimizer], feed_dict=feed_dict)
        avg_cost += c / total_batch

    print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost))

print('Learning Finished!')

# Test model and check accuracy
correct_prediction = tf.equal(tf.argmax(hypothesis, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print('Accuracy:', sess.run(accuracy, feed_dict={
      X: mnist.test.images, Y: mnist.test.labels}))

# Get one and predict
r = random.randint(0, mnist.test.num_examples - 1)
print("Label: ", sess.run(tf.argmax(mnist.test.labels[r:r + 1], 1)))
print("Prediction: ", sess.run(
    tf.argmax(hypothesis, 1), feed_dict={X: mnist.test.images[r:r + 1]}))

# plt.imshow(mnist.test.images[r:r + 1].
#           reshape(28, 28), cmap='Greys', interpolation='nearest')
# plt.show()
 
Epoch: 0001 cost = 0.300793265
Epoch: 0002 cost = 0.103306956
Epoch: 0003 cost = 0.070477889
Epoch: 0004 cost = 0.052671541
Epoch: 0005 cost = 0.039592809
Epoch: 0006 cost = 0.035387814
Epoch: 0007 cost = 0.030010276
Epoch: 0008 cost = 0.025740681
Epoch: 0009 cost = 0.022673877
Epoch: 0010 cost = 0.019972242
Epoch: 0011 cost = 0.018704831
Epoch: 0012 cost = 0.017537554
Epoch: 0013 cost = 0.015988760
Epoch: 0014 cost = 0.015692382
Epoch: 0015 cost = 0.016034859
Learning Finished!
Accuracy: 0.9798
Label:  [6]
Prediction:  [6]
 

더 넓게 더 깊게 쌓았는데 확률이 0.004프로 떨어졌다.
라는 시나리오로 가야 하는데.. 더 높게 나와버렸다..ㅜ
만약 떨어졌다면 왜 이런 일이 발생하는 것일까?
데이터마다 경우가 다르긴 하겠지만, 여기서는 overfitting이다.
그 overffiting을 예방하는 방법중 하나가 drop out이다.

 

Dropout for MNIST

 

텐서플로우에서 dropout이라는 layer를 하나 더 추가하면된다.
L1 다음에 drop시켜주는 layer 추가. L2 다음에 drop시켜주는 layer 추가. 이런식
얼마나 유지시켜줄지 keep_prop.(test할때는 1로 할 것!)

In [10]:
# dropout (keep_prob) rate  0.7 on training, but should be 1 for testing
keep_prob = tf.placeholder(tf.float32)

# weights & bias for nn layers
# http://stackoverflow.com/questions/33640581/how-to-do-xavier-initialization-on-tensorflow
W1 = tf.get_variable("W1_1", shape=[784, 512],
                     initializer=tf.contrib.layers.xavier_initializer())
b1 = tf.Variable(tf.random_normal([512]))
L1 = tf.nn.relu(tf.matmul(X, W1) + b1)
L1 = tf.nn.dropout(L1, keep_prob=keep_prob)

W2 = tf.get_variable("W2_1", shape=[512, 512],
                     initializer=tf.contrib.layers.xavier_initializer())
b2 = tf.Variable(tf.random_normal([512]))
L2 = tf.nn.relu(tf.matmul(L1, W2) + b2)
L2 = tf.nn.dropout(L2, keep_prob=keep_prob)

W3 = tf.get_variable("W3_1", shape=[512, 512],
                     initializer=tf.contrib.layers.xavier_initializer())
b3 = tf.Variable(tf.random_normal([512]))
L3 = tf.nn.relu(tf.matmul(L2, W3) + b3)
L3 = tf.nn.dropout(L3, keep_prob=keep_prob)

W4 = tf.get_variable("W4_1", shape=[512, 512],
                     initializer=tf.contrib.layers.xavier_initializer())
b4 = tf.Variable(tf.random_normal([512]))
L4 = tf.nn.relu(tf.matmul(L3, W4) + b4)
L4 = tf.nn.dropout(L4, keep_prob=keep_prob)

W5 = tf.get_variable("W5_1", shape=[512, 10],
                     initializer=tf.contrib.layers.xavier_initializer())
b5 = tf.Variable(tf.random_normal([10]))
hypothesis = tf.matmul(L4, W5) + b5
 
WARNING:tensorflow:From <ipython-input-10-4b39660c2b49>:10: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
In [11]:
# define cost/loss & optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
    logits=hypothesis, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

# initialize
sess = tf.Session()
sess.run(tf.global_variables_initializer())

# train my model
for epoch in range(training_epochs):
    avg_cost = 0

    for i in range(total_batch):
        batch_xs, batch_ys = mnist.train.next_batch(batch_size)
        feed_dict = {X: batch_xs, Y: batch_ys, keep_prob: 0.7}
        c, _ = sess.run([cost, optimizer], feed_dict=feed_dict)
        avg_cost += c / total_batch

    print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost))

print('Learning Finished!')

# Test model and check accuracy
correct_prediction = tf.equal(tf.argmax(hypothesis, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print('Accuracy:', sess.run(accuracy, feed_dict={
      X: mnist.test.images, Y: mnist.test.labels, keep_prob: 1}))

# Get one and predict
r = random.randint(0, mnist.test.num_examples - 1)
print("Label: ", sess.run(tf.argmax(mnist.test.labels[r:r + 1], 1)))
print("Prediction: ", sess.run(
    tf.argmax(hypothesis, 1), feed_dict={X: mnist.test.images[r:r + 1], keep_prob: 1}))

# plt.imshow(mnist.test.images[r:r + 1].
#           reshape(28, 28), cmap='Greys', interpolation='nearest')
# plt.show()
 
Epoch: 0001 cost = 0.479064576
Epoch: 0002 cost = 0.169453053
Epoch: 0003 cost = 0.129123473
Epoch: 0004 cost = 0.105926294
Epoch: 0005 cost = 0.092658146
Epoch: 0006 cost = 0.080533782
Epoch: 0007 cost = 0.074052478
Epoch: 0008 cost = 0.066805487
Epoch: 0009 cost = 0.062837852
Epoch: 0010 cost = 0.057148129
Epoch: 0011 cost = 0.054961414
Epoch: 0012 cost = 0.054889232
Epoch: 0013 cost = 0.048306230
Epoch: 0014 cost = 0.047772515
Epoch: 0015 cost = 0.043271988
Learning Finished!
Accuracy: 0.9809
Label:  [3]
Prediction:  [3]
 

98%가 넘었다. 엄청난 결과다.!

 

Optimizers

 

train = tf.train.GradientOptimizer(learning_rate=0.1).minimize(cost)

  • tf.train.AdadeltaOptimizer
  • tf.train.AdagradOptimizer
  • tf.train.AdagradDAOptimizer
  • tf.train.MomentumOptimizer
  • tf.train.AdamOptimizer
  • tf.train.FtrlOptimizer
  • tf.train.ProximalGradientDescentOptimizer
  • tf.train.ProximalAdagradOptimizer
  • tf.train.RMSPropOptimizer
 

Optimizer 종류는 여러가지가 있고 어떤것이 학습이 잘 되는지 테스트 해볼것.

 

http://www.denizyuret.com/2015/03/alec-radfords-animations-for.html
optimizer를 종류별로 테스트 가능

In [12]:
from PIL import Image
Image.open('adam.png')
Out[12]:
 

cost가 빨리 줄어들수록 해당 데이터에 적절한 optimizer이다.
보다시피 adam이라는 optimizer가 빨리 줄어드는데, 많이 사용하는 optimizer이니 알아두자

 

Use Adam Optimizer

 

gradiet descent와 이름부분만 바꾸면 사용할 수 있다.

 
# define cost/loss & optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
                                             logits = hypothesis, labels = Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
 

Exercise: Batch Normalization

 

입력값을 normalize 잘 하는 방법. 많은 사람들이 사용하고 있다. 이것을 한 번 보고 98% 이상 성능을 올릴 수 있는지 연습해보자.

공지사항
최근에 올라온 글
최근에 달린 댓글
Total
Today
Yesterday
링크
«   2024/12   »
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31
글 보관함