티스토리 뷰

2장_08_SVM

커널 서포트벡터 머신 (Kernel SVM)

  • 앞에서 선형 SVM 에 대해 배웠습니다. 선형 SVM 은 클래스 간의 간격을 가장 넓게 할 수 있는 곧은 평면을 찾는 것입니다. 하지만 곧은 평면 만으로 클래스를 구분 할 수 없는 경우가 많습니다.
  • 커널 방법은 다양한 방법으로 속성을 증가시키거나 다항식이나 복잡한 곡선함수를 적용하여, 굽은 평면으로 클래스를 구분하는 방법입니다.
  • 아래에서 동심원 형태의 데이터를 살펴보겠습니다.
  • 커널 알고리즘의 핵심은 다양한 속성을 추가해 나가는 것이다.
  • 아래는 2차 다항식을 추가하는 경우이다.
In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
In [3]:
from sklearn.datasets import make_circles

X, y = make_circles(factor=0.5, noise=0.1) # factor = R1/R2, noise: svd
print(X.shape, y)

plt.scatter(X[:,0], X[:,1], c=y, cmap='autumn')
plt.colorbar(shrink=0.7)
(100, 2) [1 0 0 1 0 0 1 1 1 1 0 1 0 1 1 1 0 1 1 1 1 0 1 1 1 0 1 0 1 0 0 0 0 0 0 1 0
 1 0 1 0 1 0 1 1 1 1 0 1 0 0 1 1 0 0 0 0 1 0 0 0 1 0 1 1 0 1 0 0 1 0 0 0 0
 0 1 1 0 1 1 1 0 0 0 0 0 1 0 1 1 0 0 1 1 0 1 0 1 1 1]
Out[3]:
<matplotlib.colorbar.Colorbar at 0x2577090c438>
  • 위의 그림은 어떤 방식으로도 직선으로 두 클래스를 구분할 수 없습니다.
  • 그런데 가만히 보면 중심에서의 거리에 따라 클래스를 구분할 수 있습니다.(중심부터 거리와 중심각)
  • 속성을 하나 추가해 보겠습니다. $ x_1^2 + x_2^2 $ 형태입니다.
In [4]:
new_col = X[:,0]**2 + X[:,1]**2
X_new = np.c_[X,new_col]

display(X_new.shape, X_new[:5])
(100, 3)
array([[ 0.65460121, -0.17769936,  0.4600798 ],
       [ 0.76123759, -0.47921252,  0.8091273 ],
       [ 0.27375542,  0.90683165,  0.89728568],
       [-0.31411151,  0.32778867,  0.20611146],
       [ 0.724963  ,  0.81727021,  1.19350195]])
In [5]:
plt.scatter(new_col, y, c=y, alpha=0.3)
plt.vlines([new_col[y==1].max(), new_col[y==0].min()], 0, 1, linestyles='dotted')
Out[5]:
<matplotlib.collections.LineCollection at 0x25770963c18>
In [6]:
from sklearn.svm import LinearSVC

model = LinearSVC()
model.fit(X, y)

score = model.score(X,y)
display(score)
0.52
In [7]:
from sklearn.svm import LinearSVC

model = LinearSVC()
model.fit(new_col.reshape(-1,1), y)

score = model.score(new_col.reshape(-1,1),y)
display(score)
1.0
In [8]:
model.coef_, model.intercept_
Out[8]:
(array([[-3.06766185]]), array([1.78113757]))
In [9]:
model.coef_[0,0] / model.intercept_[0]
Out[9]:
-1.722304839152741
In [10]:
from sklearn.svm import LinearSVC

model = LinearSVC()
model.fit(X_new, y)

score = model.score(X_new,y)
display(score)
1.0
In [11]:
model.coef_
Out[11]:
array([[-0.02879781,  0.10546302, -3.07067687]])
In [12]:
model.intercept_
Out[12]:
array([1.78267497])
In [13]:
scale = 300
xmax = X[:,0].max()+0.1
xmin = X[:,0].min()-.1
ymax = X[:,1].max()+.1
ymin = X[:,1].min()-.1

xx = np.linspace(xmin,xmax,scale)
yy = np.linspace(ymin,ymax,scale)
data1, data2 = np.meshgrid(xx,yy)
X_grid = np.c_[data1.ravel(), data2.ravel()]
X_grid = np.c_[X_grid, X_grid[:,0]**2 + X_grid[:,1]**2]

pred_y = model.predict(X_grid)

CS = plt.imshow(pred_y.reshape(scale,scale), interpolation=None, origin='lower',
                extent=[xmin,xmax,ymin,ymax], alpha=0.3, cmap='gray_r')

# draw X_train
plt.scatter(X[:,0], X[:,1], c=y, s=60, cmap='autumn')

plt.colorbar(CS)
Out[13]:
<matplotlib.colorbar.Colorbar at 0x25770a51198>
In [14]:
X, y = make_circles(factor=0.5, noise=0.1) # factor = R1/R2, noise: svd
print(X.shape, y)

X = X * [1,0.5]
X = X + 1

plt.scatter(X[:,0], X[:,1], c=y, cmap='autumn')
plt.colorbar(shrink=0.7)
plt.vlines([1],-0,2,linestyles='dotted')
plt.hlines([1],-0,2,linestyles='dotted')
(100, 2) [0 0 0 1 1 1 0 0 0 1 1 1 1 0 0 1 1 1 1 1 1 1 0 1 0 0 0 0 0 0 1 1 0 1 0 0 0
 1 1 0 1 1 1 0 1 0 0 0 1 1 0 1 1 1 1 1 0 0 0 0 0 1 0 1 1 1 0 1 0 1 1 0 1 1
 1 1 0 0 0 1 1 1 0 0 0 0 1 0 0 1 0 0 0 0 1 1 0 0 0 1]
Out[14]:
<matplotlib.collections.LineCollection at 0x25770af5978>

x2과 y2에 각각 다른 가중치가 주어지므로 각각의 속성이 추가되어야 한다.

In [15]:
X_new = np.c_[X, X[:,0]**2, X[:,1]**2] # x, y, x^2, y^2

model = LinearSVC(C=10)
model.fit(X_new, y)

score = model.score(X_new,y)
display(score)
1.0
In [16]:
model.coef_, model.intercept_
Out[16]:
(array([[ 4.4937658 , 13.12436104, -2.26571607, -6.64332812]]),
 array([-7.52422003]))
In [17]:
model.coef_/model.coef_[0,2], model.intercept_/model.coef_[0,2]
Out[17]:
(array([[-1.98337552, -5.79258857,  1.        ,  2.93210972]]),
 array([3.32090155]))
In [18]:
scale = 300
xmax = X_new[:,0].max()+0.1
xmin = X_new[:,0].min()-.1
ymax = X_new[:,1].max()+.1
ymin = X_new[:,1].min()-.1

xx = np.linspace(xmin,xmax,scale)
yy = np.linspace(ymin,ymax,scale)
data1, data2 = np.meshgrid(xx,yy)
X_grid = np.c_[data1.ravel(), data2.ravel()]
X_grid = np.c_[X_grid, X_grid[:,0]**2, X_grid[:,1]**2]

pred_y = model.predict(X_grid)

CS = plt.imshow(pred_y.reshape(scale,scale), interpolation=None, origin='lower',
                extent=[xmin,xmax,ymin,ymax], alpha=0.3, cmap='gray_r')

# draw X_train
plt.scatter(X[:,0], X[:,1], c=y, s=60, cmap='autumn')

plt.colorbar(CS)
Out[18]:
<matplotlib.colorbar.Colorbar at 0x25770b88c18>
In [19]:
X, y = make_circles(factor=0.5, noise=0.1) # factor = R1/R2, noise: svd
print(X.shape, y)

X = X * [1,0.5]
v = 2**(-0.5)
X = np.dot(X, np.array([[v,v],[-v,v]]))

plt.scatter(X[:,0], X[:,1], c=y, cmap='autumn')
plt.colorbar(shrink=0.7)
plt.vlines([0],-1,1,linestyles='dotted')
plt.hlines([0],-1,1,linestyles='dotted')
(100, 2) [0 0 1 1 0 0 0 1 0 0 1 1 1 0 0 1 0 1 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 1 1 1 1
 0 0 1 0 1 1 1 0 1 0 1 1 0 0 0 1 0 1 1 1 0 0 1 0 0 1 1 1 0 1 0 1 1 1 1 0 1
 0 1 0 1 0 1 0 0 0 0 1 1 0 1 1 0 1 0 0 1 1 0 0 1 1 1]
Out[19]:
<matplotlib.collections.LineCollection at 0x25770c2ea20>

xy 속성 추가, 각을 변환하여 축에 맞추고 찌그려서 원을 만든후에 구해도 된다.

In [20]:
X_new = np.c_[X, X[:,0]*X[:,1], X[:,0]**2, X[:,1]**2] # x, y, xy, x^2, y^2

model = LinearSVC(C=10)
model.fit(X_new, y)

score = model.score(X_new,y)
display(score)
0.99
In [21]:
model.coef_, model.intercept_
Out[21]:
(array([[ 0.15368078,  0.06687668,  6.85713566, -5.82627634, -7.0758161 ]]),
 array([1.65904614]))
In [22]:
scale = 300
xmax = X_new[:,0].max()+0.1
xmin = X_new[:,0].min()-.1
ymax = X_new[:,1].max()+.1
ymin = X_new[:,1].min()-.1

xx = np.linspace(xmin,xmax,scale)
yy = np.linspace(ymin,ymax,scale)
data1, data2 = np.meshgrid(xx,yy)
X_grid = np.c_[data1.ravel(), data2.ravel()]
X_grid = np.c_[X_grid, X_grid[:,0]*X_grid[:,1], X_grid[:,0]**2, X_grid[:,1]**2]

pred_y = model.predict(X_grid)

CS = plt.imshow(pred_y.reshape(scale,scale), interpolation=None, origin='lower',
                extent=[xmin,xmax,ymin,ymax], alpha=0.3, cmap='Greens')

plt.vlines([0],-1,1,linestyles='dotted')
plt.hlines([0],-1,1,linestyles='dotted')
# draw X_train
plt.scatter(X[:,0], X[:,1], c=y, s=60, cmap='autumn')

plt.colorbar(CS)
Out[22]:
<matplotlib.colorbar.Colorbar at 0x25770cbcf28>
  • 선형 SVM 에서 다루었던 예제를 가지고 커널 SVM 을 적용해 보겠습니다.
In [23]:
from sklearn.datasets import load_iris
from sklearn.svm import SVC

iris = load_iris()

col1 = 0
col2 = 1

X = iris.data[:,[col1,col2]] # 시각화를 위해 속성 2개만 선정
y = iris.target

# 학습용/테스트용 데이터 분리
X_train, X_test, y_train, y_test = train_test_split(X, y)
X_train.shape, X_test.shape, y_train.shape, y_test.shape

# 모델 정의
model = SVC()

# 학습시키기
model.fit(X_train, y_train)

# 평가하기
score1 = model.score(X_train, y_train)
score2 = model.score(X_test, y_test)
display(score1, score2)
0.8214285714285714
0.7631578947368421
In [24]:
import mglearn

plt.figure(figsize=[10,8])
mglearn.plots.plot_2d_classification(model, X_train, cm='spring')
mglearn.discrete_scatter(X_train[:,0], X_train[:,1], y_train)
Out[24]:
[<matplotlib.lines.Line2D at 0x2577175b710>,
 <matplotlib.lines.Line2D at 0x2577175b828>,
 <matplotlib.lines.Line2D at 0x2577175bcc0>]

C값을 변화시켜 보자.

In [25]:
# 모델 정의
model = SVC(C=0.05) # C값에 민감하게 그래프가 변한다.

# 학습시키기
model.fit(X_train, y_train)

# 평가하기
score = model.score(X_test, y_test)
display(score)

plt.figure(figsize=[10,8])
mglearn.plots.plot_2d_classification(model, X_train, eps=0.5, cm='spring')
mglearn.discrete_scatter(X_test[:,0], X_test[:,1], y_test)
0.5789473684210527
Out[25]:
[<matplotlib.lines.Line2D at 0x2577179e048>,
 <matplotlib.lines.Line2D at 0x2577179e160>,
 <matplotlib.lines.Line2D at 0x2577179e668>]

gamma값을 변화시켜 보자.

In [26]:
# 모델 정의
model = SVC(gamma=10)

# 학습시키기
model.fit(X_train, y_train)

# 평가하기
score = model.score(X_test, y_test)
display(score)

plt.figure(figsize=[10,8])
mglearn.plots.plot_2d_classification(model, X_train, eps=0.5, cm='spring')
mglearn.discrete_scatter(X_test[:,0], X_test[:,1], y_test)
0.7894736842105263
Out[26]:
[<matplotlib.lines.Line2D at 0x257718ea518>,
 <matplotlib.lines.Line2D at 0x257718ea630>,
 <matplotlib.lines.Line2D at 0x257718eab38>]
In [27]:
# 모델 정의
model = SVC(gamma=100) # gamma값이 증가할수록 과적합 되어버린다. 뾰족 뾰족 하게 해서 자른다.

# 학습시키기
model.fit(X_train, y_train)

# 평가하기
score = model.score(X_test, y_test)
display(score)

plt.figure(figsize=[10,8])
mglearn.plots.plot_2d_classification(model, X_train, eps=0.5, cm='spring')
mglearn.discrete_scatter(X_test[:,0], X_test[:,1], y_test)
0.6842105263157895
Out[27]:
[<matplotlib.lines.Line2D at 0x25771a589b0>,
 <matplotlib.lines.Line2D at 0x25771a58ac8>,
 <matplotlib.lines.Line2D at 0x25771a58fd0>]
In [5]:
scale = 300
xmax = X_train[:,0].max()+1
xmin = X_train[:,0].min()-1
ymax = X_train[:,1].max()+1
ymin = X_train[:,1].min()-1

xx = np.linspace(xmin,xmax,scale)
yy = np.linspace(ymin,ymax,scale)
data1, data2 = np.meshgrid(xx,yy)
X_grid = np.c_[data1.ravel(), data2.ravel()]
pred_y = model.predict(X_grid)

fig=plt.figure(figsize=[12,10])

CS = plt.imshow(pred_y.reshape(scale,scale), interpolation=None, origin='lower',
                extent=[xmin,xmax,ymin,ymax], alpha=0.3, cmap='gray_r')

# draw X_train
plt.scatter(X_train[:,0], X_train[:,1], c=y_train, s=60)

plt.xlabel(iris.feature_names[col1])
plt.ylabel(iris.feature_names[col2])
plt.colorbar(CS, shrink=0.3)
plt.title('Kernel SVC - Iris',fontsize=20)
Out[5]:
Text(0.5,1,'Kernel SVC - Iris')
  • 결과 그림을 그리는 함수를 정의하고, 이를 이용해 테스트세트를 표시해 보겠습니다.
In [24]:
def draw_result_map(X, y, model, cols=['',''], title=''):
    scale = 300
    xmax = X[:,0].max()+1
    xmin = X[:,0].min()-1
    ymax = X[:,1].max()+1
    ymin = X[:,1].min()-1

    xx = np.linspace(xmin,xmax,scale)
    yy = np.linspace(ymin,ymax,scale)
    data1, data2 = np.meshgrid(xx,yy)
    X_grid = np.c_[data1.ravel(), data2.ravel()]
    
    pred_y = model.predict(X_grid)

    fig=plt.figure(figsize=[12,10])

    CS = plt.imshow(pred_y.reshape(scale,scale), interpolation=None, origin='lower',
                    extent=[xmin,xmax,ymin,ymax], alpha=0.3, cmap='gray_r')

    # draw X
    plt.scatter(X[:,0], X[:,1], c=y, s=60)

    plt.xlabel(cols[0])
    plt.ylabel(cols[1])
    plt.colorbar(CS, shrink=0.3)
    plt.title(title,fontsize=20)
    
draw_result_map(X_test, y_test, model, [iris.feature_names[col1],iris.feature_names[col2]],'Kernel SVC - Iris')
In [20]:
model
Out[20]:
SVC(C=1, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovr', degree=3, gamma=10, kernel='rbf',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False)
In [21]:
help(model)
Help on SVC in module sklearn.svm.classes object:

class SVC(sklearn.svm.base.BaseSVC)
 |  SVC(C=1.0, kernel='rbf', degree=3, gamma='auto', coef0=0.0, shrinking=True, probability=False, tol=0.001, cache_size=200, class_weight=None, verbose=False, max_iter=-1, decision_function_shape='ovr', random_state=None)
 |  
 |  C-Support Vector Classification.
 |  
 |  The implementation is based on libsvm. The fit time complexity
 |  is more than quadratic with the number of samples which makes it hard
 |  to scale to dataset with more than a couple of 10000 samples.
 |  
 |  The multiclass support is handled according to a one-vs-one scheme.
 |  
 |  For details on the precise mathematical formulation of the provided
 |  kernel functions and how `gamma`, `coef0` and `degree` affect each
 |  other, see the corresponding section in the narrative documentation:
 |  :ref:`svm_kernels`.
 |  
 |  Read more in the :ref:`User Guide <svm_classification>`.
 |  
 |  Parameters
 |  ----------
 |  C : float, optional (default=1.0)
 |      Penalty parameter C of the error term.
 |  
 |  kernel : string, optional (default='rbf')
 |       Specifies the kernel type to be used in the algorithm.
 |       It must be one of 'linear', 'poly', 'rbf', 'sigmoid', 'precomputed' or
 |       a callable.
 |       If none is given, 'rbf' will be used. If a callable is given it is
 |       used to pre-compute the kernel matrix from data matrices; that matrix
 |       should be an array of shape ``(n_samples, n_samples)``.
 |  
 |  degree : int, optional (default=3)
 |      Degree of the polynomial kernel function ('poly').
 |      Ignored by all other kernels.
 |  
 |  gamma : float, optional (default='auto')
 |      Kernel coefficient for 'rbf', 'poly' and 'sigmoid'.
 |      If gamma is 'auto' then 1/n_features will be used instead.
 |  
 |  coef0 : float, optional (default=0.0)
 |      Independent term in kernel function.
 |      It is only significant in 'poly' and 'sigmoid'.
 |  
 |  probability : boolean, optional (default=False)
 |      Whether to enable probability estimates. This must be enabled prior
 |      to calling `fit`, and will slow down that method.
 |  
 |  shrinking : boolean, optional (default=True)
 |      Whether to use the shrinking heuristic.
 |  
 |  tol : float, optional (default=1e-3)
 |      Tolerance for stopping criterion.
 |  
 |  cache_size : float, optional
 |      Specify the size of the kernel cache (in MB).
 |  
 |  class_weight : {dict, 'balanced'}, optional
 |      Set the parameter C of class i to class_weight[i]*C for
 |      SVC. If not given, all classes are supposed to have
 |      weight one.
 |      The "balanced" mode uses the values of y to automatically adjust
 |      weights inversely proportional to class frequencies in the input data
 |      as ``n_samples / (n_classes * np.bincount(y))``
 |  
 |  verbose : bool, default: False
 |      Enable verbose output. Note that this setting takes advantage of a
 |      per-process runtime setting in libsvm that, if enabled, may not work
 |      properly in a multithreaded context.
 |  
 |  max_iter : int, optional (default=-1)
 |      Hard limit on iterations within solver, or -1 for no limit.
 |  
 |  decision_function_shape : 'ovo', 'ovr', default='ovr'
 |      Whether to return a one-vs-rest ('ovr') decision function of shape
 |      (n_samples, n_classes) as all other classifiers, or the original
 |      one-vs-one ('ovo') decision function of libsvm which has shape
 |      (n_samples, n_classes * (n_classes - 1) / 2).
 |  
 |      .. versionchanged:: 0.19
 |          decision_function_shape is 'ovr' by default.
 |  
 |      .. versionadded:: 0.17
 |         *decision_function_shape='ovr'* is recommended.
 |  
 |      .. versionchanged:: 0.17
 |         Deprecated *decision_function_shape='ovo' and None*.
 |  
 |  random_state : int, RandomState instance or None, optional (default=None)
 |      The seed of the pseudo random number generator to use when shuffling
 |      the data.  If int, random_state is the seed used by the random number
 |      generator; If RandomState instance, random_state is the random number
 |      generator; If None, the random number generator is the RandomState
 |      instance used by `np.random`.
 |  
 |  Attributes
 |  ----------
 |  support_ : array-like, shape = [n_SV]
 |      Indices of support vectors.
 |  
 |  support_vectors_ : array-like, shape = [n_SV, n_features]
 |      Support vectors.
 |  
 |  n_support_ : array-like, dtype=int32, shape = [n_class]
 |      Number of support vectors for each class.
 |  
 |  dual_coef_ : array, shape = [n_class-1, n_SV]
 |      Coefficients of the support vector in the decision function.
 |      For multiclass, coefficient for all 1-vs-1 classifiers.
 |      The layout of the coefficients in the multiclass case is somewhat
 |      non-trivial. See the section about multi-class classification in the
 |      SVM section of the User Guide for details.
 |  
 |  coef_ : array, shape = [n_class-1, n_features]
 |      Weights assigned to the features (coefficients in the primal
 |      problem). This is only available in the case of a linear kernel.
 |  
 |      `coef_` is a readonly property derived from `dual_coef_` and
 |      `support_vectors_`.
 |  
 |  intercept_ : array, shape = [n_class * (n_class-1) / 2]
 |      Constants in decision function.
 |  
 |  Examples
 |  --------
 |  >>> import numpy as np
 |  >>> X = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]])
 |  >>> y = np.array([1, 1, 2, 2])
 |  >>> from sklearn.svm import SVC
 |  >>> clf = SVC()
 |  >>> clf.fit(X, y) #doctest: +NORMALIZE_WHITESPACE
 |  SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
 |      decision_function_shape='ovr', degree=3, gamma='auto', kernel='rbf',
 |      max_iter=-1, probability=False, random_state=None, shrinking=True,
 |      tol=0.001, verbose=False)
 |  >>> print(clf.predict([[-0.8, -1]]))
 |  [1]
 |  
 |  See also
 |  --------
 |  SVR
 |      Support Vector Machine for Regression implemented using libsvm.
 |  
 |  LinearSVC
 |      Scalable Linear Support Vector Machine for classification
 |      implemented using liblinear. Check the See also section of
 |      LinearSVC for more comparison element.
 |  
 |  Method resolution order:
 |      SVC
 |      sklearn.svm.base.BaseSVC
 |      abc.NewBase
 |      sklearn.svm.base.BaseLibSVM
 |      abc.NewBase
 |      sklearn.base.BaseEstimator
 |      sklearn.base.ClassifierMixin
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __init__(self, C=1.0, kernel='rbf', degree=3, gamma='auto', coef0=0.0, shrinking=True, probability=False, tol=0.001, cache_size=200, class_weight=None, verbose=False, max_iter=-1, decision_function_shape='ovr', random_state=None)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  __abstractmethods__ = frozenset()
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from sklearn.svm.base.BaseSVC:
 |  
 |  decision_function(self, X)
 |      Distance of the samples X to the separating hyperplane.
 |      
 |      Parameters
 |      ----------
 |      X : array-like, shape (n_samples, n_features)
 |      
 |      Returns
 |      -------
 |      X : array-like, shape (n_samples, n_classes * (n_classes-1) / 2)
 |          Returns the decision function of the sample for each class
 |          in the model.
 |          If decision_function_shape='ovr', the shape is (n_samples,
 |          n_classes)
 |  
 |  predict(self, X)
 |      Perform classification on samples in X.
 |      
 |      For an one-class model, +1 or -1 is returned.
 |      
 |      Parameters
 |      ----------
 |      X : {array-like, sparse matrix}, shape (n_samples, n_features)
 |          For kernel="precomputed", the expected shape of X is
 |          [n_samples_test, n_samples_train]
 |      
 |      Returns
 |      -------
 |      y_pred : array, shape (n_samples,)
 |          Class labels for samples in X.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from sklearn.svm.base.BaseSVC:
 |  
 |  predict_log_proba
 |      Compute log probabilities of possible outcomes for samples in X.
 |      
 |      The model need to have probability information computed at training
 |      time: fit with attribute `probability` set to True.
 |      
 |      Parameters
 |      ----------
 |      X : array-like, shape (n_samples, n_features)
 |          For kernel="precomputed", the expected shape of X is
 |          [n_samples_test, n_samples_train]
 |      
 |      Returns
 |      -------
 |      T : array-like, shape (n_samples, n_classes)
 |          Returns the log-probabilities of the sample for each class in
 |          the model. The columns correspond to the classes in sorted
 |          order, as they appear in the attribute `classes_`.
 |      
 |      Notes
 |      -----
 |      The probability model is created using cross validation, so
 |      the results can be slightly different than those obtained by
 |      predict. Also, it will produce meaningless results on very small
 |      datasets.
 |  
 |  predict_proba
 |      Compute probabilities of possible outcomes for samples in X.
 |      
 |      The model need to have probability information computed at training
 |      time: fit with attribute `probability` set to True.
 |      
 |      Parameters
 |      ----------
 |      X : array-like, shape (n_samples, n_features)
 |          For kernel="precomputed", the expected shape of X is
 |          [n_samples_test, n_samples_train]
 |      
 |      Returns
 |      -------
 |      T : array-like, shape (n_samples, n_classes)
 |          Returns the probability of the sample for each class in
 |          the model. The columns correspond to the classes in sorted
 |          order, as they appear in the attribute `classes_`.
 |      
 |      Notes
 |      -----
 |      The probability model is created using cross validation, so
 |      the results can be slightly different than those obtained by
 |      predict. Also, it will produce meaningless results on very small
 |      datasets.
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from sklearn.svm.base.BaseLibSVM:
 |  
 |  fit(self, X, y, sample_weight=None)
 |      Fit the SVM model according to the given training data.
 |      
 |      Parameters
 |      ----------
 |      X : {array-like, sparse matrix}, shape (n_samples, n_features)
 |          Training vectors, where n_samples is the number of samples
 |          and n_features is the number of features.
 |          For kernel="precomputed", the expected shape of X is
 |          (n_samples, n_samples).
 |      
 |      y : array-like, shape (n_samples,)
 |          Target values (class labels in classification, real numbers in
 |          regression)
 |      
 |      sample_weight : array-like, shape (n_samples,)
 |          Per-sample weights. Rescale C per sample. Higher weights
 |          force the classifier to put more emphasis on these points.
 |      
 |      Returns
 |      -------
 |      self : object
 |          Returns self.
 |      
 |      Notes
 |      ------
 |      If X and y are not C-ordered and contiguous arrays of np.float64 and
 |      X is not a scipy.sparse.csr_matrix, X and/or y may be copied.
 |      
 |      If X is a dense array, then the other methods will not support sparse
 |      matrices as input.
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from sklearn.svm.base.BaseLibSVM:
 |  
 |  coef_
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from sklearn.base.BaseEstimator:
 |  
 |  __getstate__(self)
 |  
 |  __repr__(self)
 |      Return repr(self).
 |  
 |  __setstate__(self, state)
 |  
 |  get_params(self, deep=True)
 |      Get parameters for this estimator.
 |      
 |      Parameters
 |      ----------
 |      deep : boolean, optional
 |          If True, will return the parameters for this estimator and
 |          contained subobjects that are estimators.
 |      
 |      Returns
 |      -------
 |      params : mapping of string to any
 |          Parameter names mapped to their values.
 |  
 |  set_params(self, **params)
 |      Set the parameters of this estimator.
 |      
 |      The method works on simple estimators as well as on nested objects
 |      (such as pipelines). The latter have parameters of the form
 |      ``<component>__<parameter>`` so that it's possible to update each
 |      component of a nested object.
 |      
 |      Returns
 |      -------
 |      self
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from sklearn.base.BaseEstimator:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from sklearn.base.ClassifierMixin:
 |  
 |  score(self, X, y, sample_weight=None)
 |      Returns the mean accuracy on the given test data and labels.
 |      
 |      In multi-label classification, this is the subset accuracy
 |      which is a harsh metric since you require for each sample that
 |      each label set be correctly predicted.
 |      
 |      Parameters
 |      ----------
 |      X : array-like, shape = (n_samples, n_features)
 |          Test samples.
 |      
 |      y : array-like, shape = (n_samples) or (n_samples, n_outputs)
 |          True labels for X.
 |      
 |      sample_weight : array-like, shape = [n_samples], optional
 |          Sample weights.
 |      
 |      Returns
 |      -------
 |      score : float
 |          Mean accuracy of self.predict(X) wrt. y.

probability 디폴트는 false, true라고 주면 등고선이 그려진다. 경계선에서 얼마나 떨어져있나.

  • SVC 의 옵션 중에서 중요한 것은 kernel, C, gamma 입니다.

    |  C : float, optional (default=1.0)
    |      Penalty parameter C of the error term.
    |  
    |  kernel : string, optional (default='rbf')
    |       Specifies the kernel type to be used in the algorithm.
    |       It must be one of 'linear', 'poly', 'rbf', 'sigmoid', 'precomputed' or
    |       a callable.
    |       If none is given, 'rbf' will be used. If a callable is given it is
    |       used to pre-compute the kernel matrix from data matrices; that matrix
    |       should be an array of shape ``(n_samples, n_samples)``.
    |  
    |  gamma : float, optional (default='auto')
    |      Kernel coefficient for 'rbf', 'poly' and 'sigmoid'.
    |      If gamma is 'auto' then 1/n_features will be used instead.
  • kernel 은 어떤 알고리즘을 적용할지를 지정하는 것입니다. 기본값인 rbf 는 radial basis function 입니다. (https://en.wikipedia.org/wiki/Radial_basis_function 참고) linear 는 선형 SVM 이고, poly 는 다항식 속성을 추가하는 것입니다.

  • C 값은 훈련시 잘못 분류한 점들에 대해 벌점(penalty)을 부여하는 것으로서, C 가 커지면 벌점이 커지므로 최대한 훈련데이터에 맞추려고 노력합니다. 그러므로 C가 커지면 과대적합 경향이 생깁니다.
  • gamma 값은 rbf 에서 종모양 형태의 곡면 폭을 지정하는 것입니다. gamma 가 크면 날카로운 종모양이 되므로 훈련데이터를 아주 세밀하게 잘라낼 수 있습니다.

    $ distance = exp(-\gamma \lVert x_1 - x_2 \rVert^2) $

'beginner > 파이썬 머신러닝 기초' 카테고리의 다른 글

지도학습 - 결정트리  (0) 2019.03.12
유방암 데이터 분석 by SVM  (0) 2019.03.12
유방암 데이터 분석  (3) 2019.03.05
지도학습 - 로지스틱회귀  (0) 2019.03.05
지도학습 - LinearSVM_2  (0) 2019.03.05
공지사항
최근에 올라온 글
최근에 달린 댓글
Total
Today
Yesterday
링크
«   2024/12   »
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31
글 보관함