티스토리 뷰

2장_16_비선형회귀

Introduction to Machine Learning with Python

Chapter 2. 지도학습


비선형 회귀

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
In [3]:
from sklearn.datasets import load_breast_cancer

cancer = load_breast_cancer()
In [4]:
col1 = 0
col2 = 6

plt.scatter(cancer.data[:,col1], cancer.data[:,col2], c= cancer.target, alpha=0.3)
Out[4]:
<matplotlib.collections.PathCollection at 0x213eec4f9e8>

선형회귀

In [5]:
from sklearn.linear_model import LinearRegression

model = LinearRegression()
model.fit(cancer.data[:,[col1]], cancer.data[:,col2])

model.coef_, model.intercept_
Out[5]:
(array([0.01530951]), -0.1274825984768875)
In [6]:
xs = np.arange(5,30,0.1)
ys = xs*model.coef_[0] + model.intercept_

plt.scatter(cancer.data[:,col1], cancer.data[:,col2], c= cancer.target, alpha=0.3)
plt.plot(xs,ys,'r-',lw=3)
Out[6]:
[<matplotlib.lines.Line2D at 0x16cccd074a8>]

2차식 회귀

In [11]:
X = np.c_[cancer.data[:,col1], (cancer.data[:,col1])**2]
y = cancer.data[:,col2]

model = LinearRegression()
model.fit(X, y)

model.coef_, model.intercept_
Out[11]:
(array([-0.00546943,  0.00065831]), 0.026519913885121926)
In [14]:
xs = np.arange(0,30,0.1)
ys = xs*model.coef_[0] + (xs**2)*model.coef_[1] + model.intercept_

plt.scatter(cancer.data[:,col1], cancer.data[:,col2], c= cancer.target, alpha=0.3)
plt.plot(xs,ys,'r-',lw=3)
Out[14]:
[<matplotlib.lines.Line2D at 0x16ccce7ada0>]

3차식 회귀

In [6]:
X = np.c_[cancer.data[:,col1], (cancer.data[:,col1])**2, (cancer.data[:,col1])**3]
y = cancer.data[:,col2]

model = LinearRegression()
model.fit(X, y)

model.coef_, model.intercept_
Out[6]:
(array([-5.53456968e-02,  3.81149612e-03, -6.31326376e-05]),
 0.27649793237737497)
In [8]:
xs = np.arange(-10,50,0.1)
ys = xs*model.coef_[0] + (xs**2)*model.coef_[1] + (xs**3)*model.coef_[2] + model.intercept_

plt.scatter(cancer.data[:,col1], cancer.data[:,col2], c= cancer.target, alpha=0.3)
plt.plot(xs,ys,'r-',lw=3)
Out[8]:
[<matplotlib.lines.Line2D at 0x213f0de64e0>]

지수함수 회귀 (Exponential Regression)

$$ y = exp(ax + b) $$ $$ log(y) = ax + b $$

In [17]:
X = cancer.data[:,[col1]]
y = np.log(cancer.data[:,col2]+0.01) # 값이 0인 데이터가 있어 0.01 을 더해줌

model = LinearRegression()
model.fit(X, y)

model.coef_, model.intercept_
Out[17]:
(array([0.1579534]), -4.875933569557782)
In [18]:
xs = np.arange(0,30,0.1)
ys = np.exp(xs*model.coef_[0] + model.intercept_) - 0.01

plt.scatter(cancer.data[:,col1], cancer.data[:,col2], c= cancer.target, alpha=0.3)
plt.plot(xs,ys,'r-',lw=3)
Out[18]:
[<matplotlib.lines.Line2D at 0x16cccf7e208>]

멱함수 회귀 (Power Law Regression)

$$ y = a \cdot x^n $$ $$ log(y) = n \cdot log(x) + log(a) $$

In [20]:
X = np.log(cancer.data[:,[col1]])
y = np.log(cancer.data[:,col2]+0.01) # 값이 0인 데이터가 있어 0.01 을 더해줌

model = LinearRegression()
model.fit(X, y)

model.coef_, model.intercept_
Out[20]:
(array([2.321263]), -8.724171466884766)
In [21]:
xs = np.arange(0,30,0.1)
ys = np.exp(model.intercept_) * (xs**model.coef_[0]) - 0.01

plt.scatter(cancer.data[:,col1], cancer.data[:,col2], c= cancer.target, alpha=0.3)
plt.plot(xs,ys,'r-',lw=3)
Out[21]:
[<matplotlib.lines.Line2D at 0x16cccfc5908>]

멱함수와 복잡계

  • 샘플을 만들어 적용해 보자 (n = -3/4 인 경우)
  • 지진강도 대비 어떤 지역의 지진이 일어나는 횟수로 생각해 보자
In [10]:
xs = np.random.uniform(0.1, 2, size=100)
ys = 10 * (xs**(-3/4)) + np.random.normal(0, 0.5, size=len(xs)) # n=-3/4, a = 10
In [11]:
plt.scatter(xs, ys, alpha=0.3)
plt.title('Earthquake report', fontsize=30)
plt.xlabel('Power')
plt.ylabel('Count')
Out[11]:
Text(0,0.5,'Count')
In [13]:
plt.scatter(np.log(xs), np.log(ys))
plt.axis('equal')
Out[13]:
(-2.4304567152063568,
 0.8418633354476694,
 1.5257218394123955,
 4.134815453302616)
In [14]:
X = np.log(xs).reshape(-1,1)
y = np.log(ys)

model = LinearRegression()
model.fit(X, y)

model.coef_, model.intercept_ # n = model.coef_[0], a = exp(model.intercept_)
Out[14]:
(array([-0.75247268]), 2.3058752643130114)
In [68]:
input_x = np.arange(0.1,2.2,0.05)
pred_y = np.exp(model.intercept_) * (input_x**model.coef_[0])

plt.scatter(xs, ys, alpha=0.3)
plt.plot(input_x,pred_y,'r-')
Out[68]:
[<matplotlib.lines.Line2D at 0x16cce8f8d30>]

'beginner > 파이썬 머신러닝 기초' 카테고리의 다른 글

비지도학습-군집  (0) 2019.04.08
비지도학습-스케일  (0) 2019.04.08
지도학습-릿지와 라쏘  (0) 2019.04.02
지도학습-경사하강법  (0) 2019.04.01
지도학습-나이브베이즈  (0) 2019.03.30
공지사항
최근에 올라온 글
최근에 달린 댓글
Total
Today
Yesterday
링크
«   2024/05   »
1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31
글 보관함