Sklearn学习-iris数据集学习-Toy模板网

这篇具有很好参考价值的文章主要介绍了Sklearn学习-iris数据集学习。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

使用load_iris加载数据集，查看包含的keys

知识点
sklearn的数据集在datasets模块，自带的数据集以"load_"开头
加载的iris数据集是可以字典类型使用
对应的Keys包含 [‘data’, ‘target’, ‘frame’, ‘target_names’, ‘DESCR’, ‘feature_names’, ‘filename’]

from sklearn.datasets import load_iris

# 1-load_iris加载数据集
iris = load_iris()

# 查看包含的keys
iris.keys()

dict_keys(['data', 'target', 'frame', 'target_names', 'DESCR', 'feature_names', 'filename'])

iris['filename']

'D:\\Anaconda3\\lib\\site-packages\\sklearn\\datasets\\data\\iris.csv'

查看数据的列名，分类目标的名称

“feature_names”:表示数据data每列的特征值的名称
“target_names”:分类目标对应的名称

# 查看数据的列名，分类目标的名称
iris['feature_names']

['sepal length (cm)',
 'sepal width (cm)',
 'petal length (cm)',
 'petal width (cm)']

# DataFrame
import pandas as pd

pd.DataFrame(data=iris['data'],columns=iris['feature_names'])

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2
...	...	...	...	...
145	6.7	3.0	5.2	2.3
146	6.3	2.5	5.0	1.9
147	6.5	3.0	5.2	2.0
148	6.2	3.4	5.4	2.3
149	5.9	3.0	5.1	1.8

获取data和target，并打印各自的shape

通过字典的键值可以直接获取到对应的数据
data和target对应的数据类型是numpy的ndarry类型，可以用shape获取其大小

# 分类目标的名称
iris['target_names']

array(['setosa', 'versicolor', 'virginica'], dtype='<U10')

# 获取data和target，并打印各自的shape

data = iris['data']
print(type(data),data.shape)
target = iris['target']
print(type(target),target.shape)

<class 'numpy.ndarray'> (150, 4)
<class 'numpy.ndarray'> (150,)

拆分训练集和测试集

在model_selection模块中使用train_test_split对数据集进行训练集和测试集的划分

from sklearn.model_selection import train_test_split

'''
第一个参数：数据集
第二个参数：目标集
第三个参数：测试集所占比例
'''
data_train,data_test,target_train,target_test = \
train_test_split(data,target,test_size=0.3)

data_train.shape

(105, 4)

使用逻辑回归训练。在测试集上计算准确率

使用模型：linear_modeld的LogisticRegression
步骤：

导入模块linear_modeld.LogisticRegression

初始化模型 LogisticRegression()

训练fit()

查看分数（效果）score()

from sklearn.linear_model import LogisticRegression

model = LogisticRegression(max_iter=1000) # 定义最大迭代次数

model.fit(data_train,target_train) # 进行训练

LogisticRegression(max_iter=1000)

# 查看在训练集上评分
model.score(data_train,target_train)

0.9619047619047619

# 查看在测试集上的评分
model.score(data_test,target_test)

0.9555555555555556

在测试集上实现预测

预测，使用模型提供的predict方法执行预测

LogisticRegression(max_iter=1000)
target_predict = model.predict(data_test)

import pandas as pd
df = pd.DataFrame(target_predict,columns=["预测结果"])

df['实际结果'] = target_test
df.shape #(45, 2)

输出和理解混淆矩阵

衡量预测结果的好坏
使用metrics.confusion_matrix

from sklearn.metrics import confusion_matrix

# 输出混淆矩阵
confusion_matrix(target_test,target_predict)

array([[13,  0,  0],
       [ 0, 14,  1],
       [ 0,  1, 16]], dtype=int64)

# 查看分类错误的数据
df.loc[df['实际结果']==0]

	预测结果	实际结果
0	0	0
2	0	0
11	0	0
13	0	0
18	0	0
20	0	0
22	0	0
25	0	0
31	0	0
33	0	0
39	0	0
40	0	0
44	0	0

输出和理解分类报告

from sklearn.metrics import classification_report

# 输出混淆矩阵
print(classification_report(target_test,target_predict,
                            target_names=iris['target_names']))

              precision    recall  f1-score   support

      setosa       1.00      1.00      1.00        13
  versicolor       0.93      0.93      0.93        15
   virginica       0.94      0.94      0.94        17

    accuracy                           0.96        45
   macro avg       0.96      0.96      0.96        45
weighted avg       0.96      0.96      0.96        45

文章来源地址https://www.toymoban.com/news/detail-625607.html

到了这里，关于Sklearn学习-iris数据集学习的文章就介绍完了。如果您还想了解更多内容，请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章，希望大家以后多多支持TOY模板网！

Sklearn学习-iris数据集学习

使用load_iris加载数据集，查看包含的keys

查看数据的列名，分类目标的名称

获取data和target，并打印各自的shape

拆分训练集和测试集

使用逻辑回归训练。在测试集上计算准确率

在测试集上实现预测

输出和理解混淆矩阵

输出和理解分类报告

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

微信扫一扫打赏

支付宝扫一扫领取红包，优惠每天领

二维码1

二维码2

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2
...	...	...	...	...
145	6.7	3.0	5.2	2.3
146	6.3	2.5	5.0	1.9
147	6.5	3.0	5.2	2.0
148	6.2	3.4	5.4	2.3
149	5.9	3.0	5.1	1.8

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2
...	...	...	...	...
145	6.7	3.0	5.2	2.3
146	6.3	2.5	5.0	1.9
147	6.5	3.0	5.2	2.0
148	6.2	3.4	5.4	2.3
149	5.9	3.0	5.1	1.8

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2
...	...	...	...	...
145	6.7	3.0	5.2	2.3
146	6.3	2.5	5.0	1.9
147	6.5	3.0	5.2	2.0
148	6.2	3.4	5.4	2.3
149	5.9	3.0	5.1	1.8