首页 > 解决方案 > 如何使用python apriori解析房屋投票二进制数据集?

问题描述

我想使用先验分析房屋投票 84 数据集。共有17列,第一列是“党”,这是两个分类数据。其余列是二进制数据集。如何在 python 中应用 apriori 来解决它?minsup= 0.3 和 minconfidence= 0.9

[在此处输入图片描述][1]

这些是我的代码:输出看起来丑陋且不合理。

import matplotlib.pyplot as plt
from sklearn import datasets
import pandas as pd
import numpy as np
import sys
import os  

from apyori import apriori 
from mlxtend.frequent_patterns import apriori
from efficient_apriori import apriori
from mlxtend.frequent_patterns import association_rules
from mlxtend.preprocessing import TransactionEncoder

df = pd.read_table("house-votes-84.data", sep=",", header=None, 
na_values="?")
col_names = ['party', 'infants', 'water', 'budget', 'physician', 
'salvador','religious', 'satellite', 'aid', 'missile', 'immigration', 
'synfuels','education', 'superfund', 'crime', 'duty_free_exports', 
'eaa_rsa']
df = df.fillna(0)
df.columns = col_names
df.shape
print(df.head())

df = df.replace({'y': 1, 'n': -1, '?': 0})
print(df.head()) 

records = []  
for i in range(0, 435):  
records.append([str(df.values[i,j]) for j in range(0, 16)])

association_rules = apriori(records, min_support=0.3, min_confidence=0.9)  
association_results = list(association_rules) 
print(len(association_rules)) 
print(association_rules[0])  `enter code here

输出:

{1: {('-1',): 433, ('0',): 154, ('1',): 434, ('democrat',): 267, ('republican',): 168} , 2: {('-1', '0'): 152, ('-1', '1'): 433, ('-1', '民主'): 266, ('-1', '共和党人'):167,('0','1'):153,('1','民主'):267,('1','共和党'):167},3:{('-1 ', '0', '1'): 152, ('-1', '1', 'democrat'): 266, ('-1', '1', 'republican'): 167}}

标签: pythonapriori

解决方案


apriorifrom 函数efficient_apriori返回一个元组(itemsets, rules)。要使用efficient_apriori,您可以执行以下操作:

from efficient_apriori import apriori
itemsets, rules = apriori(records, min_support=0.3, min_confidence=0.9)
for rule in rules:
    print(rule)

有关详细信息,请参阅此示例


推荐阅读