python - 如何导出估算器 tf.estimator.DNNClassifier
问题描述
大家好,这是我的代码,我仍然是使用 tensorflow 的初学者,这是我的代码,我正在尝试运行文本分类 DNN,直到现在一切都很好。我想保存我的模型并导入它,这样我就可以用它来预测新值,但我不知道该怎么做。
让您对正在尝试做的事情有一个大致的了解。我有 2 个文件夹(培训和测试)每个文件夹都有(4 个文件夹(分类类别))
import tensorflow as tf
import tensorflow_hub as hub
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd
import re
import seaborn as sns
import logging
print("Loading all files from directory ...")
# Load all files from a directory in a DataFrame.
def load_directory_data(directory):
data = {}
data["sentence"] = []
data["tnemitnes"] = []
print("getting in a loop")
for file_path in os.listdir(directory):
with tf.gfile.GFile(os.path.join(directory, file_path), "r") as f:
print("directory : ",directory)
print("file path : ",file_path)
data["sentence"].append(f.read())
data["tnemitnes"].append(re.match("(\d+)\.txt", file_path).group(1))
return pd.DataFrame.from_dict(data)
print("merging all files in the training set ...")
# Merge all type of emails examples, add a polarity column and shuffle.
def load_dataset(directory):
pos_df = load_directory_data(os.path.join("train/br"))
neg_df = load_directory_data(os.path.join(directory, "train/mi"))
dos_df = load_directory_data(os.path.join(directory, "train/Brouillons")) #dsd
nos_df = load_directory_data(os.path.join(directory, "train/favoris")) #dsd
pos_df["polarity"] = 3
neg_df["polarity"] = 2
dos_df["polarity"] = 1
nos_df["polarity"] = 0
return pd.concat([pos_df, neg_df, dos_df , nos_df]).sample(frac=1).reset_index(drop=True)
print("Getting the data from files ...")
# Download and process the dataset files.
def download_and_load_datasets():
train_df = load_dataset(os.path.dirname("train"))
test_df = load_dataset(os.path.dirname("test"))
return train_df, test_df
print("configurring all logging output ...")
# Reduce logging output. ERROR
#logging.set_verbosity(tf.logging.INFO)
logging.getLogger().setLevel(logging.INFO)
print("Setting Up the data for the trainning ...")
train_df, test_df = download_and_load_datasets()
train_df.head()
print("Setting Up a Training input on the whole training set with no limit on training epochs ...")
# Training input on the whole training set with no limit on training epochs.
train_input_fn = tf.estimator.inputs.pandas_input_fn(train_df, train_df["polarity"], num_epochs=None, shuffle=True)
print("Setting Up a Prediction on the whole training set ...")
# Prediction on the whole training set.
predict_train_input_fn = tf.estimator.inputs.pandas_input_fn(train_df, train_df["polarity"], shuffle=False)
print("Setting Up a Prediction on the test set ...")
# Prediction on the test set.
predict_test_input_fn = tf.estimator.inputs.pandas_input_fn(test_df, test_df["polarity"], shuffle=False)
print("Removal of punctuation and splitting on spaces from the data ...")
#The module is responsible for preprocessing of sentences (e.g. removal of punctuation and splitting on spaces).
embedded_text_feature_column = hub.text_embedding_column(key="sentence", module_spec="https://tfhub.dev/google/nnlm-en-dim128/1")
print("Setting Up The Classifier ...")
#Estimator : For classification I did use a DNN Classifier
estimator = tf.estimator.DNNClassifier(
hidden_units=[10, 20],
feature_columns=[embedded_text_feature_column],
n_classes=4,
optimizer=tf.train.AdagradOptimizer(learning_rate=0.003))
print("Starting the Training ...")
# Training for 50 steps means 5000 training examples with the default
# batch size. This is roughly equivalent to 5 epochs since the training dataset
# contains less examples.
estimator.train(input_fn=train_input_fn, steps=20);
print("the Training had ended...")
print("setting Up the results ...")
train_eval_result = estimator.evaluate(input_fn=predict_train_input_fn)
test_eval_result = estimator.evaluate(input_fn=predict_test_input_fn)
print("Showing the results ...")
print("Training set accuracy: {accuracy}".format(**train_eval_result))
print("Test set accuracy: {accuracy}".format(**test_eval_result))
#this is when am having trouble !!! <====
tf.estimator.export(
os.path.dirname("Model"),
serving_input_fn,
default_output_alternative_key=None,
assets_extra=None,
as_text=False,
checkpoint_path=None,
graph_rewrite_specs=(GraphRewriteSpec((tag_constants.SERVING,), ()),),
strip_default_attrs=False
)
现在,在我添加了估算器导出功能后,它要求我提供serving_input_fn,老实说,我确实发现很难理解如何创建一个。
如果有更简单的方法会更好。
解决方案
你可以很容易地得到一个serving_input_fn tf.estimator.export.build_parsing_serving_input_receiver_fn
(链接)
在您的情况下,请执行以下操作:
serving_input_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(
[embedded_text_feature_column])
如果您希望直接传递张量,那么build_raw_serving_input_receiver_fn
在同一个包中也有。
推荐阅读
- c - C 代码错误:迭代内存指针时 EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
- javascript - 有没有办法将 SVG 文件添加为内联?
- flutter - Flutter 每次用户打开应用程序时显示一个页面,直到他们单击一个按钮,然后当他们打开应用程序时它应该显示一个不同的页面
- c++ - 我不断收到“缺少类型说明符”错误 C4430
- sql - 用于创建表的 SQL ETL - 求职面试问题
- python - Azure 存储 blob 流到数据帧错误
- javascript - 在 puppeteer 中,使用方法 page.$eval() 和 page.evaluate() 时,术语 pageFunction 是什么意思
- r - R 中是否有一个函数允许您在列中创建包含信息的新行?
- python - 有没有办法使用 jsonpath-ng 从 JSON 中提取密钥?
- java - 使用 spark 数据集中的值重复执行 java 方法