首页 > 解决方案 > 为什么我们需要在文本生成应用程序中将序列中的句子变成小写?还是根本有必要?

问题描述

显然我正在做一个与文本生成有关的项目。根据 tensorflow.org 的代码示例,加载文本文件时,它们不会降低数据中的字母。但是,在其他来源中,例如 deeplearningai 在序列模型课程中的 Dinosaurus_Island 作业。我不知道降低文本中的字母的效果,或者根本没有任何效果?

标签: pythondeep-learningtext-files

解决方案


It's for simplifying how many elements need to be represented. When you have uppercase letters you need to allocate a minimum of 26 extra spaces [A-Z] or more if there are combinations of capital letters. For text classification, I don't think it's necessary to keep capitals as this doesn't affect how the text reads but in the case where you're trying to generate say the next word or next letter in the sequence then it becomes important.


推荐阅读