javascript - 如何在不有效破坏单词的情况下将句子分成固定长度的块?
问题描述
输入:“这个过程持续了几年,聋儿不会在一个月甚至两三年内使用最简单的日常交际的无数项目和表达,小听力的孩子从这些不断的轮换和模仿他的谈话中学到在他的家里听到模拟是我的,并提出了话题,并唤起了他自己思想的自发表达。”
CHUNK_SIZE:200,(假设它有 200 个字符长)。
输出:
[“这个过程持续了好几年,聋儿不会在这里一个月甚至两三年内用最简单的日常交往的无数项目和表达方式很少”,
“听力儿童从这些不断的旋转和模仿中学习他在家里听到的对话模拟的是我的,并提出了话题,并唤起了他自己的自发表达”,
“想法。”]
我知道这样做的一种方法是计算长度并检查我是否破坏了任何单词等等,但有人告诉我那是非常低效且不可取的......所以我在这里寻求帮助。
解决方案
一种选择是使用正则表达式贪婪地匹配 200 个字符,并让它回溯,直到匹配的最后一个字符后面跟着一个空格字符或字符串的结尾:
const str = "This process was continued for several years for the deaf child does not here in a month or even in two or three years the numberless items and expressions using the simplest daily intercourse little hearing child learns from these constant rotation and imitation the conversation he hears in his home simulates is mine and suggest topics and called forth the spontaneous expression of his own thoughts.";
const chunks = str.match(/.{1,200}(?= |$)/g);
console.log(chunks);
如果您还想排除前导/尾随空格,请添加\S
到匹配的开头和结尾:
const str = "This process was continued for several years for the deaf child does not here in a month or even in two or three years the numberless items and expressions using the simplest daily intercourse little hearing child learns from these constant rotation and imitation the conversation he hears in his home simulates is mine and suggest topics and called forth the spontaneous expression of his own thoughts.";
const chunks = str.match(/\S.{1,198}\S(?= |$)/g);
console.log(chunks);
要使用变量:
const chunkSize = 200;
const str = "This process was continued for several years for the deaf child does not here in a month or even in two or three years the numberless items and expressions using the simplest daily intercourse little hearing child learns from these constant rotation and imitation the conversation he hears in his home simulates is mine and suggest topics and called forth the spontaneous expression of his own thoughts.";
const chunks = str.match(new RegExp(String.raw`\S.{1,${chunkSize - 2}}\S(?= |$)`, 'g'));
console.log(chunks);
如果您还需要考虑只有一个字符的可能性,则不需要在模式中匹配两个或多个字符:
const chunkSize = 200;
const str = "This process was continued for several years for the deaf child does not here in a month or even in two or three years the numberless items and expressions using the simplest daily intercourse little hearing child learns from these constant rotation and imitation the conversation he hears in his home simulates is mine and suggest topics and called forth the spontaneous expression of his own thoughts.";
const chunks = str.match(new RegExp(String.raw`\S(?:.{0,${chunkSize - 2}}\S)?(?= |$)`, 'g'));
console.log(chunks);
推荐阅读
- python - 字典中键替换的奇怪行为
- git - 在现有公共存储库中将某些文件设为私有
- php - 在同一查询中使用 with() 和 join() 时出现问题 - Laravel
- python - 从具有 openpyxl 的相同电子表格中读取的数据在多个模块之间的比较不相等
- docker - 如何与 Conatinerd 运行时共享我的 docker 图像?
- javascript - 刷新页面时如何不显示所有框
- python-3.x - Python 日期和时间
- javascript - ES6 箭头函数和 jQuery 小部件工厂
- javascript - Javascript文件突然停止工作,没有任何变化
- c# - C# 实体框架:使用 Where 和 Join 进行批量更新