google-apps-script - 使用 Selection 的 RangeElements 获取 Google Doc 中的所有嵌套文本元素
问题描述
在与上述类似的文档中,我可以使用以下代码获取所有段落:
var paras = body.getParagraphs();
请注意,上面的代码不仅返回顶级段落,还返回ListItem
s、Table
s 等内部的所有子级段落。
我怎样才能在选定的范围内做同样的事情?以下代码仅返回顶级元素。
const selection = DocumentApp.getActiveDocument().getSelection();
var rangeElements = selection.getRangeElements();
例如,上表包含 9 个非空段落,如果它们在选择中,我想一一处理。
我想要实现的类似于通过尽可能保留格式、表格、列表项等来翻译选择中的文本。
解决方案
.getRangeElements()
返回RangeElements数组。范围元素是一个包装对象,用于帮助我们处理部分选择。我们可以调用.getElement()
这个数组中的每一项来获取Element 对象,它是一个非常通用的对象,几乎可以表示 Google Doc 的任何部分。Elements
有一个.getType()
返回ElementType枚举的方法;而且有很多!
让我们使用我们目前所知道的来看看 Google Doc 中可能的类型(我创建了一个类似于你的(img)作为示例):
function selectionHasWhichTypes() {
var doc = DocumentApp.getActiveDocument();
var selection = doc.getSelection();
var rangeElems = selection.getRangeElements();
rangeElems.forEach(function(elem){
var elem = elem.getElement();
Logger.log(elem.getType());
});
}
//Logger OUTPUT:
PARAGRAPH
PARAGRAPH
PARAGRAPH
PARAGRAPH
PARAGRAPH
LIST_ITEM
LIST_ITEM
LIST_ITEM
PARAGRAPH
PARAGRAPH
PARAGRAPH
TABLE
PARAGRAPH
啊哈!看起来我们现在只需要处理PARAGRAPH、LIST_ITEM和TABLE ElementTypes ,但是让我们也记住他们的孩子(我们会发现这些是可以有孩子的 5 个中的 3 个)。这听起来像是一个递归函数的工作,它将不断挖掘子元素,直到我们找到并处理它们。
所以让我们试试吧。下一部分可能看起来令人困惑,但本质上是寻找一个元素,检查它是否有子元素,然后查看它们是否有子元素,等等。我们还想检查我们是否也有新的ElementTypes 来处理......
function selectionHasWhichTypes() {
var doc = DocumentApp.getActiveDocument();
var selection = doc.getSelection();
var rangeElems = selection.getRangeElements();
rangeElems.forEach(function(elem){
var elem = elem.getElement();
elemsHaveWhatChildElems(elem, elem.getType());
});
}
function elemsHaveWhatChildElems(elem, typeChain){
var elemType = elem.getType();
if(elemType == "TABLE" || elemType == "LIST_ITEM" || elemType == "PARAGRAPH"){ //Lets see if element is one of our basic 3. If so they could have children.
var numChildren = elem.getNumChildren(); //How many children are there?
if(numChildren > 0){
for(var i = 0; i < numChildren; i++){ //Let's go through them.
var child = elem.getChild(i);
elemsHaveWhatChildElems(child, typeChain + "." + child.getType()); //Recursion step to look for more children.
}
}else{
Logger.log(typeChain); //Let's log the chain of Parent to Child elements.
}
}else{
Logger.log("*" + typeChain); //Let's mark the new elemTypeChains we have not seen.
}
}
//Logger OUTPUT:
*PARAGRAPH.TEXT
PARAGRAPH
*PARAGRAPH.HORIZONTAL_RULE
PARAGRAPH
*PARAGRAPH.TEXT
*LIST_ITEM.TEXT
*LIST_ITEM.TEXT
*LIST_ITEM.TEXT
PARAGRAPH
*PARAGRAPH.TEXT
PARAGRAPH
*TABLE.TABLE_ROW
*TABLE.TABLE_ROW
PARAGRAPH
好的,所以日志的每一行都是元素链及其子元素。我们有一些新的元素类型(HORIZONTAL_RULE、TABLE_ROW和TEXT)。如果链只有 aParagraph
并且没有子级,则由“PARAGRAPH”表示。我们可以忽略它,因为它是一个空行。我们也可以忽略HORIZONTAL_RULE
,因为这显然不包含文本。
如果我们得到一个 TEXT 元素,这意味着我们可以执行我们的功能(即,对于 OP,它将是一个翻译),就像我们对 LIST_ITEMs 和 PARAGRAPHs 所做的那样。但是,我们仍然需要处理TableRow对象(它的日志是这样的:) TABLE.TABLE_ROW
。这类似于我们的主要 3 个元素,可以与我们的if(elemType == "TABLE" || elemType == "LIST_ITEM" || elemType == "PARAGRAPH")
which 更改为if(elemType == "TABLE" || elemType == "LIST_ITEM" || elemType == "PARAGRAPH" || elemType == "TABLE_ROW")
.
这给了我们链中的另一个新元素;TableCell(类似:的日志TABLE.TABLE_ROW.TABLE_CELL
),我们可以再次将其添加到 if 语句中:if(elemType == "TABLE" || elemType == "LIST_ITEM" || elemType == "PARAGRAPH" || elemType == "TABLE_ROW" || elemType == "TABLE_CELL")
是时候看看我们处理 Table ElementTypes 时会发生什么了。
function selectionHasWhichtypeChains() {
var doc = DocumentApp.getActiveDocument();
var selection = doc.getSelection();
var rangeElems = selection.getRangeElements();
rangeElems.forEach(function(elem){
var elem = elem.getElement();
elemsHaveWhatChildElems(elem, elem.getType());
});
}
function elemsHaveWhatChildElems(elem, typeChain){
var elemType = elem.getType();
if(elemType == "TABLE" || elemType == "LIST_ITEM" || elemType == "PARAGRAPH" || elemType == "TABLE_ROW" || elemType == "TABLE_CELL"){ //Lets see if element is one of our basic 5 if so they could have children.
var numChildren = elem.getNumChildren(); //How many children are there?
if(numChildren > 0){
for(var i = 0; i < numChildren; i++){ //Let's go through them.
var child = elem.getChild(i);
elemsHaveWhatChildElems(child, typeChain + "." + child.getType()); //Recursion step to look for more children.
}
}else{
Logger.log(typeChain); //Let's log the chain of Parent to Child elements.
}
}else{
Logger.log("*" + typeChain); //Let's mark the new elemTypeChains we have not seen.
}
}
//Logger OUTPUT:
*PARAGRAPH.TEXT
PARAGRAPH
*PARAGRAPH.HORIZONTAL_RULE
PARAGRAPH
*PARAGRAPH.TEXT
*LIST_ITEM.TEXT
*LIST_ITEM.TEXT
*LIST_ITEM.TEXT
PARAGRAPH
*PARAGRAPH.TEXT
PARAGRAPH
*TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
*TABLE.TABLE_ROW.TABLE_CELL.TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
*TABLE.TABLE_ROW.TABLE_CELL.TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
*TABLE.TABLE_ROW.TABLE_CELL.TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
*TABLE.TABLE_ROW.TABLE_CELL.TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH
*TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
*TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.HORIZONTAL_RULE
*TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
*TABLE.TABLE_ROW.TABLE_CELL.PARAGRAPH.TEXT
PARAGRAPH
这很棒!我们已经深入到每个父元素的深处,并且达到了文本元素或空白段落!从这里我们可以稍微修改我们的代码以添加我们想要执行的功能,同时保持文档的结构:
function myFunction() {
var doc = DocumentApp.getActiveDocument();
var selection = doc.getSelection();
var rangeElems = selection.getRangeElements(); //Get main Elements of selection
rangeElems.forEach(function(elem){ //Let's rn through each to find ALL of their children.
var elem = elem.getElement(); //We have an ElementType. Let's get the full element.
getNestedTextElements(elem, elem.getType()); //Time to go down the rabbit hole.
});
}
function getNestedTextElements(elem, typeChain){
var elemType = elem.getType();
if(elemType == "TABLE" || elemType == "LIST_ITEM" || elemType == "PARAGRAPH" || elemType == "TABLE_ROW" || elemType == "TABLE_CELL"){ //Lets see if element is one of our basic 5, if so they could have children.
var numChildren = elem.getNumChildren(); //How many children are there?
if(numChildren > 0){
for(var i = 0; i < numChildren; i++){ //Let's go through them.
var child = elem.getChild(i);
getNestedTextElements(child, typeChain + "." + child.getType()); //Recursion step to look for more children.
}
}
}else if(elemType == "TEXT"){
//THIS IS WHERE WE CAN PERFORM OUR OPERATIONS ON THE TEXT ELEMENT
var text = elem.getText();
}else{
Logger.log("*" + typeChain); //Let's log the new elem we dont deal with now - for future proofing.
}
}
繁荣!完毕。我知道这是一篇很长的文章,但我已将解决方案的每个部分分解为多个部分,以帮助新的 Apps 脚本编码人员了解选择的结构(我猜是文档正文)以及在结构时如何修改它非常复杂(许多嵌套元素)。我真的希望这会有所帮助。如果有人看到可以改进的部分,请告诉我。
作为对 OP 的说明:请注意,这不一定处理元素的部分选择,但可以通过稍微修改第一个函数以检查isPartial()
RangeElement来轻松处理。
推荐阅读
- angular - 当控件的值发生变化时,角度 seterrors 会发生变化
- javascript - 我想在单击按钮时显示段落内容 5 分钟,即使在使用 javascript 在 Java 代码中刷新页面时也是如此
- go - go fiber slice append 更改所有项目
- python - 如何增加 Docker 的日志限制?| [输出剪辑,日志限制达到 100KiB/s]
- visual-studio-code - VSCode 扩展:命令“Hello World”导致错误(未找到命令“vscode-err-reproduce.helloWorld”)
- html - 排除包含属性的元素及其所有子元素
- c# - C# 选择带参数的 SQL 查询
- c# - C#中引用类型后面的问号是什么意思?
- rust - 如何缩短 IntoIterator
- javascript - 从 React 中的 API 接收数据后,如何使用 setInterval 设置状态