autohotkey - 如何在制表符分隔文件 (txt) 中替换后跟逗号 (,) 或点 (.) 的文本?
问题描述
我是自动热键的新手。我有一个脚本可以帮助我缩短那些我不需要的单词,并且在尝试替换后跟逗号或点的文本时遇到问题,这是我的脚本:
#NoEnv
#SingleInstance force
SetWorkingDir, %A_ScriptDir%
SendMode, Input
; -- Ctrl + SPACE -> Select all text + replace whole words only + title case
^SPACE::
NonCapitalized := "a|an|in|is|of|the|this|with" ; List of words that shouldn't be capitalized, separated by pipes
ReplacementsFile := "replacements.txt" ; Path to replacements file (tab delimited file with 2 columns, UTF-8-BOM, CR+LF)
Send, ^a ; Selects all text
Gosub, SelectToClip ; Copies the selected text to the clipboard
FileRead, Replacements, % ReplacementsFile ; Reads the replacements file
If ErrorLevel ; Error message if file is not found
{
MsgBox, % "File not found: " ReplacementsFile
Return
}
StringUpper, Clipboard, Clipboard, T ; Whole clipboard to title case
Clipboard := RegExReplace(Clipboard, "i)(?<![!?.]) \b(" NonCapitalized ")\b", " $L1") ; Changes to lowercase all words from the list "NonCapitalized", except those preceded by new line/period/exclamation mark/question mark
pos := 0
While pos := RegExMatch(Replacements, "m`a)^([^\t]+)\t(.*)$", FoundReplace, pos + 1) ; Gets all replacements from the tab delimited file
Clipboard := RegExReplace(Clipboard, "i)\b" FoundReplace1 "\b", FoundReplace2) ; Replaces all occurrences in the clipboard
; add exceptions
Clipboard := StrReplace(Clipboard, "Vice President,", "")
Clipboard := StrReplace(Clipboard, "Director,", "")
Clipboard := StrReplace(Clipboard, "Senior Vice President,", "")
; = End of exceptions
Clipboard := RegExReplace(Clipboard, "^\s+|\s+(?=([\s,;:.]))|\s$") ; Removes extra spaces
Send, ^v ; Pastes the clipboard
Return
SelectToClip:
Clipboard := ""
Send, ^c
ClipWait, 0
If ErrorLevel
Exit
Sleep, 50
Return
这是我的替换文件的一部分:
Chief Operating, Financial Officer CFO & COO
Head,
President,
我的问题是如何在制表符分隔文件中添加后跟逗号(,)或点(。)的文本,而不是在 AHK 文件中添加更多行?因为如您所知,它不理解逗号和点作为文本。
非常感谢您的时间和帮助!
解决方案
请缩进,否则您的代码将更难阅读。
在正则表达式中,
\b
断言需要一个单词字符和一个非单词字符的序列,这使您的代码无法处理以逗号或点开头的非单词字符的字符串。...\b 和 \B,因为它们是根据 \w 和 \W 定义的。
...
单词边界是主题字符串中当前字符和前一个字符不匹配 \w 或 \W 的位置(即一个匹配 \w 而另一个匹配 \W),或开头或结尾如果第一个或最后一个字符分别与 \w 匹配,则为字符串。
以下测试工作:
#NoEnv
#SingleInstance force
SetWorkingDir %A_ScriptDir%
SendMode Input
; -- Ctrl + SPACE -> Select all text + replace whole words only + title case
^SPACE::
FunctionNameOfYourChoice() {
; Using static vars allows you to avoid reading the file over and over on each key press.
Static NonCapitalized := "a|an|in|is|of|the|this|with" ; List of words that shouldn't be capitalized, separated by pipes
, ReplacementsFile := "replacements.txt" ; Path to replacements file (tab delimited file with 2 columns, UTF-8-BOM, CR+LF)
, Replacements := ReadReplacements(ReplacementsFile)
Send ^a ; Selects all text
SelectToClip() ; Copies the selected text to the clipboard
If ErrorLevel { ; Error message if file is not found
MsgBox % "File not found: " ReplacementsFile
Return
}
; 3. StringUpper is deprecated in v2.
; 4. Better to work on a plain variable than on the clipboard in terms of performance and reliability.
cbCnt := Format("{:T}", Clipboard) ; Whole clipboard to title case
; Changes to lowercase all words from the list "NonCapitalized", except those preceded by new line/period/exclamation mark/question mark
cbCnt := RegExReplace(cbCnt, "i)(?<![!?.]) \b(" NonCapitalized ")\b", " $L1")
; Goes through each pair of search and replacement strings
Loop Parse, Replacements, `n, `r
FoundReplace := StrSplit(A_LoopField, "`t")
; Replaces all occurrences in the clipboard
, cbCnt := RegExReplace(cbCnt, "i)(?<!\w)\Q" FoundReplace.1 "\E(?!\w)", FoundReplace.2) ; 5.
cbCnt := RegExReplace(cbCnt, "(?<=\w-)([a-z])", "$U1") ; 6.
/*
; Now the following can be included in the replacements.txt file.
cbCnt := StrReplace(cbCnt, "Vice President,")
cbCnt := StrReplace(cbCnt, "Director,")
cbCnt := StrReplace(cbCnt, "Senior Vice President,")
*/
; Removes extra spaces
; This also removes all newlines. Are you sure you want to do this?
Clipboard := RegExReplace(cbCnt, "^\s+|\s+(?=([\s,;:.]))|\s$")
Send ^v ; Pastes the clipboard
}
SelectToClip() {
Clipboard := ""
Send ^c
ClipWait 0.5 ; Specifying 0 wouldn't be a very good idea.
If ErrorLevel
Exit
Sleep 50
}
ReadReplacements(path) {
FileRead, Replacements, % path
Return Replacements
}
编辑:
是的,第二个正则表达式(其中的第一个断言)中有一个错字,已更正。“and”的问题不再赘述。
我添加了另一个
RegExReplace
作为解决您描述的连字符问题的不那么优雅的临时措施,但请注意,这本质上是一个不平凡的问题,因为这些问题的大写取决于语义。