batch-file - 如何在窗口bat文件中计算具有特殊字符的唯一字符串
问题描述
我指的是这个样本来做一个唯一的字符串计数。用于计数出现次数的批处理文件 但是,我的字符串可能包含特殊字符,例如。“橙色 c = 美国”。如果字符串有特殊字符,计数将不起作用。
输入:
[SUCCESS] xxxx,xxxx,xxxx,orange c=US,xxxx,xxxx
[SUCCESS] xxxx,xxxx,xxxx,orange c=CA,xxxx,xxxx
[SUCCESS] xxxx,xxxx,xxxx,orange c=US,xxxx,xxxx
[SUCCESS] xxxx,xxxx,xxxx,orange c=US,xxxx,xxxx
[SUCCESS] xxxx,xxxx,xxxx,orange c=CA,xxxx,xxxx
[SUCCESS] xxxx,xxxx,xxxx,apple c=US,xxxx,xxxx
[SUCCESS] xxxx,xxxx,xxxx,apple c=CA,xxxx,xxxx
输出:
orange c=US 3
orange c=CA 2
apple c=US 1
apple c=CA 1
代码:
set "file=test.out.log"
(
for /f "tokens=1,2,3,4,5,6,7 delims=," %%a in ('findstr /I /N /C:"[SUCCESS]" "%file%"') do (
set "t=%%d" <--- %%d will extract "orange c=US"
call :handleType
)
rem Enumerate find types and echo type and number of occurrences
rem The inner loop is to allow underscores inside type
for /f "tokens=1,* delims=_" %%a in ('set _type_ 2^>nul') do (
for /f "tokens=1,2 delims==" %%v in ("%%b") do (
echo %%v %%w
)
)) > output.txt
rem Clean and exit
endlocal
exit /b
pause > nul
:handleType
rem %t% ----> orange c=US
set "t=%t:'=%"
for /f "tokens=*" %%t in ("%t:"=%") do (
set /a "_type_%%~t+=1"
)
goto :EOF
解决方案
只需将特殊字符=
(和空格)替换为另一个用于计数,然后将这些字符替换回输出:
编辑:根据后面评论中的要求修改代码
@echo off
setlocal EnableDelayedExpansion
rem Count items
for /F "tokens=4 delims=," %%d in ('findstr /I /L "[SUCCESS]" test.txt') do (
set "item=%%d"
rem Replace special characters
for %%a in ("+=PLUS" "/=SLASH") do (
for /F "tokens=1,2 delims==" %%b in (%%a) do set "item=!item:%%b=%%c!"
)
rem Separate on SPace and equal-sign characters
for /F "tokens=1,2,3 delims== " %%x in ("!item!") do (
set /A "count[%%x_%%y_%%z]+=1"
)
)
REM set count[
rem Show counts
for /F "tokens=2-5 delims=[_]=" %%a in ('set count[') do (
rem Replace back special characters
set "item=%%a"
for %%a in ("PLUS=+" "SLASH=/") do (
for /F "tokens=1,2 delims==" %%b in (%%a) do set "item=!item:%%b=%%c!"
)
echo !item! %%b=%%c %%d
)
输入:
[SUCCESS] xxxx,xxxx,xxxx,orange c=US,xxxx,xxxx
[SUCCESS] xxxx,xxxx,xxxx,orange c=CA,xxxx,xxxx
[SUCCESS] xxxx,xxxx,xxxx,orange c=US,xxxx,xxxx
[SUCCESS] xxxx,xxxx,xxxx,orange c=US,xxxx,xxxx
[SUCCESS] xxxx,xxxx,xxxx,orange c=CA,xxxx,xxxx
[SUCCESS] xxxx,xxxx,xxxx,apple c=US,xxxx,xxxx
[SUCCESS] xxxx,xxxx,xxxx,apple c=CA,xxxx,xxxx
[SUCCESS] xxxx,xxxx,xxxx,Grape+ c=US,xxxx,xxxx
[SUCCESS] xxxx,xxxx,xxxx,Grape+ c=CA,xxxx,xxxx
[SUCCESS] xxxx,xxxx,xxxx,Grape/L c=US,xxxx,xxxx
[SUCCESS] xxxx,xxxx,xxxx,Grape/L c=CA,xxxx,xxxx
输出:
apple c=CA 1
apple c=US 1
Grape+ c=CA 1
Grape+ c=US 1
Grape/L c=CA 1
Grape/L c=US 1
orange c=CA 2
orange c=US 3
推荐阅读
- bash - GitBash 启动时,如何检测和捕获事件,何时加载 .gitconfig 文件并设置 git 全局凭据
- amazon-web-services - 如何从一个代码库部署 Lambda?
- tabulator - 在同一个 Tabulator 表上使用 Ajax 过滤和非 Ajax 过滤?
- java - 使用服务在后台通过多个按钮播放多个声音
- ios - 属性未在 super.init 调用中使用 MVVM 和便利 init 初始化
- prolog - 在 Prolog 中使用“或”
- python-3.x - 我用 Python 编写了重命名目录中的文件的代码,但出现错误:(有什么建议吗?
- lua - Lua:如何使 os.rename 和 os.remove 使用包含 unicode 字符的文件名?
- android-studio - Flutter:无法在 Android Studio 中创建 Flutter 项目或运行 Flutter 项目。(颤振工作正常)
- javascript - Java-script:去掉多余的小数位