首页 > 解决方案 > 如何在文本文件中搜索函数并根据使用haskell遇到的顺序对其进行编辑

问题描述

我是haskell的新手,我需要创建一个简单的文字处理器。它需要从 cmd 读取文本并实现将编辑在文本中找到的功能的功能,以便其他一些文本。文本中的所有功能都被排序。每个功能,即section、table、figures 和ref 必须根据出现的顺序独立定位和编号。由于表格和图形在sections 函数中(就像一本书一样),当遇到新的section 时,您将重置tables 和figures 的值。所以我做了一个循环,在搜索函数时,表格和数字都在节循环内。这些函数都以这个 '' 字符(转义字符)开头,并且必须使用以下方法替换为文本:\Section{title}{id}: -> "Section n: title"。其中 n 是节号。\table{title}{id} : -> "

import System.Environment (getArgs)
import System.IO (openFile, ReadWriteMode, hGetContents)
import Text.Format
import Data.Text.Internal.Search
import Data.Text
main = do
        args <- getArgs -- reading and writing files from CMD
        file <- openFile (head args) ReadWriteMode
        text <- hGetContents file

然后,我计算了部分、表格和数字出现在文本中的次数,以便我可以实现一个循环:

s = count . "section" -- count number of occurences of a certain word
t = count . "table"
f = count .  "Figure"
r = count . "ref"    

我终于在一个循环中实现了一个格式化文本函数,根据它们出现的次数改变函数:

i = 1
 
for i in range s format "\Section{title}{id}" [show i, show {title}] --change the text

    for j in range t format "\table{title}{id}" [show i.j, show {title}]

    for k in range f format "\figure{title}{id}" [show i.k, show {title}]

然而,我知道这段代码是错误的,我被困住了,需要帮助。

标签: loopshaskell

解决方案


避免for 循环和变异状态

module Main where
-- The function doNumbering replaces \functions in the input by numbered text
-- Parameters:
-- 1. String - the input (the unprocessed part, in my code often called xs or x:xs)
-- 2. Int - the next section number (in my code often called s)
-- 3. Int - the table number (in my code often called t)
-- 4. Int - the figure number (in my code often called f)
-- 5. String - the output (the finished part, the return value of the function)
doNumbering :: String -> Int -> Int -> Int -> String
-- If there is \Section... at the top of the input, then put Section s: ... to the output
-- and change state like (tail xs) s+1 1 1
doNumbering ('\\':'S':'e':'c':'t':'i':'o':'n':xs) s t f
  = "Section " ++ (show s)
    ++ ": " ++ getStr1 (tail xs) (s+1) 1 1 
-- If there is \Table... at the top of the input, then put Table s-1.t: ... to the output
-- and change state like (tail xs) s t+1 f
doNumbering ('\\':'T':'a':'b':'l':'e':xs) s t f
  = "Table " ++ (show (s-1)) ++ "." ++ (show t)
    ++ ": " ++ getStr1 (tail xs) s (t+1) f
-- If there is \Figure... in the input, then put Figure s-1.f: ... to the output
-- and change state like (tail xs) s t (f+1)
doNumbering ('\\':'F':'i':'g':'u':'r':'e':xs) s t f
  = "Figure " ++ (show (s-1)) ++ "." ++ (show f)
    ++ ": " ++ getStr1 (tail xs) s t (f+1)

-- No template above matches, so we move the character at the top of the input to output.
doNumbering (x:xs) s t f = [x] ++ doNumbering xs s t f
doNumbering "" _ _ _ = "" -- The input string is empty (the end)

-- The functions `getStr1` copy the text from first parameter of \function to output  
getStr1 :: String -> Int -> Int -> Int -> String
getStr1 ('}':xs) s t f = getStr2 xs s t f
getStr1 (x:xs) s t f = [x] ++ getStr1 xs s t f

-- The function `getStr2` skips the yet meaningless second parameter in brackets and call doNumbering.
-- It probably should be improved for the `ref` function.
getStr2 :: String -> Int -> Int -> Int -> String
getStr2 ('}':xs) s t f = doNumbering xs s t f
getStr2 (x:xs) s t f = getStr2 xs s t f

main = putStrLn $ doNumbering input 1 1 1

input :: String
input =
        "\\Section{sss}{sss}\n"
     ++ "Text text text text.\n"
     ++ "\\Table{ttt}{ttt}\n"
     ++ "Text text text text.\n"
     ++ "\\Figure{fff}{fff}\n"
     ++ "Text text text text.\n"
     ++ "\\Figure{fff}{fff}\n"
     ++ "Text text text text.\n"
     ++ "\\Table{ttt}{ttt}\n"
     ++ "Text text text text.\n"
     ++ "\\Figure{fff}{fff}\n"
     ++ "Text text text text.\n"
     ++ "\\Section{sss}{sss}\n"
     ++ "Text text text text.\n"
     ++ "\\Figure{fff}{fff}\n"
     ++ "Text text text text.\n"
     ++ "\\Figure{fff}{fff}\n"
     ++ "Text text text text.\n"
     ++ "\\Table{ttt}{ttt}\n"
     ++ "Text text text text.\n"
     ++ "\\Figure{fff}{fff}\n"
     ++ "Text text text text.\n"
     ++ "\\Table{ttt}{ttt}\n"
     ++ "Text text text text.\n"
     ++ "\\Section{sss}{sss}\n"
     ++ "\\Table{ttt}{ttt}\n"
     ++ "Text text text text.\n"
     ++ "\\Figure{fff}{fff}\n"
     ++ "Text text text text.\n"
     ++ "\\Table{ttt}{ttt}\n"
     ++ "Text text text text.\n"

输出:

Section 1: sss
Text text text text.
Table 1.1: ttt
Text text text text.
Figure 1.1: fff
Text text text text.
Figure 1.2: fff
Text text text text.
Table 1.2: ttt
Text text text text.
Figure 1.3: fff
Text text text text.
Section 2: sss
Text text text text.
Figure 2.1: fff
Text text text text.
Figure 2.2: fff
Text text text text.
Table 2.1: ttt
Text text text text.
Figure 2.3: fff
Text text text text.
Table 2.2: ttt
Text text text text.
Section 3: sss
Table 3.1: ttt
Text text text text.
Figure 3.1: fff
Text text text text.
Table 3.2: ttt
Text text text text.

我的代码中有一件丑陋的事情。在我的代码中,它看起来像 s t f每次调用解析函数之后。这是由于参照透明性。它可以用monads处理。请参阅do 表示法


推荐阅读