首页 > 解决方案 > 如何用 '.' 分割字符串 在 Haskell 中,但如果句点介于两个数字之间,例如:(2.5) 那么不要拆分?

问题描述

谁能帮我实现一个函数,将一个字符串拆分为一个字符串数组,用句点 (.) 拆分?

所以:"This is sentence one. This is sentence two."变成:["This is sentence one", "This is sentence two"]

但是,如果句点介于两个数字之间,例如:(2.5),那么不要拆分?

标签: haskell

解决方案


有很多方法可以做到这一点。这里有几个。

Text.Regex

import Text.Regex.TDFA -- I think this will work with the other regex backends too

getAllTextMatches ("One. Two. a 3.5 b. cde." =~ "([0-9]\\.[0-9]|[^.])+" :: AllTextMatches [] String)
-- ["One"," Two"," a 3.5 b"," cde"]

使用Text.ParserCombinators.ReadP(在 中base,因此不需要第三方库):

import Data.Char
import Text.ParserCombinators.ReadP

parseDigitsWithDecimalPoint = (\a b c -> [a,b,c]) <$> satisfy isDigit <*> char '.' <*> satisfy isDigit
parseNonDot = (:[]) <$> satisfy ('.' /=)
parseSentence = fmap concat . many $ parseDigitsWithDecimalPoint <++ parseNonDot
readP_to_S (sepBy parseSentence (char '.') <* eof) "One. Two. a 3.5 b. cde."
-- [(["One"," Two"," a 3.5 b"," cde",""],"")]

Text.Regex.Applicative

import Control.Applicative.Combinators
import Data.Char
import Text.Regex.Applicative

parseDigitsWithDecimalPoint = (\a b c -> [a,b,c]) <$> psym isDigit <*> sym '.' <*> psym isDigit
parseNonDot = (:[]) <$> psym ('.' /=)
parseSentence = fmap concat . many $ parseDigitsWithDecimalPoint <|> parseNonDot
match (sepBy parseSentence (sym '.')) "One. Two. a 3.5 b. cde."
-- Just ["One"," Two"," a 3.5 b"," cde",""]

Text.Megaparsec

{-# LANGUAGE FlexibleContexts, TypeFamilies #-}
import Data.Char
import Text.Megaparsec

parseDigitsWithDecimalPoint = (\a b c -> [a,b,c]) <$> satisfy isDigit <*> single '.' <*> satisfy isDigit
parseNonDot = (:[]) <$> anySingleBut '.'
parseSentence = fmap concat . many $ try parseDigitsWithDecimalPoint <|> parseNonDot
parseMaybe (sepBy parseSentence (single '.')) "One. Two. a 3.5 b. cde."
-- Just ["One"," Two"," a 3.5 b"," cde",""]

完全手工:

import Data.Char
import Data.List.NonEmpty

foo = go id where
  go f "" = f "" :| []
  go f ('.':xs) = f "" <| go id xs
  go f (x1:'.':x2:xs) | isDigit x1 && isDigit x2 = go (f . (x1:) . ('.':) . (x2:)) xs
  go f (x:xs) = go (f . (x:)) xs
foo "One. Two. a 3.5 b. cde."
-- "One" :| [" Two"," a 3.5 b"," cde",""]

推荐阅读