string 如何解析 Haskell 中的 IO 字符串?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11229854/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 01:31:16  来源:igfitidea点击:

How can I parse the IO String in Haskell?

stringparsinghaskelliomonads

提问by Simon

I' ve got a problem with Haskell. I have text file looking like this:

我在使用 Haskell 时遇到了问题。我有看起来像这样的文本文件:

5.
7. 
[(1,2,3),(4,5,6),(7,8,9),(10,11,12)].

I haven't any idea how can I get the first 2 numbers (2 and 7 above) and the list from the last line. There are dots on the end of each line.

我不知道如何获得前 2 个数字(上面的 2 和 7)和最后一行的列表。每行的末尾都有点。

I tried to build a parser, but function called 'readFile' return the Monad called IO String. I don't know how can I get information from that type of string.

我试图构建一个解析器,但名为“readFile”的函数返回名为 IO String 的 Monad。我不知道如何从这种类型的字符串中获取信息。

I prefer work on a array of chars. Maybe there is a function which can convert from 'IO String' to [Char]?

我更喜欢处理一系列字符。也许有一个函数可以从'IO String'转换为[Char]?

回答by Chris Taylor

I think you have a fundamental misunderstanding about IO in Haskell. Particularly, you say this:

我认为您对 Haskell 中的 IO 存在根本性的误解。尤其是你这样说:

Maybe there is a function which can convert from 'IO String' to [Char]?

也许有一个函数可以从'IO String'转换为[Char]?

No, there isn't1, and the fact that there is no such function is one of the most important things about Haskell.

不,没有1,并且没有这样的函数这一事实是 Haskell 最重要的事情之一。

Haskell is a very principled language. It tries to maintain a distinction between "pure" functions (which don't have any side-effects, and always return the same result when give the same input) and "impure" functions (which have side effects like reading from files, printing to the screen, writing to disk etc). The rules are:

Haskell 是一种非常有原则的语言。它试图区分“纯”函数(没有任何副作用,并且在给出相同输入时总是返回相同的结果)和“不纯”函数(具有从文件读取、打印等副作用)到屏幕,写入磁盘等)。规则是:

  1. You can use a pure function anywhere (in other pure functions, or in impure functions)
  2. You can only use impure functions inside other impure functions.
  1. 您可以在任何地方使用纯函数(在其他纯函数中,或在不纯函数中)
  2. 您只能在其他不纯函数中使用不纯函数。

The way that code is marked as pure or impure is using the type system. When you see a function signature like

代码被标记为纯或不纯的方式是使用类型系统。当你看到一个函数签名时

digitToInt :: String -> Int

you know that this function is pure. If you give it a Stringit will return an Intand moreover it will always return the same Intif you give it the same String. On the other hand, a function signature like

你知道这个函数是纯函数。如果你给它 aString它会返回 anInt而且如果你给它相同,它总是会返回Int相同的String。另一方面,函数签名像

getLine :: IO String

is impure, because the return type of Stringis marked with IO. Obviously getLine(which reads a line of user input) will not always return the same String, because it depends on what the user types in. You can't use this function in pure code, because adding even the smallest bit of impurity will pollute the pure code. Once you go IOyou can never go back.

不纯的,因为 的返回类型String标有IO。显然getLine(读取一行用户输入)不会总是返回相同的String,因为这取决于用户输入的内容。您不能在纯代码中使用此功能,因为即使添加最小的杂质也会污染纯代码。一旦离开IO,就再也回不去了。

You can think of IOas a wrapper. When you see a particular type, for example, x :: IO String, you should interpret that to mean "xis an action that, when performed, does some arbitrary I/O and then returns something of type String" (note that in Haskell, Stringand [Char]are exactly the same thing).

您可以将其IO视为包装器。例如,当您看到特定类型时,x :: IO String您应该将其解释为“x是一个操作,当执行时,执行一些任意 I/O,然后返回某种类型的东西String”(请注意,在 Haskell 中,String[Char]完全相同事物)。

So how do you ever get access to the values from an IOaction? Fortunately, the type of the function mainis IO ()(it's an action that does some I/O and returns (), which is the same as returning nothing). So you can always use your IOfunctions inside main. When you execute a Haskell program, what you are doing is running the mainfunction, which causes all the I/O in the program definition to actually be executed - for example, you can read and write from files, ask the user for input, write to stdout etc etc.

那么你如何从一个IO动作中访问值呢?幸运的是,函数的类型mainIO ()(它是一个执行一些 I/O 并返回的操作(),这与什么都不返回相同)。所以你总是可以在IO里面使用你的函数main。当你执行一个 Haskell 程序时,你所做的就是运行该main函数,这会导致程序定义中的所有 I/O 被实际执行——例如,你可以从文件中读取和写入,询问用户输入,写入到标准输出等

You can think of structuring a Haskell program like this:

你可以考虑像这样构建一个 Haskell 程序:

  • All code that does I/O gets the IOtag (basically, you put it in a doblock)
  • Code that doesn't need to perform I/O doesn't need to be in a doblock - these are the "pure" functions.
  • Your mainfunction sequences together the I/O actions you've defined in an order that makes the program do what you want it to do (interspersed with the pure functions wherever you like).
  • When you run main, you cause all of those I/O actions to be executed.
  • 所有执行 I/O 的IO代码都会获得标签(基本上,你把它放在一个do块中)
  • 不需要执行 I/O 的代码不需要在do块中——这些是“纯”函数。
  • 您的main函数将您定义的 I/O 操作按顺序排列在一起,使程序执行您希望它执行的操作(在您喜欢的任何地方穿插纯函数)。
  • 当您运行 时main,您将执行所有这些 I/O 操作。


So, given all that, how do you write your program? Well, the function

那么,考虑到所有这些,您如何编写程序?嗯,功能

readFile :: FilePath -> IO String

reads a file as a String. So we can use that to get the contents of the file. The function

将文件读取为String. 所以我们可以使用它来获取文件的内容。功能

lines:: String -> [String]

splits a Stringon newlines, so now you have a list of Strings, each corresponding to one line of the file. The function

String在换行符上拆分 a ,因此现在您有一个Strings列表,每个 s 对应于文件的一行。功能

init :: [a] -> [a]

Drops the last element from a list (this will get rid of the final .on each line). The function

从列表中删除最后一个元素(这将去掉.每一行的最后一个)。功能

read :: (Read a) => String -> a

takes a Stringand turns it into an arbitrary Haskell data type, such as Intor Bool. Combining these functions sensibly will give you your program.

接受 aString并将其转换为任意 Haskell 数据类型,例如Intor Bool。明智地组合这些功能将为您提供您的程序。

Note that the only time you actually need to do any I/O is when you are reading the file. Therefore that is the only part of the program that needs to use the IOtag. The rest of the program can be written "purely".

请注意,您真正需要执行任何 I/O 的唯一时间是在读取文件时。因此,这是程序中唯一需要使用IO标签的部分。程序的其余部分可以“纯粹地”编写。

It sounds like what you need is the article The IO Monad For People Who Simply Don't Care, which should explain a lot of your questions. Don't be scared by the term "monad" - you don't need to understand what a monad is to write Haskell programs (notice that this paragraph is the only one in my answer that uses the word "monad", although admittedly I have used it four times now...)

听起来您需要的是文章The IO Monad For People Who Simply Don't Care,它应该可以解释您的很多问题。不要被“monad”这个词吓到——你不需要理解 monad 是什么来编写 Haskell 程序(请注意,这一段是我回答中唯一使用“monad”这个词的段落,尽管我承认我现在已经用了四次了...)



Here's the program that (I think) you want to write

这是(我认为)你想写的程序

run :: IO (Int, Int, [(Int,Int,Int)])
run = do
  contents <- readFile "text.txt"   -- use '<-' here so that 'contents' is a String
  let [a,b,c] = lines contents      -- split on newlines
  let firstLine  = read (init a)    -- 'init' drops the trailing period
  let secondLine = read (init b)    
  let thirdLine  = read (init c)    -- this reads a list of Int-tuples
  return (firstLine, secondLine, thirdLine)

To answer npfedwardscomment about applying linesto the output of readFile text.txt, you need to realize that readFile text.txtgives you an IO String, and it's only when you bind it to a variable (using contents <-) that you get access to the underlying String, so that you can apply linesto it.

要回答npfedwards有关应用lines到 的输出的评论readFile text.txt,您需要意识到它readFile text.txt会给您一个IO String,并且只有当您将其绑定到一个变量(使用contents <-)时,您才能访问底层String,以便您可以应用lines到它。

Remember: once you go IO, you never go back.

记住:一旦你走了IO,你就再也回不去了。



1I am deliberately ignoring unsafePerformIObecause, as implied by the name, it is very unsafe! Don't ever use it unless you reallyknow what you are doing.

1我故意忽略,unsafePerformIO因为正如名字所暗示的那样,它非常不安全!除非您真的知道自己在做什么,否则永远不要使用它。

回答by pooya72

As a programming noob, I too was confused by IOs. Just remember that if you go IOyou never come out. Chris wrote a great explanation on why. I just thought it might help to give some examples on how to use IO Stringin a monad. I'll use getLinewhich reads user input and returns an IO String.

作为一个编程菜鸟,我也对IOs感到困惑。请记住,如果你去,IO你永远不会出来。克里斯写了一个很好的解释为什么。我只是认为给出一些关于如何IO String在 monad 中使用的例子可能会有所帮助。我将使用getLine读取用户输入并返回一个IO String.

line <- getLine 

All this does is bind the user input from getLineto a value named line. If you type this this in ghci, and type :type lineit will return:

所有这些都是将用户输入绑定getLine到一个名为 的值line。如果你在 ghci 中输入这个,然后输入:type line它会返回:

:type line
line :: String

But wait! getLinereturns an IO String

可是等等!getLine返回一个IO String

:type getLine
getLine :: IO String

So what happened to the IOness from getLine? <-is what happened. <-is your IOfriend. It allows you to bring out the value that is tainted by the IOwithin a monad and use it with your normal functions. Monads are easily identified because they begin with do. Like so:

那么IOness from发生了什么getLine<-是怎么回事。<-是你的IO朋友。它允许您带出被IOmonad 内部污染的值,并将其与您的正常功能一起使用。单子很容易识别,因为它们以do. 像这样:

main = do
    putStrLn "How much do you love Haskell?"
    amount <- getLine
    putStrln ("You love Haskell this much: " ++ amount) 

If you're like me, you'll soon discover that liftIOis your next best monad friend, and that $help reduce the number of parenthesis you need to write.

如果你像我一样,你很快就会发现它liftIO是你的下一个最好的 monad 朋友,这$有助于减少你需要写的括号数量。

So how do you get the information from readFile? Well if readFile's output is IO Stringlike so:

那么你是如何获取信息的readFile呢?那么 ifreadFile的输出是IO String这样的:

:type readFile
readFile :: FilePath -> IO String

Then all you need is your friendly <-:

那么你所需要的只是你的友好<-

 yourdata <- readFile "samplefile.txt"

Now if type that in ghci and check the type of yourdatayou'll notice it's a simple String.

现在,如果在 ghci 中输入它并检查类型,yourdata您会注意到它是一个简单的String.

:type yourdata
text :: String

回答by JJJ

As people already say, if you have two functions, one is readStringFromFile :: FilePath -> IO String, and another is doTheRightThingWithString :: String -> Something, then you don't really need to escape a string from IO, since you can combine this two functions in various ways:

正如人们已经说过的,如果您有两个函数,一个是readStringFromFile :: FilePath -> IO String,另一个是doTheRightThingWithString :: String -> Something,那么您实际上不需要从 转义字符串IO,因为您可以通过多种方式组合这两个函数:

With fmapfor IO(IOis Functor):

随着fmapIOIOFunctor):

fmap doTheRightThingWithString readStringFromFile

With (<$>)for IO(IOis Applicativeand (<$>) == fmap):

随着(<$>)IOIOApplicative(<$>) == fmap):

import Control.Applicative

...

doTheRightThingWithString <$> readStringFromFile

With liftMfor IO(liftM == fmap):

随着liftMIOliftM == fmap):

import Control.Monad

...

liftM doTheRightThingWithString readStringFromFile

With (>>=)for IO(IOis Monad, fmap == (<$>) == liftM == \f m -> m >>= return . f):

随着(>>=)IOIOMonadfmap == (<$>) == liftM == \f m -> m >>= return . f):

readStringFromFile >>= \string -> return (doTheRightThingWithString string)
readStringFromFile >>= \string -> return $ doTheRightThingWithString string
readStringFromFile >>= return . doTheRightThingWithString
return . doTheRightThingWithString =<< readStringFromFile

With donotation:

随着do符号:

do
  ...
  string <- readStringFromFile
  -- ^ you escape String from IO but only inside this do-block
  let result = doTheRightThingWithString string
  ...
  return result

Every time you will get IO Something.

每次你都会得到IO Something

Why you would want to do it like that? Well, with this you will have pureand referentially transparentprograms (functions) in your language. This means that every function which type is IO-free is pureand referentially transparent, so that for the same arguments it will returns the same values. For example, doTheRightThingWithStringwould return the same Somethingfor the same String. However readStringFromFilewhich is not IO-free can return different strings every time (because file can change), so that you can't escape such unpure value from IO.

你为什么要这样做?那么,有了这个,您将拥有您的语言中纯粹引用透明的程序(函数)。这意味着每个类型为 IO-free 的函数都是纯的引用透明的,因此对于相同的参数,它将返回相同的值。例如,doTheRightThingWithString将返回相同Something的相同String。但是readStringFromFile,不是 IO-free 的每次都可以返回不同的字符串(因为文件可以更改),因此您无法从IO.

回答by dave4420

If you have a parser of this type:

如果您有这种类型的解析器:

myParser :: String -> Foo

and you read the file using

然后你使用读取文件

readFile "thisfile.txt"

then you can read and parse the file using

然后你可以使用读取和解析文件

fmap myParser (readFile "thisfile.txt")

The result of that will have type IO Foo.

其结果将具有类型IO Foo

The fmapmeans myParserruns "inside" the IO.

fmap手段myParser在 IO“内部”运行。

Another way to think of it is that whereas myParser :: String -> Foo, fmap myParser :: IO String -> IO Foo.

另一种思考方式是,而myParser :: String -> Foo, fmap myParser :: IO String -> IO Foo.