scala - Reading empty files with scala
问题描述
I wrote the following function in scala which reads a file into a list of strings. My aim is to make sure that if the file input is empty that the returned list is empty too. Any idea how to do this in an elegant way:
def linesFromFile(file: String): List[String] = {
def materialize(buffer: BufferedReader): List[String] = materializeReverse(buffer, Nil).reverse
def materializeReverse(buffer: BufferedReader, accumulator: List[String]): List[String] = {
buffer.readLine match {
case null => accumulator
case line => materializeReverse(buffer, line :: accumulator)
}
}
val buffer = new BufferedReader(new FileReader(file))
materialize(buffer)
}
解决方案
您的代码应该可以工作,但它在内存使用方面效率很低:您将整个文件读入内存,然后浪费更多内存并以正确的顺序处理这些行。
使用Source.fromFile
标准库中的方法是您最好的选择(它还支持各种文件编码),如其他评论/答案中所述。
但是,如果您必须自己滚动,我认为使用 a Stream
(一种惰性形式的列表)比 a 更有意义List
。您可以一次返回每一行,并且可以在到达文件末尾时终止流。这可以按如下方式完成:
import java.io.{BufferedReader, FileReader}
def linesFromFile(file: String): Stream[String] = {
// The value of buffer is available to the following helper function. No need to pass as
// an argument.
val buffer = new BufferedReader(new FileReader(file))
// Helper: retrieve next line from file. Called only when next value requested.
def materialize: Stream[String] = {
// Uncomment to demonstrate non-recursive nature of this method.
//println("Materialize called!")
// Read the next line and wrap in an option. This avoids the hated null.
Option(buffer.readLine) match {
// If we've seen the end of the file, return an empty stream. We're done reading.
case None => {
buffer.close()
Stream.empty
}
// Otherwise, prepend the line read to another call to this helper.
case Some(line) => line #:: materialize
}
}
// Start the process.
materialize
}
虽然看起来materialize
是递归的,但实际上它只是在需要检索另一个值时才调用,因此您无需担心堆栈溢出或递归。println
您可以通过取消注释调用来验证这一点。
例如(在Scala REPL会话中):
$ scala
Welcome to Scala 2.12.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_171).
Type in expressions for evaluation. Or try :help.
scala> import java.io.{BufferedReader, FileReader}
import java.io.{BufferedReader, FileReader}
scala> def linesFromFile(file: String): Stream[String] = {
|
| // The value of buffer is available to the following helper function. No need to pass as
| // an argument.
| val buffer = new BufferedReader(new FileReader(file))
|
| // Helper: retrieve next line from file. Called only when next value requested.
| def materialize: Stream[String] = {
|
| // Uncomment to demonstrate non-recursive nature of this method.
| println("Materialize called!")
|
| // Read the next line and wrap in an option. This avoids the hated null.
| Option(buffer.readLine) match {
|
| // If we've seen the end of the file, return an empty stream. We're done reading.
| case None => {
| buffer.close()
| Stream.empty
| }
|
| // Otherwise, prepend the line read to another call to this helper.
| case Some(line) => line #:: materialize
| }
| }
|
| // Start the process.
| materialize
| }
linesFromFile: (file: String)Stream[String]
scala> val stream = linesFromFile("TestFile.txt")
Materialize called!
stream: Stream[String] = Stream(Line 1, ?)
scala> stream.head
res0: String = Line 1
scala> stream.tail.head
Materialize called!
res1: String = Line 2
scala> stream.tail.head
res2: String = Line 2
scala> stream.foreach(println)
Line 1
Line 2
Materialize called!
Line 3
Materialize called!
Line 4
Materialize called!
请注意materialize
,仅当我们尝试从文件中读取另一行时才调用 how 。此外,如果我们已经检索到一行,则不会调用它(例如,输出中的Line 1
和仅在第一次引用时才在前面)。Line 2
Materialize called!
关于空文件,在这种情况下,将返回一个空流:
scala> val empty = linesFromFile("EmptyFile.txt")
Materialize called!
empty: Stream[String] = Stream()
scala> empty.isEmpty
res3: Boolean = true
推荐阅读
- c++ - 如何在 C++ 中暂停进程?
- haskell - Emacs:如何在源文件的注释部分评估 Haskell 表达式
- python - 如何使用 tf.data 创建多元时间序列数据集?
- reactjs - 重新加载我的 reactJS 项目时出现“错误:超出最大更新深度”
- sql - 如何使用 T-SQL 从 HTML 内容中获取所有 PDF 链接
- javascript - 未登录和登录后如何处理购物车系统?
- excel - Excel 具有 3 个相同的数据 Web 查询,仅加载一个,另外两个状态为“仅连接”
- python - 如何遍历文件夹中的文件并将我的脚本应用于python中的所有文件
- sql - 日期功能中以下查询的 SQL 查询优化
- node.js - 缺少 CORS 标头“Access-Control-Allow-Origin” - CORS 请求未成功