scala - Directive to complete with RandomAccessFile read
问题描述
I have a large data file and respond to GET requests with very small portions of that file as Array[Byte]
The directive is:
get {
dataRepo.load(param).map(data =>
complete(
HttpResponse(
entity = HttpEntity(myContentType, data),
headers = List(gzipContentEncoding)
)
)
).getOrElse(complete(HttpResponse(status = StatusCodes.NoContent)))
}
Where dataRepo.load
is a function along the lines of:
val pointers: Option[Long, Int] = calculateFilePointers(param)
pointers.map { case (index, length) =>
val dataReader = new RandomAccessFile(dataFile, "r")
dataReader.seek(index)
val data = Array.ofDim[Byte](length)
dataReader.readFully(data)
data
}
Is there a more efficient way to pipe the RandomAccessFile read directly back in the response, rather than having to read it fully first?
解决方案
Instead of reading the data into a Array[Byte]
you could create an Iterator[Array[Byte]]
which reads chunks of the file at a time:
val dataReader = new RandomAccessFile(dataFile, 'r')
val chunkSize = 1024
Iterator
.range(index, index + length, chunkSize)
.map { currentIndex =>
val currentBytes =
Array.ofDim[Byte](Math.min(chunkSize, length - currentIndex))
dataReader seek currentIndex
dataReader readFully currentBytes
currentBytes
}
This iterator can now feed an akka Source
:
val source : Source[Array[Byte], _] =
Source fromIterator (() => dataRepo.load(param))
Which can then feed an HttpEntity:
val byteStrSource : Source[ByteString, _] = source.map(ByteString.apply)
val httpEntity = HttpEntity(myContentType, byteStrSource)
Now each client will only use 1024 Bytes of memory at-a-time instead of the full length of your file read. This will make your server much more efficient at handling multiple concurrent requests as well as allowing your dataRepo.load
to return immediately with a lazy Source
value instead of utilizing a Future
.
推荐阅读
- javascript - 合并重复并增加数组javascript中每个对象的计数
- python - 比较数百万个 mongoDB 记录中的变化的最佳方法
- node.js - Mongoose ORM - 查找查询返回包装在 _doc 对象中的文档
- python - 如何根据与行索引对应的列表获取数据帧的子集?
- javascript - 请帮我解决 JavaScript Asyncronus 问题
- javascript - 将项目添加到数组开头的更短/更清洁的方法,同时还删除最后一个?
- machine-learning - 来自 scikit learn 的朴素贝叶斯模型
- sage - 有没有办法在 Sage 中显示近似值?
- python - 为什么Sconstruct中有一些python语句没有执行?
- c++ - 破坏局部静态对象