mongodb - 二进制数据中的Spring Mongodb搜索字符串
问题描述
我正在使用 spring rest 在 mongodb 中存储文档(文本、pdf、csv、doc、docx 等)。文档被存储为二进制数据。现在我想根据里面的内容搜索文档。例如,如果用户搜索字符串“office”,他应该看到包含字符串“office”的文档列表。如何查询 mongodb 以获取二进制数据中包含的数据?
解决方案
You could try to define a text index over your binary files. I don't know if it would work, but even if it does, such an index would match any words that are part of the file format rather than user content which is generally undesirable.
If I was implementing your requirements I would use a transformer from all of the binary documents to plain text (e.g. pandoc), thus obtaining the user content of each of the documents, then insert that content into a field which has a text index over it, then query on that field.
推荐阅读
- javascript - 如何链接到特定 ID?
- reactjs - 我需要一种方法来使组件有条件地呈现
- python - 有没有办法将字典的组合键作为函数的参数传递?
- flutter - 任务完成时在任何屏幕上显示 Snackbar
- docker - 当我们在容器上设置 jvm 参数和内存请求和限制时,内存管理是如何发生的
- amazon-web-services - 如何在redshift中创建包含列作为值的列?
- amazon-web-services - 清除 terragrunt 缓存时与 TFC 中的状态(本地执行)失去同步
- c++ - 用柯南安装 Qt5Svg
- java - Gradle 多项目存储库配置不适用于 spring-boot 应用程序
- reactjs - 如何动态更改导航栏标题?