首页 > 解决方案 > Getting the date of receipts using OCR

问题描述

I am using Google's ML-Kit for text recognition on receipts. Every floating number the OCR finds gets put in an Array.

I know how to get the name of the store and found a way to reliably get the total amount.(I know its not the best way, but the easiest for me to implement with barely any programming knowledge)

But i am struggling with getting the date out of the array. The date format here in germany is dd.mm.yyyy and the OCR recognizes this as a whole block.

E.g. 14.06.2021.

How can i now fetch this out of the Array? Or will it not be put into the array due to my regex pattern?

 fun String.findFloat(): ArrayList<Float> { 
//get digits from result
if (this.isEmpty()) return ArrayList<Float>()
val originalResult = ArrayList<Float>()
val matchedResults = Regex(pattern = "[+-]?([0-9]*[.])?[0-9]+").findAll(this)
for (txt in matchedResults) {
    if (txt.value.isFloatAndWhole()) originalResult.add(txt.value.toFloat())
}
return originalResult
}

fun String.getBetrag(): String {
if (this.isEmpty()) return "" 
return this.split("\n").let {it[it.size -2]}
}

fun String.getGeschäft(): String { 
if (this.isEmpty()) return ""
return this.split("\n").get(0)
}


private fun String.isFloatAndWhole() = this.matches("\\d*\\.\\d*".toRegex())

标签: androidregexkotlinocr

解决方案


推荐阅读