sql - 转换 UTF到 nvarchar
问题描述
我有这个奇怪的 UTF 字符串存储:
<U+0410><U+043B><U+044C><U+043A><U+0430>
如何将其转换回 nvarchar?上面的字符串应该转换为Алька
更新。以下是更多示例数据:
+-------------------------------------------------------------------------------------+-----------------+
| Column1_encoded | Column1_decoded |
+-------------------------------------------------------------------------------------+-----------------+
| <U+0410><U+043B><U+044C><U+043A><U+0430> | Алька |
| ABC <U+0410><U+043B><U+044C><U+043A><U+0430> 1 | ABC Алька 1 |
| <U+0410><U+043B> 2 <U+044C><U+043A><U+0430> | Ал 2 ька |
| <U+0410><U+043B><U+044C><U+043A><U+0430> 3 <U+0410><U+043B><U+044C><U+043A><U+0430> | Алька 3 Алька |
+-------------------------------------------------------------------------------------+-----------------+
在通过 R 转换将数据从 Power BI 发送到 SQL Server 时,我得到了这种奇怪的格式: https ://stackoverflow.com/a/51386029/1903793
Jeroen Mostert 在评论中的回答似乎可以处理它。谢谢你。
解决方案
为了在多个列值中使用它,您需要将其转换为表值函数并通过调用它cross apply
,但我相信您可以自己管理。解释在评论中:
declare @str nvarchar(1000) = '<U+0410><U+043B><U+044C><U+043A><U+0430> This is a string with <U+0410><U+043B><U+044C><U+043A><U+0430> not encoded as we would like <U+0410><U+043B><U+044C><U+043A><U+0430>';
-- Add an additional > character before the first < character to act as the first delimiter
-- and then insert a delimiting > character before any instances of a < chracter that follow a space to ensure the character code is properly parsed out.
select @str = replace(stuff(@str,charindex('<',@str,1),0,'>'),' <',' ><');
-- Start tally table with 10 rows.
with n(n) as (select n from (values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n(n))
-- Select the same number of rows as characters in @str as incremental row numbers.
-- Cross joins increase exponentially to a max possible 10,000 rows to cover largest @str length.
,t(t) as (select top (select len(@str) a) row_number() over (order by (select null)) from n n1,n n2,n n3,n n4)
-- Return the position of every value that follows the specified delimiter.
,s(s) as (select 1 union all select t+1 from t where substring(@str,t,1) = '>')
-- Return the start and length of every value, to use in the SUBSTRING function.
-- ISNULL/NULLIF combo handles the last value where there is no delimiter at the end of the string.
,l(s,l) as (select s,isnull(nullif(charindex('>',@str,s),0)-s,4000) from s)
,r as (select rn as ItemNumber
,Item
from(select row_number() over(order by s) as rn
,substring(@str,s,l) as item
from l
) a
where Item <> ''
)
select cast((select case when left(Item,3) = '<U+' -- Where required, convert the Unicode number into a character using the NCHAR function
then nchar(convert(nvarchar(500),convert(int,(convert(varbinary(max),replace(Item,'<U+','0x0000'),1)))))
else Item
end
from r
order by ItemNumber
for xml path('')
) as nvarchar(max)) as String;
输出:
+----------------------------------------------------------------------+
| String |
+----------------------------------------------------------------------+
| Алька This is a string with Алька not encoded as we would like Алька |
+----------------------------------------------------------------------+
推荐阅读
- odoo-9 - 从 Odoo 应用程序下载 Excel 报告
- c# - LINQ - 分组,然后有条件地求和
- javascript - VTK.js VR 不在 Room 中显示/显示手控制器
- java - java.sql.SQLIntegrityConstraintViolationException:列'library_idlibrary'不能为空
- c# - 当 `throw new` 旁边发生异常时,还有其他方法可以查看程序的行为吗?
- windows - 在 VPS 中使用 KeyboardEvent
- android - 如何使用用户定义的文档 ID 创建 Firebase Firestore 文档?
- css - 如何将 box-shadow 逗号分隔值与 filter: drop-shadow 一起使用?
- powerbi - 微软 Power BI | TOP 5 客户的销售额和销售额占总销售额的百分比(不仅仅是前 5 名)
- azure - azure_managed_disk 和 storage_data_disk 之间的关系