首页 > 解决方案 > 替换在工作笔记中不起作用的不可打印的 ASCII 字符

问题描述

我正在尝试在可能包含不可打印的 ASCII 字符的工作笔记中添加文本。这些字符在存储到数据库之前不会按预期被替换。

<work_notes>
TEST
X  000  000  0x00  00000000  NUL  (Null char.)
  001  001  0x01  00000001  SOH  (Start of Header)
  002  002  0x02  00000010  STX  (Start of Text)
  003  003  0x03  00000011  ETX  (End of Text)
  004  004  0x04  00000100  EOT  (End of Transmission)
  005  005  0x05  00000101  ENQ  (Enquiry)
  006  006  0x06  00000110  ACK  (Acknowledgment)
  007  007  0x07  00000111  BEL  (Bell)
  008  010  0x08  00001000   BS  (Backspace)
      009  011  0x09  00001001   HT  (Horizontal Tab)

  010  012  0x0A  00001010   LF  (Line Feed)
  011  013  0x0B  00001011   VT  (Vertical Tab)
  012  014  0x0C  00001100   FF  (Form Feed)

  013  015  0x0D  00001101   CR  (Carriage Return)
  014  016  0x0E  00001110   SO  (Shift Out)
  015  017  0x0F  00001111   SI  (Shift In)
  016  020  0x10  00010000  DLE  (Data Link Escape)
  017  021  0x11  00010001  DC1  (XON)(Device Control 1)
  018  022  0x12  00010010  DC2  (Device Control 2)
  019  023  0x13  00010011  DC3  (XOFF)(Device Control 3)
  020  024  0x14  00010100  DC4  (Device Control 4)
  021  025  0x15  00010101  NAK  (Negative Acknowledgement)
  022  026  0x16  00010110  SYN  (Synchronous Idle)
  023  027  0x17  00010111  ETB  (End of Trans. Block)
  024  030  0x18  00011000  CAN  (Cancel)
  025  031  0x19  00011001   EM  (End of Medium)
  026  032  0x1A  00011010  SUB  (Substitute)
  027  033  0x1B  00011011  ESC  (Escape)
  028  034  0x1C  00011100   FS  (File Separator)
  029  035  0x1D  00011101   GS  (Group Separator)
  030  036  0x1E  00011110   RS  (Request to Send)(Record Separator)
  031  037  0x1F  00011111   US  (Unit Separator)
</work_notes>

工作笔记中显示的方块是实际字符,但在文本区域中它没有显示。

我为替换 Escape 字符而编写的代码是

/**
 * Escape a string for XML.
 * @param {String} txt
 * @return {String}
 */
ImDataHelper.escapeXml = function (txt) {
  var str = txt;
  // Replace the escape character.
  txt = str.replace(/x1B/g,''); 
  // copied from SOAPMessage script include
  return Packages.org.apache.commons.lang.StringEscapeUtils.escapeXml(txt);
};

运行这个事务的输出如下

<work_notes>2019-04-09 13:31:37 - Shaji Kalidasan (Work Notes)
TEST
X  000  000  0x00  00000000  NUL  (Null char.)
  001  001  0x01  00000001  SOH  (Start of Header)
  002  002  0x02  00000010  STX  (Start of Text)
  003  003  0x03  00000011  ETX  (End of Text)
  004  004  0x04  00000100  EOT  (End of Transmission)
  005  005  0x05  00000101  ENQ  (Enquiry)
  006  006  0x06  00000110  ACK  (Acknowledgment)
  007  007  0x07  00000111  BEL  (Bell)
  008  010  0x08  00001000   BS  (Backspace)
      009  011  0x09  00001001   HT  (Horizontal Tab)

  010  012  0x0A  00001010   LF  (Line Feed)
  011  013  0x0B  00001011   VT  (Vertical Tab)
  012  014  0x0C  00001100   FF  (Form Feed)

  013  015  0x0D  00001101   CR  (Carriage Return)
  014  016  0x0E  00001110   SO  (Shift Out)
  015  017  0x0F  00001111   SI  (Shift In)
  016  020  0x10  00010000  DLE  (Data Link Escape)
  017  021  0x11  00010001  DC1  (XON)(Device Control 1)
  018  022  0x12  00010010  DC2  (Device Control 2)
  019  023  0x13  00010011  DC3  (XOFF)(Device Control 3)
  020  024  0x14  00010100  DC4  (Device Control 4)
  021  025  0x15  00010101  NAK  (Negative Acknowledgement)
  022  026  0x16  00010110  SYN  (Synchronous Idle)
  023  027  0x17  00010111  ETB  (End of Trans. Block)
  024  030  0x18  00011000  CAN  (Cancel)
  025  031  0x19  00011001   EM  (End of Medium)
  026  032  0x1A  00011010  SUB  (Substitute)
  027  033  0  00011011  ESC  (Escape)
  028  034  0x1C  00011100   FS  (File Separator)
  029  035  0x1D  00011101   GS  (Group Separator)
  030  036  0x1E  00011110   RS  (Request to Send)(Record Separator)
  031  037  0x1F  00011111   US  (Unit Separator)
</work_notes>

如您所见,它仅替换了 '0x1B' 中的 'x1B' 而不是正方形中显示的实际 ASCII Escape 字符。

标签: javascriptregex

解决方案


您需要\在正则表达式中转义这些 ASCII 代码(我在示例中记录字符串长度,因为该字符不会显示在控制台中):

var str = String.fromCharCode(27) + '  027  033  0x1B  00011011  ESC  (Escape)';

console.log('length: ', str.length);

str = str.replace(/\x1B/g, '');

console.log('length: ', str.length);

参见 Regex101

注意:您也可以在这样的间隔中使用它们:/[\x00-\x1F]/g

参见 Regex101


推荐阅读