java - 如何从Java中结构类似于XML的文件中获取特定元素
问题描述
我有一个 .sic 文件,其结构类似于 XML,但不完全。有一个部分Channel2
我想阅读一些元素。部分是这样的:
.
.
.
<SI name = "Channel2" type = "list">
<SI name = "SecsPortConfig" type = "list">
<SI name = "PortType" type = "string">'XXX'</SI>
<SI name = "Protocol" type = "string">'XXX'</SI>
<SI name = "Serial" type = "list">
<SI name = "Port" type = "int">'XXX'</SI>
<SI name = "Speed" type = "int">'XXXX'</SI>
</SI>
<SI name = "Socket" type = "list">
<SI name = "ConnectionMode" type = "string">'XXX'</SI>
<SI name = "LocalHost" type = "string">'XXX.XXX.XXX.XXX'</SI>
<SI name = "LocalPort" type = "int">'XXX'</SI>
<SI name = "RemoteHost" type = "string">'XXX.XXX.XXX'</SI>
<SI name = "RemotePort" type = "int">'XXX'</SI>
</SI>
<SI name = "HSMS" type = "list">
<SI name = "T5" type = "int">'XXX'</SI>
<SI name = "T6" type = "int">'XXX'</SI>
<SI name = "T7" type = "int">'XXX'</SI>
<SI name = "T8" type = "int">'XXX'</SI>
<SI name = "LinkTestTime" type = "int">'XXX'</SI>
</SI>
<SI name = "SECSI" type = "list">
<SI name = "T1" type = "int">'XXX'</SI>
<SI name = "T2" type = "int">'XXX'</SI>
<SI name = "T4" type = "int">'XXX'</SI>
<SI name = "RTY" type = "int">'XXX'</SI>
<SI name = "IsHost" type = "bool">'XXX'</SI>
<SI name = "IsMaster" type = "bool">'XXX'</SI>
<SI name = "InterleaveBlocks" type = "bool">'XXX'</SI>
</SI>
<SI name = "SECSII" type = "list">
<SI name = "DeviceID" type = "int">'XXX'</SI>
<SI name = "T3" type = "int">'XXX'</SI>
<SI name = "MultipleOpen" type = "bool">'XXX'</SI>
<SI name = "AutoDeviceID" type = "bool">'XXX'</SI>
</SI>
<SI name = "Log" type = "list">
<SI name = "LogCharError" type = "bool">'XXX'</SI>
<SI name = "LogCharEvent" type = "bool">'XXX'</SI>
<SI name = "LogCharReceive" type = "bool">'XXX'</SI>
<SI name = "LogCharSend" type = "bool">'XXX'</SI>
<SI name = "LogSecsIHsmsError" type = "bool">'XXX'</SI>
<SI name = "LogSecsIHsmsEvent" type = "bool">'XXX'</SI>
<SI name = "LogSecsIHsmsReceive" type = "bool">'XXX'</SI>
<SI name = "LogSecsIHsmsSend" type = "bool">'XXX'</SI>
<SI name = "LogSecsIIError" type = "bool">'XXX'</SI>
<SI name = "LogSecsIIEvent" type = "bool">'XXX'</SI>
<SI name = "LogSecsIIReceive" type = "bool">'XXX'</SI>
<SI name = "LogSecsIISend" type = "bool">'XXX'</SI>
</SI>
</SI>
<SI name = "UseSeparateSECSLogFile" type = "bool">'XXX'</SI>
<SI name = "Connected" type = "bool">'XXX'</SI>
<SI name = "MessageFilters" type = "list">
<SI name = "DeviceIDList" type = "list"/>
<SI name = "StreamFunctionList" type = "list"/>
</SI>
<SI name = "SafeMessageFilters" type = "list">
<SI name = "DeviceIDList" type = "list"/>
<SI name = "StreamFunctionList" type = "list"/>
</SI>
</SI>
.
.
.
如果它是一个 xml 文件,我可以解析它并读出元素,但是我该如何处理这种文件呢?我想提取元素RemoteHost
和RemotePort
. 我现在使用 BufferedReader 进行了尝试,并Channel2
通过在字符串中插入此部分来从文件中获取部分,但是如何提取我想要的元素的特定值?我可能可以使用子字符串和其他一些字符串方法来做到这一点,但没有更简单的方法吗?到目前为止,这是我的代码:
File file = new File("C:\\Users\\but\\Desktop\\ExternalswPassThroughSrv.sic");
int counter = 0;
BufferedReader br = new BufferedReader(new FileReader(file));
String cl;
String finalString = "";
while ((cl = br.readLine()) != null) {
if (cl.contains("Channel2")) {
counter = 63;
}
if(counter != 0){
//System.out.println(cl);
finalString += cl + "\n";
counter--;
}
}
System.out.println(finalString);
解决方案
由于我们不知道整个文件是如何形成的:
即使它不是一个完整的 XML 文档,您也可以从文件的其余部分中提取 XML-Fragment 并通过添加根将其转换为格式良好的 XML-Document元素。
之后,您可以将其解析为 Document 并使用 XPath 提取所需信息。
这是一些可以为您工作的示例 Java 代码(为了清楚起见,我没有包含 xml)
import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.TransformerException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import java.io.IOException;
import java.io.StringReader;
public class ConvertXml {
public static void main(String[] args) throws ParserConfigurationException, IOException, SAXException, TransformerException, XPathExpressionException {
// Your XML-like content
String xmlString = "xml here";
// transform xml-Fragment into well-formed xml with root element
String xmlStringWellformed = "<content>" + xmlString + "</content>";
// parse well-formed xml
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(xmlStringWellformed)));
// build xpath expression
String xPathRemoteHost = "//SI[@name='Channel2']/SI[@name='SecsPortConfig']/SI[@name='Socket']/SI[@name='RemoteHost']/text()";
String xPathRemotePort = "//SI[@name='Channel2']/SI[@name='SecsPortConfig']/SI[@name='Socket']/SI[@name='RemotePort']/text()";
XPath xPath = XPathFactory.newInstance().newXPath();
// Use XPath for extraction
String remoteHost = (String) xPath.compile(xPathRemoteHost).evaluate(document, XPathConstants.STRING);
String remotePort = (String) xPath.compile(xPathRemotePort).evaluate(document, XPathConstants.STRING);
System.out.println("RemoteHost: " + remoteHost);
System.out.println("RemotePort: " + remotePort);
}
}
推荐阅读
- php - 转移 Wordpress 网站后如何排查和修复数据库错误?
- java - 如何在不单击按钮的情况下将用户输入文本从 EditView 设置为 TextView
- xamarin.forms - 无法在 Xamarin 窗体中安装 NuGet 包
- c++ - C++编译器如何在继承中实现析构函数的反向调用顺序?
- python-3.x - 如何在数据框中按列分组并在循环中创建数据透视表
- android - 为 android 构建的 c++ 应用程序是否在内核上的 JVM 之上运行
- arrays - 在 c 中使用来自 valgrind 的大小为 8 的未初始化值
- c# - 使用 HttpClient PostAsync 发送 C# NameValueCollection 问题
- python - 为什么 Python 中并非所有 ram 内存都可用?
- python - 我的 heroku 应用程序无法导入 Phonenumber_field 模块