词法分析程序(Lexical Analyzer)要求:
- 从左至右扫描构成源程序的字符流
- 识别出有词法意义的单词(Lexemes)
- 返回单词记录(单词类别,单词本身)
- 滤掉空格
- 跳过注释
- 发现词法错误
程序结构:
输入:字符流(什么输入方式,什么数据结构保存)
处理:
–遍历(什么遍历方式)
–词法规则
输出:单词流(什么输出形式)
–二元组
单词类别:
1.标识符(10)
2.无符号数(11)
3.保留字(一词一码)
4.运算符(一词一码)
5.界符(一词一码)
单词符号 |
种别码 |
单词符号 |
种别码 |
begin |
1 |
: |
17 |
if |
2 |
:= |
18 |
then |
3 |
< |
20 |
while |
4 |
<= |
21 |
do |
5 |
<> |
22 |
end |
6 |
> |
23 |
l(l|d)* |
10 |
>= |
24 |
dd* |
11 |
= |
25 |
+ |
13 |
; |
26 |
- |
14 |
( |
27 |
* |
15 |
) |
28 |
/ |
16 |
# |
0
|
代码:
package cn.itcast.day13Collection;
import java.io.*;
import java.util.*;
public class Lexical_Analyzer {
public static void main(String[] args) throws IOException {
StringBuffer str=new StringBuffer("");
FileReader fr = new FileReader("D:\\webquanzhan\\day05-code\\day05_code\\src\\cn\\itcast\\day13Collection\\demo01.txt");
BufferedReader bf = new BufferedReader(fr);
//读取文章结尾
//消除注释
String tail = "";
String data = "";
while ((tail = bf.readLine()) != null) {
if (!tail.startsWith("//") && !tail.endsWith("//")) {
data += tail;
}
}
bf.close();
//分割文章,划分为多个单词可能含有a+b,这种字母和符号结合
String regex = "[ ,|\\n,|,|\\s]";
String[] word = data.split(regex);
//添加到链表里进行操作
ArrayList<String> wordlist = new ArrayList<>();
for (int i = 0; i < word.length; i++) {
wordlist.add(word[i]);
}
HashMap<String, Integer> rewordmap = new HashMap<>();
rewordmap.put("public", 1);
rewordmap.put("class", 2);
rewordmap.put("static ", 3);
rewordmap.put("void ", 4);
rewordmap.put("main", 5);
rewordmap.put("+", 13);
rewordmap.put("-", 14);
rewordmap.put("*", 15);
rewordmap.put("/", 16);
rewordmap.put(":", 17);
rewordmap.put(":=", 18);
rewordmap.put("<", 20);
rewordmap.put("<=", 21);
rewordmap.put("<>", 22);
rewordmap.put(">", 23);
rewordmap.put(">=", 24);
rewordmap.put("=", 25);
rewordmap.put(";", 26);
rewordmap.put("(", 27);
rewordmap.put(")", 28);
rewordmap.put("{", 29);
rewordmap.put("}", 30);
Iterator<String> it = wordlist.iterator();
while (it.hasNext()) {
String words = it.next();
//将每个单词进行遍历,查看是否有符号,将符号提取出来
char[] chars = words.toCharArray();
for (int i = 0; i < chars.length; i++) {
String s = String.valueOf(chars[i]);
if (rewordmap.containsKey(s)) {
//判断map是否含有对应的键,返回对应的值
Integer integer1 = rewordmap.get(s);
System.out.println(s + "-->" + integer1);
}
}
//将符号替代为空格,剩下单词空格隔开,然后利用空格将单词分割成一个数组
String new_words = words.replaceAll("[+,-,*,/,=,;]", " ");
String[] split_new_words = new_words.split("[ ,{,},(,)]");
//遍历数组查看是否符合条件
for (int i = 0; i < split_new_words.length; i++) {
String s=split_new_words[i];
//判断是否为数子,查阅百度得知该方法
boolean isNum01 = checkStrIsNum01(s);
Integer integer = rewordmap.get(split_new_words[i]);
//判断是否为保留字
if (rewordmap.containsKey(split_new_words[i])) {
System.out.println(split_new_words[i] + "-->" + integer);
} else if (isNum01) {
System.out.println(split_new_words[i] + "-->11");
} else if (split_new_words[i]!=""){
System.out.println(split_new_words[i] + "-->10");
}
}
}
}
public static boolean checkStrIsNum01(String str) {
for (int i = 0; i < str.length(); i++) {
if (!Character.isDigit(str.charAt(i))) {
return false;
}
}
return true;
}
}
demo.txt文件:
运行截图:
(注:以上代码参考他人,原网址:https://www.cnblogs.com/miaoxiaowen/p/11656495.html)