首页 > 解决方案 > for-loop 中的 Python re.search 给出了误报。我该如何解决?

问题描述

我正在创建一个自动更新我的网站的代码,并且在处理代码以识别标签并正确标记数据库上的页面时,我遇到了一个我不知道如何修复的错误。
我做了一个 for 循环来迭代 .php 的行,然后使用 if 语句来查找标签。但不知何故,从它的输出来看,我的 if 语句响应了两次。

首先,我检查了我的正则表达式是否给出了误报。使用文本编辑软件从代码中使用相同的正则表达式手动搜索,但它只找到一行。
然后我去检查 re.compile 和 re.search 是如何工作的,但我没有做错任何事情。

这是代码的一部分。

        mydb = mysql.connector.connect(
        [Personal information redacted]
        )
        mycursor = mydb.cursor()
        local = input('Select directory.')
        for paths, dirs, files in os.walk(local):
            for f in files:
                print(f)
                if(splitext(f)[1] == ".php"):
                    print("found .php")
                    opened = open(local + f, 'r')
                    lines = opened.readlines()
                    date = splitext(f)[0]
                    flagD = re.compile(r'<!--desc.')
                    flagS = re.compile(r'<!--subject.')
                    flagE = re.compile(r'-->')
                    desc = None
                    subject = None
                    for l in lines:
                        if(flagD.search(l) != None):
                            print("found desc")
                            desc = re.sub(flagD, "",l)
                            descF = re.sub(flagE,"",desc)
                        if(flagS.search(l) != None):
                            print("found subj")
                            subject = re.sub(flagS, "",l)
                            subjectF = re.sub(flagE,"",subject)
                    if(desc == None or subject == None):
                        continue
                    sql = "INSERT INTO arquivos (quando, descricao, assunto, file) VALUES (%s, %s, %s, %s)"
                    val = (date, descF, subjectF, f)
                    mycursor.execute(sql, val)
                    mydb.commit()  

这是输出:

2018-11-15.php
found .php
2018-11-16.php
found .php
2018-11-26.php
found .php
2019-01-13.php
found .php
2019-01-15.php
found .php
2019-01-16.php
found .php
2019-01-17.php
found .php
2019-01-22.php
found .php
found desc
found subj
2019-01-24.php
found .php
found desc
found desc
found subj
found subj
BdUpdate.php
found .php
BdUpdate1.php
found .php
Comentarios.php
found .php
FINAL.php
found .php
Foot.inc
Formulario.php
found .php
FormularioCompleto.php
found .php
Head.inc
index.php
found .php
index1.php
found .php
Java.php
found .php
Layout Base - Copy.php
found .php
Layout Base.php
found .php
Php_Test.ste
Phyton.php
found .php
SalvandoDB.php
found .php
sidenav.inc
Side_Menu.php
found .php
Thema.php
found .php
Translations.php
found .php
Web.php
found .php
2019-01-13.php
found .php

如您所见, and 以某种方式print("found desc")print("found subj")
one 中被调用了两次print("found .php")。这意味着它在我的代码中的某个地方给出了误报,但这根本不可能,因为我在其他软件中测试了这个正则表达式。这完全是无意的,并将其余代码作为我的数据库中的一个条目。
编辑:抱歉耽搁了。这是在意外返回部分被扫描的 .php。

<!doctype html>
<!--desc Today I attempted to learn django, but a lot went wrong and I couldn't do it.-->
<!--subject:Java-->
<html>
<head>
<meta charset="utf-8">
<title>Training Diary</title>
<?php
// Establecer la zona horaria predeterminada a usar. Disponible desde PHP 5.1
date_default_timezone_set('Asia/Tokyo');
$pasta=date("F");
echo '<link rel = "stylesheet" type = "text/css" href = "';
echo "$pasta";
echo '/estilo.css"/>';
?>
<link href="January/estilo.css" rel="stylesheet" type="text/css">
</head>
<body>
<table width="100%" align="center" cellpadding="0" cellspacing="0" summary="Around Table">
<tr>
<td width="100%" height="100%" valign="top">
<!--HEADER -->
<table width="100%" border="0" cellspacing="0" cellpadding="0" summary="Header">
<tr>
<td id="claro"><img src="img/spc.png" width="140" height="40" alt="space_Header">
</td>
<td width="100%" rowspan="2" align="center" valign="middle" id="claro">
<div id="banner"></div>
</td>
</tr>
<tr>
<td id="escuro"><img src="img/spc.png" width="140" height="20" alt="space_Header">
</td>
</tr>
</table>
</td>

</tr>   
<table width="100%" border="0" cellspacing="0" cellpadding="0" summary="Meio">
<tr>
<td height="100%" valign="top" id="escuro">
<base target="_top">
<div align="center" id="Side">
<table border="0" width="100%" cellspacing="1" cellpadding="0">
<tr>
<!--MENU MENU MENU MENU MENU MENU MENU MENU-->
<?php
$sql = "SELECT * FROM menu";
$con=mysqli_connect("localhost","root","","bdcomentarios");
$executar=mysqli_query($con, $sql);
while( $exibir = mysqli_fetch_array($executar)){
    echo '<td align="center" bordercolor="#2A628F" id="claro">';
    echo '<a href="';
    echo $exibir['assunto'];
    echo '.php" id="Side">';
    echo $exibir['assunto'];
    echo '</a>';
    echo '</td></tr><tr>';
}
mysqli_close($con)
?>
<!--FIM MENU FIM MENU FIM MENU FIM MENU FIM-->
</td>
</tr>
</table>
</div>
<img src="img/spc.png" width="140" height="1" alt="space_Meio">
</td>
<td width="100%" height="100%">
<table width="90%" border="0" cellspacing="0" cellpadding="0">
  <tr>
    <td align="center">
<h2> Creative ways to iterate 

</h2><br><p>Today I fixed 2 things.

<br>The first one, is that the methods of the classes that implements Pieces, was calling the Pieces' method,

<br>instead of their own. I took the method declaration, removed it and replaced it with an abstract method so the entire

<br>code does not glitch due to the absence of Move().

<br>Then, I noticed that the evaluation on notBlocked() on diagonal moves was wrong.

<br>Before, I was using nested for loops to iterate through the blocks it will move through.

<br>But as you may have noticed, that means that it will evaluate a square area instead of a diagonal line.

<br>So, I made a single for loop inside nested if statements that determine which angle it is moving on,

<br>Example: if(positionX > destinationX){if(positionY > destinationY)for(int i....) (this means it is moving down-left because both values are going down.)

<br>then made it return the piece on each square, and I expressed it with subtracting or adding the current loop number to the original position.

<br>Meaning, if you are on the second loop, and you want to see if there is a piece at 2 squares below AND left, it is x minus loop No. and y minus loop No.

<br>And by making different ways of iterating, I succeeded in correctly evaluating the bishop (and the queen's) movement.

<br>Now there is only 3 more unexpected returns to fix.

<br>More coming soon. <br><br><br>

    <h2>Visitors Comments, Thanks!</h2>
    <table width="50%" border="0" align="center" cellpadding="0" cellspacing="0"><tr><td>
<form action="<?php echo htmlspecialchars($_SERVER["PHP_SELF"]); ?>" method="post" id"postcomments">
Name:(Show)<br>
<input type="CHAR" name="nome">
<br><br>E-Mail:(Hide)<br>
<input type="text" name= "email">
<br><br>Message:(Show)<br>
<textarea name="comentario"></textarea>
<br><br>
<INPUT TYPE="hidden" NAME="pagina" VALUE="<!--DATE-->">
<input type="submit" name="submit" value="Enviar">
<input type="reset" value="Limpar">
</form>
<hr>
</td></tr></table>
<?php
if(isset($_POST['submit'])){
$nome = "";
$email = "";
$comentario = "";
$pagina ="";
//keep the variables
if(isset($_POST["nome"]))
     $nome = $_POST["nome"];
if(isset($_POST["email"]))
     $email = $_POST["email"];
if(isset($_POST["comentario"]))
     $comentario = $_POST["comentario"];
if(isset($_POST["pagina"]))
     $pagina = $_POST["pagina"];

//current date
$date = date_default_timezone_set('Asia/Tokyo');
$data = date("Y/m/d");
$con=mysqli_connect("localhost","root","","bdcomentarios");

//EU COLOQUEI
if(isset($_POST["nome"],$_POST["pagina"], $_POST["email"], $_POST["comentario"], $_POST["data"]));

// Check connection
if (mysqli_connect_errno())
{
    echo "Failed to connect to MySQL: " . mysqli_connect_error();
}


$sql_insert="INSERT INTO tbcomentarios (data, nome, email, comentario, pagina) 
VALUES('$data', '$nome', '$email', '$comentario', '$pagina')";

//check the insert into DB
if (mysqli_query($con,$sql_insert)) {
echo '<script type="text/JavaScript">
alert("Sua mensagem foi gravada com sucesso. Obrigado");
location.href="<!--DATE-->.php";
</script>';
}
else {
     echo "Error: " . $sql . "<br>" . mysqli_error($con);
}



$sql = "SELECT * FROM tbcomentarios WHERE pagina like '%<!--DATE-->%' ORDER BY id desc";
$executar=mysqli_query($con, $sql);
while( $exibir = mysqli_fetch_array($executar)){
    echo $exibir['data'];
    echo "<br><b>Name:</b>";
    echo $exibir['nome'];
    echo "<br>";
    echo "<b>E-mail:</b>*********";
    echo "<br><b>Comment:</b><br>";
    echo $exibir['comentario'];
    echo "<br><hr>";
}
}
?>
<?php
$sql = "SELECT * FROM tbcomentarios WHERE pagina like '%<!--DATE-->%' ORDER BY id desc";
$con=mysqli_connect("localhost","root","","bdcomentarios");
$executar=mysqli_query($con, $sql);
while( $exibir = mysqli_fetch_array($executar)){
    echo '<table width="50%" border="0" align="center" cellpadding="0" cellspacing="0"><tr><td>';
    echo $exibir['data'];
    echo "<br><b>Name:</b>";
    echo $exibir['nome'];
    echo "<br>";
    echo "<b>E-mail:</b>*********";
    echo "<br><b>Comment:</b><br>";
    echo $exibir['comentario'];
    echo "<br><hr>";
    echo '</td></tr></table>';
}
?></td>
  </tr>
</table>

</td>
</tr>
</table>
<!--Foot -->
<table width="100%" border="0" cellspacing="0" cellpadding="0" summary="Foot">
<tr>
<td id="escuro""><img src="img/spc.png" width="140" height="30" alt="space_Foot">
</td>
<td width="100%" valign="bottom" id="escuro">
<div id="Foot">
<table align="center" cellpadding="3" cellspacing="1" summary="Foot Menu">
<tr>
<!--FOOT MENU FOOT MENU FOOT MENU FOOT MENU-->
<?php
$sql = "SELECT * FROM menu";
$con=mysqli_connect("localhost","root","","bdcomentarios");
$executar=mysqli_query($con, $sql);
while( $exibir = mysqli_fetch_array($executar)){
    echo '<td align="center" valign="bottom"><a href="';
    echo $exibir['assunto'];
    echo '.php">';
    echo $exibir['assunto'];
    echo '</a>';
    echo '</td>';
}
mysqli_close($con);
?>
</tr>
</table>
</div>
</td>
</tr>
</table>
</td>
</tr>
</table>
</body>
</html>

另外,当我说我用“其他软件”检查时,我的意思是我去了记事本++,然后用完全相同的正则表达式进行了正则表达式搜索,它按预期工作。
我希望这些信息将有助于确定问题。
提前致谢。

PS。我的大多数问题都被关闭或锁定,没有人解释原因。我已经编辑了过去的问题以符合指导方针,但问题很快就被埋没了。请停止。

标签: pythonregexfor-loopif-statement

解决方案


与其尝试一次解析每个文件一行,还可以使用单个多行正则表达式提取两个值。例如,下面显示了如何为单个测试文件完成它:

from os.path import splitext
import re

f = 'test.php'
re_desc_subject = re.compile(r'<!--desc.([^\r\n]*?)-->.*?<!--subject.([^\r\n]*?)-->', re.M + re.S + re.I)

with open(f) as f_input:
    data = f_input.read()

desc_subject = re_desc_subject.search(data)

if desc_subject:
    desc, subject = desc_subject.groups()
    print(desc, subject)

因此,对于您的代码,这将按如下方式工作:

re_desc_subject = re.compile(r'<!--desc.([^\r\n]*?)-->.*?<!--subject.([^\r\n]*?)-->', re.M + re.S + re.I)

mydb = mysql.connector.connect(
[Personal information redacted]
)
mycursor = mydb.cursor()

local = input('Select directory.')

for paths, dirs, files in os.walk(local):
    for f in files:
        print(f)

        if splitext(f)[1] == ".php":
            print("found .php")

            with open(os.path.join(local, f)) as f_input:
                data = f_input.read()

            desc_subject = re_desc_subject.search(data)

            if desc_subject:
                desc, subject = desc_subject.groups()
                print("Found desc and subject")

                sql = "INSERT INTO arquivos (quando, descricao, assunto, file) VALUES (%s, %s, %s, %s)"
                val = (date, desc, subject, f)
                mycursor.execute(sql, val)
                mydb.commit()  

推荐阅读