首页 > 解决方案 > 当它不是文件的第一行时如何自动设置标题?

问题描述

我处理的 csv 文件在标题前都有不同的行数。我需要根据文件自动设置标题。

下面是一个文件示例:

            Wine Directory List



            Wine Title  Vintage Country Region  Sub region  Appellation Color   Bottle Size Price   URL FORMAT
Chateau Petrus Pomerol  2011    France  Bordeaux    Pomerol     Red 750ML   2799.99 HTTP://holbrookliquors.com/sku218758.html   1x750ML
Pappy Van Winkle's Bourbon 15 Year Family Reserve       United States   Kentucky                0ML 999.99      1x0ML
Shipping Fee                            0ML 999.99  1x0ML
Heineken Holland Beer       Netherlands                 0ML 999.99  1x0ML

这是我的转换器:

更新:第一个解决方案:getHeaderLine()。只有挫折:当我开始使用 getHeaderLine() 解析文件时,我无法从 HeaderLine 获取数据,因为我已经读取了 getHeaderLine 中的行。拜托,那里有人帮助我。

public function convert($filePath, $feedColumnsMatch)
{

    //this array will contain the elements from the file
    $articles = [];

    $headerRecord = [];

        //if we can open the file on mode "read"
        if (($handle = fopen($filePath, 'r')) !== FALSE) {
            //represents the line we are reading
            $rowCounter = 0;
            $nb = $feedColumnsMatch->getNumberOfColumns();

            $headerLine = $this->getHeaderLine($handle, $nb, $delimiter);

            //as long as there are lines
            while (($rowData = fgetcsv($handle, 5000, $delimiter)) !== FALSE) {
//todo enlever le vilain 9
                if ($nb===count($rowData)) {

                    //At x line, are written the keys so we record them in $headerRecord
 //What I had first     if (9 === $rowCounter) {
//What I now have
                        if(0 === $rowCounter) {
                        //trim the titles of columns
                        for ($i = 0; $i < $nb; $i++) {
                            $rowData[$i] = trim($rowData[$i]);
                        }

                        $headerRecord = $rowData;
                    }
                    elseif(9<$rowCounter )
                    {      //for every other lines...
                        foreach ($rowData as $key => $value) {       //in each line, for each value
                            // we set $value to the cell ($key) having the same horizontal position than $value
                            // but where vertical position = 0 (headerRecord[]
                            $articles[$rowCounter][$headerRecord[$key]] = mb_convert_encoding($value, "UTF-8");

                        }
                    }
                }
                $rowCounter++;
            }
            fclose($handle);
        }

    return $articles;
}

 public function getHeaderLine($handle, $nbColumns, $delimiter){
        $rowCounter = 0;
        while (($rowData = fgetcsv($handle, 5000, $delimiter)) !== FALSE) {
            $rowCounter++;
            if ($nbColumns===count($rowData)){
                return $rowCounter;
            }

        }
        return -1;
    } 

如您所见,我必须在 if() 中写入“9”才能正确解析数据并为每个文件更改数据。

标签: csvsymfony4

解决方案


当您的 csv/tsv 文件以不同数量的无效行(空白或标题或标签或其他)开头时,获取标题的解决方案是第一次解析文件,这要归功于辅助功能。具有正确数量的单元格的第一行(您应该知道您的 csv 文件有多少列)是您的标题。这样您可以将标题的数据返回到主函数并继续从您所在的位置解析停止在侧面功能中阅读。

整个代码:

public function convert($filePath, $feedColumnsMatch)
    {

        if(!file_exists($filePath) ) {
            return "existe pas";
        }
        if(!is_readable($filePath)) {
            return "pas lisible";
        }

        //this array will contain the elements from the file
        $articles = [];

        if($feedColumnsMatch->getFeedFormat()==="tsv" | $feedColumnsMatch->getFeedFormat()==="csv"){
            if($feedColumnsMatch->getFeedFormat()==="csv"){
                $delimiter = $feedColumnsMatch->getDelimiter();
            }else{
                $delimiter = "\t";
            }

            //if we can open the file on mode "read"
            if (($handle = fopen($filePath, 'r')) !== FALSE) {
                //represents the line we are reading

                $nb = $feedColumnsMatch->getNumberOfColumns();
                $headerRecord = $this->getHeader($handle, $nb, $delimiter);           // With this function, I start parsing the file till line === $headerLine
                $rowCounter = 0;
                //if there is no header
                if (null!==$headerRecord || false!==$headerRecord) {
                    //as long as there are lines
                    while (($rowData = fgetcsv($handle, 5000, $delimiter)) !== FALSE) {

                        //if it is a line with valid number of cells
                        if ($nb === count($rowData)) {

                            foreach ($rowData as $key => $value) {       //in each line, for each value
                                // we set $value to the cell ($key) having the same horizontal position than $value
                                // but where vertical position = 0 (headerRecord[]
                                $articles[$rowCounter][$headerRecord[$key]] = mb_convert_encoding($value, "UTF-8");
                            }
                        }
                        $rowCounter++;
                    }
                }
                else{
                    new \Exception();
                }
                fclose($handle);
            }
        }
        return $articles;
    }


    /**
     * is used to get the data of the row containing the header of the csv file or null if no header
     * @param $handle
     * @param $nbColumns
     * @param $delimiter
     * @return array|false|null
     */
    public function getHeader($handle, $nbColumns, $delimiter){
        $rowCounter = 0;
        while (($rowData = fgetcsv($handle, 5000, $delimiter)) !== FALSE) {
            $rowCounter++;
            if ($nbColumns===count($rowData)){
                //trim the titles of columns
                for ($i = 0; $i < $nbColumns; $i++) {
                    $rowData[$i] = trim($rowData[$i]);
                }

                return $rowData;
            }
        }
        return null;
    }

但是,我不知道这是否是正确的方法。这只是一种行之有效的方式。


推荐阅读