python - What method does os.listdir() use to obtain a list of files in a directory?
问题描述
I am working on a project where I have to edit a few lines of content in some 400 different files. They are all in the same folder, and have each got unique names. For the sake of this question, I will call them fileName001.conf
to fileName420.conf
.
I am using a python script to obtain the contents of each file before going on to make the edits programmatically. At the moment, I am using this snippet to get the files with some print()
lines for debugging:
folderPath = '/file/path/to/list/of/conf/files'
for filename in os.listdir(folderPath):
print('filename = ' + filename)
print('filepath = ' + folderPath + '/' + filename)
with open(folderPath + '/' + filename, 'r') as currFile:
#... code goes on...
Lines 4 and 5 are designed for debugging only. Running this, I noticed that the script was exhibiting some strange behaviour - the order in which the file names are printed seemed to change on each run. I took this a step further and added the line:
print(os.listdir(folderPath))
Before the for loop in my first code snippet. Now when I run the script from terminal, I can confirm that the output that I get, while contains all file names, has a different order each time:
RafaGuillermo@virtualMachine:~$ python renamefiles.py
['fileName052.txt', 'fileName216.txt', 'fileName084.txt', 'fileName212.txt', 'fileName380.txt', 'fileName026.txt', 'fileName119.txt', etc...]
RafaGuillermo@virtualMachine:~$ python renamefiles.py
['fileName024.txt', 'fileName004.txt', 'fileName209.txt', 'fileName049.txt', 'fileName166.txt', 'fileName198.txt', 'fileName411.txt', etc...]
RafaGuillermo@virtualMachine:~$
As far as getting past this goes - as I want to make sure that I go through the files in the same order each time, I can use
list = sorted(os.listdir(folderPath))
Which alphebetises the list, though it seems counter-intuitive that os.listdir()
returns the list of filenames in a different order each time I run the script.
My question is therefore not how can I get a sorted list of files in a directory using os.listdir()
, but:
What method does os.listdir()
use to retrieve a list of files and why does it seemingly populate its return value in a different way on each call?
解决方案
回答:
这是该os.listdir()
方法的预期行为。
更多信息:
os.listdir(path='.')
返回一个列表,其中包含路径给定的目录中条目的名称。该列表按任意顺序排列,不包括特殊条目“.”。和 '..' 即使它们存在于目录中。
os.listdir()
是一个 C 模块的实现,它位于Python 源代码的 posixmodule.c 中。返回基于存储文件的文件系统的结构,并且根据确定本地操作系统的条件语句的评估具有不同的实现。os.listdir()
使用以下 C 代码打开您正在调用的目录:
static PyObject *
_posix_listdir(path_t *path, PyObject *list) {
/* stuff */
dirp = opendir(name);
它为存储在 中的目录名称打开一个流name
,并返回一个指向具有第一个目录条目位置的目录流的指针。
继续:
for (;;) {
errno = 0;
Py_BEGIN_ALLOW_THREADS
ep = readdir(dirp);
Py_END_ALLOW_THREADS
if (ep == NULL) {
if (errno == 0) {
break;
} else {
Py_DECREF(list);
list = path_error(path);
goto exit;
}
}
if (ep->d_name[0] == '.' &&
(NAMLEN(ep) == 1 ||
(ep->d_name[1] == '.' && NAMLEN(ep) == 2)))
continue;
if (return_str)
v = PyUnicode_DecodeFSDefaultAndSize(ep->d_name, NAMLEN(ep));
else
v = PyBytes_FromStringAndSize(ep->d_name, NAMLEN(ep));
if (v == NULL) {
Py_CLEAR(list);
break;
}
if (PyList_Append(list, v) != 0) {
Py_DECREF(v);
Py_CLEAR(list);
break;
}
Py_DECREF(v);
}
readdir()
被调用,将先前分配的指向目录文件流的指针作为函数参数传递。readdir()
在 Linux 上返回一个dirent 结构,它表示指向的目录流中的下一个点dirp
。
如readdir()
Linux 手册页中所述:
使用 opendir(3) 打开目录流。连续调用 readdir() 读取文件名的顺序取决于文件系统的实现;名称不太可能以任何方式排序。
所以这种行为是预期的,也是文件系统实现的结果。
参考:
推荐阅读
- java - JMS 2 MDB 监听多个队列
- python - 使用 Python 的 PostgreSQL 连接
- java - 当我在 Eclispe 中运行 Junit 测试用例时,我得到了 errr
- jmeter - 从 JMeter-jtl 输出文件中跳过一些列
- python - 为什么通过python执行时tesseract OCR会冻结?
- mariadb - 无法在 Mac OS 上 brew 安装 MariaDB 5.5
- c# - 在发布模式下通过 VS2010 添加内部版本号
- javascript - module.exports 在哪里导出你的函数,当我们仍然使用 require 将代码导入你的模块时它有什么用
- javascript - DataTables - 如何使用自定义条件对列进行排序?
- vue.js - Vue SCSS lighten()