python - 将 Python 脚本转换为 Sed 脚本
问题描述
从鹈鹕搬到雨果。我需要重写我的 Markdown 文件的某些部分。
文章标题,由此而来:
Title: Threads
Date: 2017-03-08 04:30:25
Modified: 2017-03-08 03:40:17
Category: Unix
Tags: c,
Slug: an-overwiew-on-threads
Authors: Nsukami
Summary: A long, thin strand of cotton, nylon, or other fibres used in sewing or weaving.
Lang: en
对此:
---
title: Threads
date: 2017-03-08 04:30:25
lastmod: 2017-03-08 03:40:17
categories: ['Unix']
tags: ['c',]
slug: an-overwiew-on-threads
summary: A long, thin strand of cotton, nylon, or other fibres used in sewing or weaving.
---
以及内部链接,来自:
[processes]({filename}/on-processes.md)
[Threads]({filename}/images/threads-example.gif)
对此:
[processes]({{< ref on-processes >}})
[Threads](/images/threads-example.gif)
我已经有一个转换标题的 Awk脚本。我也有这个 Python 脚本,我想用 Sed 或 Awk 重写它(主要用于学习目的):
import os
import re
import sys
import glob
import shutil
from tempfile import mkstemp
def replace(file_path):
#Create temp file
fh, abs_path = mkstemp()
with open(fh,'w') as new_file:
with open(file_path, 'r') as old_file:
for line in old_file:
# search for something like
# [this: processes]({filename}/on-processes.md)
# or this: [Threads]({filename}/images/threads-example.gif)
match = re.search( r'(?:!| )?((\[.*\])\((\{filename\})/(.*)\.(.*)\)(?:.|,| )?)', line, re.M|re.I)
if match:
old_link = match.group(1)
if match.group(5).startswith("md"):
# if md file, replace by
# [processes]({{< ref on-processes >}})
new_link = match.group(2)+'({{< ref '+match.group(4)+' >}})'
else:
# if image, replace by
# [Threads](/images/threads-example.gif)
new_link = match.group(2)+'(/'+match.group(4)+'.'+match.group(5)+')'
line = line.replace(old_link, new_link)
new_file.write(line)
#Remove original file
os.remove(file_path)
#Move new file to old file
shutil.move(abs_path, file_path)
if __name__ == '__main__':
# for all md files in content folder
for i, article in enumerate(glob.iglob("./content/post/*.*", recursive=True)):
# replace all internals links
replace(article)
我尝试使用 Sed,只是为了捕获并打印所有应该转换的内部链接:
~/nskm
>> sed -n /^.*\[[[:alnum:]]\]\({filename}.*\)/p on-threads.md
I've written something about [processes]({filename}/on-processes.md). Let's write some notes about threads. This is not really an introduction
to threads. It's more like a little bit of introspection, so we can have an interesting perspective of what are threads.<br><br>
Processes each have their own address space. Threads exist as subsets of a process. Threads are just multiple workers in the same [virtual address space]({filename}/on-processes.md#vas), all threads in a process share the same memory. They can also share open files and other resources. Within that VAS, each thread has its own ID, its own stack, its own program counter, its own independent flow of control, its own registers set. A thread is just a **context of execution**.<br>
![Threads]({filename}/images/threads-example.gif)
[Processes]({filename}/on-processes.md) are created with the [fork()](http://man7.org/linux/man-pages/man2/fork.2.html) system call. However, there is a separate system call, named [clone()](http://man7.org/linux/man-pages/man2/clone.2.html) which is used for creating threads. It works like fork(), but it accepts a number of flags for adjusting its behavior so the child can share some parts of the parent's execution context.
Our c script made a call to clone(), twice. And looking at _some_ of the flags that have been passed, we can see that:
![insert breakpoints]({filename}/images/insert-breakpoints.png)
![info proc mappings]({filename}/images/info-proc-map1.png)
![info proc mappings]({filename}/images/info-proc-map2.png)
![thread's stack]({filename}/images/stacks.png)
而且我还尝试捕获所有要转换的内部链接,这次是使用 Awk:
~/nskm
>> awk '/.*(filename}.*\..*)/' on-threads.md # regex not precise enough
I've written something about [processes]({filename}/on-processes.md). Let's write some notes about threads. This is not really an introduction
to threads. It's more like a little bit of introspection, so we can have an interesting perspective of what are threads.<br><br>
Processes each have their own address space. Threads exist as subsets of a process. Threads are just multiple workers in the same [virtual address space]({filename}/on-processes.md#vas), all threads in a process share the same memory. They can also share open files and other resources. Within that VAS, each thread has its own ID, its own stack, its own program counter, its own independent flow of control, its own registers set. A thread is just a **context of execution**.<br>
![Threads]({filename}/images/threads-example.gif)
[Processes]({filename}/on-processes.md) are created with the [fork()](http://man7.org/linux/man-pages/man2/fork.2.html) system call. However, there is a separate system call, named [clone()](http://man7.org/linux/man-pages/man2/clone.2.html) which is used for creating threads. It works like fork(), but it accepts a number of flags for adjusting its behavior so the child can share some parts of the parent's execution context.
Our c script made a call to clone(), twice. And looking at _some_ of the flags that have been passed, we can see that:
![insert breakpoints]({filename}/images/insert-breakpoints.png)
![info proc mappings]({filename}/images/info-proc-map1.png)
![info proc mappings]({filename}/images/info-proc-map2.png)
![thread's stack]({filename}/images/stacks.png)
我被困在如何告诉 sed/awk 不仅要打印,而且这次要替换那些内部链接。如果可能的话,我也想让我的 sed 正则表达式和我的 awk 正则表达式更精确。感谢您提供有关如何从这一点继续的意见和建议。
解决方案
推荐阅读
- css - 任何自定义 CSS 来删除 Slick Slider 轮播中图像上的灰色叠加层?
- javascript - 在 ES6 中,如何将一个生成器函数的可迭代结果作为参数传递给另一个生成器函数?
- java - IBM MQ | 内存泄漏 | 堆转储
- kotlin - 接口中过度抽象的类型
- java - 将 UTC 字符串日期转换为日期格式
- c++ - C ++:当您具有相同的功能(静态和非静态)时,编译器和链接器如何工作?
- javascript - 从消息 discord.js v12 中获取表情符号
- swift - 为 NSProgressIndicator 颜色设置 IB 内容过滤器
- google-sheets - 包含包含的 IMPORTXML Xpath(Google 表格)
- reactjs - 将数据道具传递给函数对象