首页 > 解决方案 > 如何将表格从 rts 文件转换为 pandas DataFrame?

问题描述

我有一个带有表格的富文本格式文件:

水图像介绍

我想在熊猫数据框中得到它,但在 csv 中。这是我们在记事本中看到的文件:

{\rtf1\mac\deff2 {\fonttbl{\f20\froman Times;}}
{\stylesheet{\s224 \b\v\f20\fs20 \sbasedon0\snext0 PostScript;}
{\s243\tqc\tx4320\tqr\tx8640 \f20 \sbasedon0\snext243 footer;}
{\s244\tqc\tx4320\tqr\tx8640 \f20 \sbasedon0\snext244 header;}
{\f20 \sbasedon222\snext0 Normal;}
{\s2\ri-3022\tx1700\tqc\tx3420\tx3980\tx4240\tx6060\tx6220\tqc\tx8780\tqc\tx9920\tqc\tx11160\tx11820\tx11980 \f20\fs18 \sbasedon0\snext2 Trade tabs;}
{\s3\ri2560\sa400\tx1200\tqc\tx2680\tx3220\tx5120\tx6860\tx8100\tqc\tx8780\tqc\tx9920\tx10620 \f20\fs40 \sbasedon0\snext3 chapter head;}
{\s4\fi-9060\li9060\ri-40\sl220\tx280\tqr\tx2080\tx2520\tx4440\tx6140\tx6860\tqr\tx8500\tx8920 \f20\fs18 \sbasedon2\snext4 AT tab format;}
{\s5\sb100\sl80\brdrt\brdrhair \tqr\tx12860 \f20\fs8 \sbasedon6\snext5 thin line;}
{\s6\sb100\sl220\brdrt\brdrs \tqr\tx12860 \f20\fs8 \sbasedon2\snext6 main line;}}
\paperw16820\paperh11880\margl1820\margr2260\margt2160
\margb2160\facingp\deftab709\widowctrl\ftnbj
\pgnstart1\fracwidth \sectd \linemod0\linex0\headery2160\footery960\cols1
\endnhere\titlepg 
\pard\plain \s3\sa400\tx1200\tqdec\tx3000\tx3220
\tx5120\tx6860\tx8100\tqc\tx8780\tx9820 

\f20\fs40 Transfers of major weapons: Deals with deliveries or orders made for 2020 to 2020\par 
\pard\plain \s2\qj\sa100\sl240\tx1200\tqdec\tx3000\tx3220\tx5120\tx6940\tx7840\tqc\tx9120\tx9820 \f20\fs18 
{\fs20\b Note: \b0 The \'d4No. delivered\'d5 and the \'d4Year(s) of deliveries\'d5 columns refer to all deliveries since the beginning of the contract. The \'d4Comments\'d5 column includes publicly reported information on the value of the deal. Information on the sources and methods used in the collection of the data, and explanations of the conventions, abbreviations and acronyms, can be found at URL <http://www.sipri.org/contents/armstrad/sources-and-methods>. \par \b Source: \b0 SIPRI Arms Transfers Database\par \b Information generated:\b0  05 May 2021\par \par }
\pard\plain \s6\sl40\brdrt\brdrs \tqr\tx12860 \f20\fs8 \tab \par
\pard\plain \s2\sl200\tx280\tx1540\tx2520\tx4440\tx6080\tx6860 
\tx7780\tx8920 \f20\fs18{\b 
\tab \tab \tab \tab  \tab Year(s)\tab \tab \line 
Supplier/\tab \tab No.\tab Weapon\tab Weapon\tab Year\tab of\tab No. 
\tab \line 
\tab recipient (R)\tab ordered\tab designation\tab description
\tab of order\tab delivery\tab delivered\tab Comments\par }\pard\plain \s6\sb40\sl40\brdrt\brdrs 
\tqr\tx12860 \f20\fs8 \tab \par 
\pard\plain \s2\ri-40\sl220\tx280\tqr\tx2080\tx2520
\tx4440\tx6020\tx6860\tqr\tx8500\tx8920 \f20\fs18 
\pard \s2\fi-9060\li9060\ri-40\sl220\tx280\tqr\tx2080
\tx2520\tx4440\tx6140\tx6860\tqr\tx8500\tx8920 \pard\plain \s6\sb40\sl40\brdrt\brdrs 
\tqr\tx12860 \f20\fs8 \tab \par 
\pard\plain \s2\ri-40\sl220\tx280\tqr\tx2080\tx2520
\tx4440\tx6020\tx6860\tqr\tx8500\tx8920 \f20\fs18 
\pard \s2\fi-9060\li9060\ri-40\sl220\tx280\tqr\tx2080
\tx2520\tx4440\tx6140\tx6860\tqr\tx8500\tx8920 
{\b France}\par{\b R:} Denmark\tab (9)\tab FLASH\tab ASW sonar\tab 2019\tab \tab \tab ASQ22 ALFS version; for 9 MH-60R ASW helicopters from USA; probably from US production line\par{\b     } Egypt\tab 2\tab UMS-4110 BlueMaster\tab ASW sonar\tab 2020\tab 2020\tab 1\tab For 2 FREMM frigates from Italy\par{\b     } Qatar\tab 16\tab NH-90 TTH\tab Transport helicopter\tab 2018\tab \tab \tab Part of EUR3 b deal; delivery planned 2021-2025\par{\b     } Australia\tab 12\tab Barracuda\tab Submarine\tab (2019)\tab \tab \tab AUD90 b ($70 b) 'SEA-1000' programme; produced under licence in Australia (incl. at least 60% of value from Australian production); Shortfin Barracuda Block-1A version; Australian designation Attack; delivery planned 2033/2034-2050\par{\b     } Belgium\tab 6\tab MCM-2720\tab MCM ship\tab 2019\tab \tab \tab Part of EUR2 b 'MCMV' programme (incl 6 for Netherlands; incl production of components in Belgium); delivery planned from 2023/2024\par{\b     } Brazil\tab 43\tab EC725 Super Cougar\tab Transport helicopter\tab 2008\tab 2010-2020\tab (38)\tab Part of EUR1.9 b 'H-XBR' programme; H225M version; incl 3 CSAR version; Brazilian designations H-36, HM-4, UH-15A and UH-15B; delivery planned 2010-2022\par{\b     } \tab\tab 2\tab P-400\tab Patrol craft\tab 2009\tab \tab \tab NAPA-500 version; produced under licence in Brazil; Brazilian designation Macae\par{\b     } \tab\tab 4\tab Scorpene\tab Submarine\tab 2009\tab \tab \tab Part of EUR6.8 b 'SDP' programme; S-BR version; produced under licence in Brazil; Brazilian designation Riachuelo; delivery planned 2021-2024\par{\b     } \tab\tab 1\tab SNBR\tab Nuclear submarine\tab 2009\tab \tab \tab Part of EUR6.8 b 'SDP' programme; produced under licence in Brazil with nuclear reactor designed and produced in Brazil; delivery planned 2033/2034\par{\b     } \tab\tab (20)\tab AM-39 Exocet\tab Anti-ship missile\tab (2011)\tab 2018-2020\tab (15)\tab AM-39 Block-2Mod-2 version; for EC-725 (AH-15B) helicopters; incl production of components in Brazil\par{\b     } \tab\tab 5\tab EC725 Super Cougar\tab Anti-ship helicopter\tab 2012\tab 2019-2020\tab (2)\tab Part of EUR1.9 b 'H-XBR' programme; H225M version; Brazilian designation AH-15B and Operacio MB; delivery planned 2019-2022\par{\b     } China\tab . .\tab AS565S Panther\tab ASW helicopter\tab (1980)\tab 1989-2020\tab (49)\tab AS365F version; Chinese designation Z-9C Haitun; produced in China\par{\b     } \tab\tab . .\tab AS365/AS565 Panther\tab Helicopter\tab 1988\tab 1992-2020\tab (442)\tab Produced under licence in China as Z-9A or Z-9A-100 Haitun and Z-9B/G; incl Z-9WZ anti-tank version\par{\b     } \tab\tab . .\tab Arriel\tab Turboshaft\tab (2005)\tab 2012-2020\tab (560)\tab For Z-19 combat helicopter produced in China; Arriel-2 version produced under licence in China as WZ-8C\par{\b     } \tab\tab (34)\tab PC2.5\tab Diesel engine\tab (2005)\tab 2007-2019\tab (26)\tab PC-2.6 version for 6 Type-071 (Yuzhao) AALS and 1 Danyao support ship produced in China\par{\b     } \tab\tab (144)\tab PA6\tab Diesel engine\tab (2010)\tab 2013-2020\tab (140)\tab Produced under licence in China; for 70 Type-056 (Jiangdao) frigates produced in China\par{\b     } Czechia\tab 62\tab TITUS\tab APC\tab 2019\tab \tab \tab CZK6b ($264 m) deal (produced in Czech Republic); incl. 40 command post and 20 fire control versions; delivery planned 2022-2023\par{\b     } Egypt\tab 4\tab Gowind-2500\tab Frigate\tab 2014\tab 2017\tab 1\tab EUR1 b deal incl option on 2 more; incl 3 produced under licence in Egypt; Egyptian designation El Fateh\par{\b     } India\tab 6\tab Scorpene\tab Submarine\tab 2005\tab 2017-2019\tab 2\tab INR207-237 b ($3.2-4.5 b) 'Project-75' programme; produced under licence in India as Kalvari; delivery planned 2017-2022/2023 (delayed from 2012-2017)\par{\b     } \tab\tab (49)\tab Mirage-2000-5\tab FGA aircraft\tab 2011\tab 2015-2020\tab (21)\tab INR109-175 b deal ($2.3-2.6 b; offsets $593 m); Indian Mirage-2000H rebuilt to Mirage-2000-5; incl 2 produced in France and rest in India; delivery planned 2015-2023\par{\b     } \tab\tab 36\tab Rafale\tab FGA aircraft\tab 2017\tab 2019-2020\tab (13)\tab EUR7.8 b deal (incl EUR5.2 b for aircraft EUR1.8 b for spare parts and EUR710 m for armament; 50% offsets incl 20% as production of components in India); Rafale-EH version (incl 8 Rafale-DH trainer/combat version); delivery planned 2019-2022\par{\b     } \tab\tab 8\tab SA-316B Alouette-3\tab Light helicopter\tab 2017\tab 2019-2020\tab (8)\tab INR3.2 b deal; produced under licence in India as Chetak\par{\b     } \tab\tab 10\tab SA-315B Lama\tab Light helicopter\tab (2018)\tab \tab \tab Cheetal version; produced under licence in India; delivery planned from 2021\par{\b     } Indonesia\tab (10)\tab AS-532 Cougar/AS-332\tab Transport helicopter\tab 1997\tab 2001-2017\tab (9)\tab Produced under licence in Indonesia; NAS-332 and NAS-332C1+ versions; incl some for CSAR\par{\b     } Kazakhstan\tab 20\tab Ground Master-400\tab Air search radar\tab 2014\tab 2014-2018\tab (2)\tab Incl production in Kazakhstan\par{\b     } Malaysia\tab 6\tab Gowind-2500\tab Frigate\tab 2014\tab \tab \tab 'SGVP-LCS' programme; produced under licence in Malaysia; Malaysian designation Maharaja Lela; delivery planned from 2023 (delayed from 2019)\par{\b     } Saudi Arabia\tab 39\tab HSI-32\tab Patrol craft\tab 2018\tab 2019-2020\tab (9)\tab Incl production of 18 in Saudi Arabia\par{\b     } \tab\tab 19\tab HSI-32\tab Patrol craft\tab 2020\tab \tab \tab Probably produced under licence in Saudi Arabia\par{\b     } South Korea\tab (210)\tab EC155\tab Helicopter\tab 2015\tab \tab \tab Produced under licence in South Korea; LCH-LAH version; delivery planned from 2023\par{\b     } Spain\tab 24\tab EC-665 Tiger\tab Combat helicopter\tab 2003\tab 2007-2020\tab (24)\tab EUR1.4 b deal; incl 6 HAP and 18 HAD versions; incl production under licence of 18 in Spain\par{\b     } \tab\tab 22\tab NH-90 TTH\tab Transport helicopter\tab 2005\tab 2014-2020\tab (17)\tab EUR1.3 b deal; most assembled/produced under licence in Spain; Spanish designation HT-29 and HD-29; delivery planned 2014-2021\par{\b     } Ukraine\tab 20\tab FPB-98\tab Patrol craft\tab 2020\tab \tab \tab For border guard; FPB-98 Mk-1 version; incl 5 produced under licence; delivery planned 2021-2024\par{\b     } United States\tab (60)\tab PC2.5\tab Diesel engine\tab (1996)\tab 2006-2017\tab 44\tab Produced in USA for 15 San Antonio AALS produced in USA\par{\b     } Algeria\tab 10\tab FPB-98\tab Patrol craft\tab 2018\tab 2019-2020\tab (10)\tab \par{\b     } Angola\tab \tab Vigilante-1400\tab OPV\tab 2016\tab \tab \tab Part of EUR495 m deal; designation uncertain (reported as 'long patrol vessel')\par{\b     } \tab\tab \tab Vigilante-400\tab Patrol craft\tab 2016\tab \tab \tab Part of EUR495 m deal; designation uncertain (reported as 'short patrol vessel')\par{\b     } Argentina\tab 3\tab Gowind\tab OPV\tab 2018\tab \tab \tab Part of EUR319 m deal; Argentinian designation Bouchard; delivery planned 2021-2022\par{\b     } Belgium\tab 382\tab Griffon VBMR\tab APC\tab 2019\tab \tab \tab Part of EUR1.1 b or EUR1.6 b deal; assembled in Belgium; delivery planned 2025-2030\par{\b     } \tab\tab 60\tab Jaguar EBRC\tab Armoured car\tab 2019\tab \tab \tab Part of EUR1.1 b or EUR1.6 b deal; partly assembled in Belgium; delivery planned 2025-2030\par{\b     } Bolivia\tab 4\tab Ground Master-400\tab Air search radar\tab 2016\tab 2019-2020\tab (4)\tab Part of EUR191 m ($215 m) deal; incl for civilian use\par{\b     } Botswana\tab (50)\tab MICA\tab BVRAAM\tab 2016\tab 2020\tab (50)\tab For VL-MICA SAM system\par{\b     } \tab\tab (140)\tab Mistral\tab Portable SAM\tab 2016\tab 2018-2020\tab (140)\tab \par{\b     } \tab\tab (1)\tab VL-MICA\tab SAM system\tab 2016\tab 2020\tab (1)\tab \par{\b     } Brazil\tab \tab F21 533mm\tab AS/ASW torpedo\tab (2009)\tab 2019\tab (2)\tab For Scorpene submarines\par{\b     } Burkina Faso\tab 6\tab Bastion\tab APC/APV\tab 2019\tab 2019-2020\tab (6)\tab Aid financed by EU\par{\b     } Chad\tab 7\tab Bastion\tab APC/APV\tab 2019\tab 2019-2020\tab (7)\tab Financed by EU\par{\b     } \tab\tab 9\tab ERC-90\tab Armoured car\tab 2020\tab \tab \tab Second-hand; aid; delivery planned 2021\par{\b     } Chile\tab 5\tab AS-350/AS-550 Fennec\tab Light helicopter\tab 2019\tab 2020\tab 1\tab 'Project Gaviota'; delivery planned 2020-2022\par{\b     } Cyprus\tab \tab Mistral\tab Portable SAM\tab 2019\tab \tab \tab EUR150 m deal\par{\b     } \tab\tab (20)\tab MM-40-3 Exocet\tab Anti-ship MI/SSM\tab 2019\tab 2020\tab (10)\tab For 1 Exocet coast defence system\par{\b     } Denmark\tab 15\tab CAESAR 155mm\tab Self-propelled gun\tab 2017\tab 2019\tab (2)\tab Option on 6 more; delivery planned 2019-2021\par{\b     } \tab\tab 4\tab CAESAR 155mm\tab Self-propelled gun\tab 2019\tab \tab \tab \par{\b     } Egypt\tab (50)\tab MM-40-3 Exocet\tab Anti-ship MI/SSM\tab (2014)\tab 2017\tab (10)\tab For Gowind frigates\par{\b     } \tab\tab (100)\tab MICA\tab BVRAAM\tab 2015\tab 2017\tab (25)\tab For Gowind frigates\par{\b     } \tab\tab (50)\tab Storm Shadow/SCALP\tab ASM\tab (2015)\tab \tab \tab SCALP version; for Rafale combat aircraft\par{\b     } Estonia\tab (100)\tab Mistral\tab Portable SAM\tab 2018\tab 2020\tab (30)\tab \par{\b     } Germany\tab (18)\tab Ocean Master\tab MP aircraft radar\tab 2013\tab 2019\tab (3)\tab For 18 NH-90 ASW helicopters produced in FRG\par{\b     } \tab\tab (62)\tab RTM-332\tab Turboshaft\tab (2020)\tab \tab \tab For 31 NH90 NFH (Sea Tiger) ASW helicopters produced in Germany\par{\b     } Greece\tab (60)\tab MM-40-3 Exocet\tab Anti-ship MI/SSM\tab (2008)\tab 2010-2020\tab (60)\tab For Super Vita (Roussen) FAC\par{\b     } \tab\tab \tab Meteor\tab BVRAAM\tab (2020)\tab \tab \tab For Rafale combat aircraft; selected but contract not yet signed end-2020\par{\b     } \tab\tab 12\tab Rafale\tab FGA aircraft\tab (2020)\tab \tab \tab Second-hand; part of EUR2.3 b deal; delivery planned 2021-2022\par{\b     } \tab\tab 6\tab Rafale\tab FGA aircraft\tab (2020)\tab \tab \tab Part of EUR2.3 b deal; delivery planned 2023\par{\b     } Hungary\tab 16\tab EC725 Super Cougar\tab Transport helicopter\tab 2018\tab \tab \tab Armed H225M Caracal version\par{\b     } India\tab 16\tab PA6\tab Diesel engine\tab (2003)\tab 2014-2020\tab 16\tab For 4 Kamorta (Project-28) frigates produced in India\par{\b     } \tab\tab 36\tab SM-39 Exocet\tab Anti-ship missile\tab 2005\tab 2017-2020\tab (9)\tab Possibly $150 m deal; SM-39 Block-2 version; for Scorpene submarines\par{\b     } \tab\tab 493\tab MICA\tab BVRAAM\tab 2012\tab 2014-2020\tab (493)\tab EUR950 m deal (offsets 30%); MICA-EM and MICA-IR versions; for Mirage-2000-5 combat aircraft\par{\b     } \tab\tab (358)\tab Ardiden-1\tab Turboshaft\tab (2016)\tab \tab \tab For LCH combat helicopter produced in India; produced under licence in India as Shakti\par{\b     } \tab\tab (200)\tab Meteor\tab BVRAAM\tab (2016)\tab 2020\tab (100)\tab Part of EUR710 m deal; for Rafale combat aircraft\par{\b     } \tab\tab (350)\tab MICA\tab BVRAAM\tab (2016)\tab 2020\tab (175)\tab Part of EUR710 m deal; MICA-RF and MICA-IR versions; for Rafale combat aircraft\par{\b     } \tab\tab (200)\tab Storm Shadow/SCALP\tab ASM\tab 2016\tab 2020\tab (100)\tab Part of EUR710 m deal; SCALP version; for Rafale combat aircraft\par{\b     } \tab\tab . .\tab AASM\tab ASM\tab (2020)\tab \tab \tab For Rafale combat aircraft\par{\b     } Indonesia\tab (2)\tab AS-565S Panther\tab ASW helicopter\tab 2014\tab 2019-2020\tab 2\tab AS565MBe version; assembled in Indonesia\par{\b     } \tab\tab (14)\tab MIDR\tab Diesel engine\tab (2016)\tab 2019-2020\tab (9)\tab For 14 Badak armoured fire support vehicles produced in Indonesia\par{\b     } \tab\tab (50)\tab MM-40-3 Exocet\tab Anti-ship MI/SSM\tab 2016\tab 2019-2020\tab (25)\tab For SIGMA frigates\par{\b     } \tab\tab 2\tab VL-MICA-M\tab Naval SAM system\tab 2016\tab 2019-2020\tab 2\tab For 2 SIGMA-10514 (Martadinata) frigates from Netherlands\par{\b     } \tab\tab 18\tab CAESAR 155mm\tab Self-propelled gun\tab 2017\tab 2019-2020\tab 18\tab EUR60 m deal\par{\b     } \tab\tab 8\tab EC725 Super Cougar\tab Transport helicopter\tab (2018)\tab \tab \tab H225M armed Combat SAR version\par{\b     } \tab\tab (40)\tab MICA\tab BVRAAM\tab (2018)\tab 2019-2020\tab (40)\tab VL-MICA SAM version for SIGMA-10514 (Martadinata) frigates\par{\b     } Iraq\tab 4\tab Ground Master-400\tab Air search radar\tab 2020\tab \tab \tab Delivery planned from 2021\par{\b     } Italy\tab 14\tab 2R2M 120mm\tab Mortar\tab (2019)\tab \tab \tab For 14 Freccia mortar carriers produced in Italy\par{\b     } Kuwait\tab 30\tab EC725 Super Cougar\tab Transport helicopter\tab 2016\tab 2020\tab 2\tab EUR1.1 b deal; H-225M Caracal version; incl 6 for National Guard; delivery planned 2020-2021\par{\b     } \tab\tab (300)\tab Sherpa\tab APC/APV\tab 2018\tab 2019-2020\tab (300)\tab EUR270 m deal\par{\b     } Malaysia\tab \tab MICA\tab BVRAAM\tab (2015)\tab \tab \tab For 6 Combat Gowind (SGVP-LCS) frigates\par{\b     } \tab\tab 18\tab LG1 105mm\tab Towed gun\tab 2018\tab 2020\tab (18)\tab LG-1 Mk-3 version; assembled from kits in Malaysia\par{\b     } Mali\tab 13\tab Bastion\tab APC/APV\tab 2019\tab 2020\tab 13\tab Aid financed by EU; incl 1 command post and 2 ambulance version\par{\b     } Mexico\tab 1\tab CAPTAS TAS\tab ASW sonar\tab (2017)\tab 2020\tab (1)\tab CAPTAS-2 version; for 1 SIGMA-105 (Reformador) frigate from Netherlands\par{\b     } Morocco\tab (36)\tab CAESAR 155mm\tab Self-propelled gun\tab 2020\tab \tab \tab EUR200 m deal\par{\b     } \tab\tab \tab MICA\tab BVRAAM\tab 2020\tab \tab \tab For VL-MICA SAM system\par{\b     } \tab\tab 36\tab Sherpa\tab APC/APV\tab 2020\tab \tab \tab Incl Scout and APC version\par{\b     } \tab\tab \tab VL-MICA\tab SAM system\tab 2020\tab \tab \tab EUR192-200 m deal\par{\b     } Netherlands\tab 6\tab MCM-2720\tab MCM ship\tab 2019\tab \tab \tab Part of EUR2 b 'MCMV' programme (incl 6 for Belgium); delivery planned from 2025\par{\b     } Nigeria\tab 2\tab FPB-110\tab Patrol craft\tab 2019\tab 2020\tab (2)\tab FPB-110 Mk-2 version\par{\b     } \tab\tab 1\tab FPB-72\tab Patrol craft\tab 2019\tab 2020\tab 1\tab \par{\b     } Norway\tab 6\tab FLASH\tab ASW sonar\tab 2002\tab 2019-2020\tab (2)\tab For 6 NH-90 ASW helicopters from Italy\par{\b     } Philippines\tab (40)\tab Mistral\tab Portable SAM\tab 2019\tab \tab \tab Mistral-3 version; for HHI-2600 (Rizal) frigates; delivery planned 2021\par{\b     } Qatar\tab 300\tab AASM\tab ASM\tab 2015\tab 2019-2020\tab (300)\tab For Rafale combat aircraft\par{\b     } \tab\tab 60\tab AM-39 Exocet\tab Anti-ship missile\tab 2015\tab 2019-2020\tab (60)\tab AM-39 Block-2 Mod-2 version; for Rafale combat aircraft\par{\b     } \tab\tab 52\tab M-88\tab Turbofan\tab 2015\tab 2019-2020\tab (52)\tab Spares for Rafale combat aircraft\par{\b     } \tab\tab (160)\tab Meteor\tab BVRAAM\tab 2015\tab 2019-2020\tab (160)\tab For Rafale combat aircraft\par{\b     } \tab\tab 300\tab MICA\tab BVRAAM\tab 2015\tab 2019-2020\tab (300)\tab Incl 150 MICA-ER and 150 MICA-EM version; for Rafale combat aircraft\par{\b     } \tab\tab 24\tab Rafale\tab FGA aircraft\tab 2015\tab 2019-2020\tab (24)\tab Part of EUR6.7 b deal; incl 18 Rafale-EQ (Q3-R) and 6 Rafale-DQ versions\par{\b     } \tab\tab 140\tab Storm Shadow/SCALP\tab ASM\tab 2015\tab 2019-2020\tab (140)\tab SCALP version; for Rafale combat aircraft\par{\b     } \tab\tab \tab Exocet CDS\tab Coastal defence system\tab 2016\tab \tab \tab Incl for use with Marte-ER anti-ship missiles; EUR640 m deal; delivery planned from 2022\par{\b     } \tab\tab (30)\tab MICA\tab BVRAAM\tab (2016)\tab \tab \tab For Fincantieri-700 corvettes from Italy\par{\b     } \tab\tab \tab MM-40-3 Exocet\tab Anti-ship MI/SSM\tab 2016\tab \tab \tab For coastal defence\par{\b     } \tab\tab (80)\tab MM-40-3 Exocet\tab Anti-ship MI/SSM\tab (2016)\tab \tab \tab For Fincantieri-3000 frigates and Fincantieri-700 corvettes from Italy\par{\b     } \tab\tab 2\tab VL-MICA-M\tab Naval SAM system\tab (2016)\tab \tab \tab For 2 Fincantieri-700 corvettes from Italy\par{\b     } \tab\tab 12\tab Rafale\tab FGA aircraft\tab 2017\tab \tab \tab Delivery planned 2021-2022\par{\b     } \tab\tab 16\tab AS-350/AS-550 Fennec\tab Light helicopter\tab 2018\tab 2018-2020\tab (16)\tab H125 version\par{\b     } Romania\tab 4\tab Gowind-2500\tab Frigate\tab (2019)\tab \tab \tab EUR1.2 b deal; incl production in Romania; delivery planned from 2022\par{\b     } Saudi Arabia\tab 3\tab Combattante FS-56\tab FAC\tab (2016)\tab \tab \tab EUR250 m deal; originally ordered by Saudi Arabia as aid for Lebanon but taken over by Saudi Arabia after aid cancelled; possibly to be delivered as patrol craft for coast guard\par{\b     } Senegal\tab 2\tab RPB-33\tab Patrol craft\tab 2018\tab 2019\tab 1\tab \par{\b     } \tab\tab (25)\tab Marte-2\tab Anti-ship missile\tab 2019\tab \tab \tab Marte-2/N version; for 3 OPV-58 FAC\par{\b     } \tab\tab \tab Mistral\tab Portable SAM\tab (2019)\tab \tab \tab For Simbad RC SAM on OPV-58 FAC\par{\b     } \tab\tab 3\tab OPV-58\tab FAC\tab 2019\tab \tab \tab \par{\b     } Serbia\tab 50\tab Mistral\tab Portable SAM\tab 2019\tab \tab \tab Mistral M3 version; incl for PASARS-16 SPAAG; delivery planned 2021\par{\b     } Singapore\tab (12)\tab EC725 Super Cougar\tab Transport helicopter\tab 2016\tab \tab \tab H225M version; delivery planned from 2021\par{\b     } South Korea\tab 4\tab PC2.5\tab Diesel engine\tab (2015)\tab \tab \tab For 1 Dokdo AALS produced in South Korea\par{\b     } Spain\tab 5\tab CAPTAS\tab ASW sonar\tab (2019)\tab \tab \tab Part of EUR166 m deal; CAPTAS-4 Compact version for 5 F-110 frigates produced in Spain\par{\b     } \tab\tab 5\tab UMS-4110 BlueMaster\tab ASW sonar\tab 2019\tab \tab \tab Part of EUR166 m deal; for 5 F-110 frigates produced in Spain\par{\b     } Tanzania\tab 8\tab AS-350/AS-550 Fennec\tab Light helicopter\tab 2017\tab 2018\tab 3\tab H125 version\par{\b     } Thailand\tab 4\tab EC725 Super Cougar\tab Transport helicopter\tab 2018\tab \tab \tab H225M version; delivery planned 2021\par{\b     } \tab\tab 12\tab LG1 105mm\tab Towed gun\tab (2020)\tab \tab \tab THB834 m ($27 m) deal; LG1 Mk-3 version; selected 2020 but possibly not yet ordered by end-2020\par{\b     } Togo\tab (5)\tab SA-342 Gazelle\tab Light helicopter\tab (2017)\tab 2019-2020\tab (5)\tab Second-hand; EUR20 m deal; SA-342M version\par{\b     } Turkey\tab 15\tab Ocean Master\tab MP aircraft radar\tab 2002\tab 2013-2020\tab (10)\tab Part of $400 m deal; part of 'Meltem' programme; for 9 CN-235MPA from Spain and 6 ATR-72 MP aircraft from Italy\par{\b     } UAE\tab 2\tab Helios-2\tab Recce satellite\tab 2015\tab 2020\tab 1\tab EUR700 m deal; Falcon Eye or Pleiades version\par{\b     } \tab\tab 2\tab Gowind-2500\tab Frigate\tab 2019\tab \tab \tab EUR750 m deal; option on 2 more\par{\b     } \tab\tab (30)\tab MM-40-3 Exocet\tab Anti-ship MI/SSM\tab (2019)\tab \tab \tab For Gowind-2500 frigates\par{\b     } \tab\tab \tab RDY\tab Combat ac radar\tab 2019\tab \tab \tab For modernization of Mirage-2000-9 combat aircraft; RDY-3 version\par{\b     } Ukraine\tab 12\tab EC725 Super Cougar\tab Transport helicopter\tab 2018\tab 2018-2019\tab 3\tab Second-hand; part of EUR555 m deal; H-225 version\par{\b     } United States\tab (45)\tab Mirage F-1C\tab FGA aircraft\tab 2017\tab 2019-2020\tab (15)\tab Second-hand (18 more delivered for spare parts); for US company for training of US armed forces\par{\b     } Uzbekistan\tab 8\tab EC725 Super Cougar\tab Transport helicopter\tab 2018\tab 2018\tab 3\tab Armed H-215M version\par\pard\plain \s6\sb100\sl220\brdrt\brdrs \tqr \tx12860 \f20\fs8 \par 
\pard\plain \s2\fi-9060\li9060\ri-40\sl220\tx280\tqr
\tx2080\tx2520\tx4440\tx6140\tx6860\tqr\tx8500\tx8920 \f20\fs18 
{\plain \f20 \par }
\pard\plain \s3\qj\fi220\sl280 \f20\fs22 {\plain \f20 \par }}

我想在熊猫数据框中得到它,如下所示:

                                                    Year(s)     
                     No.            Weapon      Weapon      Year        of          No.     
    recipient        ordered        designation description of order    delivery    delivered   Comments
    
    
     Belgium         (8)            RTM-332     Turboshaft  (2001)  2013-2015   (8) For 4 NH90 NFH ASW helicopters from Germany
     Denmark         (9)            FLASH           ASW sonar   2019            ASQ22 ALFS version; for 9 MH-60R ASW helicopters from USA; probably from US production line
     Egypt            2             UMS-4110       BlueMaster   ASW sonar   2020    2020    1   For 2 FREMM frigates from Italy
     Kenya           12             Bastion     APC/APV (2018)  2018    12  Financed by USA
     Qatar           16             NH-90 TTH       Transport helicopter    2018            Part of EUR3 b deal; delivery planned 2021-2025
     Australia       22             EC-665 Tiger    Combat helicopter   2001    2004-2011   (22)    AUD1.3 b ($670-981 m) 'Project Air-87' (offsets incl production of components and assembly of 18 in Australia and production of EC-120 helicopter for Asian market); Aussie Tiger version

我已经开始做一些研究,看看是否有任何自动化工具,因为 rts 格式的表格真的很难看:

标签: python-3.xpandasrtf

解决方案


不幸的是,我也找不到任何用于此提取过程的自动化工具,但是查看 rtf 标签,我们可以尝试拼凑一个合理的工作解决方案。

我注意到该标签\\par{\\b } 用于分隔行,并且该标签用于分隔行\\tab 内的单元格。

目前,我只是对列名进行了硬编码,但也许对rtf文件中的标签有更多工作知识的人可以建议如何自动提取这些名称,并改进我的答案。

还有一些额外的清理工作要做,因为每个单元格的文本中还有一些其他标签。

import re
import pandas as pd

## read in the text as a string
with open('weapons.rtf', 'r') as file:
    rawtext = file.read()

## the tag with R: as text is the starting point of the table
tabletext = rawtext.split("{\\b R:}")[-1]
rows = tabletext.split("\\par{\\b     } ")
cells = [row.split("\\tab ") for row in rows]

df = pd.DataFrame(data=cells, columns=
    ['recipient','No. ordered','Weapon designation','Weapon description',
    'Year of order','Year of delivery','No. delivered','Comments']
)

## clean up cells by replacing \\tab with an empty string
df_cleaned = df.replace("\\tab", "")

这是输出的开头和结尾print(df_cleaned.to_string())

         recipient No. ordered    Weapon designation      Weapon description Year of order Year of delivery No. delivered                                                                                                                                                                                                                                                                                                  Comments
0          Denmark         (9)                 FLASH               ASW sonar          2019                                                                                                                                                                                                                                              ASQ22 ALFS version; for 9 MH-60R ASW helicopters from USA; probably from US production line
1            Egypt           2   UMS-4110 BlueMaster               ASW sonar          2020             2020             1                                                                                                                                                                                                                                                                           For 2 FREMM frigates from Italy
2            Qatar          16             NH-90 TTH    Transport helicopter          2018                                                                                                                                                                                                                                                                                          Part of EUR3 b deal; delivery planned 2021-2025
3        Australia          12             Barracuda               Submarine        (2019)                                                                                                  AUD90 b ($70 b) 'SEA-1000' programme; produced under licence in Australia (incl. at least 60% of value from Australian production); Shortfin Barracuda Block-1A version; Australian designation Attack; delivery planned 2033/2034-2050
4          Belgium           6              MCM-2720                MCM ship          2019                                                                                                                                                                                                      Part of EUR2 b 'MCMV' programme (incl 6 for Netherlands; incl production of components in Belgium); delivery planned from 2023/2024
5           Brazil          43    EC725 Super Cougar    Transport helicopter          2008        2010-2020          (38)                                                                                                                                                  Part of EUR1.9 b 'H-XBR' programme; H225M version; incl 3 CSAR version; Brazilian designations H-36, HM-4, UH-15A and UH-15B; delivery planned 2010-2022
6                            2                 P-400            Patrol craft          2009                                                                                                                                                                                                                                                          NAPA-500 version; produced under licence in Brazil; Brazilian designation Macae
7                            4              Scorpene               Submarine          2009                                                                                                                                                                                            Part of EUR6.8 b 'SDP' programme; S-BR version; produced under licence in Brazil; Brazilian designation Riachuelo; delivery planned 2021-2024
8                            1                  SNBR       Nuclear submarine          2009                                                                                                                                                                                      Part of EUR6.8 b 'SDP' programme; produced under licence in Brazil with nuclear reactor designed and produced in Brazil; delivery planned 2033/2034
9                         (20)          AM-39 Exocet       Anti-ship missile        (2011)        2018-2020          (15)                                                                                                                                                                                                      AM-39 Block-2Mod-2 version; for EC-725 (AH-15B) helicopters; incl production of components in Brazil
10                           5    EC725 Super Cougar    Anti-ship helicopter          2012        2019-2020           (2)                                                                                                                                                                               Part of EUR1.9 b 'H-XBR' programme; H225M version; Brazilian designation AH-15B and Operacio MB; delivery planned 2019-2022

...

129        Ukraine          12    EC725 Super Cougar    Transport helicopter          2018        2018-2019             3                                                                                                                                                                                                                                                         Second-hand; part of EUR555 m deal; H-225 version
130  United States        (45)           Mirage F-1C            FGA aircraft          2017        2019-2020          (15)                                                                                                                                                                                                           Second-hand (18 more delivered for spare parts); for US company for training of US armed forces
131     Uzbekistan           8    EC725 Super Cougar    Transport helicopter          2018             2018             3  Armed H-215M version\par\pard\plain \s6\sb100\sl220\brdrt\brdrs \tqr \tx12860 \f20\fs8 \par \n\pard\plain \s2\fi-9060\li9060\ri-40\sl220\tx280\tqr\n\tx2080\tx2520\tx4440\tx6140\tx6860\tqr\tx8500\tx8920 \f20\fs18 \n{\plain \f20 \par }\n\pard\plain \s3\qj\fi220\sl280 \f20\fs22 {\plain \f20 \par }}

推荐阅读