首页 > 解决方案 > 如何从网页中读取内容,然后使用 urllib 将其输出

问题描述

我正在尝试从网站获取随机文本。我的代码:

import urllib.request
import re

url = "https://www.randomlists.com/random-words"

print(url)
request = urllib.request.urlopen(url).read()
request.decode("utf-8")



print(request)

然而,输出:

https://www.randomlists.com/random-words
b'<!doctype html> <html lang="en-US"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> <title>Random Word Generator &mdash; Get a list of random words</title> <meta name="description" content="A word randomizer for finding quick inspiration. Generate a random list of words from 2500+ of the most common English words. Also filter by part of speech!" /> <link rel="canonical" href="https://www.randomlists.com/random-words" /> <meta property="og:locale" content="en_US" /> <meta property="og:type" content="website" /> <meta property="og:title" content="Random Word Generator &mdash; Get a list of random words" /> <meta property="og:description" content="A word randomizer for finding quick inspiration. Generate a random list of words from 2500+ of the most common English words. Also filter by part of speech!" /> <meta property="og:url" content="https://www.randomlists.com/random-words" /> <meta property="og:site_name" content="Random Lists" /> <meta property="og:image" content="https://www.randomlists.com/img/og-image.jpg" /> <meta property="og:image:secure_url" content="https://www.randomlists.com/img/og-image.jpg" /> <meta property="og:image:height" content="1200" /> <meta property="og:image:width" content="1200" /> <meta property="og:image:alt" content="Playing cards randomly spread about" /> <meta property="og:image:type" content="image/jpg" /> <link rel="apple-touch-icon" sizes="180x180" href="/apple-touch-icon.png"> <link rel="icon" type="image/png" sizes="32x32" href="/favicon-32x32.png"> <link rel="icon" type="image/png" sizes="16x16" href="/favicon-16x16.png"> <link rel="manifest" href="/manifest.json"> <link rel="mask-icon" href="/safari-pinned-tab.svg" color="#000000"> <meta name="msapplication-TileColor" content="#ffffff"> <meta name="theme-color" content="#ffffff"> <link rel="preload" data-rand_json href="/data/words.json" as="fetch"> <style>@charset "UTF-8";.Header-content,.layout-main,.main_width{width:100%}@media only screen and (min-width:500px){.Header-content,.layout-main,.main_width{width:calc(100% - 120px - 1rem)}}@media only screen and (min-width:700px){.Header-content,.layout-main,.main_width{width:calc(100% - 160px - 1rem)}}@media only screen and (min-width:900px){.Header-content,.layout-main,.main_width{max-width:calc(100% - 300px - 1rem)}}.Header-content,.layout-main,.main_width{width:100%}@media only screen and (min-width:500px){.Header-content,.layout-main,.main_width{width:calc(100% - 120px - 1rem)}}@media only screen and (min-width:700px){.Header-content,.layout-main,.main_width{width:calc(100% - 160px - 1rem)}}@media only screen and (min-width:900px){.Header-content,.layout-main,.main_width{max-width:calc(100% - 300px - 1rem)}}html{font-size:14px}@media only screen and (min-width:728px){html{font-size:16px}}body{background:#efefea}body,input{margin:0;font-family:Roboto;line-height:1.4em}h1,h2,h3,p{margin-top:0;margin-bottom:1rem}h1,h2,h3{font-family:"Roboto Condensed";font-weight:400;line-height:1.1em}h1{font-size:1.8rem}h2{font-size:1.4rem}h3{font-size:1.2rem}strong{font-weight:500}.section_wide{max-width:1200px;margin-left:auto;margin-right:auto;box-sizing:border-box}.section_wide.section_gutter{box-sizing:content-box}.section_gutter{padding-left:1rem;padding-right:1rem;box-sizing:border-box}.button{display:-webkit-inline-box;display:inline-flex;-webkit-box-orient:vertical;-webkit-box-direction:normal;flex-direction:column;-webkit-box-align:center;align-items:center;text-align:center;-webkit-box-pack:center;justify-content:center;text-decoration:none;color:#fff;box-sizing:border-box;border-width:0;background:#0e1a35;cursor:pointer;height:44px;width:48px;padding:2px;font-size:8px;line-height:1.25em;-webkit-transition:all ease .2s;transition:all ease .2s;text-transform:uppercase;font-family:Arial}.button:before{font-family:icon;font-size:1.2rem;line-height:1em;margin-bottom:.4rem}.button:hover{background:#224287}.button:focus{outline-style:solid;outline-width:2px}.button--yo{background:#1985a3!important}.button--yay{background:#19a337!important}.button--yay:before{content:"\xef\x80\x8c"!important}.button--edit:before{content:"\xef\x81\x80"!important}.button--wide{width:64px}.button:disabled{background:#222;cursor:no-drop;opacity:.8}.js_notice{position:fixed;bottom:0;left:0;width:100%;padding:1rem;background:#a32019;color:#fff;text-align:center}.js_notice a{color:currentColor}@font-face{font-family:icon;src:url(/vendor/icomoon/fonts/icomoon.ttf?cq58c8) format("truetype"),url(/vendor/icomoon/fonts/icomoon.woff?cq58c8) format("woff"),url(/vendor/icomoon/fonts/icomoon.svg?cq58c8#icomoon) format("svg");font-weight:400;font-style:normal}.layout{display:-webkit-box;display:flex;flex-wrap:wrap;-webkit-box-pack:justify;justify-content:space-between}.layout-main{background:#fff;box-shadow:0 0 1rem rgba(0,0,0,.2)}.layout-top{margin:.5rem 0 1.5rem;display:-webkit-box;display:flex;-webkit-box-align:center;align-items:center;-webkit-box-pack:center;justify-content:center}.adsbygoogle--top{width:320px;min-height:50px;max-height:100px}@media only screen and (min-width:630px){.adsbygoogle--top{width:468px;height:60px;min-height:0;max-height:none}}@media only screen and (min-width:1065px){.adsbygoogle--top{width:728px;height:90px}}.adsbygoogle--side1,.adsbygoogle--side2,.layout-side{display:none}@media only screen and (min-width:500px){.layout-side{display:-webkit-box;display:flex;display:block;margin:-2rem 0 0;width:120px}.adsbygoogle--side1,.adsbygoogle--side2{display:block;width:120px;height:600px}}@media only screen and (min-width:700px){.layout-side{width:160px}.adsbygoogle--side1,.adsbygoogle--side2{width:160px;height:600px}}@media only screen and (min-width:900px){.layout-side{width:300px}.adsbygoogle--side1{width:300px;height:250px}.adsbygoogle--side2{width:300px;height:600px}}.Header{background:#182e5e;color:#fff;border-bottom:solid 1px rgba(0,0,0,.1)}.Header a{color:inherit}.Header-content{display:-webkit-box;display:flex;height:3rem}.Header-logo{margin-left:-1rem;margin-right:auto}.Header-logo a{display:-webkit-box;display:flex;-webkit-box-align:center;align-items:center;font-size:1.5rem;line-height:1em;height:3rem;padding:0 1rem;text-decoration:none;white-space:nowrap;position:relative;margin-left:-.2rem}.Header-logo a:before{content:\'\';background:url(data:image/svg+xml;base64,PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0idXRmLTgiPz48c3ZnIHZlcnNpb249IjEuMSIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIiB4bWxuczp4bGluaz0iaHR0cDovL3d3dy53My5vcmcvMTk5OS94bGluayIgeD0iMHB4IiB5PSIwcHgiIHZpZXdCb3g9IjAgMCAyNCAyNCIgc3R5bGU9ImVuYWJsZS1iYWNrZ3JvdW5kOm5ldyAwIDAgMjQgMjQ7IiB4bWw6c3BhY2U9InByZXNlcnZlIj48cmVjdCB4PSIxIiB5PSIxIiBmaWxsPSIjZmZmIiB3aWR0aD0iMjIiIGhlaWdodD0iMjIiLz48Zz48cGF0aCBkPSJNMjMsMEgxQzAuNCwwLDAsMC40LDAsMXYyMmMwLDAuNiwwLjQsMSwxLDFoMjJjMC42LDAsMS0wLjQsMS0xVjFDMjQsMC40LDIzLjYsMCwyMywweiBNMjIsMjJIMlYyaDIwVjIyeiIvPjxwYXRoIGQ9Ik0xMiwxNGMxLjEsMCwyLTAuOSwyLTJzLTAuOS0yLTItMnMtMiwwLjktMiwyUzEwLjksMTQsMTIsMTR6Ii8+PHBhdGggZD0iTTE3LDE5YzEuMSwwLDItMC45LDItMnMtMC45LTItMi0ycy0yLDAuOS0yLDJTMTUuOSwxOSwxNywxOXoiLz48cGF0aCBkPSJNNyw5YzEuMSwwLDItMC45LDItMlM4LjEsNSw3LDVTNSw1LjksNSw3UzUuOSw5LDcsOXoiLz48cGF0aCBkPSJNMTcsOWMxLjEsMCwyLTAuOSwyLTJzLTAuOS0yLTItMnMtMiwwLjktMiwyUzE1LjksOSwxNyw5eiIvPjxwYXRoIGQ9Ik03LDE5YzEuMSwwLDItMC45LDItMnMtMC45LTItMi0ycy0yLDAuOS0yLDJTNS45LDE5LDcsMTl6Ii8+PC9nPjwvc3ZnPg==) no-repeat 0 0;background-size:1.25rem 1.25rem;width:1.25rem;height:1.25rem;margin-right:.2rem}.Header-menu{display:-webkit-box;display:flex;margin-right:-1rem}.Header-menu>a{height:3rem;width:3rem;margin-left:2px}.Header-menu [href*=search]:before{content:"\xef\x80\x82"}.Header-menu [href*=browse]:before{content:"\xef\x83\x89";margin-bottom:.2rem;line-height:1.4rem}.Header-suggested{margin:0 .4rem 0 0;overflow:hidden;height:3rem;font-size:.9rem}.Header-suggested ul{margin:0;padding:0;list-style:none;display:-webkit-box;display:flex;flex-wrap:wrap;display:flex;-webkit-box-pack:end;justify-content:flex-end}.Header-suggested li{display:-webkit-box;display:flex;padding:0}.Header-suggested a{display:-webkit-box;display:flex;-webkit-box-align:center;align-items:center;line-height:1em;height:3rem;padding:0 .5rem}@media only screen and (max-width:620px){.Header-suggested{display:none}}.Header-search{display:none}.Rand-header{margin:1rem 0 2rem;padding-top:1rem;display:-webkit-box;display:flex;-webkit-box-pack:justify;justify-content:space-between}.Rand-headline{margin:0 1rem 0 0;align-self:center}.Rand-tools1{align-self:flex-start;margin:0;white-space:nowrap;position:relative}.Rand-tools2{display:-webkit-box;display:flex;-webkit-box-pack:center;justify-content:center;position:relative;margin:2rem 0;text-align:center}.ShareButtons:not([hidden]){display:-webkit-box;display:flex;-webkit-box-pack:center;justify-content:center;text-align:center;position:absolute;top:calc(100%);left:50%;-webkit-transform:translateX(-50%);transform:translateX(-50%);z-index:10;padding:4px;background:rgba(255,255,255,.8)}.Rand-tools1 button:not(:first-child),.Rand-tools2 button:not(:first-child),.ShareButtons button:not(:first-child){margin-left:4px}.Rand-ad{display:-webkit-box;display:flex;margin:1rem -1rem;-webkit-box-pack:center;justify-content:center}.adsbygoogle--rand{width:300px;height:250px}@media only screen and (min-width:350px){.adsbygoogle--rand{width:336px;height:280px}}@media only screen and (min-width:1065px){.adsbygoogle--rand{width:728px;height:90px}}@media only screen and (min-width:468px){.Rand-grid{display:-webkit-box;display:flex;-webkit-box-pack:justify;justify-content:space-between;-webkit-box-align:start;align-items:flex-start}.Rand-grid>*{width:calc(50% - 1.5rem)}}.RandOptions+.RandCopy{padding-top:1rem}.RandCopy-nav .select-nav a:after{content:","}.RandCopy-nav *{display:inline;padding:0}.RandOptions{background:rgba(1,1,1,.02);padding:1rem}.RandOptions h2{margin:0 0 1rem}.RandOptions-row{margin:0 0 1rem}.RandOptions-row input:not([type=checkbox]),.RandOptions-row select,.RandOptions-row textarea{padding:.5rem;box-sizing:border-box;min-width:140px;max-width:100%}.RandOptions-row label:first-child{display:block;margin-bottom:.1rem}.RandOptions-row input:not(:first-child),.RandOptions-row select{display:block}.RandOptions-row textarea{min-height:200px;width:100%}.RandOptions-row input[type=number]{min-width:0;width:5em}.RandOptions-row input[type=checkbox]{width:1.5rem;height:1.5rem;vertical-align:bottom}.RandOptions-helper{display:block;font-size:.7rem;line-height:1.1em;margin-top:.1rem}[data-action=rerun]:before{content:"\xef\x81\xb4"}[data-action=options]:before{content:"\xef\x80\x93"}[data-action=share]:before{content:"\xef\x87\xa0"}[data-share=copy]:before{content:"\xef\x83\x85"}[data-share=facebook]:before{content:"\xef\x82\x9a"}[data-share=twitter]:before{content:"\xef\x82\x99"}[data-share=reddit]:before{content:"\xef\x8a\x81"}.Rand-stage-loading{width:100%;text-align:center;font-size:12px;text-indent:.8em;text-transform:uppercase;line-height:1em;background:transparent url(/img/loading.gif) no-repeat 50% .5rem;padding:2rem 1rem 1rem}.Rand-stage--no_images{text-align:left!important}.Rand-stage--no_images .img,.Rand-stage--no_images img{display:none!important}.Rand-stage--highlight,.RandOptions--highlight{background:rgba(1,1,1,.04)!important}.Rand-stage{display:-webkit-box;display:flex;margin:0;background:rgba(1,1,1,.02);padding:1rem .5rem 1rem 0}.Rand-stage ol,.Rand-stage ul{display:-webkit-box;display:flex;flex-wrap:wrap;width:100%;margin:0;padding:0;list-style:none}.Rand-stage li{position:relative;margin:0 0 .5rem;padding:.5rem 1rem;box-sizing:border-box;word-break:break-word;min-width:20%}@media only screen and (min-width:900px){.Rand-stage li{min-width:10%}}.Rand-stage ol{counter-reset:li;padding-left:.5rem}.Rand-stage ol>li{position:relative}.Rand-stage ol>li:before{counter-increment:li;content:counter(li);position:absolute;right:calc(100% - .8rem + .5ex);top:calc(.5rem + .5em);font-size:.8rem;line-height:1em;font-family:monospace;opacity:.5;text-align:right;white-space:nowrap}.Rand-stage img{max-width:100%;height:auto}.Rand-stage>ol>li img{display:inline-block;margin:0 0 4px}.Rand-stage>ol>li a.biglink{display:block;position:relative;padding:0 1rem 0 0;text-decoration:none}.Rand-stage>ol>li a.biglink:after{content:"\xef\x82\x8e";font-family:icon;position:absolute;bottom:0;right:0;font-size:1rem;color:#000;-webkit-transition:opacity .3s ease,color .3s ease;transition:opacity .3s ease,color .3s ease;opacity:.75}.Rand-stage>ol>li a.biglink:hover:after{color:#0e1a35;opacity:1}span.rand_large,span.rand_medium,span.rand_small{display:block}.rand_huge{font-size:2.5rem;line-height:1.1em}.rand_large{font-size:1.5rem;line-height:1.25em}.rand_medium{font-size:1rem;color:#000}.rand_small{font-size:.8rem;color:#999}.monospace{font-family:monospace}</style> <link rel="preload" href="/css/defer.css?v=1575659034" as="style" onload="this.rel=\'stylesheet\'"> <link rel="preload" href="https://fonts.googleapis.com/css?family=Roboto%2BCondensed%7CRoboto%3A400%2C500&display=swap" as="style" onload="this.rel=\'stylesheet\'"> <script> (function(){ const re = new RegExp("items=[^\\&]"); if(re.test(location.search)){ var style = document.createElement("style"); document.getElementsByTagName(\'head\')[0].appendChild(style); style.innerHTML = ".adsbygoogle{display:none !important;height:0 !important;width:0 !important;}"; } }()); </script> <script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script> <script> (adsbygoogle = window.adsbygoogle || []).push({ google_ad_client: "ca-pub-5711305288167877", enable_page_level_ads: true }); </script> <script src="/js/defer.js?v=1575659071" defer></script> <script async src="https://www.googletagmanager.com/gtag/js?id=UA-40634703-1"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag(\'js\', new Date()); gtag(\'config\', \'UA-40634703-1\'); </script> </head> <body> <header class="Header"> <div class="section_wide"> <div class="Header-content section_gutter"> <div class="Header-logo"> <a href="/">Random Lists</a> </div> <nav class="Header-suggested"> <ul> <li><a href="/nouns">Nouns</a></li> <li><a href="/pictionary-words">Pictionary</a></li> <li><a href="/random-verbs">Verbs</a></li> <li><a href="/random-vocabulary-words">Vocabulary Words</a></li> <li><a href="/compound-words">Compound Words</a></li> </ul> </nav> <div class="Header-menu"> <a href="/search" class="button">Search</a> <a href="/#browse" class="button">Menu</a> </div> <div class="Header-search"> <form method="get" action="/search"> <div class="Header-search_flex"> <input type="text" name="q" required placeholder="Search&hellip;"> <button class="button"><span>Search</span></button> </div> </form> </div> </div> </div> </header> <div class="layout section_wide"> <div class="layout-main"> <div class="layout-top"> <ins class="adsbygoogle adsbygoogle--top" style="display:inline-block" data-ad-client="ca-pub-5711305288167877" data-ad-slot="3781211666"></ins> <script> (adsbygoogle = window.adsbygoogle || []).push({}); </script> </div> <main class="layout-part3 section_gutter"> <article> <div class="Rand-header"> <h1 class="Rand-headline">Random word generator:</h1> <aside class="Rand-tools1"> <button data-action="rerun" class="button">Rerun</button><button data-action="options" class="button">Options</button> </aside> </div> <div class="Rand-stage"> <div class="Rand-stage-loading"> Loading&hellip; </div> </div> <aside class="Rand-tools2"> <button data-action="rerun" class="button">Rerun</button><button data-action="options" class="button">Options</button> <button data-action="share" id="Rand-ShareButtonsLabel" aria-expanded="false" aria-controls="Rand-ShareButtons" class="button">Share</button> <div class="ShareButtons" id="Rand-ShareButtons" aria-labelledby="Rand-ShareButtonsLabel" hidden> <button data-share="copy" class="button">Copy URL</button> <button data-share="facebook" class="button">Facebook</button> <button data-share="twitter" class="button">Twitter</button> <button data-share="reddit" class="button">Reddit</button> </div> </aside> <aside class="Rand-ad"> <ins class="adsbygoogle adsbygoogle--rand" style="display:inline-block" data-ad-client="ca-pub-5711305288167877" data-ad-slot="1928302466"></ins> <script> (adsbygoogle = window.adsbygoogle || []).push({}); </script> </aside> <div class="Rand-grid"> <aside class="RandOptions"> <h2>Edit Settings</h2> <form action="#" id="rand_options"> <p class="RandOptions-row"> <label for="rand_jumper">Dataset</label> <select id="rand_jumper"></select> </p> <p class="RandOptions-row"> <label for="rand_options_qty">Quantity</label> <input id="rand_options_qty" type="number" name="qty" value="12"> </p> <p class="RandOptions-row"> <input type="checkbox" name="dup" id="rand_options_dup"> <label for="rand_options_dup">Duplicates</label> </p> </form> <p><button data-action="rerun" class="button">Rerun</button></p> </aside> <div class="RandCopy"> <h2>Random Word Generator</h2> <p>Supposedly there are over one million words in the English Language. We trimmed some fat to take away really odd words and determiners. Then we grabbed the most popular words and built this word randomizer. Just keep clicking generate&mdash;chances are you won\'t find a repeat!</p> <h3>Random Word Games</h3> <p>As an exercise for English students, generate a list of ten random words and have the student write a story that incorporates those words in the order they\'re generated.</p> <p>You could also take the hard work out of playing MadLibs but for that you\'ll need to separate out the parts of speech. There\'s generators for each one, just jump over using the options below.</p> <div class="RandCopy-nav"> Also try: <nav class="select-nav"> <a href="https://www.randomlists.com/random-words">Words</a> <ul> <li><a href="https://www.randomlists.com/random-adjectives">Adjectives</a></li> <li><a href="https://www.randomlists.com/random-adverbs">Adverbs</a></li> <li><a href="https://www.randomlists.com/compound-words">Compound Words</a></li> <li><a href="https://www.randomlists.com/nouns">Nouns</a></li> <li><a href="https://www.randomlists.com/random-prepositions">Prepositions</a></li> <li><a href="https://www.randomlists.com/spanish-words">Spanish Words</a></li> <li><a href="https://www.randomlists.com/random-verbs">Verbs</a></li> <li><a href="https://www.randomlists.com/random-vocabulary-words">Vocabulary Words</a></li> </ul> </nav> or just <a href="/list-randomizer">create your own list</a>. </div> </div> </div> </article> <div id="rand-image-preload"></div> <script>const rand = "words";</script> <script type="application/ld+json">{"@context":"https:\\/\\/schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Random Lists","item":"https:\\/\\/www.randomlists.com\\/"},{"@type":"ListItem","position":1,"name":"Words","item":"https:\\/\\/www.randomlists.com\\/random-words"}]}</script> </main> <div class="layout-prefooter"> <ins class="adsbygoogle adsbygoogle--prefooter" style="display:inline-block" data-ad-client="ca-pub-5711305288167877" data-ad-slot="3781211666"></ins> <script> (adsbygoogle = window.adsbygoogle || []).push({}); </script> </div> <footer class="footer section_gutter"> <nav> <ul> <li><a href="/">Home</a></li> <li><a href="/faq">FAQ</a></li> <li><a href="/privacy">Privacy Policy</a></li> <li><a href="/contact">Contact</a></li> <li><a href="/list-randomizer">Randomize your custom list</a></li> <li class="footer-sitemap"><a href="/all">Sitemap</a></li> </ul> </nav> <p class="disclosure">This site is offered as is to those who visit it. We make no guarantees regarding its services. Just enjoy yourself.</p> </footer> </div> <div class="layout-side"> <ins class="adsbygoogle adsbygoogle--side1" style="display:inline-block" data-ad-client="ca-pub-5711305288167877" data-ad-slot="2304478467"></ins> <script> (adsbygoogle = window.adsbygoogle || []).push({}); </script> <ins class="adsbygoogle adsbygoogle--side2" style="display:inline-block" data-ad-client="ca-pub-5711305288167877" data-ad-slot="2304478467"></ins> <script> (adsbygoogle = window.adsbygoogle || []).push({}); </script> </div> </div> <script> const el_RS=document.querySelector(".Rand-stage"),el_RO=document.getElementById("rand_options"),useWebP=function(){var e=document.createElement("canvas");return!(!e.getContext||!e.getContext("2d"))&&0==e.toDataURL("image/webp").indexOf("data:image/webp")}();function randomiseNumbers(e,t,n,r){e=parseInt(e),t=parseInt(t),n=parseInt(n);var o=t-e+1,a=new Array;if(r){for(var i=0;i<n;i++)a[i]=e+Math.floor(Math.random()*o);return a}var s=1;for(i=0;i<o;i++)s=(n-a.length)/(o-i),Math.random()<=s&&a.push(i+e);return randomise(a,n,r)}function randomise(e,t,n){if(Array.isArray(e)||(e=e.split("")),e.shuffle(),n){for(var r=new Array,o=0;o<t;o++)r[o]=e[Math.floor(Math.random()*e.length)];return r}return t>0&&t<e.length?e.slice(0,t):e}Array.prototype.shuffle=function(){var e,t,n=this.length;if(0!=n)for(;--n;)e=Math.floor(Math.random()*(n+1)),t=this[n],this[n]=this[e],this[e]=t};const rand_json=function(){var e=[];const t=document.head.querySelectorAll("[data-rand_json]");for(var n=t.length-1;n>=0;n--)e.unshift(t[n].getAttribute("href"));return e}();function getRandOptions(){if(!el_RO)return;var e={};const t=el_RO.querySelectorAll("[name]");for(var n=t.length-1;n>=0;n--){const r=t[n].getAttribute("name"),o=t[n].getAttribute("id"),a=document.getElementById(o);"checkbox"===t[n].type?e[r]=a.checked:e[r]=a.value}return e}function strip(e){return(new DOMParser).parseFromString(e,"text/html").body.textContent||""}function triggerEvent(e,t){t=void 0===t?window:t;var n=document.createEvent("HTMLEvents");n.initEvent(e,!1,!0),t.dispatchEvent(n)}function getJSON(e,t){var n=new XMLHttpRequest;n.open("GET",e,!0),n.onload=function(){if(n.status>=200&&n.status<400){const e=JSON.parse(n.responseText);t(e)}},n.send()}function copyText(e){var t=document.createElement("input");t.setAttribute("style","opacity:0;position:absolute;"),t.value=e,document.body.appendChild(t),t.select(),document.execCommand("copy"),t.parentNode.removeChild(t)}if(function(){if(!el_RO)return;var e,t=(e||document.location.search).replace(/(^\\?)/,"").split("&").map(function(e){return this[(e=e.split("="))[0]]=e[1],this}.bind({}))[0];const n=el_RO.querySelectorAll("[name]");for(var r=n.length-1;r>=0;r--){var o=n[r].getAttribute("name");if(!t[o])return;if("checkbox"===n[r].type){const e=t[o]&&"false"!=t[o];n[r].checked=e}else{var a=decodeURIComponent(t[o]).replace(/\\+/g," ");n[r].value=strip(a)}}}(),el_RS){const e=function(){var e=document.getElementById("style_rand_stage");e?e.innerHTML="":((e=document.createElement("style")).setAttribute("id","style_rand_stage"),document.getElementsByTagName("head")[0].appendChild(e));var t=document.querySelector(".Rand-stage > ol");if(t){var n=t.offsetWidth,r=function(e,t,n){var r=document.querySelectorAll(e);if(0!=r.length){for(var o=0,a=0;a<r.length;a++){var i=r[a].offsetWidth;i>o&&(o=i)}var s=1/Math.floor(t/o);s=100*(s<1?s:1),n.innerHTML+=e+"{min-width:"+s+"%;}"}};r(".Rand-stage > ol > li",n,e),r(".Rand-stage > ol ol > li",n,e)}};window.addEventListener("rand_ran",e),window.addEventListener("resize",e),window.addEventListener("fonts_loaded",e)}!function(){const e=document.head.querySelectorAll("[as=style]");for(var t=e.length-1;t>=0;t--)e[t].setAttribute("rel","stylesheet")}();var runRand=function(){var a=getRandOptions();getJSON(rand_json,function(n){var e=n.data||n.RandL.items,s=!!(n&&n.RandL&&n.RandL.meta&&n.RandL.meta.img)&&n.RandL.meta.img;s.local&&s.suffix&&useWebP&&(".png"!=s.suffix&&".jpg"!=s.suffix||(s.suffix=".webp"));var r=function(a,n){return a};s&&(r=function(a,n){var e=n.prefix;return n.local?n.merge_underscores?e+=a.replace(/\\W+/g,"_").toLowerCase():e+=a.replace(/\\W/g,"_").toLowerCase():e+=a,n.suffix&&(e+=n.suffix),e});for(var i=randomiseNumbers(0,e.length-1,a.qty,a.dup),l="",t=0;t<i.length;t++){var m="";if("string"==typeof e[i[t]]||e[i[t]]instanceof String)s&&n.RandL.meta.img.local?(m+="<span class=\'img\'><img src=\'"+r(e[i[t]],s)+"\' alt=\'\'></span>",m+="<span class=\'rand_medium\'>"+e[i[t]]+"</span>"):m+="<span class=\'rand_large\'>"+e[i[t]]+"</span>";else{var d=!1;if(e[i[t]].img?(m+="<span class=\'img\'><img src=\'"+r(e[i[t]].img,s)+"\' alt=\'\'></span>",d=!0):s&&n.RandL.meta.img.local&&(m+="<span class=\'img\'><img src=\'"+r(e[i[t]].name,s)+"\' alt=\'\'></span>",d=!0),e[i[t]].yt&&(m+="<div class=\'yt\'><iframe width=\'420\' height=\'315\' src=\'//www.youtube.com/embed/"+e[i[t]].yt+"\' frameborder=\'0\' allowfullscreen></iframe></div>",d=!0),e[i[t]].name)m+="<span class=\'"+(d?"rand_medium":"rand_large")+"\'>"+e[i[t]].name+"</span>";if(e[i[t]].detail)m+="<span class=\'"+(d?"rand_small":"rand_medium")+"\'>"+e[i[t]].detail+"</span>";e[i[t]].url&&(m="<a class=\'biglink\' href=\'"+e[i[t]].url+"\' target=\'_blank\'>"+m+"</a>")}l+="<li>"+m+"</li>"}el_RS.innerHTML="<ol>"+l+"</ol>",triggerEvent("rand_ran")})};runRand(); </script> <noscript> <div class="js_notice">You need to enable JavaScript. See: <a href="https://www.enable-javascript.com/" target="_blank">How to enable JavaScript in your browser</a></div> </noscript> </body> </html>'

这是html。我也试过:

import urllib.request
import re

url = "https://www.randomlists.com/random-words"

print(url)
request = urllib.request.urlopen(url)

word = request.read()
word.decode("utf-8")


print(word)

但它输出相同。如何读取网页内容然后输出?

所需输出的示例:

machine
somber
fancy
bitter
limping
lip
clear
worry
belief
arm
jam
board

该网页一次有 12 个随机生成的单词。

标签: pythonurllib

解决方案


正如现代 Web 应用程序经常发生的那样,您使用的应用程序有一个未记录的 JSON api,它可能比设计为由浏览器呈现的 HTML 页面更适合由外部应用程序读取。

如果您在加载页面时检查了浏览器发出的请求,您可能已经注意到浏览器发出了对 url 的请求https://www.randomlists.com/data/words.json。这会将应用程序使用的整个单词列表作为 json 数组返回。

下面的代码获取这个 url 并使用 Python 随机数函数随机选择单词。

import json
import random
import urllib.request

NOUT = 15
URL = "https://www.randomlists.com/data/words.json"
request_data = urllib.request.urlopen(URL).read()
json_data = json.loads(request_data)
nwords = len(json_data["data"])
idxs = random.choices(range(nwords), k=NOUT)
random_words = [json_data["data"][i] for i in idxs]

for word in random_words:
    print(word)

每次运行此代码时,它都会输出一个NOUT不同单词的列表。


推荐阅读