UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 9: illegal multibyte sequence

2023-07-31,,

最近对爬虫有点着迷,

在用bs4模块时,遇到报错:UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 9: illegal multibyte sequence

bs4获取本地文件内容

from bs4 import BeautifulSoup
soup = BeautifulSoup(open('a.html'), 'html.parser')
print(soup.prettify()) # 打印本地文件的内容

其中,a.html的内容为:

<div>大家好</div>
<p>你好啊</p>

运行报错

上面是字符流的问题

from bs4 import BeautifulSoup
soup = BeautifulSoup(open('a.html', 'rb'), 'html.parser')
print(soup.prettify()) # 打印本地文件的内容

运行结果:

 

UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 9: illegal multibyte sequence的相关教程结束。

《UnicodeDecodeError: 'gbk' codec can't decode byte 0xae in position 9: illegal multibyte sequence.doc》

下载本文的Word格式文档,以方便收藏与打印。