Skip to main content
Open In ColabOpen on GitHub

古腾堡

Project Gutenberg is an online library of free eBooks.

本笔记本介绍如何将指向 Gutenberg 本电子书的链接加载为我们可在下游使用的文档格式。

from langchain_community.document_loaders import GutenbergLoader
API 参考:GutenbergLoader
loader = GutenbergLoader("https://www.gutenberg.org/cache/epub/69972/pg69972.txt")
data = loader.load()
data[0].page_content[:300]
'The Project Gutenberg eBook of The changed brides, by Emma Dorothy\r\n\n\nEliza Nevitte Southworth\r\n\n\n\r\n\n\nThis eBook is for the use of anyone anywhere in the United States and\r\n\n\nmost other parts of the world at no cost and with almost no restrictions\r\n\n\nwhatsoever. You may copy it, give it away or re-u'
data[0].metadata
{'source': 'https://www.gutenberg.org/cache/epub/69972/pg69972.txt'}