The content of a page may be parsed, searched, reformatted, its data copied into a spreadsheet or loaded into a database. Once fetched, then extraction can take place. Therefore, web crawling is a main component of web scraping, to fetch pages for later processing.
Fetching is the downloading of a page (which a browser does when a user views a page). Web scraping a web page involves fetching it and extracting from it. It is a form of copying in which specific data is gathered and copied from the web, typically into a central local database or spreadsheet, for later retrieval or analysis. While web scraping can be done manually by a software user, the term typically refers to automated processes implemented using a bot or web crawler. The web scraping software may directly access the World Wide Web using the Hypertext Transfer Protocol or a web browser.
Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites. For broader coverage of this topic, see Data scraping.