Webscraping Geospatial Data with Python/Beautiful Soup

Maptime brings experts, beginners, and the curious together all for the sake of spatial learning. Our goal for the Seattle branch of maptime is to connect geospatial professionals with those who desire to learn more about maps and spatial data.

https://www.meetup.com/maptimesea/events/300385852/

Beautiful Soup, so rich and green,

Waiting in a hot tureen!

Who for such dainties would not stoop?

Soup of the evening, beautiful Soup!

Soup of the evening, beautiful Soup!

Beau–ootiful Soo–oop!

Beau–ootiful Soo–oop!

Soo–oop of the e–e–evening,

Beautiful, beautiful Soup!

–Lewis Carroll

Many websites have geodata that’s embedded in HTML, with no published API to retrieve the original underlying data. Beautiful Soup is a Python library for quick, simple extraction of data from HTML pages. At the end of this tutorial, you will know how to write a Python script to use Beautiful Soup to extract geographic data from web pages that you didn’t write and don’t control.

We’ll dip our toes into several areas:

If you know of a website containing geodata that you’d like to extract and use, bring the URL and we can look at how to attack it.

Agenda:

What to Bring:

How to Prepare: Please install: Google Chrome, Anaconda, available at https://www.anaconda.com/download A text editor you like: perhaps Emacs, Vim, VS Code, or TextEdit.

Anaconda is a Python package management and Python installation. You might already have installed a different Python interpreter and package manager that you like, and it’s perfectly fine to stick with that one. I’ll use the Anaconda distribution for the tutorial, for the sake of simplifying the examples, but you can use a different Python setup if you prefer.

I specify Google Chrome because of its support for inspecting internals of web pages. Safari, Firefox, and Edge offer these capabilities too. I’ll be using Chrome for simplicity,

Check: Launch the Anaconda-Navigator from the Applications folder on your Mac or the Start button on your Windows machine. If you can launch Anaconda-Navigator, you’re ready to roll.

Where to Go:

About the instructor: