View on GitHub

Data Products and Interpolation

Oak Ridges Moraine Groundwater Program

Web Scraping

Web scraping is the process of extracting data from websites through scheduled/automated scripts.

On a nightly basis, data are extracted from a number of open data sources, whether they be:

“APIs”—Databases that can be directly queried by sending a web-address (URL), such as:
- KISTERS Services (KiWIS)
- AQUARIUS Time-Series Software, Aquatic Informatics Inc.
- FlowWorks
- WaterTrax
File repositories—Typically an FTP server hosting a number of general use files, like comma-separated-values (.csv) files
HTML tables—Readable tables posted online are converted into a dataframe—a form needed to insert into our database. This is the least reliable and thus the most effort is required.

Notes:

Streamflow discharge and stage are re-scaled to daily mean timesteps when inserted into our database.

Sources:

A number of our partners maintain internal databases. ORMGP is continuing to integrate these sources into our workflow without the need for data duplication. This is (hopefully) accomplished by establishing an Application Programming Interface (API) on the partners’ end. Currently, we have:

see also Source References.

APIs

Region of Peel
York Region
Durham Region
TRCA
LSRCA
CVC
CLOCA
MNRF

File repositories

MSC Datamart
WSC HYDAT
ECCC CaPA-HRDPA
NOAA SNODAS

HTML Tables

MSC historical (hourly and daily)