Public transit in Toronto area can now be paid with Presto fare cards. As part of an online registration functionality, one can access their card activity history. I was curious to what extent it would be possible to use that data to automatically reconstruct trips I’ve taken. I was most interested in distance travelled and mode (bus, subway, streetcar, train).
Useful data available is:
- Time of tap
- Transit agency operating the vehicle or station tapped
- Location of the tap – with some limitations, sometimes inaccurate
- Amount paid
The data isn’t quite perfect for my purposes. There are a few problems:
- Taps are not required everywhere
- Most trips do not require tapping out when leaving, including when leaving the subway. In many cases, a 2-stop trip can look the same as a 12-stop trip.
- Particularly in City of Toronto there are many in-station transfers for transferring between subway and buses that do not require taps
- The location of the tap is sometimes wrong (perhaps due to malfunctioning GPS)
- Sometimes a location is recorded specifically (“Square One GO Bus Terminal”), sometimes generically (“Zone 20”)
- Taps can take a few hours or days to arrive in online history (particularly the bus card readers seem to be uploaded nightly), though it does come in eventually
- There is a “discount” field which doesn’t actually show discounts as publicized in official fare schedules and is, as far as I can tell, useless
(Presto online interface also has a “Transit Usage Report” view, but it seems to only include fare payments, and none of the free transfers. As I understand it, it’s used for claiming tax credits.)
The exports are straightforward enough: it’s a simple CSV file. It is in reverse chronological order (latest taps first), but that’s easy to reverse.
Given the data, here’s what’s possible:
- Calculating the minimum number of trips taken, by only counting trips that had a fare charged
- Calculating the minimum number of trips involving the subway, by only counting entering subways with no in-station bus or streetcar transfers (these are mostly downtown Toronto)
- Estimating, with fairly high probability, at least one of modes of transport (bus, streetcar, subway, train) involved in a trip
- Estimating, with better than chance probability but not close to perfect, the number of logical “trips” taken and some of the modes involved
Here are some less obvious samples of actual data, commented to note what I was actually doing. I switched the order to earlier taps first for easier reading.