The repository for the website komikaan.nl.
(As currently I am the sole contributor, expect very odd lazy hacks written around 02:00 in the morning)
This project is heavily inspired by OVInfo and DRGL. These are applications that show dutch public transport info in real time. It's perfect for users who know the route they are taking and any alternate routes. Per stop they simply show information of when transport is leaving, how delayed it easy and some additional miscellaneous info. The problem with these apps tends to be, the second you cross the border you have left their area and they have no data to show you.
We try to not do that. Most places in the world publish a Google Transit Feed Specifation (GTFS) version of their schedules for intergrations into apps like "Google Maps" and such. This project aims to grab those public feeds, shove them into a database and make them querable.
The main issue with this project is the inconsistency of IDs across feeds from different suppliers. For instance, "Paris Gare du Nord" station is represented differently in various data sets. In the Dutch data set, it appears as a distinct stop serving only high-speed trains towards the Netherlands. In contrast, the Paris region data set lists it as a stop with numerous local and international lines. Additionally, NMBS publishes it with a completely different name and geographic location kilometers of where it should be to make matters worse.
Because of the need to deduplicate and merge data, we can't handle it manually—there are just too many cross-border trip stops. So, we would like to solve this problem programmatically.
The following is a really simplistic overview of the current structure of the project:
- API (this project)
- For connecting to the PostgreSQL database
- FileDetector
- Responsible for querying external data suppliers and checking if a new GTFS file has been published
- Harvester
- Responsible for importing GTFS datasets into a database
- Publishes imported stops to an RabbitMQ queue primary for the Gardener
- Gardener*
- Responsible processing each stop to attempt a merge with other relevant stops
- Responsible for publishing deduplicated stops into the database
- Irrigator
- Responsible for retrieving and processing GTFS realtime data
- Responsible for writing the realtime data to the database
- GTFS-PSQL-Multisourced
- Working name for the database where the GTFS data ends up in. A database designed to support multiple GTFS feeds + realtime data.
- komikaan.GTFS
- A library that has raw datacontracts for GTFS in .NET without any special modifications
(*) Projects that are not opensource yet
The website is a poor mix of Dutch and English at the moment (I promise i18n support) but the backend code is not. Everything is written in English with the rare exception of data models from external supplier. There we are bound to whatever they provide us. If you end up contributing, please follow provided standards.
The frontend can be found in EnessenE/komikaan-webapp.
Simply open the .sln file. If you use Visual Studio then it should guide you along with what else you need.