GTFS Transformer
A real-time GTFS (General Transit Feed Specification) data transformer built in Go. This service processes and transforms public transit data for South Tyrol's transportation network as part of the Open Data Hub platform.
Overview
GTFS is the standard format for public transit schedules used by Google Maps, Apple Maps, and transit apps worldwide. However, raw GTFS data often needs transformation to be useful for real-time applications.
This transformer ingests GTFS-RT (real-time) feeds from multiple transit operators, normalizes the data, and provides a unified API for downstream consumers. It's currently processing live data for buses, trains, and cable cars across South Tyrol.
Key Features
Real-time Processing
Processes GTFS-RT feeds with sub-second latency, providing live vehicle positions and arrival predictions.
Data Normalization
Merges data from multiple operators with different formats into a consistent, clean schema.
Validation & Alerts
Validates incoming data against GTFS spec and generates alerts for service disruptions and delays.
REST & GraphQL APIs
Exposes transformed data through both REST and GraphQL endpoints for flexible consumption.
Architecture
The service is built with Go's concurrency primitives at its core. Each data source runs in its own goroutine, polling for updates and pushing to a central processing pipeline via channels.
Protocol Buffers are used for efficient serialization of GTFS-RT messages. The transformer decodes protobuf feeds, applies business logic transformations, and stores results in Redis for fast retrieval.
The service processes over 50,000 vehicle position updates daily with an average latency of 200ms from source to API.
Tech Stack
What I Learned
Working with real-time data streams taught me about the challenges of eventual consistency, handling out-of-order messages, and building resilient systems that gracefully degrade when upstream sources fail.
I also gained experience with the GTFS specification and the complexities of public transit scheduling, including exceptions for holidays, route variants, and real-time delay propagation.