System Architecture and Redundancy
iconik runs in multiple regions to which we replicate the databases so all database records are available in all regions.
Each region works stand-alone, but since all data is replicated across regions a user will be redirected to the closest region by the Google Cloud Load-balancer publically accessible at https://app.iconik.io/
Image showing iconik replication and load-balancing across two regions, such as our regions in Belgium EU and Iowa, USA.
Having multiple regions means that we can allow for:
- Region outages and failover.
- Global Load-Balancing so that users are redirected to the closest region for lowest latency.
- Upgrades of a single region.
In-Depth information
Iconik is replicated between multiple datacenters and regions. Currently we have two locations in Google Cloud, one in Iowa, USA (us-central1) and one in Belgium, EU (europe-west1).
Each region has a cluster of Cassandra database nodes, Redis and Elastic as well as the microservices that iconik runs. Clusters allow for individual node failures and they are cross replicated across regions to allow for region outages. iconik Microservices are load balanced inside of each region.
There are two additional federated RabbitMQ clusters deployed on the same nodes that runs Cassandra that manages synchronization between Elasticsearch nodes across regions.
The global load-balancer utilizes Anycast to provide different routes depending on where traffic originates or the health of the backend services.
This means that in the rare event of the whole EU cluster or region going down, everyone will still be able to access their data through the US cluster seamlessly through https://app.iconik.io
Each entity in iconik has a unique ID, using UUIDs which allow for creating unique records on each region and synchronization without clashing names or identifiers.
Keyframes and Proxies (As well as any original files stored in the default iconik GCS buckets) are mirrored in at least three regions in the continent of its origin (EU, US or Asia) (See https://cloud.google.com/storage/docs/bucket-locations).
Internal monitoring and logging utilizes Google StackDriver (a close relation of Prometheus). Publicly accessible external monitoring (https://status.iconik.io ) is deployed on Amazon AWS using CloudWatch, CloudFront, AWS Lambda and S3. Logs for the system are stored to Google Cloud Buckets
Periodic backups are done and stored in private buckets for each region.