About your key responsibilities and impact:
Our team is building a high load Data Platform with over 1 TB/day real-time data and up to 100 GB/day of new aggregated data, which includes following directions:
- Integration with third-party services/solutions;
- Supporting for operations data-centric services;
- Data validation and storage.
Essential professional experience:
• Basic understanding of the following technologies:
- REST API (Aiohttp, Flask, FastAPI);
- Relational databases (PostgreSQL, Microsoft SQL Server, etc.);
- Job scheduling, task queues
- Cloud providers: AWS (S3, Athena, Redshift), Google Cloud Platform (Cloud Storage, BigQuery), etc.
• Knowledge in database theory: types, their pros and cons;
• Ability to implement data collection: Kafka, Google Analytics, Firebase, Appsflyer, Cloudflare, other third party apps;
• Development and support of ETL / ELT processes using examples;
• Understanding concepts of data quality / integrity testing automation based on an existing framework;
• Semantic layer models development using existing examples.