Help build and maintain the infrastructure that powers our decision-making.
You will join a team responsible for a massive AWS Data Lakehouse environment, handling high volume telemetry data from our IoT network.
As a Data Engineer, you will move beyond simple tasks to owning full features of the data pipeline. You will write production-grade ETL code, ensure data quality, and contribute to the architectural design of our maturing data platform.
Key Responsibilities
- Pipeline Development: Design, build, and maintain scalable data pipelines using PySpark and Apache Airflow to ingest and process IoT telemetry data
- Lakehouse Architecture: actively contribute to the management of our data lakehouse using Apache Iceberg on Amazon S3
- Data Quality: Implement automated testing and monitoring to ensure the accuracy and reliability of data flowing into the lake
- Infrastructure as Code: Utilize Terraform to manage and deploy data infrastructure, ensuring repeatable and secure environments
- Collaboration: Partner with analysts and software engineers to understand data requirements and deliver efficient solutions
- Growth: Participate in release planning and design reviews, ensuring that we are building sustainable and scalable data products
Qualifications
- Experience: Solid experience in Data Engineering, capable of taking a feature from concept to production
- Tech Stack: Strong coding skills in Python and SQL
- Big Data Tools: Experience with Apache Spark (PySpark) and workflow orchestration tools like Airflow
- Modern Data Architecture: Familiarity with Table Formats (e.g., Iceberg, Delta Lake) and AWS storage (S3)
- Bonus: Exposure to Machine Learning concepts or ML Ops pipelines
About Circle Gas
Circle Gas is a revolutionary company focused on transforming the lives of people in sub-Saharan Africa and beyond.
We are scaling rapidly to support 2 million households (10 million people), providing clean, safe, and affordable cooking fuel. Data is the lifeblood of our operation—from billing 10 million customers to optimizing logistics in remote areas.
.