We are currently partnering with one of our Berlin based clients that are looking for a Data Engineer for the below position. This role requires someone to work onsite 1-2 days per week and rest of the time from home. Please see below for more details:
We’re looking for an experienced developer with a getting-things-done attitude to join our team. You’ll work on the development of our ad verification solution by focussing on our data processing pipelines and the underlying infrastructure. We work in cross-functional teams of backend, frontend and data engineers, product managers as well as quality assurance and DevOps experts.
- Plan, design and implement robust data pipelines using technologies such as Hadoop MapReduce and Spark
- Deliver near real-time data to our customers using our high-availability data infrastructure
- Participate in architectural planning for our current and upcoming data challenges
- Create solutions for data analytics and reporting
- Enable our team to generate valuable insights
- Support and monitor our data pipelines
- Work closely with our network/server administrators to maintain On-Premises Hadoop infrastructure
- Be responsible for the cluster maintenance
Skills and requirements
- Seasoned Java and Scala developer
- Experience with the Hadoop ecosystem (MapReduce, Hive, HDFS, Pig, HBase…)
- Knowledge in cluster administration (Cloudera Manager / Ambari)
- Knowledge and experience in writing Spark applications (Spark streaming, SparkSQL)
- Used to work in Linux based environments
- Good to know: Kafka, ElasticSearch
- Experience with working in an agile development environment
- Fluent in English
Our backend and warehousing solutions reach from classical RDBMs to in-memory databases and distributed solutions as CouchDB and ElasticSearch.
Last but not least, we have a bunch of other technologies in production or waiting for them to be, among them OpenShift, SOA, NodeJS, Docker, Jenkins, Nginx + Lua.