Jenjang | : | Sarjana/S1 |
Program Studi | : |
|
Exposure to a traditional RDBMS with a preference for Postgres, and MySQL.
Ability to perform data transformations via scripting, stored procedures, or an ETL framework.
Able to do light development and support most aspects of a big data cluster: Ingestion, Processing, Parsing, integration (Python, Spark, Scala), data movement, workflow management (OOZIE, ActiveBatch and Airflow), and querying (SQL).
Ability to read and understand at least one of following programming languages: Java, Python, Go, and Scala
Some ability to perform light to medium development in Apache DataFlow, Apache Spark.
Able to navigate Unix and Linux operating systems
Capable of navigating and working effectively in a DevOps model including leveraging related technologies: Jenkins, GitLab, Git, etc….
Some experience with SQL and Data warehousing solution
A Bachelor’s Degree in Computer Science or related field preferred
1+ years of Data Integration experience.
1+ years of hands on experience with one of the following technologies: Hadoop, Apache Spark, SQL or Redshift or PostgreSQL
Having experience in GCP Product like Bigquery, Dataflow, PubSub, Bigtable, Composer, GCS is a plus
Level Masuk | : | Entry Level |
Penempatan | : | - |
Tipe Pekerjaan | : | Penuh Waktu |
Deskripsi Pekerjaan | : | Modify existing structured and unstructured data integration solutions for rapidly evolving business needs. Perform smaller to medium complexity development activities for structured and unstructured data including ingestion, parsing, integration, auditing, logging, aggregation, normalization, and error handling. Collaborate with a cross functional team to resolve data quality and operational issues. Create conceptual models and data flow diagrams. Participate with end users to gather requirements and consult on data integration solutions. Receive and adhere to project delivery deadlines. Migrate code across environments and leverage a source code management system. Develop and maintain proper documentation for data pipeline and service |