Specific responsibilities include
- Understand the Business Problem and the Relevant Data
- Maintain an intimate understanding of company and department strategy
- Translate analysis requirements into data requirements
- Identify and understand the data sources that are relevant to the business problem
- Develop conceptual models that capture the relationships within the data
- Define the data-quality objectives for the solution
- Be a subject matter expert in data sources and reporting options.
Job Requirements
- Experience developing, delivering, and/or supporting data engineering, advanced analytics or business intelligence solutions
- Experienced in developing ETL/ELT processes using Apache Ni-Fi and Snowflake
- Significant experience with big data processing and/or developing applications and data sources via Hadoop, Yarn, Hive, Pig, Sqoop, MapReduce, HBASE, Flume, etc.
- Significant experience with big data processing and/or developing applications and data Pipelines via Hadoop, Yarn, Hive, Spark, Pig, Sqoop, MapReduce, HBASE, Flume, etc.
- Data Engineering and Analytics on Google Cloud Platform using BigQuery, Cloud Storage, Cloud SQL, Cloud Pub/Sub, Cloud DataFlow, Cloud Composer..etc or equivalent cloud platform.
- Familiarity with software architecture (data structures, data schemas, etc.)
- Strong working knowledge of databases (Oracle, MSSQL, etc.) including SQL and NoSQL.
- Strong mathematics background, analytical, problem solving, and organizational skills
- Strong communication skills (written, verbal and presentation)
- Experience working in a global, multi-functional environment
- Minimum of 2 years’ experience in any of the following: At least one high-level client, object-oriented language (e.g. JAVA/Python/Perl/Scala, etc.); at least one or more web programming language (PHP, MySQL, Python, Perl, JavaScript etc.); one or more Data Extraction Tools (Apache NiFi/Informatica/Talend etc.)
- Software development using programming languages like Python/Java/Scala
- Ability to travel as needed
Architect Data Management Systems
- Use understanding of the business problem and the nature of the data to select appropriate data management system (Big Data, Cloud DW,Cloud (GCP/AWS/Azure), OLTP, OLAP, etc.)
- Design and implement optimum data structures in the appropriate data management system (Cloud DW, Cloud (GCP/AWS/Azure), Hadoop, SQL Server/Oracle, etc.) to satisfy the data requirements
- Plan methods for archiving/deletion of information
- Develop, Automate, and Orchestrate an Ecosystem of ETL Processes for Varying Volumes of Data
- Identify and select the optimum methods of access for each data source (real-time/streaming, delayed, static)
- Determine transformation requirements and develop processes to bring structured and unstructured data from the source to a new physical data model
- Develop processes to efficiently load the transform data into the data management system
- Prepare Data to Meet Analysis Requirements
- Work with the data scientist to implement strategies for cleaning and preparing data for analysis (e.g., outliers, missing data, etc.)
- Develop and code data extracts
- Follow standard methodologies to ensure data quality and data integrity
- Ensure that the data is fit to use for data science applications
Job Type: Full Time
Job Location: Delhi