Big Data Engineer

Bridgewater, NJ

PRIMARY RESPONSIBILITIES

 

  • Design, develop, test, implement, and maintain code, information architecture, and conceptual models to support data processing, and flows thru data lake
  • Landing or Source Zone – data ingestion of raw data or capture of streaming data
  • Reporting Zone – transform into data model for external consumption by reporting & self-service BI
  • Sandbox Zone – evaluate data quality, transform raw data, and cleanse data
  • Builds large-scale data processing systems, is an expert in data warehousing solutions and should be able to work with the latest (NoSQL) database technologies.
  • Embrace the challenge of dealing with petabytes of data on a daily basis.
  • Understands how to apply technologies to solve big data problems and to develop innovative big data solutions.
  • Have extensive knowledge building data processing systems with Hadoop and Hive using programming or scripting languages
  • Also expert knowledge should be present regarding different (NoSQL or RDBMS) databases such as.
  • Works on implementing complex big data projects with a focus on collecting, parsing, managing, analyzing and visualizing large sets of data to turn information into insights using multiple platforms.
  • Recommend, design, implement and maintain the various file formats (e.g. XML/XSD, SequenceFiles, Avro files, or Parquet files) for information interchange between application, external systems, 3rd party applications and/or data lake.
  • Review and evaluate database performance, risk and financial analysis feasibility studies
  • Investigate and repair application defects regardless of component, including platform, business logic, data process logic, or database (SQL and data modeling).
  • Ability to develop prototypes and proof of concepts for the selected solutions
  • Develop data and metadata policies and procedures
  • Implement and maintain operational and disaster-recovery procedures.
  • Participate in the review of code and/or systems for proper design standards, content and functionality.
  • Participate in all aspects of the Systems Development Life Cycle
  • Analyze files and map data from one system to another
  • Adhere to established source control versioning policies and procedures
  • Meet timeliness and accuracy goals.
  • Communicate status of work assignments to stakeholders and management.
  • Responsible for technical and production support documentation in accordance with department standards and industry best practices.
  • Maintain current knowledge on new developments in technology-related industries
  • Participate in corporate quality and data governance programs

 

QUALIFICATIONS & EXPERIENCE

  • 5+ years of systems/application analysis & design experience
  • 3+ years of data modeling & database administrator experience
  • 3+ years of experience in designing, building, and using a big data distribution, preferably MapR (Hortonworks, or Cloudera), for ◦data ingestion, cleansing, and transformation (e.g. Talend, Scoop)
  • data discovery & analysis using querying tools (e.g. Impala, Hive)
  • data storage using distributed databases (HBASE, Kudu)
  • data streaming (e.g. Kafka, Apache Spark)
  • data visualization (e.g. Tableau, Qlik, Lumira)
  • processing monitoring (e.g. MapR manager, Hue)

 

EDUCATION

 

  • Bachelor’s Degree in Information Technology or related field preferred

 

 

Note: Qualified candidates will be contacted within 2 business days of application. If an applicant does not meet the above criteria, we will keep your resume on file for future opportunities and may contact you for further discussion.

Date Posted 5/10/2018
Salary $170,000- $180,000 + 15% bonus






(2MB Max; allowed file formats: doc, docx, pdf, pptx, txt)



characters left