Senior Big Data platform Developer, Tampa, Florida

Sr. Big Data platform Developer Tampa, FL | Information Technology | ID:181962

Our client, a banking company, is seeking a Sr. Big Data platform Developer

Location: Tampa, FL

Position Type: Contract

Job Summary:

We are looking for a Sr. Big Data platform developer who will work on the collecting, storing, processing, and analyzing of huge sets of data. The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them. You will also be responsible for collaborating with different stakeholders/team for required development.

Responsibilities

Implementing ETL process using defined framework
Monitoring performance and advising any necessary infrastructure changes
Create/modify tables, views in hive
Write Shell scripts to execute hive on spark jobs
Automate the shell scripts on job scheduling tool - Autosys
Improve job performance by implementing hive parameters, spark configuration level changes and spark optimization techniques
Create/modify hql script to retrieve data from hive tables or to use hql script for data processing
Working with team to defining data retention logic as per business requirements
Perform and oversee tasks such as writing scripts, writing T-SQL queries and calling APIs
Customize and oversee integration tools, warehouses, databases, and analytical systems
Design the data flow, create data flow diagrams and implement design level changes
Design and implement data stores that support the scalable processing and storage of our high-frequency data
With the help of Admin/support team solve any ongoing issues with operating the cluster

Qualifications and Skills:

Bachelors or master’s degree in computer/data science technical or related experience
7+ years of hands-on years of relevant data engineering experience with data warehouse, data lake, and enterprise bigdata platforms required
Experience working in an agile/iterative methodology required
Working experience with Bigdata-Hadoop ecosystem: NoSQL-Hive, Impala, Spark, Scala, shell scripting and RDMBS-MS SQL server required
Experience with integration of data from multiple data sources with full load, incremental load and real time load
Working experience with development/deployment tool: Jira, Bitbucket, Jenkins, RLM
Experience with Spark, Hadoop v2, MapReduce, HDFS required
Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala required
At least 2 years of relevant experience with real-time data stream platforms such as Flume, Kafka and Spark Streaming
Experience with various ETL techniques and frameworks required
Excellent analytical, problem-solving skills and have excellent communication skill
Ability to solve any ongoing issues with operating the cluster