Data science is an exploding field. Most larger companies have made the digital-first transformation and actively rely on insights provided by data analysts or data scientists to make informed, data-driven business decisions. With this increased reliance on data, the demand for data-related professionals has skyrocketed. Data analysts need a reliable source of high-quality data to do their jobs, and they rely on data engineers to provide it.
In the data science hierarchy of needs described by Monica Rogati, data engineers work at levels 2 and 3, creating reliable ways to move, store, explore, and transform data so that it's in a usable state by the time it reaches data scientists. As the first level, collecting, continues to expand, it increases the need for data engineers and data engineering managers.
What Does a Data Engineer Do?
Before you can understand the role of a data engineering manager, you need to understand what a data engineer does. Data engineering is the process of designing, building, and maintaining systems for collecting, storing, transforming, and analyzing data. Although data analyst and data scientist careers are more well-known, these professionals rely on the work of data engineers to do their jobs.
Data engineers may perform the following tasks:
- Collect datasets that are valuable to their companies
- Build, test, and maintain data pipeline architecture
- Comply with data governance and data security regulations and policies
- Develop algorithms to transform data into usable forms
Data engineers use the following skills in their daily tasks:
- Programming in languages like R, Python, Java, SQL, and Scala
- Software engineering, including Agile, DevOps, architecture design, and service-oriented architecture
- Extract, transform, and load (ETL) systems
- Automation and scripting for repetitive tasks
- Open frameworks like Apache Spark, Hadoop, and Kafka
- Pandas, a Python library used for manipulating data
- Data visualizations and dashboards
- Data security
- Cloud platforms like AWS, Google Cloud Data Engineering, and Microsoft Azure
Data Engineer Manager Responsibilities
Data engineering managers have to stay on top of emerging trends and technologies in data engineering and supervise the data engineers who work for them. They need a broad range of technical and interpersonal skills to do their jobs. Here are some common data engineering manager roles and responsibilities:
Creating a Modern Data Stack (MDS)
The MDS is a new approach to data integration, but it's exploding in popularity. Compared with legacy data stacks, an MDS saves time and frees up data engineers and analysts so they can focus on accomplishing high-value data tasks rather than setting up and maintaining on-site systems. An MDS consists of a suite of largely off-the-shelf tools used at different steps along the data flow.
Used together, these tools allow organizations to become fundamentally data-driven. Because this is such a rapidly evolving concept, with little in the way of established protocol, a big part of a data engineering manager's job will be evaluating and implementing an MDS.
There's no one-stop solution for an MDS, so every organization's solution will look different. However, there are three basic features of any effective MDS:
- It can easily be used by people with a wide range of skills and abilities, particularly "non-techy" people
- It's agile, so data is readily available in a lake or warehouse and elastic enough to allow auto-scale processing based on need
- It's cloud-native, which allows it to be fast, flexible, and provided on a pay-as-you-go basis
Components of an MDS include:
- A managed ETL data pipeline
- A cloud-based storage system for raw data
- A tool for data transformation
- A data visualization platform
The MDS follows the general technological trend of moving away from local systems and toward cloud-based systems. It lowers the technical barrier to entry by promoting end-user accessibility and scalability. An MDS saves time, money, and effort compared to legacy systems.
Selecting the Right Tools
Along with creating a modern data stack, data engineering managers are responsible for selecting the right tools. There are numerous tools available for each component of an MDS. Choosing the right ones will depend on factors related to both the company's resources and the skills and knowledge of the data engineering department. Some considerations a data engineering manager will take into account when choosing tools include:
- The type of data being processed
- The language their team is most familiar with
- The budget
- The existing tools being used
Data engineering technology is constantly evolving and updating, so selecting the right tools is an ongoing part of the job. Many of the standard tools used today were unknown just a few years ago. Data engineering managers have to keep up with new technologies as they're released and decide when the benefits of changing tools outweigh the drawbacks.
When they decide it's time to make a change, they'll design and oversee the onboarding process for all new users. This may include training data engineers in the use of new technologies as well as ensuring that end users outside of the data department are able to use them.
Structuring an Effective Team
Data engineering managers have to know as much about people as they do about data. They need to understand the strengths and weaknesses of engineers and how they approach problems. With tech skills becoming outdated and changing so quickly, managers will increasingly have to look at solutions like upskilling and reskilling current team members rather than continually hiring new data engineers.
Communicating With Business Users
One of the main benefits of an MDS is that it's easy for non-engineers to use. Business users should be able to use it intuitively to access the data they need to analyze for strategic planning. However, this doesn't mean they won't need training or help along the way. Data engineering managers will have to work closely with business users to educate them on how to achieve their end goals and the benefits of their department's work.
To effectively communicate with business users, data engineering managers need to be able to explain complicated technical concepts in simple ways. While "tech talk" is second nature to engineers, many business users will be lost if they start throwing around obscure data science terms.
Hiring Junior Data Engineers
Data engineering managers hire junior engineers and mentor them as they grow into more senior roles. They are also responsible for hiring senior data engineers and overseeing them as they train junior engineers. Interviewing candidates is a large part of this responsibility. Given the shortage of skilled tech workers, filling empty positions is an ongoing challenge.
When hiring for junior positions, managers should look for candidates who:
- Can learn quickly
- Expect learning new skills to be a career-long process
- Take the initiative in doing a job and seeking help when needed
- Have well-developed problem-solving abilities
Finding engineers with the requisite hard skills may be difficult, but using appropriate screening tools can help. The more significant challenge is likely to be finding skilled data engineers with the proper soft skills who are a good fit for the existing team.
Creating a cohesive team involves understanding how different personalities work together to achieve results. Therefore, hiring a candidate with skills and personality traits that enhance, or possibly counter, those on the existing team is often as important as hiring one with rock-star machine learning skills.
Facilitating Interdepartmental Communication
When software development, product, and data teams become siloed, it can negatively affect data quality and lead to data debt. Data engineering managers need to work with other teams to identify issues and be proactive about changes that can help avoid data breaking. They also work closely with data scientists to determine their data needs for particular projects and design solutions for them.
Interdepartmental communication is crucial because upstream changes to the data model may affect the downstream data resources. This problem becomes more of an issue with scale since the model may change more often. Implementing a transparent communication method that all departments can update and access can mitigate some of the problems associated with changing data models.
Data Engineer Manager Job Description
The job description for a data engineer manager will naturally vary based on the industry and company. Some of the key responsibilities you may see in job descriptions include:
- Working with software engineers, product managers, and data scientists to build data architecture that drives insights
- Creating, defining, and executing a plan for business intelligence and data warehousing across a product vertical
- Designing a data engineering team and scaling it
- Managing data warehouse plans across a product vertical
- Delivering intuitive, high-impact data dashboards and visualizations
- Leading the delivery of scalable data and analytics solutions
- Implementing new and open-source technologies
- Designing and building reusable components, frameworks, and libraries at scale to support analytics data products
- Improving processes regarding data flow and quality to improve data accuracy, viability, and value
Just like the responsibilities, requirements for data engineering managers will vary based on the hiring organization, but here are some common requirements:
- Bachelor's degree in computer science, math, physics, or other technical fields
- Fundamental data architecture and design skills
- 3 to 5 years of technical management experience
- Expert knowledge of SQL
- Experience building scalable data solutions
- 6 to 10 years of solution development and technology experience
- Strong knowledge of a variety of data engines
Senior Data Engineer Manager Interview Process
While every company will have its own process, data engineering managers generally go through several rounds of technical and leadership interviews, starting with broad screenings and working their way up the organizational structure until the final interviews. Here's an example of what the process might look like:
Leadership Screening Interview
This is a situational and behavioral interview with the hiring manager designed to gauge leadership skills and potential. It will include questions like:
- Tell me about the toughest decision you've ever made
- What does a good data engineering team look like?
- Share a time when a project dramatically shifted directions at the last minute, including how you handled it
In addition to determining leadership style and effectiveness, questions like this are a good way to judge a candidate's ability to communicate concisely. Comprehensive answers should include information from the STAR model:
- Situation — Set the scene and give the details surrounding it
- Task — Describe tasks and responsibilities in the example
- Action — Explain how the issue was handled
- Results — Describe the specific outcome that followed from the action
Technical Screening Test
This is a coding exercise that will likely use SQL and a common programming language like Java, Python, or C. Candidates are asked to provide solutions to more complex problems than at the data engineering level. LeetCode is a good resource for finding these types of questions. However, as a manager, how they approach problems is more important than the actual coding.
Technical Data Exercise
Once they pass the screener interviews, the candidate will move on to the next round, which may be on-site, depending on the company. This is a longer technical interview where they are given a problem or feature and asked to explain how they would solve the problem or implement the feature. After discussing the options they've considered, they'll be asked to code the solutions.
They may also be asked a technical data question about a business or app idea and asked what metrics would be best to collect to gather data for it and then design a schema for a database. They should then be able to write queries to retrieve the metrics.
Leadership and Stakeholder Interviews
Finally, they candidate will probably have one or two more interviews to assess their leadership potential and ability to work with other teams. Again, their answers should be comprehensive and include information using the STAR format.
Sample Questions for a Data Engineering Manager Interview
Hiring managers are looking for data engineering managers who have the ability to prioritize work, mentor data engineers, and proactively manage risks. Some questions they may ask to determine an applicant's abilities to do this include:
- Given these tasks, how would you prioritize the work?
- How do you plan projects to keep them on schedule?
- What do you do when they start running behind schedule?
- How do you keep your team's skills updated?
- How do you divide work among your team members?
- How do you resolve the conflict when you and another team leader disagree about task prioritization?
- What soft skills did it take for you to get where you are?
Data Engineering vs. Data Management
Data engineering and data management are closely related concepts that work together. Though they're often lumped together, there are some differences between them.
Data management is the process of acquiring, storing, processing, and applying data. It's divided into three subsets:
Metadata can be technical, operational, or business-oriented. It plays a key role in finding and using data.
Data Quality Management
Not all data is good data. If you're using inaccurate, out-of-date, or corrupted data, you can't trust the insights it provides. Data quality management is concerned with assessing and managing the integrity, timeliness, accuracy, and consistency of data.
Data Security Management
Data security is tightly regulated in many industries. Data security management protects data access, use, release, and destruction processes. Some of the most important features in data security management include:
- Only allowing authorized users to access data
- Recording all access and modifications to sensitive data
- Deleting confidential or sensitive data that are no longer in use
- Protecting sensitive data through tokenization or encryption
While data management is primarily concerned with how data is handled to ensure its quality, security, and indexability, data engineering involves designing and building the architecture to collect, transform, store, and serve data for efficient use by data scientists. Data engineers:
- Build data pipelines
- Analyze and organize data
- Build data ingestion systems
- Perform quality checks on data
Like the tech industry at large, the roles of data science professionals are rapidly evolving. There's a lot of overlap between roles — a condition made worse by the fact there is no standardized terminology between companies. The role that one organization calls a "data manager" may be referred to as a "data engineer" at another organization.
In general, however, a data engineering manager is responsible for overseeing a team of data engineers who build the architecture and systems that collect, store, transform and analyze data. Data engineering managers need a broad range of experience in both data engineering and technical management to effectively perform their jobs.
Data engineering managers need a thorough grounding in data architecture and years of experience in data engineering. They usually have a background in the management side of tech that gives them experience in leadership and team dynamics. A data engineering manager position is a senior-level role that should have high hiring standards.
If you're looking to hire a data engineering manager or other highly sought-after tech talent positions, consider working with a tech talent marketplace like Revelo. Revelo has a talent base of over 110,000 pre-vetted developers who are job-ready. We take the hassle out of finding skilled tech workers so you can focus on growing your business.
Our talent marketplace focuses on providing developers from Latin American countries, where you can take advantage of the benefits of hiring from South America while enjoying the ease of working with developers in a similar time zone and culture. Reach out today to find out how we can help you find the tech talent you need.