Hiring, training and problem-solving
Problem
My company was tasked with taking a dataset from the US Government, figuring out what the network architecture for different school districts in America was, and working out which school districts needed internet access. The dataset from the Government existed so that school districts could apply for their network access costs to be reimbursed. However, the information the FCC needed about the school districts' architecture was different to the information we wanted, as we wanted to know whether the school districts were meeting the 100Mbps goal that was set by the FCC.
"The dataset from the Government existed so that school districts could apply for their network access costs to be reimbursed."
When I was first hired, our analytics team was only using Excel for our data. At the time, none of the team knew how to use SQL. In addition, while two of our engineers were great start-up engineers and hackers, the data model they had set up completely emulated the data model of the data we were downloading, and they hadn't bothered to move the data into a model that was useful for the work we were doing. This meant we had two issues - our analysts didn't know how to use SQL, and our data models were too large, so our Excel files were unable to load.
"Our analysts didn't know how to use SQL, and our data models were too large, so our Excel files were unable to load."
While I've used databases a lot, and have helped architect them, I'm not a database person, and while I knew that we needed to model better, I didn't have the people to do it.
Actions taken
I attacked this issue on three fronts. Firstly, I went to my boss and told him that I wanted to hire an outside firm to help, because it was taking some time for the new people I was hiring to become proficient, and the company wanted an external website showing the school districts and what their network access looked like.
"I wanted to hire an outside firm to help...the company wanted an external website showing the school districts and what their network access looked like."
My second step was to ensure that the analysts understood how to use SQL. To this end, one of my engineers suggested teaching them about how to use Github and SQL, and we had a series of training sessions to do this. The analysts got up to speed really quickly, and were able to do the analysis that we needed to feed to the engineers developing the public website.
"The analysts got up to speed really quickly, and were able to do the analysis that we needed to feed to the engineers developing the public website."
Thirdly, I still needed to spend time hiring, and teaching my new hires about network architecture. To work with the data we were working with you need to understand how to work with network architecture. This means you need to understand how fiber, internet access, wide area networks work, and dark and light fiber. However, this takes a bright person around six months to fully understand, which slows down progress. I streamlined this process by getting some existing staff members to write agile user stories so that the engineers could slowly learn enough to do the work they needed to do.
"I streamlined this process by getting some existing staff members to write agile user stories so that the engineers could slowly learn enough to do the work they needed to do."
In addition, when I interviewed one of the first engineers I hired, I told him that I couldn't hire him unless he knew how to use SQL, and he admitted that he hadn't used it much. However, I hired him on the spot and he came back 10 days later having fully mastered it. He then also started to help with getting analysts up to speed and with hiring new engineers.
"I hired him on the spot and he came back 10 days later having fully mastered it."
Even after all these steps, we still didn't have a data model. The tables we were using had redundancies, and it was a mess for anyone with a database background. Analysts needed us to add and subtract columns, but the same tables were being used to fuel our public-facing website, meaning adding or subtracting columns had a detrimental effect on what we could do with the public website. The amount of extra effort in the background on the engineering side required to ensure production was synced up internally was huge. We started using views in PostgreSQL, and splitting up tables, which helped us. However, it created even more context for our analysts, designers, and engineers to remember, even when they were doing the most simple routines.
"The tables we were using had redundancies, and it was a mess for anyone with a database background."
That's been our struggle for the last few years, but it's also been a source of strength. The tools work well. However, the analysts' SQL needs to be translated to be put into Ruby, to be put onto the public-facing website. In that translation, we often make a mistake, and we may not keep with the exact same logic that the analyst had, and then once it was on the public website it then looks different to the data the Government has. We still have that issue today, but we have changed our QA methodology to make this issue easier to work with.
"The analysts' SQL needs to be translated to be put into Ruby, to be put onto the public-facing website."
We knew we were creating problems for ourselves, but we needed to serve the business, and deal with problems that were in front of us in April 2013. Perfection is the enemy of the good.
It has been two and a half years, and we are now tearing the data apart and a small team is now working on building a data warehouse. It's an interesting group of people - two of them have implemented data warehouses, and the other four (two analysts and two engineers) have never even seen a data warehouse. We have also supplemented this team with an outside solutions architect who doesn't understand our data, but who understands data modeling and how to make the dimension and fact tables we need.
Lessons learned
Struggling with all of this complexity has been challenging in terms of getting data to the public-facing website, and also in terms of supporting internal and external analytics work. We knew that we needed to solve this problem, but we couldn't just stop the business to solve the issue. Solving this issue was all about people - if you have enough people, you can get the work done.
"Solving this issue was all about people - if you have enough people, you can get the work done."
While this process has taken over two years, which is longer than we had hoped, we have gotten to the right place. When faced with an issue like this, it's really important to communicate with your executives and your team to tell them about the technical issues you are facing, and the tradeoffs you are making. Be completely transparent with both, and state it in business language to the company, and technical language to the team.
"When faced with an issue like this, it's really important to communicate with your executives and your team to tell them about the technical issues you are facing, and the tradeoffs you are making."
Be notified about next articles from Denise Shepard
Connect and Learn with the Best Eng Leaders
We will send you a weekly newsletter with new mentors, circles, peer groups, content, webinars,bounties and free events.