More education

While at Strata NYC this year, I was talking with the nice people at Metis Machine. The topic of continuing education in the Big Data space came up and they recommended I try out the Data Engineering Podcast

During my morning exercise session I have been working through the Data Engineering Podcasts and have a few recent picks that I recommend to others if they are working in the Date Engineering space or aspiring.

Apache Zookeeper As a building Block For Distributed Systems

This was a good introduction to Zookeeper and helped me put it into perspective. I liked how they talked through the opportunities and challenges of Zookeeper. I was also interested to hear about some of the other products in this space. I know of several projects in the past that would have benefited from a network level locking mechanism. Wondering what it would take to get them to consider this now… Probably couldn’t convince them and they probably have the bugs worked out of it now.

Looker

We have been using Tableau for a few years now. I gotta say that Tableau started out as the most wonderful thing in the world. Still today, it is a good tool to do quick visualizations, however the server side is quite a challenge. The notion of Data Source residing on the server is a good one, but the implementation is lacking. It seems like a hastily added bolt on. Looker sounds a lot more sustainable and may integrate better into a shop that is comfortable with SQL and really looking for something to extract the data into visualizations in place rather than the Tableau model where data is collected up on server side.

Using Notebooks as the unifying layer

This is an interesting discussion about using notebooks as the common platform or mode of communicating information amongst teams, development and operations. Sounds like they still have some challenges to face, but I like how Netflix has pushed the envelope here and turned it into something useful. See my other thoughts on notebooks here: Arriving at Jupyter Notebooks