jackattackson

joined 2 years ago
 

A chiral aperiodic monotile David Smith, Joseph Samuel Myers, Craig S. Kaplan, and Chaim Goodman-Strauss, 2023

https://cs.uwaterloo.ca/~csk/spectre/

 

I get questions like this a lot:

  • Where did this data come from?
  • How do I know I can trust the source?
  • What types of QA checks were applied to this data?

Data lineage is such a chronic issue in data engineering. This blog post from Airbyte gives a good overview & mentions some interesting products/projects that can maybe help out with data lineage.

Unfortunately, I have limited flexibility to purchase or install tools for this in my current role. Anyone rolled their own solution for this?

 

(pdf download at the bottom of the linked page)

Often the most challenging part of data engineering is figuring out what problem to solve in the first place.

The resources Stanford put in this Design Thinking Bootleg might have something that can help you work with others and build towards a well-designed solution.

 

For Python developers, the NumPy array is a widely used data structure in data science / data engineering.

Thought this paper was a good resource to learn a bit more about the history and core concepts of NumPy.

 

I found this content super helpful and I frequently share this with new coworkers getting started in data / dev.