Socio-technical Graphs (or Who Owns Kafka cluster C13 or Does It Even Exist?)
As we improve our observability of our technical systems, what about our socio-technical systems? Building and maintaining a graph of components and service s along with the teams and the people who run them needs to be the next wave of organizational observability.
As more tools start to include the humans along the technology, what happens when you have a queryable graph database that includes relationships between technology and humans? What could you do if you could query what teams commit into the mono-repo the most (not just which people)? Or if you could see what teams own services downstream of a particular service? What if you could catch when a service becomes an orphan at the moment when it re-org happens instead of catching it years later?
El Dorado, an internal tool created by Ward Cunningham at New Relic, Backstage, an open source project from Spotify, data.world, an interconnected data catalog as a service and Jeli.io, an incident analysis tool from the resilience engineering world are all examples of projects that are trying to give us observability not just into the technical system, but into the socio-technical system.
We are just at the beginning of these tools and we’ll be discussing them and what superpowers they will be able to unlock in the years to come.
More about Tim Tischler
Tim Tischler is Principal Engineer at Wayfair, and previously was a Site Reliability Champion at New Relic, Infrastructure at Nike, and engineering at HomeAway/VRBO. Observability is one of his driving concerns and he believes that "socio-technical" needs to be one of the big buzzwords of the next decade. He is also enrolled in the Human Factors and System Safety program at Lund University, though on hiatus during COVID.