Brief summary
We’re increasingly seeing a trend of organizations exposing events — particularly business domain events — before knowing who the consumers are or what the specific applications are, in the hope that people elsewhere in the organization can discover these events and create value, without us directly orchestrating it.
Podcast Transcript
Neal Ford:
Welcome to the Thoughtworks podcast. This is Neal Ford.
Zhamak Dehghani:
And I'm Zhamak Dehghani.
Neal Ford:
And we're here at the TAB face-to face meeting, and as often happens, we have interesting discussions that come up as we're face-to-face. So this morning, we're joined by a couple of the Doppler members that put together the technology radar.
Erik Dörnenburg:
Erik Dörnenburg, the head of technology in Germany.
Evan Bottcher:
Evan Bottcher, tech lead from Melbourne, Australia.
Neal Ford:
And we're going to talk about a topic this morning about a serendipitous events. So Zhamak.
Zhamak Dehghani:
We are seeing this emerging trend around the idea of exposing events, particularly business domained events, before knowing who the consumers are, or what the specific applications are, and hoping for discovery of those events by other people in the organizations, and creation of the value using those events without us directly orchestrating it. And I can imagine that this can be quite a heated discussion because, it sounds like things that we don't want to do, like creating big upfront architecture. But, I want to open the discussion with perhaps giving some examples of seeing this theme of inorganization and look at the pros and cons of it.
Zhamak Dehghani:
To give an example, a small example of that, that we have seen within Thoughtworks, is the events that are identifying submission of time sheets. So, we all put our time allocated for different projects at the end of the week, and the system that's capturing that has decided, or the team has decided, to expose these events that timesheet have been submitted. And the following effect of that has been, that other teams have found value in that and they created other applications consuming those events such as, calculating the leave that you have, or the payroll, defining where the tax allocation for your project might be. So this is an example, a small example, but I'm opening it to our guests to see if they have seen these things in the technology landscape.
Evan Bottcher:
Well, I think before we talk about where we may have seen it, the reason this is so controversial and so difficult to get a beat on from my perspective, is we've for a long time talked about emergent design; two things, emergent design, which is creating software and architecture and design in response to a genuine need without building a lot of material upfront, a lot of design or a lot of the architecture from the bottom up, hoping for reuse later. And this theme of use before reuse where we want to see that the usage of a particular feature, or piece of software, or data round or whatever it is, in context with delivering some piece of value.
Evan Bottcher:
And then, and actually we would just introduced the reusability as it becomes used in more context. So this is quite a counter to something that we've promoted for a very long time as a way to reduce the waste that we've observed in the software industry. So it's really hard to tackle when I'm working with organizations and saying, " Well what should our strategy be about exposing our data? ". I think it all is different in different areas.
Erik Dörnenburg:
Yeah, that's a good point. I'm probably the skeptic in this round. I've seen lots of systems that are being built that are connected by various means and of course events such as one way. And I think Evan made an important point too when he said about data and I think what we often want to do today is we want to take data that is available somewhere and do something with that data in a different context.
Erik Dörnenburg:
And, we've seen with the shift towards an architecture that's based on microservices sent, the microservices owning their own data stores. We've seen a shift to them locking up the data in them, which is good from an encapsulation perspective, but as a response, we also have data lakes and here it was relatively clear there was low effort on the teams to put the data into that data lake to stream the raw data that they had. And it was so obvious to the consumers that, that was raw data that they could have this serendipitous moment where they discover data that they can make use of with little effort upfront and also with an understanding that you're dumping the raw data in there. And that I think that worked really well for me, where I'm skeptical about this approach or where I think that there may be examples as Zhamak has given us that, in the vast majority, we wouldn't see those effects because, the events need to serve a certain purpose in the system.
Erik Dörnenburg:
And, if you design those events and you have this in mind that you may create value later, then there's a huge temptation that we've seen so many times that people are trying to do the perfect thing, that they are trying to think they're anticipating, they're not waiting for a serendipity, they're trying to anticipate and they're trying to create these adaptable system that can adapt to anything and they will put a lot of effort into this. And this is why I'm worried about it. I'm not saying that serendipity can't happen, but I'm concerned that the cost is higher or too high actually.
Evan Bottcher:
You quoted again to data lake architecture versus traditional data warehousing. One of the big costs in data warehousing, which we'll try to address the data lake, was the enormous effort and modeling the perfect chronicalized form of the data. And I can see that slope, that slippery slope towards team saying, "Okay, I want to expose this time sheet data. Now I've got to spend a lot of time on modeling this event data because I'm going to publish it into the event stream." Especially with these kind of long leaf, immutable, event streaming technology that they use, changes to that schema have a cost of an impacted in consumers. So, there's a lot of forces that will drive a lot more upfront design I think.
Neal Ford:
Erik brings up this distinction about adaptability, and Rebecca and I talked about this distinction in the building evolution architecture's book about adaptable versus evolvable systems because, there's a key difference between those two things. An adaptable system is one where you've tried to anticipate what people are going to need and you build adaptive hooks into it, either future toggles, or configuration, or some other way to adapt it to this future state that you imagined. But, the problem with doing too much of that is you end up with the eclipse configuration dialogue, where you can change and fiddle with all sorts of literal configuration parameters.
Neal Ford:
And of course that adds to testing and debugging and you frequently end up building a lot of things that people don't use or don't use very much; that even though it seems like a really good idea at the time, that's an adaptable system and that's actually counter to building truly evolvable systems Where you are trying to determine what is the real usage of this thing and keep it very grounded in the day-to-day usage of the system as a whole over time and not get too speculative about it. And so there's, it's tricky to get real value here. So, I think the real question is when do we think this seems like not a bad idea? Are there some indicators that might lead you in one system to do this more than another kind of system for example.
Erik Dörnenburg:
Maybe we can take a page out of the book of API design. I mean we've now totally corrupted, I guess about the entire industry, the term API, when I mean that, that's an API now, I mean the term that we're talking about is HTP endpoints they're rstful. They don't have to be restful, just an HTP end point that returns some data and that creates a platform and as always, we've advocated on the technology weight and elsewhere to have product people behind it.
Erik Dörnenburg:
But here, what is happening, is it's exactly that we are designing this. We're designing it with the anticipation of consumers and that of course takes a serendipity out of it, but maybe there's a middle ground somewhere where we're saying at least we take some ideas of understanding what we are trying to do in essence. But as I said, I'm worried as Evan said about the slippery slope in this. Yet I believe, we have some learnings in that space and we are trying I guess with a different way achieving the same goal. Maybe dissolving all the applications, I have this platform that we can, with relatively low cost, experiment on in our enterprises.
Zhamak Dehghani:
Maybe there is a subtlety here around planned serendipity, if there is such a thing. Where you're creating an environment that encourages that ecosystem behavior that people can easily discover, that end point, and learn about it. There is a product thinking behind it in terms of evolution. There was a point that you made, it might be related to that Evan, that you mentioned that the costs and the investments versus the exploratory nature. Like if you want to put something out there for exploration. And does that fit into the product thinking or applying a kind of product management techniques when you build something that's somewhat speculative?
Evan Bottcher:
I think there is. My view is that there is space for some speculation but, it has to be a very constrained, you're essentially taking a bit. And we, for a long time, just stayed so brutal because of the waste that we've seen across the industry of just lots and lots of predictive planning. If we crack that back a little bit and we say, " Okay, you know we are thinking of these internal business capabilities as crimes." And some large proportion of the funding and the budget that you are doing must be features that are going to be immediately used. It's going to be a strong recommendation. There is space for some small or I wouldn't say it's necessarily small, but some control or constraint investment in taking bets on things that will create value that's not going to be literally prison, but it's a very strong caution around it.
Evan Bottcher:
I think it's worth pointing out that there's two dimensions to this that are in the conversations I'm having in this space now. It's what events, of the dozen interesting domain events, which of them do we expose? And, how much of the data that we would normally recommend encapsulating and hiding as much as possible, how much time data do you put in? The data lake architecture really says you do stroke roll kind of feeds in the low cost raw feeds and then defer the under cleansing and modeling of it later, but how much information do you put in the event? That's tricky as well. It's two dimensions.
Evan Bottcher:
One example I have that drove some of this behavior was around organizations that have hacked decks. So it's not a business feature that you're building. You know they have hack teams to be exploratory, very exploratory and they need an increased access to data and information that these systems have. So if you have an organization that does that quite often, you may generate a need to expose certain information, like your example of time sheet data that isn't actually a proven business feature.
Erik Dörnenburg:
I want to pick up on Evan's point about, I guess you were arguing for slack or for some time that people have that is not directly going to known features and I think that is a great idea and you can view it from both angles. Even say build something that you don't know the value of right now or sometimes what I've seen is, give people in product teams the chance to respond to something that isn't on the immediate backlog or the immediate part of work that they're doing. And maybe that is another way of approaching this. I remember a number of years ago, we built a humane registry for developers and organizations to discover what data, what services were in the organization in the first place and that addresses the discoverability. And without having published the event, when you have discovered it, if you own the organization, you have the chance to go to that team and in the humane registry you maybe get some example data, you can get copies of real data that is not currently streamed.
Erik Dörnenburg:
And it is very clear this is just here to discover. You can then go to the other team, and if that team's given some time to actually help you, you can probably get that serendipity. If that team is completely swamped and if it is only measured, as we've unfortunately seen in some organizations, by the immediate output and if they are told you have to focus on that release, you can't help your colleagues, then they're all work. But, maybe that is another angle too, whether it's hack days, whether it is time allocated for planned serendipity, or there is time to respond to requests from your colleagues. That is probably a key thing that we are landing on here.
Zhamak Dehghani:
Yeah, absolutely. I think that discoverability is a big factor of it and enabling people to find things that they can do useful things. I have a counter example of that where one of our large retail clients that were super excited about Kafka, early days of Kafka. So, the initiative was started from a technology curiosity about technology and then they published essentially all the user interactions with the websites as events. It was a big effort to get that out there. But, nothing came after that and one reason for it because nobody knew about it, nobody knew about these events, nobody knew about their schemas, they couldn't discover it, they couldn't use it. So, people were still going around complaining that we have no insights around how customers are interacting with that website.
Neal Ford:
Well that's always a challenge to make organizations is how do you get the reuse the things that are already there.
Zhamak Dehghani:
Absolutely. And I think discovering is one aspect and paving the path is another make it easy to use. As I mentioned, you have a sample data, you have an example code perhaps.
Neal Ford:
Yeah. Make it easy to be successful using your stuff and then you're a lot of likely to make it more widely used.
Erik Dörnenburg:
And again, here, is that fine balance between trying to document it up front and being there when somebody needs it. I mean, we talked to what schemas, I mean in adjacent it is getting a little bit better, but even most developers, that I know and I include myself in this, we find it easiest to look at some sample data to get a rough understanding of what is in there, rather than a complex schema document that can outline all the possible edge cases. On the other hand, if you only see current sample data, you may not see everything. This might be a system, like a time sheet system, that seems very different uses that on certain parts of the week or you can see, if you look at investment banks, trading patterns can vary over time, within a week, within periods and so on.
Erik Dörnenburg:
And again here, that's why I'm thinking that some preselected sample data or maybe some... They might have written a stop anyway for their own testing. To be able to get that stuff and get data from that. That could be a good idea. Might Tracking middle ground between having a very abstract, complicated, schema and observing what is currently being sent.
Evan Bottcher:
So many of these conversations that we're having and having a new sound in this TAB context. All the way back to a blip that was on the previous radar, which was product management for internal platforms. It's such an important thing that we've kind of under-emphasized. I think generally that if we have these internal platforms and business capabilities and business applications, that part of that is understanding the customer and they need and what we may be describing in this serendipitous use of events or other forms of APIs and data is a customer group that's underrepresented, that product manager, that product owner for that platform can identify. That gives me a little bit more comfort that you can apply the other product management techniques around measuring the value of what you'll produce and trying to shorten the feedback cycle. So, even if you're building something that doesn't have an immediate need, you can still optimize to get that need validated quickly.
Neal Ford:
I think Evan makes an important point. If you're going to engage in this, you should track and see how well you're doing. Are you actually building things that people eventually end up using or if you end up building a bunch of things that nobody ever uses so.
Evan Bottcher:
You're turning the thing off. I mean if you've published 10 event streams, it's costing you time to maintain those gamers and the infrastructure that they're sitting on. If they've never been used in how long do you keep them running for?
Erik Dörnenburg:
Maybe since we are writing the radar, the good point here is also to look up is the idea of the events stream as a source of truth. So, that can really help. And I've seen that at clients where you have the ability and Kafka does this really well, allow you to actually replay all the events from the beginning. That means if you suddenly start to become a consumer, you can actually have a very good way of testing what would have happened if you had seen the entire stream. But also, if you are deriving protecting some state of the events, you have a chance to actually do that. It depends. With the time sheets it probably didn't matter because there was no overall state, so that's okay too. But, it is another important enabling technique or technology, if you will, that can help you with that approach.
Neal Ford:
Okay, so thanks everyone for joining us this morning. This, like many of the things we ended up talking about on the radar and don't make it on the radar is something that's a really interesting idea that's current but, it's way too complex to get into two or three sentences. And so, this gives us an idea to kind of poke around at some of the nuances, and edges, and facets, at some of the interesting things that are happening in the tech world. So thanks Eric. Thanks Evan.
Zhamak Dehghani:
Thank you.
Neal Ford:
And Zhamak and I will see you again soon.
Rebecca Parsons:
This is Rebecca Parsons and on the next edition of the Thoughtworks podcast, Mike Mason, and I will be speaking with Jonny Leroy and Zhamak Dehghani about architectural governance. And that might sound like a dry topic, but I think you'll find it a fascinating conversation and we hope you'll listen in.