Episode Transcript
[00:00:03] Speaker A: Hello, I'm Joseph Kasser, and welcome to this podcast where the AI team does a deep dive into some application of systems thinking.
[00:00:13] Speaker B: So you want to understand AI quality, not just the hype. Right. Not just the, wow, AI can do this, but, like, the actual impact it has. Yeah. Well, we've got a really fascinating deep dive for you today. Okay. We're going to be exploring the work of Tom Gilbur.
[00:00:30] Speaker C: Go ahead.
[00:00:31] Speaker B: He's a veteran software engineer, and he really challenges us to think differently about AI.
[00:00:37] Speaker C: Okay.
[00:00:37] Speaker B: You know, he argues that we need to measure AI's impact on the world and on people, not just marvel at its abilities.
[00:00:45] Speaker C: I see.
[00:00:46] Speaker B: So it's not just, can it do this cool thing? It's like, okay, what are the actual consequences of it being able to do that cool thing? Yeah.
[00:00:53] Speaker C: And he starts by tackling a fundamental problem.
[00:00:55] Speaker B: Okay.
[00:00:56] Speaker C: Which. Which is how do we even understand these complex AI systems in the first place?
[00:01:00] Speaker B: Right. Because it feels like everyone's talking about AI, but it's this big, mysterious thing.
[00:01:05] Speaker C: Exactly.
[00:01:05] Speaker B: I think we all kind of have a general sense of it, but when it comes down to it, like, how does it actually work? Like, what's going on under the hood?
[00:01:12] Speaker C: Gilp argues that the ways we typically try to understand AI, it's like just focusing on its design or its intended functions, are actually flawed.
[00:01:21] Speaker B: Interesting.
[00:01:22] Speaker C: He compares it to a black box.
[00:01:23] Speaker B: Oh, okay.
[00:01:24] Speaker C: Or we can only peek through a few tiny holes.
[00:01:26] Speaker B: I can see where that analogy is going. We're getting glimpses, but not the full picture.
[00:01:30] Speaker C: Exactly.
[00:01:30] Speaker B: Right.
[00:01:31] Speaker C: And what's more, these systems are constantly changing and evolving in ways that are really hard to predict.
[00:01:37] Speaker B: Oh, so that adds another layer of complexity to it.
[00:01:40] Speaker C: Absolutely. That instability makes it really difficult to know if an AI will consistently produce the results we expect.
[00:01:47] Speaker B: So we're essentially trusting these black boxes to make important decisions even though we don't fully understand how they work.
[00:01:54] Speaker C: That's a key concern.
[00:01:55] Speaker B: Yeah.
[00:01:56] Speaker C: Gilb highlights the lack of transparency is a major challenge.
[00:01:59] Speaker B: Okay.
[00:02:00] Speaker C: And he's not alone.
[00:02:01] Speaker B: Right.
[00:02:02] Speaker C: A lot of experts, even big names in the tech world, are voicing similar worries.
[00:02:07] Speaker B: That makes sense, because if we're going to rely on AI for things like health care or transportation, we need to be able to trust it. But how do we build that trust when we're dealing with these black boxes?
[00:02:20] Speaker C: So Gilb's solution is pretty radical.
[00:02:22] Speaker B: Okay.
[00:02:23] Speaker C: He says we need to move away from vague terms like good AI or safe AI.
[00:02:28] Speaker B: Okay.
[00:02:29] Speaker C: And start defining measurable metrics for quality.
[00:02:33] Speaker B: Measurable metrics so like a report card for AI.
[00:02:36] Speaker C: That's the idea.
[00:02:37] Speaker B: I like that.
[00:02:37] Speaker C: It's about creating objective ways to evaluate AI performance.
[00:02:41] Speaker B: Okay.
[00:02:42] Speaker C: Much like we use grades to assess a student's understanding.
[00:02:45] Speaker B: That's a really interesting concept. Yeah, but AI is so complex. How do we even begin to grade it?
[00:02:51] Speaker C: Right.
[00:02:52] Speaker B: What kind of subjects would be on this report card?
[00:02:54] Speaker C: Well, Gilb gives some specific examples. One is transparency. Instead of just saying the AI system is transparent or not, he suggests measuring it based on the percentage of operations where the decision making process is clear to stakeholders.
[00:03:08] Speaker B: So we're taking something that seems very subjective, like transparency and turning it into a hard number.
[00:03:14] Speaker C: Exactly.
[00:03:14] Speaker B: I like that. What other qualities can be measured in this way?
[00:03:17] Speaker C: He also talks about wellbeing. It's about understanding how the AI impacts the well being of all those affected by it.
[00:03:25] Speaker B: That's getting into some pretty complex territory.
[00:03:27] Speaker C: It is.
[00:03:28] Speaker B: Measuring well being sounds incredibly difficult.
[00:03:30] Speaker C: It definitely has its challenges.
[00:03:32] Speaker B: Yeah.
[00:03:32] Speaker C: But Gil argues that it's crucial to consider.
And just like with transparency, you would break down well being into measurable dimensions that are relevant to the stakeholders.
[00:03:45] Speaker B: So stakeholders are central to this whole idea of evaluating AI. They are, but who are these stakeholders? Exactly?
[00:03:52] Speaker C: That's a great question.
[00:03:53] Speaker B: Yeah.
[00:03:53] Speaker C: It's not just the developers who built the AI or the users who directly interact with it.
[00:03:57] Speaker B: Okay.
[00:03:58] Speaker C: Gilb emphasizes that stakeholders include anyone and everyone who might be impacted by the AI's decisions.
[00:04:05] Speaker B: Oh, wow.
[00:04:06] Speaker C: Even indirectly.
[00:04:07] Speaker B: Okay.
[00:04:08] Speaker C: So if we're talking about, say, an AI system used for medical diagnoses.
[00:04:11] Speaker B: Okay.
[00:04:12] Speaker C: The stakeholders would include not just doctors and patients, but. But potentially nurses, hospitals, insurance companies, and even family members of the patients.
[00:04:21] Speaker B: So it's really a much broader scope than we might initially think it is.
[00:04:25] Speaker C: He calls this systematic process of understanding stakeholders and their needs stakeholder engineering.
[00:04:31] Speaker B: Stakeholder engineering.
[00:04:32] Speaker C: It's even the title of one of his books.
[00:04:34] Speaker B: Okay, so he's written a whole book about this.
[00:04:35] Speaker C: He has.
[00:04:36] Speaker B: That sounds like a massive undertaking.
[00:04:38] Speaker C: It is a big topic.
[00:04:39] Speaker B: Yeah. How do you even begin to map out all those potential impacts in relationships?
[00:04:45] Speaker C: Well, it's not easy, but Gil provides a framework for approaching it.
[00:04:49] Speaker B: Okay.
[00:04:50] Speaker C: And he argues that understanding these diverse perspectives is absolutely essential.
[00:04:54] Speaker B: Yeah.
[00:04:55] Speaker C: If we want to develop AI systems that are truly beneficial and avoid unintended consequences.
[00:05:00] Speaker B: This is all really insightful and I'm eager to hear more about how he suggests that we tackle this challenge.
But before we dive into that, let's take a quick pause and recap what we've learned so far about AI quality.
[00:05:12] Speaker C: Sounds Good.
[00:05:13] Speaker B: What stands out to you as the most crucial takeaway?
[00:05:16] Speaker C: Well, I think the shift from vague ideas about AI quality to these measurable metrics is huge. It really forces us to define what good AI actually means. In practical terms.
[00:05:27] Speaker B: It's like we're moving from philosophy to a science of AI quality.
[00:05:32] Speaker C: I like that.
[00:05:33] Speaker B: And the report card analogy, while not perfect, it is a really compelling way to visualize that shift. It is, because it really gives you something concrete to grasp onto.
[00:05:45] Speaker C: It does, yeah. And it highlights the importance of moving beyond just looking at the AI itself.
[00:05:52] Speaker B: Okay.
[00:05:52] Speaker C: We have to consider all the people who are impacted by it.
[00:05:55] Speaker B: Right.
[00:05:55] Speaker C: That's where stakeholder engineering comes in.
[00:05:57] Speaker B: Yeah.
[00:05:58] Speaker C: It's about understanding the human element, not just the technical side of things.
[00:06:03] Speaker B: Now, Gilb also spends some time discussing specific types of AI. Those large language models or LLMs, that everyone's talking about. Things like ChatGPT, for example.
[00:06:13] Speaker C: Exactly. No, and he raises some really important points.
[00:06:16] Speaker B: Okay.
[00:06:16] Speaker C: While LLMs are certainly impressive. Yeah, they are in their ability to generate text and code. Gilb argues that they're fundamentally limited because they rely so heavily on massive data sets and lack true reasoning capability.
[00:06:30] Speaker B: So they're basically really good at mimicking patterns, but not necessarily at thinking for themselves.
[00:06:36] Speaker C: Exactly. And this leads Gilb to critique the idea that LRMs represent true intelligence. He even quotes Fred Gibson, the CEO of Graph Matrix, who calls calls the hype around LMS delusional.
[00:06:47] Speaker B: Wow, that's a pretty strong statement.
[00:06:49] Speaker C: It is.
[00:06:50] Speaker B: So what does Gibson see as the real future of AI, then?
[00:06:54] Speaker C: Well, he believes the future lies in what's called Artificial General Intelligence, or AGI.
[00:07:00] Speaker B: Okay.
[00:07:00] Speaker C: AGI, this is about developing AI that can think more like humans, able to reason and solve problems in a more fundamental way.
[00:07:09] Speaker B: So it's not just about pattern recognition. It's about something that more closely resembles human. Like thought processes.
[00:07:15] Speaker C: Exactly.
[00:07:16] Speaker B: That's fascinating. Can you give an example of how that might look different from an LLM in practice?
[00:07:20] Speaker C: Sure. Imagine you're trying to teach an AI to play a new board game.
[00:07:24] Speaker B: Okay.
[00:07:25] Speaker C: An LLM might be able to analyze thousands of game transcripts and learn to mimic successful moves.
[00:07:31] Speaker B: Okay, so it's like learning by rote.
[00:07:33] Speaker C: In a way. Yeah. But an AGI ideally would be able to understand the underlying rules and strategies of the game, allowing it to adapt to new situations.
[00:07:43] Speaker B: Oh, wow.
[00:07:44] Speaker C: And even come up with innovative tactics that haven't been seen before.
[00:07:47] Speaker B: So it's about moving from imitation to genuine understanding.
[00:07:50] Speaker C: Exactly.
[00:07:51] Speaker B: That makes sense.
Now, with all this talk about different types of AI, different qualities to measure, and the need to consider all these stakeholders. Is there a way to bring this all together?
[00:08:02] Speaker C: There is some kind of framework that helps us make sense of it all. GILB introduces something called the Penta Model.
[00:08:09] Speaker B: The Penta Model.
[00:08:09] Speaker C: It's a simple but really powerful way to visualize and analyze any AI system.
[00:08:16] Speaker B: Tell me more about this Penta model.
[00:08:18] Speaker C: Okay. So imagine a five pointed star.
[00:08:20] Speaker B: Okay.
[00:08:21] Speaker C: Each point represents a key aspect of the AI system. You have design functions, values, resources and constraints.
[00:08:29] Speaker B: Got it. So five different points.
[00:08:31] Speaker C: Right. By considering each of these elements and how they relate to each other, you get a much more holistic understanding of the AI.
[00:08:39] Speaker B: I see how that could be really useful.
[00:08:41] Speaker C: Yeah.
[00:08:42] Speaker B: So for design, you'd be looking at the technical architecture of the AI. Right. And functions would be what the AI is actually capable of doing.
[00:08:50] Speaker C: Precisely.
[00:08:51] Speaker B: And then values are about the intended purpose of the AI, what it's supposed to achieve.
[00:08:56] Speaker C: Exactly.
[00:08:57] Speaker B: And then we have resources. Which includes things like the data the AI is trained on.
[00:09:02] Speaker C: Right.
[00:09:02] Speaker B: The computing power it uses and even the human expertise involved in its development.
[00:09:08] Speaker C: All of that.
[00:09:09] Speaker B: Wow. And finally, constraints represent the limitations or restrictions placed on the AI, whether those are technical limitations, ethical guidelines or regulatory frameworks.
[00:09:19] Speaker C: All of those things.
[00:09:19] Speaker B: This Penta model sounds like a really helpful tool for dissecting an AI system and thinking through all its different facets. It's not just about the technology itself, but also the goals, the resources and the limitations that shape its development and impact.
[00:09:34] Speaker C: Absolutely. And by thinking through all of these elements, we can start to get a much clearer picture of the AI's potential benefits and risks.
[00:09:43] Speaker B: Okay.
[00:09:44] Speaker C: We can ask more informed questions about its quality and make more responsible decisions about its development and deployment.
[00:09:50] Speaker B: This has been incredibly insightful.
[00:09:52] Speaker C: Good.
[00:09:52] Speaker B: It feels like we've moved beyond the typical AI hype.
[00:09:55] Speaker C: Yeah.
[00:09:56] Speaker B: And into a much deeper understanding of what's really at stake. State.
[00:09:59] Speaker C: It's about asking the tough questions. Right, Right. Not just being impressed by what AI can do.
[00:10:04] Speaker B: Yeah.
[00:10:05] Speaker C: But critically examining its impact and ensuring it's aligned with our values.
[00:10:10] Speaker B: Absolutely. And I love how GILB gives us these practical tools like the Penta model to help us do that. It's not just about abstract concepts. It's about having a framework for actually analyzing and evaluating AI systems in a meaningful way.
[00:10:25] Speaker C: Exactly. And he doesn't just leave us with a bunch of questions either.
[00:10:28] Speaker B: Oh, right.
[00:10:29] Speaker C: He pushes us to think about solutions.
[00:10:31] Speaker B: Okay.
[00:10:31] Speaker C: He asks that thought provoking question at the end of his presentation.
[00:10:35] Speaker B: Oh, right. What was it again?
[00:10:36] Speaker C: If you were designing an AI system, what quality would you prioritize above all else?
[00:10:43] Speaker B: Okay.
[00:10:43] Speaker C: And how would you measure its success?
[00:10:45] Speaker B: That's such a powerful question.
[00:10:47] Speaker C: It is.
[00:10:48] Speaker B: It really forces you to get to the heart of what matters most.
[00:10:51] Speaker C: It does.
[00:10:52] Speaker B: For some people, it might be safety. For others, it might be fairness or transparency. And then the challenge of figuring out how to actually measure those things.
[00:11:00] Speaker C: It's not easy.
[00:11:01] Speaker B: Yeah.
[00:11:01] Speaker C: But it's a reminder that building beneficial AI is an ongoing process.
[00:11:06] Speaker B: It is.
[00:11:07] Speaker C: We need to constantly be evaluating, refining, and adapting our approach as we learn more.
[00:11:12] Speaker B: Well, listeners, I don't know about you, but I'm feeling a lot more equipped to navigate this complex world of AI.
[00:11:18] Speaker C: Me too.
[00:11:19] Speaker B: We've gone beyond the buzzwords and into some really deep, insightful territory today.
[00:11:24] Speaker C: I agree.
[00:11:24] Speaker B: If you're interested in diving even deeper into these ideas.
[00:11:28] Speaker C: Yeah.
[00:11:28] Speaker B: We've got links to Tom Gilb's work and all the resources he mentions in his presentation over in the show. Notes.
[00:11:33] Speaker C: Definitely check those out.
[00:11:35] Speaker B: And as you explore the world of AI, remember to keep asking those critical questions.
[00:11:39] Speaker C: Yes. What qualities matter most to you? How can we measure success?
[00:11:43] Speaker B: Right.
[00:11:44] Speaker C: And how can we ensure that AI truly benefits humanity?
[00:11:47] Speaker B: Those are the big questions.
[00:11:49] Speaker C: Until next time, keep learning, keep exploring, and keep thinking critically.
[00:11:53] Speaker B: Happy diving. All right, bye.
[00:11:56] Speaker A: I hope you enjoyed today's deep dive. If you'd like to discuss any of the questions or anything that you heard in the podcast or would like the team to do a deep dive into a different topic, please join the LinkedIn group and let me know. I look forward to providing you with many more deep dives into the applications of systems thinking.
Take care.