Jon Scheele (00:01)
Welcome to the API connections podcast. In this episode, I continue my conversation with data scientist Koo Ping Shung. In part one, we discussed the complexities of AI governance, bias and the importance of transparency and review mechanisms to address these biases. In part two, we continue our discussion with how organizations face the challenges of implementing data practices, and the impact this has on AI governance. We talk about the importance of documentation and knowledge management, particularly when onboarding new talent, and we discuss the role of AI itself in enhancing documentation efforts with the aim of producing trustable AI.
Jon Scheele (00:49)
So within an organization, what are the sort of governance reviews that you typically see for the data science capability team that you work with?
Koo Ping Shung (01:05)
Actually, if you ask me, I don't really see much. Reason being, I think companies right now are still struggling with using data in the first place. So like for instance, how to do good reporting, how do we set up the right performance metrics, and also talk about like visualization, setting up the right dashboards and so on and so forth. So organizations.
Koo Ping Shung (01:25)
To be fair, I think they haven't reached that maturity level. Most of the organization hasn't reached that maturity level yet when it comes to using AI from what I observed. So I can't say for sure what, when it comes to governance itself, what should they do? But I think governance, if you look at it, it's very similar to audit.
If you look at it, it's very similar to audit in any companies itself. So at the end of the day, the key is to prevent known abuses. The key is to actually be able to prevent known abuses first. And how do we do that will be documentation. You have to have a lot of use cases in order to set up a strong framework for AI governance.
And I haven't seen that also as well in some of the more matured companies in terms of having a library of use cases. In fact, the thing is that even at the national level, I do feel that there's a strong need to have a library of use cases, both successful and failed, for companies or even for participants or for data scientists or AI engineers to be able to reference on it so that they can design AI tools that doesn't get abused easily. I would say that doesn't get abused easily based on all the past learnings and lessons from the library of use cases. So I think documentation will be important for companies.
Because there's some sort of knowledge management component to it, because you retain the knowledge also within the organization if you do good documentation, if you have a library of use cases altogether.
Jon Scheele (03:19)
Yeah.
So knowledge management is one of the use cases for AI because a lot of organizations have had to spend, documentation is hard, and also sharing documentation is hard across an organization. So many organizations have developed a knowledge management capability but requires upkeep.
Jon Scheele (03:49)
They want to make sure that all their contact center agents, for example, know about all the different situations the customer may be in and what are the different ways that that can be addressed. So they've built a knowledge base. But if they have to have people constantly cataloging these things and then sharing them with everyone so that they know what to expect, then that's a challenge. And this is why companies have been interested in creating chat bots for use by internal employees that help them to tap the knowledge base of the company. In software development, we see possibilities for using AI to generate test cases because test cases are also a hard thing to do, a time consuming thing to do. I wouldn't say it's rocket science, but it's one of those things that people often don't spend enough time preparing because we're too busy trying to write the code and not enough time testing for all the different possible conditions or thinking about all the different error conditions that may occur. But AI can help to generate some of these. Some of them may not be as good as others, but it could help. So...
Is AI, should we be thinking about tooling for how we're going to test models at the same time as we're figuring out how to create the models?
Koo Ping Shung (05:27)
I mean, mean, are standard ways for testing the models. Not saying no. There are standard ways, really, that is done. So even LLM in the past one year already have some idea how to do we test one LLM over another. So maybe even on a human test, maybe a PSLE over here or a bar exam somewhere else and so on and so forth.
But I feel at the end of the day, OK, so coming back to the point that you wanted to say that, hey, can we use LLM? So LLM can be a tool for us to do knowledge management. But we shouldn't forget the fact that LLM are trained on documents. And if you want the knowledge to be retained within the LLM itself, then the documentation needs to be there regardless. And if you want it to be.
Better if you want to be better you want the knowledge management to be better than all the more you need more documentation to train it So I think to your point. I will say probably LLM is very good for retrieval To be used for knowledge management knowledge retrieval But at the end of the day Companies now run away from the fact that they have to do up their documentation and how good
how good the retrieval would be right would depend on how good the documentation is done. But to your point, I do feel the same in the sense that, yeah, documentation is important, but everyone don't really want to do it in the first place because for whatever, for what, for most reason is because it's very mundane. And also you don't know whether, you can't see the impact of doing good documentation.
until you need it, which by then also can be too late as well. Yeah.
Jon Scheele (07:11)
Yeah, you don't know if anyone's going to use it, even if you yourself are going to use it. So yeah, it can be hard to.
Koo Ping Shung (07:19)
Yeah, so I don't deny that documentation has been a very big itch on most organizations that it's very difficult to scratch it away. But I have this idea and that is actually why not use documentation to train and onboard talents. So what do I mean by that would be, I mean, at the end of the day, a talent, once it goes into onboarding into a company, he or she will need to learn the ups and downs, the ins and outs of the company. And this is where keeping documentation helps.
The person will also take a stand and say, I do not know whether this is important or not. It's OK. I'll try to document them down as much as possible. With that in mind, and now with LLMs, the intern or the new guy or girl can actually use LLMs to try and do documentation faster, or at least get some guidance on what kind of documentation is needed maybe for a certain project. For instance, you can always go to LLM and say, I need to do documentation on the data collection. Can you state down the areas on what I should be talking about in my documentation for data collection through the LLM? So the LLM will definitely give you some idea of what other things you need to record. And then you can start recording and start documenting from there. I see that helping. I see LLM be able to help on what areas to document, and also later on, the retrieval part, for fast retrieval. That means you use LLM to then turn it into a chatbot. And then you can be able to retrieve the necessary information much faster.
Jon Scheele (09:07)
Yeah. So there's, I guess there are aspects that are very general to, to general best practice for, for every company. But then this, there are things that are specific to a company and its own knowledge. But to your earlier point, it, what it means is that, if you, if a company, an organization is going to, get the most value out of the AI then they're going to have to make sure that they've got the data foundations right. And data is never going to be perfect, but we should be constantly trying to improve the data that we have and improve the data quality.
Koo Ping Shung (09:44)
Yes.
I think one of the biggest mindset trap that most businesses have is they will always think that it's one step to perfection. But once you do data, that's not really the thing. Once you do data, you know that there's no such thing as a one step to perfection. It's actually taking several steps. And as you go through the smaller steps, we learn along the way.
But perfection will always be a... how to say it? Perfection will just be an idea and vision that you can have because at the end of the day, things will change along the way and your perfection will be a goalpost that keeps on moving because of all these different changes. But that's not saying... that's not an excuse not to chase that vision of perfection. So in terms of like collecting data, in terms of like building the best model and so on and so forth,
I think a lot of businesses need to realize that once you get into this business of using AI for your process and all this, it's always a continuous learning process along the way. And in order to tap onto this learning process, don't let too much value leak out from it. It would be to do good documentation along the way as well.
Jon Scheele (11:05)
Yes, it's about process and not simply, you need projects to run but you need ongoing processes of improvement and continuous improvement and I think it's a bit like renovating your house once you start you never really stop. The next time you walk into your house you say well that part I think I did pretty well but now that part stands out more and I need to fix that. So you need to keep working on it. Okay. So I think there's a number of things that some of us good housekeeping and but is also that we all have to keep updating ourselves on the knowledge of things, whether we are technical or not, whether we want AI to be especially because it can be a tool for us, but we should also be sharing our own expertise on the evolution of how AI is being used in the rest of the organization.
Jon Scheele (12:18)
So thanks very much for sharing that perspective. Do you have any sort of message to someone who's looking to either make use of AI models themselves or want to expand its usage in the organization in a trustable and governable sort of way?
Koo Ping Shung (12:48)
Now, okay, so I would say for a very good part of this discussion that we had, we have been talking about AI governance. But actually, I'm of the opinion, I think you have used a few words already, used that word for a few times, which is trustable. Actually, in fact, the thing is, if I were to look at it, I think what we want to build is AI that humanity can trust. When you talk about AI ethics, you talk about AI governance.
All these are not hitting the crux of the issue. And the crux of the issue is, is there an AI that I can trust that will look out for every stakeholders benefits and be able to balance every stakeholders benefits and risk and of course disadvantage and so on and so forth. So I'm of the opinion we should focus on building trust in AI and not just governance alone and not just AI ethics alone. And in fact, the thing is, I kind of feel this thing about building ethics into machines is a bit of, if I were to use a mild word, not so great. I don't believe in AI. I don't really believe in ethical machines.
Why is that so right? It's because how many of us human beings dare to call ourselves or dare to proclaim ourselves that we are ethical in the first place. So if there are situations where we are not ethical, so we are not really a fully 100 % ethical human being in the first place or really homo sapiens in the first place, what makes us
what makes us in a position where we can build an ethical machine then? I kind of feel it's a bit of an egotistical view that we say that, we can build an ethical machine, but we ourselves cannot even confirm that we are ethical 100 % of the time. So that's AI ethics. And the other thing is also this. We humans haven't been able to resolve
I won't say resolve. Resolve may not be the right word. We human being haven't even come up with an agreeable solution for 100 % of humans when it comes to the trolley problem. I'm sure most of your audience will know what's a trolley problem. And there's a very good chance that we will come into a certain situation of value discussion where we don't agree with each other.
So that happens a lot when I'm conducting my classes in AI ethics, where I talk about AI ethics is a very difficult concept. And I don't think anyone can be an expert in it in the first place, unless you are expert in philosophy, especially ethics and morals, and also an expert in AI. I haven't found someone who is expert in both. So if I don't think that's the case, then I don't think
Anyone can be an AI ethics expert, I'm not sure. But I'm waiting for someone to disprove me. I'm more than happy to have someone disprove me. So again, I'm not a strong proponent of AI ethics. I don't think we can build ethical machines. And in fact, recently, I came across a book. It's called Escape from Model Land. This one is a book I recently
finished. The book was actually talking about this part about human judgment and machine judgment. So if you look at human judgment, we humans know how to juggle the many different values. So going back to the trolley problem, we'll know that, let's say, for instance, maybe it's three cats versus one human on the other track.
We know how to juggle and make a decision and say, no, I think animal at the end of the day may not be that precious compared to one single human life and so on and so forth. So human judgment can juggle many different values. Human judgment can juggle many different values. But when it comes to machine judgment, it's a bit different. Machine judgment, it depends on what? It depends on the designer. And the designer or the model trainer will have to fix.
and say this is a certain value we want to chase. For instance, we want to chase fairness. So the person who's building the model will have to focus on building fairness into the model. It's very difficult to say, I want to build fairness. I'll build transparency. I'll build everything, all this into a model. There's no such model. Every model will have its give and take on certain values. So this is where there's a difference between human judgment and machine judgment in that sense.
And in that case also, then what makes us think, like I said, how do we build an ethical machine where 100 % of humanity can agree that, this is a solution to this particular problem. So ethics is one. Governance already just as mentioned already. And I think like I shared just now when it comes to governance, there's so many different areas to consider. Just want to join you also mentioning that there are other areas as well.
and so on and so forth. There are so many areas. It's very difficult to find one single person to know everything. But in fact, if you ask me, it's also very difficult to find someone who knows at least two of these areas. And you need that because at the end of the day, there are so many different areas, right? All will have their own agenda, their own different points they want to look at, right? So how do we come up with final conclusion after the meeting? You cannot after the meeting of a governance committee.
Jon Scheele (18:18)
Hmm.
Koo Ping Shung (18:30)
And then after that, nothing happens. It cannot work that way. So communication is going to be very important here. And this is where, like I said, I think it's very difficult to juggle in the AI governance community. There are so many different opinions, different areas. At the end of the day, how do we come up with the final decision? And that itself, I think we can foresee that all of us have worked in the corporate for the longest time. All of us can foresee that.
Maybe just making that one single decision can take 6 months to a year.
And there's no way AI innovation in AI can wait for you six months to a year. It's like, OK, we wait for your decision to finish first before we innovate the next step. No. Most of the things going to happen. So I feel AI governance at the end of the day, yes, it's part of the toolkit to regulate AI, but it shouldn't be the ultimate toolkit if you were to have more people use AI. And then I think for more people to use AI,
Koo Ping Shung (19:26)
is I need to trust that AI, which is why I said trustable AI. I think we should be aiming for building trustable AI. You can trust. Very straightforward. I mean, you look at it from a human to human perspective. Jon, if you don't trust that I will share something important or something valuable to your audiences, you won't be inviting me, all right? You won't be using it. So there's a certain level of trust. And trust.
It's not easy to get, but it's very easy to lose. So I kind of feel, at the end of the day, the crux of the whole thing is to build trustable AI, AI that a lot of people can trust. And then AI can be used more for humanity at the end of the day.
Jon Scheele (20:16)
Mm -hmm.
Koo Ping Shung (20:16)
I do get a lot of questions people ask me, Koo, why do you call it trustable AI and not trustworthy AI in the first place? Because I think the common password now is actually trustworthy AI. But I kind of feel to call it trustworthy is I probably, maybe my English is limited, but I will only call a human trustworthy but an object trustable. So.
We all know that humans have a big issue when it comes to inanimate objects. And that is called anthropomorphizing. I don't know whether I pronounce it correctly, but you get the picture as in we tend to treat both inanimate objects and animal objects as fellow beings, as a human. So there's a certain advantage and disadvantage to it i'm not saying i'm not saying it's bad but to call it trustworthy AI right it's sort of you seeing that AI as a human being rather i kind of feel and
Jon Scheele (21:42)
Thank you.
Yeah, I guess my interpretation would be trustable means you trust it to do a certain thing, but trustworthy means you trusted to act in my interests every time and or make sure you're going to look after me. And trustable doesn't necessarily mean that. It means that you should be able to predict what the result will or the outcome will be within a reasonable.
Koo Ping Shung (21:54)
Yeah, I think what you're trying to refer to over here is that trustable is at the task level. Trustworthy is at the entity level. Trustworthy means you say that this particular entity is trustworthy. But maybe the particular entity, when it comes to certain tasks, may not be that trustable. Could be. Could be, right? Yeah.
Jon Scheele (21:58)
Mm.
Koo Ping Shung (22:18)
That's a good point to look at. I'm not saying that's a good point to look at. Something to think about more definitely. But for now, at least, like I said, I think I would rather call it trustable AI and see AI as an inanimate non -human being object and say that it's trustable rather than call it trustworthy AI. But I think this is just an argument on the semantic side of things. And I'm not denying the fact also I'm trying to carve out a niche altogether by calling it trustworthy AI rather than trustworthy AI.
Jon Scheele (22:50)
Okay. Well, I think that's a lot to think about. And I guess the upshot of that is that we can't expect that we're going to get two or three simple rules that solve every situation. Actually, the concept of rules, hard and fast rules, is a separate discussion also because there are the rules-based and principles-based approaches to the law and regulation. So thanks very much Koo for giving us lots of things to think about and I appreciate that whether we're technical or not, whether data science is our specialty or not, we all need to keep seeking how to make best use of AI, but also contribute our own professional knowledge to the improvement of the use of AI by others. So thanks very much.
Koo Ping Shung (23:54)
Thank you for having me here. Thanks Jon.