What is AI? / Data for bluffers #6

31 January 2022

It’s time we settle this once and for all…maybe. This week Tom and Ed get to the bottom of the question “what is AI?”

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

Tom

Welcome back to another episode of data for Bluffs this week, we wanted to, to tackle a topic that is on the tip of a lot of people’s tongues. And that’s actually, what is AI? We, we thought never have a conversation. Try, try, and bottom out definition. So people are able to walk away, understand the right questions to ask about our AI and, and really understand what it is. Um, so we’re now further ado. Uh, let’s, let’s go over to the conversation between me and ed and, uh, I hope you enjoy it. Hey ed. Good to see you again. How you doing?

Ed

I’m good. Thank you. How are

Tom

You, Tom? Very well. Very well. I must say I have been excited about this episode since we started planning what we were gonna talk about, because I, I must hear this, this, this topic talked about in a thousand different ways, uh, you know, apocalyptic, it’s the, you know, all, it’s the best thing ever, but I really it’s, it’s shrouded in, in misunderstanding and, and buzzword, but fundamentally that that’s what is AI? You know, we hear about this word every day. It’s it’s in the news, but I just thought today, let’s, let’s try and let’s try and crack it between us. So as our resident data scientist, ed, what is AI? I really, I

Ed

Really hope we could get to that, you know, settle this once and for all , I, I I’m afraid. I don’t, I don’t expect to get there today when we’re, when you’re thinking about AI and artificial intelligence, it’s, it’s always worth thinking about kind of the history of the word and, and where it’s come from and, and how that leads to what it sort of means today for everyone who’s using it. And I think from that point of view, the best definition of AI that I’ve I can come up with is it’s kind of the, the cutting edge of what computers can do from, uh, an intelligence perspective. So what can computers do? What sort of tasks that computers can do? Usually what we really mean, a task that, and have a human perspective to them that we might imagine a human doing. So some sort of human inte and that’s where the intelligence part of it comes in.

Ed

So for example, tasks that involve learning, or some sort of problem solving both of which are themselves are slightly vague concepts, but it’s quite well known that what was called AI 20 ago is not called AI today. There are tasks that were considered AI 20 years ago, but now are kind of run of the mill. They’re done all the time and that kind of stops them being AI. And that way the AI really is the, the forefront of what computers are able to do. It’s worth mentioning. Now that there’s a slightly distinct concept that we should probably talk about, which is artificial general, inte is, which is, and that’s the idea of like a real computer brain, you know, a brain that can think by itself do tasks that are action to it without any human interaction. And it’s worth saying that before anyone gets too worried, we’re a long way from that, a long, long way from that as it is.

Tom

Okay. So, so a lot, a lot of a, so we’re almost splitting that into two then. So you’ve got the phrase I hear a lot is task based AI, you know, it, it’s something that’s been, been built to do a very specific job. Uh, it, it might be to open my door when I come home and, but predict when I’m gonna be home and make sure it’s me or whatever it might be. And that’s, that’s a specific task versus, you know, the, at this point, you know, someone always wills out a terminate quote or something, but, you know, that’s the doomsday scenario everyone goes for, you know, we, the world we live in stays is still very much focused around tasks. Is that fair?

Ed

That’s yeah, that’s definitely fair. And then to add kind of further complication to the issue, when a lot of people talk about AI, especially in the, in the marketing space, then that’s often tasks that are actually quite far away from the cutting edge of what computers can do, but are being sort of applied for the first time to new problems. And I think that’s why people think of it as AI and brand it as AI. And that’s because it’s new, it’s exciting. And it’s giving people the capability to do things they weren’t able to do before. And that makes it feel cutting edge, even if the technology and the mass and the statistics and the machine learning behind it, aren’t at cutting

Tom

Edge. So we’re, we’re kind of putting it into, you know, the definition has shifted, you know, this has been around, AI’s been around since the fifties, right? So the definition has shifted over the last 70 years to, to what’s the forefront. You then dropped in a load of words, you know, AI, sorry, machine learning and, and stats. They, you hear these words used interchange, right? You know, sometimes someone will drop an AI bomb. Sometimes they’ll then try and maybe sound more technical and use machine learning. And it all goes around like, how do we, how do we bring those things together? Can you bring them together? Do do they hinge off each other in any way that we can sensibly sensibly string together?

Ed

I mean, most certainly these, these terms, they’re all interrelated. They all kind of form part of what broadly is described as AI in today’s world. It kind of works in almost like a, a sort of nested fashion, almost like Russian dolls. So if you think about a problem you’re trying to solve, then the simplest implementation or automatic implementation to solve that pro problem is to use some rules of thumb. For example, you might have a rule of thumb that says, if someone has opened an email from us the last three months, then they’re primed for a phone call. So we’ll give them a phone call this month, and that’s sort of a known rule or a rule of thumb or a technical speak. That’s what we call a heuristic. So heuristics things that are learned from sort of human understanding of things and pattern recognition done by humans, they’re the simplest way to solve a problem then sort of, I think one, one level more complicated than that, or what more mathematical and more technical I’d say than that, we can talk about statistics and that’s really asking, okay.

Ed

So say someone has our opened our email the last three months, and then we, we wanna phone them. But how much more likely does that make them to buy from us, therefore how much of our time should be spend phoning those people versus just random calls for new customers? So statistics is about kind of quantifying in relatively well structured ways, what our hos are telling us and, and how we can Mo kind of maximize our benefit from them next, next, in our level of complexity, I think, uh, we can talk about machine learning, which is something you mentioned. And we, we hear a lot machine learning is really where, instead of knowing the, the trend you’re looking for, you might say, okay, I know that the rate that we people are opening emails, there’s some information in there in terms of how useful a sales call to them is gonna be.

Ed

So what we’re gonna do is we’re gonna use a computer that can learn from that data to develop a new cooling strategy. And, and it’s, it is a lot more open. You’re letting the computer find the patterns in the data, rather than the, you find the pattern that you’ve already identified by your holistic it’s worth saying now. And we’ll come back to this at the end as well. That that approach is not absent of the holistic. You’re still that because there’s some information in the email opening data, which is gonna drive sales. So it’s not, it’s not independent of heuristics, but it’s just kind of another level of distracted.

Tom

That was gonna be my question. Like you, you know, it, it feels like with a lot of these problems, you know, we’ve got a lot of implied knowledge as humans, right? We’ve, we’ve got a lot of da domain expertise. We know how, you know, for, for a lot of the product, um, a lot of the problems, how, how they come together and actually you think that would be hugely valuable in, in building these systems out. So we can, we can input our influence into these systems and, and, and let them figure out the bits that we don’t know is that, is that really what we’re trying to do?

Ed

Most definitely like you can’t can’t really have a statistical model or do statistics without some heuristics about what data to look at, and you can’t do machine learning without some statistics that go into it. And then on top of that, you can’t do more complex machine learning, which some people call deep machine learning, which is where you’re basically throwing larger amounts of data and less information about it. But even there, you, you are drawing on some heuristics of the relationships that might be available. So the artificial general intelligence, we talked about briefly, that’s where the heuristics go, and it’s a machine that can do what it wants. You can ask, ask a machine a question, and it gives you an answer and you haven’t had to as a huge human feed, any like learning information into that. And that’s just something that doesn’t exist. So really, I guess what I’m saying is you could do heuristics sit at the center of all of it. You can’t do statistics without heuristics. You can’t do machine learning without statistics, and you can’t do AI without some form of learning system where humans are teaching machines, how to do the things they want to achieve. They want them to achieve in

Tom

The end. I guess that, that makes me think then if I’m, you know, I’m buying some, if I’m buying some AI off the shelf, you know, I’m going to go to seed market by a box of AI as, as a buyer, you really need to sort of understand the rules of thumb that have, have gone into building that system. Right. In terms of, you know, the people who have, have who’ve developed it, the people who have, have, have designed it, the people who’ve chosen the data to use in it. What rules of thumb have, have they used to build that and do those rules of thumb align to your worldview or, you know, or, or your domain expertise is, is that fair?

Ed

Uh, that’s, that’s, that’s definitely fair. I think the key for understanding any AI system that you are, you are buying prepackaged is you’ve gotta kind of understand and ask the question, what is the effect you’re trying to capitalize on? So what is the istic that backs up the idea that this, there is an effect here? So in our sales email sales example, the, the effect they’re trying to capitalize on there is that there’s information in how often someone opens an email that tells you that they’re more likely to buy. If you give them a call, you are more likely to have a successful sales call with them. And then the second sort of more technical angle is to understand, okay, how is that implemented and applied in the particular situation? So is that a, a sensible way of doing things? So an example might be say, you are looking at like a segment system.

Ed

So a system that basically is relying on the, the istic, that there’s something about your customers, which connects them together. That means that other businesses that are like them or similar to your customers in some way are more likely to buy from you and that, so that gives you a targeting system. So that’s the segmentation approach where you’re segmenting the population of businesses into categories. And you’re saying, okay, we do really well with businesses in this category, right? That might be coffee shops, or it might be, you know, small businesses based on business parks on the age of cities. And you then say, okay, I’m gonna target my marketing at those places, because that’s where my marketing does. Well, now what it’s important to understand is, okay, that’s, that’s the segmentation, that’s your istic for how the system is gonna work, but then it’s also important to understand the implementation. So as a kind of silly example, like if the, if the, the idea or the central idea is okay, we’re gonna select similar businesses to our customers, but then your segmentation works based on, I don’t know, the number of, uh, letters in the company’s name, for example. So you go out and find companies which have similar length names to your customers. Then even though the initial heuristic was right, the way that’s being applied, probably

Tom

Isn’t, it’s kind of shocking in a way. It, it, it really reminds me of when we were, when we were last talking about, you know, statistics, you know, it’s the sum of the, the quality of the input. You know, we were talking very specifically around data. Um, but it makes me think here, we we’re building, there’s a lot of, and there’s a lot of great tools, right. But there’s a lot of tools being built in this, this AI umbrella. And, you know, actually they’re all dependent on the quality of these rules of thumbs, you know, and I guess it sometimes gonna come back to, are people using rule of thumbs, or are they using an assumption, right? Because there’s, in your example, there’s no rule of thumb that, uh, company companies with the same, same length name, um, have the same, same buying behavior. But me as a, as a developer building that could say, well, I’ll assume that works right.

Tom

And I’ll, I’ll use that as a heuristic. So, so when it comes to these systems, they, they, they, it’s really, really interesting to try and work out what has been built using a well-established rule of thumb or rule of thumb. That’s got some evidence behind it, and actually what’s been built just with some assumption. And then as we’ve evolved the, the, the, the systems and the tools on top of it, they’ve kind of just been forgotten, you know, and they’re almost just hidden that that’s like a really, I dunno, it’s quite a fascinating and horrifying the discovery

Ed

Well, it is definitely, I mean, it’s definitely the case that these, you know, all of these systems are built on, you know, some form of human input. I do quite like a phrase that I’ve heard quite a few times as augmented intelligence instead of artificial intelligence. So thinking of AI systems as systems that help humans be more intelligent as opposed to replacing human intelligence. Okay. Yeah. Instead of not using humans, what you’re actually doing is you’re allowing the humans who are using the system to sort of think quicker or think bigger or think smarter. So either, you know, test out lots of things much more quickly think about bigger data sets, more columns at once or more different variables at once. And then also optimize amongst all the possible things they could be thinking about to try and find the, the most efficient route or the, what would be the best thing to implement next, right? The sort of next best step approach,

Tom

Changing, changing gear a little bit. Then, um, another phrase that I hear banded around a lot is, is supervised machine learning. It’s it is another one that gets thrown around. Where, where does that fit in this, in this conversation that we’ve had so far? Like, how would I, how would I position that from this broadly

Ed

Speaking machine learning falls into two categories, supervised or unsupervised supervis machine learning is where you are trying to predict something that, or trying to classify something which either is, or isn’t the case. So is, or isn’t true. So is this a picture of a cat or not a cat? And it’s supervised in the sense that you have lots of examples where you know, what the answer is or for that system to learn, and the way that system learns effectively is it guesses. This picture is a cat, and then you as a human, but normally through a big database, tell that, tell to the computer, you’ve got that one, right. Or you’ve that one wrong. And that’s how, and then it uses that to update the way that it’s guessing whether it’s a, a cat or not to the point where it’s, it’s not guessing it has a system and it, you know, guessing is kind of underplay what it’s actually doing. So that supervised machine learning unsupervised machine learning is more sort of pattern spotting. So segmentation is a good example, um, that we’ve already spoken about. So a segmentation is, is not supervised in the sense that we don’t know the perfect segmentation of the businesses in the UK, but we can find, we can use lots of algorithms to find groups that they fit into, which have similar behaviors. And are you usually the, the classification you’re looking for is that they’re sufficiently distinct from each other?

Tom

Yeah. Okay. Okay. On, on, on your cat example then of the, um, of the supervised systems, what, what’s the risk of me introducing some bias? Right. I, I love ginger tabbies as an example, I don’t, that’s just a random example. I like all cats, but, um, if I, if I showed our system right, a million pictures of ginger tabs and it learnt it really well, and then tomorrow I showed it a picture of a ESE cat, it, it might get that wrong, right. Because they don’t share as many characteristics. Is, is there a risk of the, of me an individual introducing bias into that system? Or does the take care of that?

Ed

So this is where machine learning sort of the skill and the science comes into it. So humans can interact with inside the learning process to make sure it learns features that aren’t in that case color based. Right. So you can, you can make sure that it’s looking at the shapes of the objects and not the colors and sort of try and effectively make up for deficiencies in your supervis or your supervised process. So make up for the deficiencies in your training data is what

Tom

We would say, and then flipping it to the, to the unsupervised. Um, I guess my head’s, my head’s turning to, you know, the, the big, the big change that’s happened in, in the world in the last, you know, 24 months in terms of, you know, our, our behavior as, as people, right? We, we are doing this remotely. We spend less time in the office. We travel less all of these sorts of things. How does that work in these unsupervised ones? You know, we’ve had, let’s say you’ve had, they’ve been learning for 15 years about how humans move and how we’re gonna do some segmentation or whatever. Based on the last 15 years of data, the last two years, the world’s now a very different place. Does, does that system, you know, I appreciate there’s no system, but hypothetically would a system like that learn, or actually is that then when we have seismic changes, is that then the responsibility of, of the people who own and maintain these things to go back in and, and change, change some of the rules change some of the, the heuristics, I guess,

Ed

Uh, the very core level, the system can’t adapt to the world it’s not seen, right. The supervised system has to learn again. Now there’s a few things that save you from that. There’s the sort of, I idealistic approach, which is that your system is identifying sort of the very important things that are going on all the time. Anyway. So it is, and this is where statistics gets very difficult, but machine learning can improve on these things. And that that’s the, the sit that you are sort of outside the boundaries of what you’ve seen before, sort of to almost to flip the problem on its head, that’s actually where your rules of thumb become really good. Right? The best environment is the slight abstraction is the rules of physics. Right. Okay. The, the laws of physics are, are, are well developed. They, they, they allow you to project outside the space you’ve seen before Newton’s law of acceleration worked.

Ed

Yeah. Okay. Yeah. For things going much faster than they ever traveled in Newton’s day until ultimately that did hit a point where that wasn’t the case anymore, but yeah. Okay. But they allow you to push the boundaries a lot more than a statistical model, which sort of only valid in the dataset it’s learned from then there’s a second, the second thing that can save you. So the first one is, you know, having your system learning sort of core, core, or key values, um, the second thing that can save you is good system design, which recognizes when it’s going wrong and then falls back to some heuristics, sort of has a full back plan to say, actually, I’m not sure what’s going on here. Okay. Yeah. So I’m gonna fall back to some simpler rules and sort start the learning process again.

Tom

Okay. Yeah. And then if those heuristics don’t work, then, uh, it

Ed

Blows up, well, if you, if you’re running your system live, so it’s making active, you know, decisions live, then yeah. You can end up up in a very, uh, worrying situation. And you can, you do have these problems where, you know, systems cascade in that way. So one, one system causes something or produces a result, which is not expected. And that propagates through a load of other systems that are

Tom

Depending on. And so I took us down a dooms day. It there, I was just trying to interest to explore. I think, I think what I’m starting to understand from the conversation is it’s really, if you’re putting in saying what we talked about the other week, right. If you’re putting in good quality data or good quality, realistic rule of thumbs, actually these things exist to make our lives easier. You know, I, I like your phrase, you know, to augment our intelligence, you know, have, how do they, how do they enable us to do what we do, but, but do it better as opposed to, you know, anything kind of scary or any of the apocalyptic notions that we, we, we often see when we talk about AI, it’s, it’s really an opportunity to take what we know, um, take our rules of thumb, apply some statistical thinking to it and then cast that forward even further to make, make everything we do much easier.

Ed

Yeah, definitely. And I think it does go back to what we were saying at the beginning about the fact that AI is kind of the often used for the cutting edge of, of, you know, computing intelligence. And it’s kind of that that cutting edge is a little bit domain specific. So the idea that things that are actually quite well established in some fields might not be that well established in others. I think it’s important to realize that you, you can make progress without kind of going, you know, the whole Hogan fight, wait for the artificial brain, right? Yeah.

Tom

Yeah. And, and, and it’s not an either or cause I, I hear you almost hear statistics sometimes talk in a bit of a disparaging light, you know that, oh, why is it not AI? Or actually it’s not a, why is it not, you know, it’s sometimes, sometimes heuristics is enough to solve the problem. Right? Sometimes you might need some statistical approach and statistical thinking, sometimes you might need to build a machine learning system to do it. It is not, it’s not a case of either or it’s about using the right tool for the right problem in front of us.

Ed

Yeah, definitely. I mean, last year we had the pandemic and all of the modeling for the pandemic was really done using, you know, rules of thumb, but like, like, well researched rules of thumb, but that’s what they really were. They, it wasn’t, there were machine learning approaches and now there’s some potentially more successful machine learning approaches, but the most successful approaches, particularly in the early days when we knew very little were statistics based approaches based on some very simple rules of thumb. And then yeah, they used, you know, large computer simulation, but large computer simulation is not really AI. It’s the kind of the opposite

Tom

Of AI. Yeah. So what was, what’s been really interesting that I think it’s been really interesting to understand that ladder, if you like that, we start with theistic rules of thumb, you know, ultimately it’s it’s what do we, what do we know? What are those rules? Um, you build on that with statistics, you can build up that further into, into machine learning and then you, we can keep going further up that ladder towards the, you know, eventually, um, to that, you know, these general intelligence that I guess people are aspiring to, but we’re obviously not there yet from your point of view, if I’m gonna go and buy a know some, some tools tomorrow that are , you know, AI for, for everything they do. What’s the piece of advice you’d give me to, to ask, ask those teams to, to really kick the

Ed

Tires on it. So from a marketing perspective, understand the effect they’re trying to capitalize on, be that the similar companies to your customers are more like to be your next customer or at people’s past behavior and their interactions with you dictates their future behavior. These are kind of obvious things, but you are, there are a lot of AI based businesses that sort of try and hide what they’re doing. And really I’d say, yeah, find out what, what their heuristics, their rules or thumb are because the system is never gonna work. If those things aren’t true. And because of that, they’re really kind of central to any solution. You have to believe in those things being true. Otherwise the system will not work unless you approach it from a scientific point of view, which is that you’re gonna try lots of things and keep the one that works. Okay. So

Tom

The, the black box, excuse, doesn’t cut

Ed

It. No, is it that nothing, nothing is about black box, right?

Tom

I feel I’ve had a bit of a workout mentally, but I think I now have a, I can have a better understanding of what AI is say just off speak you soon. Thank you for listening. I hope you found the conversation with Reed on what is AI interesting as ever if you enjoyed it, the leaders us rating share it, share it with your friends, your colleagues, who might, might interesting and see.

Friends in conversation | Herdify

Sign up to the Herdify newsletter