Machine Learning in Fraud Management
Machine learning and the broader category of artificial intelligence are rightly attracting attention and discussion. These are powerful technologies. But, like many new technology conversations, there’s the suggestion that they can address all use cases.
Maybe focusing on a single use case is the better approach right now. Join Glenbrook’s George Peabody and Nuno Sebastiao, Chairman and CEO of fraud management firm Feedzai, in this refreshing discussion about the role of machine learning in fraud management, some of its limitations, and how services like these fit into an enterprise’s fraud and risk management operations.
Read the transcript below the line
George: Welcome to a Payments on Fire podcast. I’m George Peabody, a partner with Glenbrook Partners, and today it’s my pleasure to have Nuno Sebastiao, who is the Chairman and CEO of Feedzai. So welcome, Nuno. Glad to have you here.
Nuno: Thank you and thank you for having me. I’m really happy to be here.
George: So, for those of you who didn’t read the description of this podcast, and for those of you who don’t know about Feedzai, it’s a fraud and risk management platform, and I’m sure you’re going to correct me, Nuno, but it’s really largely based on machine learning. Machine learning and even artificial intelligence – these are topics that are very much in the ether these days, and I thought it would be important for us to bring that topic here to Payments on Fire because I know they’re used really heavily in financial services. Correct?
Nuno: That’s correct. So Feedzai is a technology company and we built this technology that is essentially a combination of real time data processing, so it’s applied in real time to payments, and applying machine learning techniques first, but also we use a human good-sense and really understand what payments look like. What does risk look like? And from the combination of those two, understand basically on the spectrum of risk where transactions sit. Are they extremely good from an extremely good customer so that you can offer products, as an example, or are they just extremely bad and you need to block the transaction? There’s a whole spectrum of risk in the middle from the very bad to the very good. I think what Feedzai brings is the application, yes of the technological aspect of large-scale machine learning platform, but also combining that with payment knowhow to really understand, depending on a merchant’s or a financial institution’s appetite for risk. What’s the risk that they want to be exposed to and how do they control that? That’s what we really bring to the marketplace. That’s why I believe we’ve been fairly successful.
George: I understand that a lot of the advantages of a system like yours, and certainly a lot of the focus of your industry, recently has been on separating out and making sure that we’ve determined who the good guys are as fast as possible. Never mind finding the bad actors, but really trying to serve the good customers as efficiently as possible so that you can reduce the rate of false positives. Before we start talking about how you work and what your track record is with respect to fraud, I’d love to talk with you a little bit and hear from you a little bit and frame this with: How should we be thinking about machine learning? I don’t think we need to go and talk about artificial intelligence and the self-driving car phenomenon, but is there some useful way or some guidance you can give us on how to think about machine learning generally? And then, let’s talk about how we apply it to fraud and risk.
Nuno: Yeah, so the promise of machine learning is basically the ability to understand reality, be it whatever the use case you’re talking about, be it self-driving cars, be it talking to Amazon Echo, or in our case, Fintech. What machine learning brings to the table is, some people call is mass personalization, it’s the ability of, in an automated, smart way, understanding what’s going on around you. In our case, it’s understand what’s going on in the networks, the payment networks. If you compare that, today there are so many different combinations of payments, transactions, channels, I can buy things in China and the US at exactly the same time using a phone line or a mobile app or a website. What became really difficult, was to combine all of that information and understand what I’m seeing. The promise of machine learning is the ability to digest all of that information and still make sense of it once you determine – and this is where the human part comes into play – what you’re looking for. What are the goals that you have? This compares to what’s been, more or less, historically where analysts have done this, but in a very simplistic way, they’ve codified using some sort of rules based system. When you had single channel types of payments, this is kind of okay. Let me use an analogy. If you think about self-driving cars, what was there before was essentially the same as imagine if you had to have rules for every single little road, for every single little cross-junction, every single traffic light, as opposed to telling the system that the general behavior for whenever you see a junction or whenever you see a red light, it needs to behave like that. So this is the promise, it’s going from each specific instance to general ones and try to understand behavior. This isn’t an essence, what machine learning gives you. Now that being said, it’s a discipline, and although it’s been around for some decades, it’s had huge leaps in the last 5 or 6 years, but still it only works in very narrow fields. That’s why, I’ll use the self-driving car example again, today you can only do it safely for some aspects, like driving on a highway, but not to drive on general roads or all the roads out there. Whereas any average driver can drive on all the roads. That’s the promise of machine learning, but also the care that we all need to have; it is not a one size fits all solution for all the problems out there.
George: It’s so nice to hear a technology guy actually say that. So a couple of things must be supporting this trend, besides the evolution of algorithms that support this, but you’ve got to have 2 things: you’ve got to have the computing platforms and the resources on which to run the algorithms at scale, and then of course, what was different 7-8 years ago was we didn’t have the large collection of data sources to inform the machine learning to shove into the algorithms to make sense of all this stuff. Do you use an outsource platform to do computing, or do you maintain your own data centers and compute gear to make all this happen?
Nuno: Yes, I’ll tell you about some of the SLA’s we have, and we have SLA’s where we have to make a decision with thousands of transactions at the same time in a hundred milliseconds, and in those hundred milliseconds, we have to run all of our machine learning models. You’re absolutely right, there’s a couple of things that have happened in the last 5-6 years that made possible what we do today. One was actually the shift of payments from a very poor type of channel, like POS swipe, to a lot richer types of channels, like mobile and online. Even for the POS, today it’s a lot more information to collect at the point of transaction, then you did before; so that’s one. The second part is, ok you can collect all that information, but if you cannot act on it in real time, it’s pretty much useless. The processing platforms, and we’re talking about things like storage mechanisms, or another type of technology that’s called column stores, which allows you to very fast and efficiently look up in a database why this is important. If I see a transaction from a given card, I am able to go and lookup and load those very short time frames, in 2 or 3 milliseconds, all of the information I have on that particular merchant, that particular card, that particular consumer. This was just not possible a few years ago, so it combined that. Also, with the evolution you had in machine learning, machine learning has historically been very slow, so you couldn’t apply it with this real time system. When you ask about how are we as a company, we combined all of this together, so we use some of these elements developed by organizations, but what we built, and this is where our secret sauce comes in, is out own implementation of some of the machine learning techniques we need because, in particular, this type of technology that’s called random forest, we have our own implementation for two reasons. One of them being in performance, and I’ll talk about the second reason in just a minute. That is the way we combine the ability to collect data, process data, measure – quantify as much as you can of all you’re seeing, and then apply the machine learning techniques. That is our IP. Another reason, probably even more important, why we have our own implementation, is historically, machine learning has been what’s called black box. That is, you have a machine learning model, it gives you a result, a score, an opinion, and you don’t know how it got to that opinion. It just gives a score; trust me. What we’ve done, because we felt that it doesn’t work…
George: That’s not very helpful if a score lands in a manual queue and an analyst has to look at it.
Nuno: Exactly. So not only to make the analyst’s life easier, but also for auditing purposes and even qualification purposes. We basically say this is our opinion, it’s called a score, this is what we think of this transaction, and here’s why. So we have this white box implementation, this white box decisioning, and it’s this that’s called semantic layer that basically can produce what we call human friendly outputs that explain and justify the underlying machine logic because you still need people to look at this. In fact, probably even more people have heard with the large organizations we work with, they’ve got thousands of people because when I was talking about risk management as opposed to fraud, risk management is all about being confident in the sales that you’re doing, it is not about stopping the bad ones. So, as you know, it’s all about reducing friction. A lot of our clients come to us, not because they only want to stop fraud, but because they want to be able to sell with more confidence. Basically they want to understand; they say, “I know this comes from a given country that I’ve never sold to what do you guys think?” And we say, “Well, we think this transaction is probably good, and here’s why”. Then, when they look at it, they say “Ok, I trust this assessment, I’m going to let it go.” Whereas before, they were blocking these types of transactions. This is where machine learning, by being able to look at more granular information, is actually able to conduit to the topline as opposed to just keeping the bad guys at bay.
George: I recall that you actually develop profiles for individual cards.
George: I think I recall that you develop profiles in fact for the account holders.
George: When a transaction comes in, you’re pulling up those profiles that have also been pre-built clearly, and then comparing them against what? How this profile is matching up against the current transaction context?
Nuno: Yes. When I was talking about measuring, that’s the profile I’m talking about because a machine learning algorithm is only as good as the data that you feed into it. So what we’ve also brought to the table is the ability to really compute with such high levels of granularity, we call it hyper granularity, a segment of one. If I see a payment from a card, from an individual POS at an individual merchant, whatever time of the day from that card, I am going to compute everything I know about that card, that merchant, that region, that POS, that type of merchant, how does it compare with similar merchants at the same time. Imagine if it’s a coffee shop, it’s Friday afternoon, how does that compare with what I’m seeing in similar types of merchants? If it’s a coffee shop, do I have the same type of behavior? If so, ok. If not, why not? It’s this level of granularity and being able to compute that in that very short timeframe that then enables us to feed the models and have the type of accuracy that we have.
George: So you’re taking a set of precomputed profiles for all those touch points, if you will, and then feeding that into the algorithm that’s running currently against the known threats.
Nuno: More than that, more than that. Whenever I see a transaction, and let’s assume you have a transaction with one of the POS’ or one of the merchants we’re working with, that transaction is immediately, taking into account the triggers, the updates, all the measurements, all the profiles that are either directly or indirectly affected by that transaction. Why is that important? For instance, when we’re seeing flash attacks, there are leading indicators. By measuring and continuously updating all of those profiles, we’re able to understand whatever the leading indicators are putting up. What am I seeing? What is bubbling up? That is not normal; I shouldn’t be seeing. Every time you pay, it goes through our systems, we re-compute everything that is directly or indirectly touched by that transaction.
George: Is that how you then go out and detect new threats?
Nuno: Yes. That in combination with a couple things. Also, the machine learning techniques we use and the models we have. Historically, again, in the first versions of the network and machine learning, people tended to have one very big, floppy model that was strained with one year’s worth of data and very static. That is okay to look at things like seasonality or some signals; it’s very bad to look at information or patterns that are evolving very fast, such as: flash attacks or attacks on properties that evolve over a matter of minutes or even hours – very, very fast. What we have is the ability to have models that not only have different perspectives, so one model might be more concerned with the data from the merchant, another model might be more concerned with data from the channel, another model more concerned with data from the consumer. For each of these, we have different time granularities. What do I mean by this? We have models that are bathed on a year’s worth of data, and they are good to pick up things like seasonality trends. So if its summertime, how does it compare to summertime last year? What’s the growth? Then we have other models that look at, for instance, a month of data, and they’re good for us to understand how Monday compares against two Mondays ago. Or, how does the first Monday of the month compare against the first Monday of last month?
George: Sure, sure.
Nuno: Then we have models that looks at a week’s worth of data, then a day’s worth of data, and even as little as 1 hour’s or 5 minutes’ worth of data. Each of them pick up different signals, and it’s the combination of all of them. So whenever data comes to our system, it’s broadcasted to all of these models at the same time, and they all come up with an opinion and with a white box explanation of why they believe a certain transaction is either fraudulent or if it’s risky or not. At the end, we have decisioning models that based on the individual opinions of all of these models, we call this an ensemble approach, comes up with a general recommendation. This is really important because it allows us to deal with things, such as data gaps. Imagine I receive information from a channel and I didn’t get any device information. The model that’s based on device information, they’re going to say “I don’t have enough information”, but that in itself can be a signal. Because if that combined with the time of day or with the other model, well actually when you don’t have the information, it’s actually a bad thing. You also don’t have information because the terminal didn’t send it or there was a communication problem, or whatever, but it allows us to be tolerant to those types of data gaps. Again, on why this is important, it’s important because what we see in our clients is the channels keep on changing, they keep on coming up with new ways of selling, selling on social channels as opposed to just selling on mobile. It’s really important to build systems that it’s just another data source. It’s just another source of data I’m going to feed into the risk engine, and I don’t have to go through these lengthy, retraining of models, building of new models, that was historically very time costly. Let me give you some numbers based on this. We can deploy in a top 50 merchant in the US, as long as that data is available, in a matter of weeks. During that process, we tune our models, our plain vanilla models to the specific information of that merchant. That was the process that before would take months or even half a year to get implemented.
George: So you will take a historical data feed from this new client, pull that in, take a look at it, and start to go as a base, and then start to do more once you start running in real time.
Nuno: Yeah. Run it through our system, our models start to pick up on this historical information on the indicators of that particular merchant, and then you just plug in and turn the live system on, and boom, immediately you have results for that merchant, trained and tailored for that merchant, as opposed to just a plain vanilla type of medium approach.
George: Are you able to get data that, for example, identifies an in app payment on a smart phone versus a buy button in an app?
Nuno: Yes. For instance, we get all sorts of surreal data, I mean it can even be, for instance, behavioral data as you navigate an app, or as you navigate a website, or as you navigate a social medium.
George: Are you measuring that data? Is that part of what Feedzai does, or do you work with a partner?
Nuno: Yes. All of those signals come into us. We’re not, for some of them we work with partners, we’re not a collection medium. We work with some partners because, for instance, there’s organizations that work better with Android, where others work better with iPhone, where others work better with a desktop. So what we have is the data collection – we’re not the data collector. What we are is, once that data is collected, we are the data processor. The brains behind the data.
George: And I would imagine that actually establishing those data connections take some time before you’re ready to get turned on.
Nuno: It depends. In some cases they’re already there. If there’s a need to turn a new one, yeah, that has an impact. You need to have an STK or an API to a website or to an app and then you need to go into the full release the app again with the STK inside. What we found out is that most of the clients, they already, one way or the other, they’re already collecting more or less sophisticated, they’re already collecting that information. The challenge has been…
George: What do I do with this stuff?
Nuno: Yeah. More than that, they were processing it separately, they were not combining. So not only were they not processing, but they were not combining.
George: So at best case, they were looking at a silo basis.
Nuno: Exactly. Let me give you an example. Combining information that you get from the network operator, so the telecom, with information that you get from the device, because it’s very easy for you to trick a device so that it thinks it’s in New York, but the cell tower is telling you that that device is actually in California. Combining those two, it’s a huge improvement. Well, you can trick the device very easily, it’s harder to trick the cell tower because you don’t have access to it. A lot of our customer are not completing that information. This is just an example. So that’s where we really come in, we are combining all of that different information.
George: So, Nuno, couple of questions about, obviously, part of your pitch is to improve sales, not turn away good sales. Can you give us some metrics regarding maybe the sales lift you seen, but also the fraud reduction?
Nuno: Yes. I can give you an example. In one of the largest Fintech organizations, one of the largest financial institutions in the US, they had a problem, and the problem is as follows. If you are a Fintech organization that wants to go branchless, so you’re a bank and you basically want to start, and for millennials or even for you and me, what was the last time we walked into a branch, it’s been months since I walked into a branch. How do you, as a bank, still work with people and do it in a confident way? So the challenge they had was how could they keep on doing banking safely even more, and not rejecting people? The challenge they were having is the process, the online process to open an account or to do a transaction was very slow, and they were rejecting a lot of people because they didn’t have enough information. What they challenged us with was “Well, we’re rejecting about half of the people that come onto our website and apply for our financial products. How much can you guys provide?” So we have the numbers, this is in production. We can safely onboard around 90% of the people. At the same risk level, go from 50% to 90%, and then internally manage the risk. Will all the clients be offered all the same products? No. Will they immediately allow all the people that successfully went through to do the same type of operations? No. But they were able to bring people onboard, so into the door, and then manage risk. Going at the same false positive, as the same level of risk, go from 50% to 90%. That is huge! That is amazing.
George: One works, the other is a social media nightmare.
Nuno: Yeah, exactly.
George: How is the role of the analyst or the staffing of the analyst changed in that instance? Maybe more generally, what are you seeing as pre and post deployment?
Nuno: They actually have more work. This is why I was saying that analysts are still needed. Think about it, it’s the same rate of risk, but there’s a lot more people going through the door. You have a lot more clients, and once they’re inside, basically, the work that gets done, you say “Okay, fine. These people, we validate them; they fit the criteria. What kind of products can we offer them?” Actually, then what happens is their call centers reach out to those people, and then they work with the clients to “We’ve seen this information from you; however, it’s not yet enough. We need more on this.” Because they are in the approved but reviews on, there’s actually more work, but it’s fundamentally different because it goes from an outright “no I don’t have enough information about you” to “I’ve seen this type of behavior, I know that this person is trustworthy, I need a little bit more. Let the person come through the door, and then we’ll manage it internally”. If you’re a customer, think about it from…
George: A qualitatively difference, I’m sure.
Nuno: Exactly. You’re in, you have a product, you have an offer, and then someone gets to you, reaches out to you and says “Mr. So and So, we’ve seen that you’ve applied. We’re very happy that you’ve been approved. What about this? What about that?” From a user perspective, it’s actually similar to what you’d have in a branch. The same applies to online merchants. If you are a merchant, you concern yourself with different things. Let me give you an example. A recent client we had, their problem was, for them to keep their brand, what they needed was to do these promotions, these limited editions. They started to receive complaints that the limited editions of the products that they were putting out, were not reaching some regions, some states, or even some countries in Europe, as an example. They didn’t understand, because they were actually sending the items to those regions. What happened is, they had people that would buy them all in bulk, and then would resell them on Craig’s List or eBay, and those items, and they were collection items, they were very expensive on the secondary market. This was really bad for their brand, not from a financial standpoint because they’re still selling the items, but from a reputation standpoint it was really bad and they wanted to deal with that. So we did that for them. As I said, it’s all about if you have a brand, if you’re a merchant and you have an online brand, or if you’re a financial institution, it’s all about managing risk, but essentially do more for your customers. Be there when customers want to have a credit card, open an account, pay something, or in this case, receive the limited edition item. This is very, very important. This is a different take on just we stop bad transactions. Yes, we do that, and very well, but it’s a lot more than that. It’s all about managing risk.
George: Nuno, thank you very much, it’s been really interesting. One last question. I’m very curious, particularly with the EMV shift happening here in the US. What are you seeing in the card not present world? Are there particularly new threat that you’re observing or different rates? How are rates changing?
Nuno: Yes. So the biggest thing we’ve seen is the way facts are perpetrated. Before EMV, it was very easy to clone a card, if you had a box, you could do that. Now, that’s harder. What’s the next biggest link? When the token has been secured, it is what we’re seeing a lot. Websites or merchants or even financial institutions are hacked and the payment information, the tokens are stolen in bulk. So as opposed to cloning one card at a time, you hack into some merchant or some financial institution. There’s a very public example, very recently with Swift. I don’t know if you’ve seen it. They’ve hacked one of the weakest countries they had, and all of the sudden the entire network is compromised. This is what’s going on right now. This was kind of expected, we’ve seen that in Europe that once you plug a whole then what’s the next weakest one? If you combine that with the shift to C&B and the shift to online merchants and online payments, everyone today has huge vaults of payment information, and from what we’ve seen in the industry, there are still ways for things like tokenization, things like securing those databases, it’s still way too easy to access those databases and just stake thousands, hundreds of thousands of payment information, be it tokens, or be it, as we’ve seen in some clients, plain text card information.
George: It’s still amazing. I’ve talked recently with someone who told me the story of a VP in marketing still thinking its fine to have 5 million clear text card numbers in the marketing database.
Nuno: Let me put it like that, we have very strong policies in place. We never want to receive the card information, because it protects ourselves, it protects us. So we say, tokenize it, we have vaults for it, we have caching mechanisms for it. Some clients still send clear text card information. This is a cultural issue, this is a security issue. As you know, hacks are perpetrated a lot of times using social engineering. This is still the case. There’s still a lot of information out there that is not secured, and as you were saying, people think its ok.
Nuno: The next couple of years, this will be the single biggest source of hacks and of data breaches and payment information being stolen. It’s not that the end payment notes aren’t safe. They are. Not all of them, there’s still a lot of people who need to update a lot of POS software, but the biggest, largest threat right now is the systems where the vaults and databases are not secured.
George: And those vaults keep getting more concentrated moving up into the deeper end of the network with payment tokens.
George: Anybody from a processor or card network listening to this, you know what you have to do.
Nuno: Yep. Secure those databases.
George: Great. Well, Nuno, thank you so much. I really appreciate the conversation. I learned a lot and look forward to talking to you again soon.
Nuno: Thank you so much. I’m the one who appreciates it. Thank you so much.