Bank Fraud Detection¶
Banks and Insurance companies lose billions of dollars every year to fraud. Traditional methods of fraud detection play an important role in minimizing these losses. However, increasingly sophisticated fraudsters have developed a variety of ways to elude discovery, both by working together and by leveraging various other means of constructing false identities.
Reference Card : https://neo4j.com/docs/cypher-refcard/current/
Original Source: https://github.com/neo4j-contrib/gists/blob/master/other/BankFraudDetection.adoc

Using Graphs¶
Graph databases offer new methods of uncovering fraud rings and other sophisticated scams with a high degree of accuracy, and are capable of stopping advanced fraud scenarios in real time.
The following nodes will be created:
AccountHolder
Address
PhoneNumber
SSN
(social security number)CreditCard
BankAccount
UnsecuredLoan
With relationships:
HAS_ADDRESS
betweenAccountHolder
andAddress
HAS_PHONENUMBER
betweenAccountHolder
andPhoneNumber
HAS_SSN
betweenAccountHolder
andSSN
HAS_CREDITCARD
between betweenAccountHolder
andCreditCard
HAS_BANKACCOUNT
betweenAccountHolder
andBankAccount
HAS_UNSECUREDLOAN
betweenAccountHolder
andUnsecuredLoan
Cypher Code
// Create account holders
CREATE (accountHolder1:AccountHolder {
FirstName: "John",
LastName: "Doe",
UniqueId: "JohnDoe" })
CREATE (accountHolder2:AccountHolder {
FirstName: "Jane",
LastName: "Appleseed",
UniqueId: "JaneAppleseed" })
CREATE (accountHolder3:AccountHolder {
FirstName: "Matt",
LastName: "Smith",
UniqueId: "MattSmith" })
// Create Address
CREATE (address1:Address {
Street: "123 NW 1st Street",
City: "San Francisco",
State: "California",
ZipCode: "94101" })
// Connect 3 account holders to 1 address
CREATE (accountHolder1)-[:HAS_ADDRESS]->(address1),
(accountHolder2)-[:HAS_ADDRESS]->(address1),
(accountHolder3)-[:HAS_ADDRESS]->(address1)
// Create Phone Number
CREATE (phoneNumber1:PhoneNumber { PhoneNumber: "555-555-5555" })
// Connect 2 account holders to 1 phone number
CREATE (accountHolder1)-[:HAS_PHONENUMBER]->(phoneNumber1),
(accountHolder2)-[:HAS_PHONENUMBER]->(phoneNumber1)
// Create SSN
CREATE (ssn1:SSN { SSN: "241-23-1234" })
// Connect 2 account holders to 1 SSN
CREATE (accountHolder2)-[:HAS_SSN]->(ssn1),
(accountHolder3)-[:HAS_SSN]->(ssn1)
// Create SSN and connect 1 account holder
CREATE (ssn2:SSN { SSN: "241-23-4567" })<-[:HAS_SSN]-(accountHolder1)
// Create Credit Card and connect 1 account holder
CREATE (creditCard1:CreditCard {
AccountNumber: "1234567890123456",
Limit: 5000, Balance: 1442.23,
ExpirationDate: "01-20",
SecurityCode: "123" })<-[:HAS_CREDITCARD]-(accountHolder1)
// Create Bank Account and connect 1 account holder
CREATE (bankAccount1:BankAccount {
AccountNumber: "2345678901234567",
Balance: 7054.43 })<-[:HAS_BANKACCOUNT]-(accountHolder1)
// Create Credit Card and connect 1 account holder
CREATE (creditCard2:CreditCard {
AccountNumber: "1234567890123456",
Limit: 4000, Balance: 2345.56,
ExpirationDate: "02-20",
SecurityCode: "456" })<-[:HAS_CREDITCARD]-(accountHolder2)
// Create Bank Account and connect 1 account holder
CREATE (bankAccount2:BankAccount {
AccountNumber: "3456789012345678",
Balance: 4231.12 })<-[:HAS_BANKACCOUNT]-(accountHolder2)
// Create Unsecured Loan and connect 1 account holder
CREATE (unsecuredLoan2:UnsecuredLoan {
AccountNumber: "4567890123456789-0",
Balance: 9045.53,
APR: .0541,
LoanAmount: 12000.00 })<-[:HAS_UNSECUREDLOAN]-(accountHolder2)
// Create Bank Account and connect 1 account holder
CREATE (bankAccount3:BankAccount {
AccountNumber: "4567890123456789",
Balance: 12345.45 })<-[:HAS_BANKACCOUNT]-(accountHolder3)
// Create Unsecured Loan and connect 1 account holder
CREATE (unsecuredLoan3:UnsecuredLoan {
AccountNumber: "5678901234567890-0",
Balance: 16341.95, APR: .0341,
LoanAmount: 22000.00 })<-[:HAS_UNSECUREDLOAN]-(accountHolder3)
// Create Phone Number and connect 1 account holder
CREATE (phoneNumber2:PhoneNumber {
PhoneNumber: "555-555-1234" })<-[:HAS_PHONENUMBER]-(accountHolder3)
RETURN *
The results can be visualised as :

The database schema can be visualised:
CALL db.schema.visualization()

Getting every AccountHolder
’s information:
MATCH (accountHolder:AccountHolder)-[]->(contactInformation)
RETURN accountHolder.FirstName,contactInformation
View Result
accountHolder.FirstName |
contactInformation |
---|---|
John |
{Limit:5000,ExpirationDate:01-20,Balance:1442.23,SecurityCode:123,AccountNumber:1234567890123456} |
John |
{SSN:241-23-4567} |
John |
{Balance:7054.43,AccountNumber:2345678901234567} |
John |
{PhoneNumber:555-555-5555} |
John |
{Street:123 NW 1st Street,City:San Francisco,State:California,ZipCode:94101} |
Jane |
{Limit:4000,ExpirationDate:02-20,Balance:2345.56,SecurityCode:456,AccountNumber:1234567890123456} |
Jane |
{LoanAmount:12000.0,APR:0.0541,Balance:9045.53,AccountNumber:4567890123456789-0} |
Jane |
{Balance:4231.12,AccountNumber:3456789012345678} |
Jane |
{PhoneNumber:555-555-5555} |
Jane |
{SSN:241-23-1234} |
Jane |
{Street:123 NW 1st Street,City:San Francisco,State:California,ZipCode:94101} |
Matt |
{LoanAmount:22000.0,APR:0.0341,Balance:16341.95,AccountNumber:5678901234567890-0} |
Matt |
{Balance:12345.45,AccountNumber:4567890123456789} |
Matt |
{PhoneNumber:555-555-1234} |
Matt |
{SSN:241-23-1234} |
Matt |
{Street:123 NW 1st Street,City:San Francisco,State:California,ZipCode:94101} |
Or better :
MATCH (accountHolder:AccountHolder)-[]->(contactInformation)
RETURN accountHolder.FirstName,count(contactInformation) as RingSize
Or even much better:
MATCH (accountHolder:AccountHolder)-[]->(contactInformation)
RETURN accountHolder.FirstName as FirstName ,
accountHolder.LastName as LastName,
count(contactInformation) as RingSize
Examine the Network¶
Fraud networks are typically known to share similar types of information (phone numbers, credit card details, etc…). This is called the RingSize of information. The wider the ring the more it is shared.
MATCH (accountHolder:AccountHolder)-[]->(contactInformation)
WITH contactInformation, count(accountHolder) AS RingSize
WHERE RingSize >1
RETURN contactInformation,RingSize ORDER BY RingSize DESC
contactInformation |
RingSize |
---|---|
{Street:123 NW 1st Street,City:San Francisco,State:California,ZipCode:94101} |
3 |
{PhoneNumber:555-555-5555} |
2 |
{SSN:241-23-1234} |
2 |
The table above illustrates how SSN and addresses are used more than once
in the network of the bank. A logical next step is to identify the different AccountHolder
nodes that have a relationship with these nodes.
MATCH (accountHolder:AccountHolder)-[]->(contactInformation)
WITH contactInformation, count(accountHolder) AS RingSize
MATCH (contactInformation)<-[]-(accountHolder)
RETURN labels(contactInformation) as info,id(contactInformation) as id ,
accountHolder.LastName as LastName,RingSize
Making use of collect() allows for a more understandable aggregation :
MATCH (accountHolder:AccountHolder)-[]->(contactInformation)
WITH contactInformation, count(accountHolder) AS RingSize
MATCH (contactInformation)<-[]-(accountHolder)
WITH collect(accountHolder.UniqueId) as AccoundHolderID,contactInformation,RingSize
WHERE RingSize > 1
RETURN AccoundHolderID , labels(contactInformation) , RingSize
ORDER BY RingSize DESC
Which results in :
AccountHolderID |
labels(contactInformation) |
RingSize |
---|---|---|
[MattSmith,JaneAppleseed,JohnDoe] |
[Address] |
3 |
[JohnDoe,JaneAppleseed] |
[PhoneNumber] |
2 |
[MattSmith,JaneAppleseed] |
[SSN] |
2 |
All three persons Matt, Jane and John seem to live on the same address. Jane and John share the same phone number. Jane also shares with Matt the same social security number.
Find Fraud Risk¶
Financial risk is involved if those persons sharing similar information have access to loan
(HAS_UNSECUREDLOAN
) or have a credit card to use at their convenience (HAS_CREDITCARD
).
Let us inspect the properties of the these two labels (make sure APOC is installed on the database, using the plugins menu)
MATCH (card:CreditCard)
RETURN apoc.meta.cypher.types(card) LIMIT 1
Discloses the fields of the CreditCard
:
Limit:INTEGER
ExpirationDate:STRING
Balance:FLOAT
SecurityCode:STRING
AccountNumber:STRING
and for the UnsecuredLoan
:
MATCH (loan:UnsecuredLoan)
RETURN apoc.meta.cypher.types(loan) LIMIT 1
LoanAmount:FLOAT
APR:FLOAT
Balance:FLOAT
AccountNumber:STRING
The critical number for the CreditCard
is the Limit property. For the UnsecuredLoan
it is the outstanding LoanAmount property that materializes the risk to the bank. As consequence
we are interested in the following pattern
(accountHolder)-[r:HAS_CREDITCARD|HAS_UNSECUREDLOAN]->(unsecuredAccount)
Where unsecuredAccount is a CreditCard
or an UnsecuredLoan
Cypher Code
MATCH (accountHolder:AccountHolder)-[]->(contactInformation)
WITH contactInformation, count(accountHolder) AS RingSize
MATCH (contactInformation)<-[]-(accountHolder),
(accountHolder)-[r:HAS_CREDITCARD|HAS_UNSECUREDLOAN]->(unsecuredAccount)
WHERE RingSize >1
RETURN accountHolder.UniqueId AS AccountHolderID,
labels(contactInformation) AS ContactType,
unsecuredAccount.LoanAmount as LoanAmount,
unsecuredAccount.Limit as Limit,
RingSize
ORDER BY AccountHolderID ASC
The table below is somehow messy because of the null
in some of the rows.
AccountHolderID |
ContactType |
LoanAmount |
Limit |
RingSize |
---|---|---|---|---|
JaneAppleseed |
[PhoneNumber] |
12000.0 |
null |
2 |
JaneAppleseed |
[PhoneNumber] |
null |
4000 |
2 |
JaneAppleseed |
[Address] |
12000.0 |
null |
3 |
JaneAppleseed |
[Address] |
null |
4000 |
3 |
JaneAppleseed |
[SSN] |
12000.0 |
null |
2 |
JaneAppleseed |
[SSN] |
null |
4000 |
2 |
JohnDoe |
[PhoneNumber] |
null |
5000 |
2 |
JohnDoe |
[Address] |
null |
5000 |
3 |
MattSmith |
[Address] |
22000.0 |
null |
3 |
MattSmith |
[SSN] |
22000.0 |
null |
2 |
Let’s deal with this !
Cypher Code
MATCH (accountHolder:AccountHolder)-[]->(contactInformation)
WITH contactInformation,
count(accountHolder) AS RingSize
MATCH (contactInformation)<-[]-(accountHolder),
(accountHolder)-[r:HAS_CREDITCARD|HAS_UNSECUREDLOAN]->(unsecuredAccount)
WITH collect(DISTINCT accountHolder.UniqueId) AS AccountHolders,
contactInformation, RingSize,
SUM(CASE type(r)
WHEN 'HAS_CREDITCARD' THEN unsecuredAccount.Limit
WHEN 'HAS_UNSECUREDLOAN' THEN unsecuredAccount.Balance
ELSE 0
END) AS FinancialRisk
WHERE RingSize > 1
RETURN AccountHolders AS AccountHolderID,
labels(contactInformation) AS ContactType,
RingSize,
round(FinancialRisk) AS FinancialRisk
ORDER BY FinancialRisk DESC
The most exposed persons are in the table below:
AccountHolderID |
ContactType |
RingSize |
FinancialRisk |
---|---|---|---|
[MattSmith,JaneAppleseed,JohnDoe] |
[Address] |
3 |
34387.0 |
[MattSmith,JaneAppleseed] |
[SSN] |
2 |
29387.0 |
[JaneAppleseed,JohnDoe] |
[PhoneNumber] |
2 |
18046.0 |
The address owned (jointly) by Matt, Jane and John creates a risk of 34K EUR for the bank. This financial risk is either in the form of a Credit Card or a Bank Loan.
Last change: Oct 30, 2023