Serverless KYC using AWS

Motivation

Let me start by telling a little story. There was a start-up. They had a great idea and realised it into an app. To comply with Govt. regulations, they needed to authenticate users using Govt. documents. From the beginning, they wanted to automate this authentication using an event-driven architecture. So they used Apache Kafka on a dedicated EC2 instance for various purposes(keep this point in mind, I will come back to it later).

On one fine day, they launched the application to targeted users. A couple of weeks past and the rate of user onboarding is not satisfactory. Since they were using a dedicated server for Kafka, they had to maintain it. When Kafka was down, all hell breaks loose. Also, when you launch an application or product. You will probably witness a surge in user onboarding for starting months. That too! Can defer during the time of a day. In this context, the company had to pay for the server’s idle time, maintain them. And when user onboarding is not satisfactory, this can be harsh.

Intoduction

The solution for this kind of challenge is quite obvious, go serverless. A fully managed, pay as you use resource model where scalability is taken care of by the cloud provider. Here I am using AWS Lambda as compute resource, S3 as event source, rekognition for face compare and text extraction, and Dynamodb to store the result. Also, we can leverage SNS for notification.

The Solution

For this, we will need two buckets, one for user photos and another for user document’s images. Here each user has a unique identifier and, the application will use this identifier to name an image. For example, if the user identifier is 123asdf, then we can mark the user image as 123asdf-profile.jpg and Document the photo as 123asdf-doc.jpg. The application first puts the user image and then the user document image. This second upload will trigger a lambda function. This lambda function pulls both images using the user identifier. Then the lambda function converts both images and calls rekognition for face match and text extraction. When the lambda receives the result, it stores it in Dynamodb. And then the lambda function sends a notification to the user using SNS.

Here I am using AWS CDK to build the necessary stack

Initial set up

Install and configure AWS CLI - follow this link
- ```
Installation for linux-amd
--------------------------
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
```
- configure - you need to have an AWS account and sufficient permission. Open a terminal and run aws configure. Put your access key and secret access key as prompted.
Install NodeJs - follow this link and pick the right one for your OS.
Install AWS CDK
- ```
npm install -g aws-cdk
```

Create CDK project

mkdir <project name>
cd <project name>

```
cdk init app --language javascript
```

The stack

const cdk = require('@aws-cdk/core');
const s3 = require('@aws-cdk/aws-s3');
const lambda = require('@aws-cdk/aws-lambda');
const dynamodb = require('@aws-cdk/aws-dynamodb');
const iam = require('@aws-cdk/aws-iam');
const path = require('path');
const S3EventSource = require('@aws-cdk/aws-lambda-event-sources').S3EventSource;

class ServerlessKycStack extends cdk.Stack {
  /**
   *
   * @param {cdk.Construct} scope
   * @param {string} id
   * @param {cdk.StackProps=} props
   */
  constructor(scope, id, props) {
    super(scope, id, props);

    // The code that defines your stack goes here
    // dynamodb to store the result
    const table = new dynamodb.Table(this, 'sm-user-result', {
      partitionKey: { name: 'id', type: dynamodb.AttributeType.STRING},
      removalPolicy: cdk.RemovalPolicy.DESTROY
    });

    // s3 bucket to store photos
    const s3_photo = new s3.Bucket(this, 'sm-kyc-photo', {
      removalPolicy: cdk.RemovalPolicy.DESTROY,
      autoDeleteObjects: true
    });
    const s3_document = new s3.Bucket(this, 'sm-kyc-doc', {
      removalPolicy: cdk.RemovalPolicy.DESTROY,
      autoDeleteObjects: true
    });
    
    // lambda to process the event
    const fn = new lambda.Function(this, 'sm-kyc-function', {
      runtime: lambda.Runtime.NODEJS_12_X,
      handler: 'index.handler',
      code: lambda.Code.fromAsset(path.join(__dirname, '../src')),
      timeout: cdk.Duration.seconds(5)
    })
    // lambda trigger
    fn.addEventSource(new S3EventSource(s3_document, {
      events: [ s3.EventType.OBJECT_CREATED]
    }))

    //permission for lambda
    s3_document.grantRead(fn);
    s3_photo.grantRead(fn);
    table.grantReadWriteData(fn);
    fn.addEnvironment('TABLE_NAME', table.tableName);
    fn.addEnvironment('PHOTO_BUCKET', s3_photo.bucketName);
    fn.addToRolePolicy(new iam.PolicyStatement({
      effect: iam.Effect.ALLOW,
      actions: ['rekognition:CompareFaces', 'rekognition:DetectText'],
      resources: ['*']
    }));
  }
}

module.exports = { ServerlessKycStack }

Lambda

Create a folder called src and index.js inside it. Put everything below inside index.js

const aws = require('aws-sdk');
aws.config.update({region:"us-east-1"});

async function face_compare(doc_bucket, photo_bucket, source_image, target_image) {
    const client = new aws.Rekognition();
    let result = {};
    const params = {
        SourceImage: {
            S3Object: {
                Bucket: doc_bucket,
                Name: source_image
            }
        },
        TargetImage: {
            S3Object: {
                Bucket: photo_bucket,
                Name: target_image
            }
        },
        SimilarityThreshold: 80
    }
    try {
        const rekognition_result = await client.compareFaces(params).promise();
        console.log(`The face compare similarity score : ${rekognition_result.FaceMatches[0].Similarity}`);
        result = {
            success: true
        }
    } catch (error) {
        console.error(error);
        result = {
            error: true
        }
    }
    return result;
}

async function detect_text(doc_bucket, doc_image){
    const client = new aws.Rekognition();
    let result = {};
    const params = {
        Image: {
            S3Object: {
                Bucket: doc_bucket,
                Name: doc_image
            }
        }
    }
    try {
        const detect_text_result = await client.detectText(params).promise();
        result = {
            success: true,
            data: detect_text_result
        }
    } catch (error) {
        console.error(error);
        result = {
            success: false,
            error: true
        }
    }
    return result
}

async function save_result_dynamodb(id, face_match, texts) {
    const dynamodb = new aws.DynamoDB.DocumentClient();
    const params = {
        TableName: process.env.TABLE_NAME,
        Item: {
            "id": id,
            "face_match": face_match,
            "text_detection": texts
        }
    }
    try {
        const result = await dynamodb.put(params).promise();
        console.log(result);
    } catch (error) {
        console.error(error);
    }
}
module.exports.handler = async (event, context) => {
    // event details 
    const bucket_name = event.Records[0].s3.bucket.name;
    console.log('BUCKET NAME : ',bucket_name);
    const key = decodeURIComponent(event.Records[0].s3.object.key.replace(/\+/g, ' '));
    console.log('FILE NAME : ', key); // source photo - doc
    let fields = key.split('-');
    
    //unique user id user face image and bucket
    const user_uuid = fields[0];
    let target_image = `${user_uuid}-photo.jpg`; // target photo - face image
    let photo_bucket = process.env.PHOTO_BUCKET

    // face compare
    let face_compare_result = await face_compare(bucket_name, photo_bucket, key, target_image);
    console.log('Face compare result : ', face_compare_result);

    // text detection
    let text_detect_result = await detect_text(bucket_name, key);
    console.log('Text detection completed');
    console.log('Text detection result : ', text_detect_result);
    // store result in dynamo
    await save_result_dynamodb(user_uuid, face_compare_result, text_detect_result);
    console.log('Result stored in Dynamodb');

My package.json file dependencies

"devDependencies": {
    "@aws-cdk/assert": "1.119.0",
    "aws-cdk": "1.119.0",
    "jest": "^26.4.2"
  },
  "dependencies": {
    "@aws-cdk/aws-dynamodb": "^1.121.0",
    "@aws-cdk/aws-iam": "^1.121.0",
    "@aws-cdk/aws-lambda": "^1.121.0",
    "@aws-cdk/aws-lambda-event-sources": "^1.121.0",
    "@aws-cdk/aws-s3": "^1.121.0",
    "@aws-cdk/core": "1.119.0",
    "aws-sdk": "^2.983.0"
  }