What’s the fuss about FaaS?

| 4 Comments | 12 minutes read

The recent blog post about Serverless Architectures by Mike Roberts caught my attention. Furthermore, ekito recently developed nowave.io, a video on demand service whose technical architecture is serverless. Consequently I decided to have my own take on Serverless Architectures based on FaaS (Functions as a Service).

In this blog post I will demonstrate how to develop a simple FaaS that is triggered by a REST API. I will show how to define an API using a swagger.io description in a contract first approach. Then, the API will be deployed to Amazon’s API Gateway. In the gateway, I will configure a mapping that extracts data from API requests and transforms them to event objects. Those event objects are then passed to Amazon Lambda.

In the second part I will provide some feedback on load testing the service I developed. You will get some insights to observed round trip latencies and API Gateway rate limiting behaviour.

But first what’s all that fuss about FaaS?

FaaS

A Function as a Service (FaaS) is some stateless business logic that is triggered by a stimulus. The logic of the function is applied to its inputs; the result is returned. A FaaS can have side effects. For example, it can trigger another function or communicate with third party services.

For FaaS to execute, they are bound to triggers. Therefore they become versatile building blocks that can act as a stored procedure on database tables, be triggered on file system changes, act as workers on messaging queues or act as controllers behind a REST API.

FaaS are packaged and deployed to a cloud provider. FaaS usage is generally billed by the number of invocations and by the amount of computing resources consumed during the execution of a function call.

What makes FaaS so attractive is their simple programming, packaging and deployment model. Their capacity of being natively elastic, resulting in linear scalability in terms of performance and operational costs makes them outstanding. Furthermore, the FaaS deployment model does not require any execution platform, middleware, container or virtual machine. Operational maintenance is completely outsourced to cloud providers (NoOps).

Building the HelloWorld function

Amazon Lambda accepts functions implemented in JavaScript (Node.js), Python and Java 8. By consequence, it is also possible to write a function in Scala or any other programming language based on Java 8.

Willing to write a little bit of Scala, I decided to follow this blog post for implementing my function.

Project Initialization

First, let’s initialize a Scala project with the following build.sbt file at the project root:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
name := "AWSLambdaTest"
 
version := "1.0"
 
scalaVersion := "2.11.8"
 
libraryDependencies += "com.amazonaws" % "aws-lambda-java-core" % "1.1.0"
libraryDependencies += "com.fasterxml.jackson.module" % "jackson-module-scala_2.11" % "2.7.5"
 
javacOptions ++= Seq("-source", "1.8", "-target", "1.8", "-Xlint")
 
assemblyMergeStrategy in assembly := {
  case PathList("META-INF", xs@_*) => MergeStrategy.discard
  case x => MergeStrategy.first
}

Note that we instruct sbt expressively to build Java 8 byte code. The build script also contains the assembly strategy of the SBT assembly plugin.

The assembly plugin is configured in LambdaTest/project/plugins.sbt as follows:

1
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.3")

In order to build and package the project, use the command sbt assembly. The resulting jar resides at LambdaTest/target/scala-2.11/AWSLambdaTest-assembly-1.0.jar.

What struck me was the size of the jar file (12.4MB) given that we are writing a mere hello world function. It turns out that all Scala byte code classes are included in the jar. In practical terms, this size can have an impact on the upload time to AWS (timeout after 61s) and the number of functions one can deploy per user account and region (max. storage space for all lambdas in one region is 1.5GB). Should this really become an issue, consider spreading your deployments across different regions, consider using directly Java 8 or an interpreted language with a smaller footprint (Node.js or Python).

Programming model

We implement functions within a handler with a name of your choice. The function signature can take several forms, always defining an input and an output. Amazon Lambda accepts the following input and output types:

  • Simple Java types (AWS Lambda supports the String, Integer, Boolean, Map, and List types)
  • POJO (Plain Old Java Object) type
  • Stream types (InputStream / OutputStream)

Amazon Lambda ignores return value of asynchronously invoked functions. In this case, do consider using the void return type instead.

When using POJOs, Amazon Lambda serializes and deserializes based on standard bean naming conventions. One must use Mutable POJOs with public getters and setters. Annotations are ignored.

When developing AWS Lambdas in Scala, type conversions between Scalaisms and Javaisms must be implemented for non-primitive types. Alternatively, stream types for inputs and outputs can be used. In that case, it is the responsibility of the function handler to marshall and unmarshall inputs and outputs. At a first look this approach seems a little cumbersome. However, it is in my opinion the most stable strategy when opting for Scala. I am not inventing anything here. It is coming straight from this tutorial I was following.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
package io.ekito.lambdaTest
 
case class NameInfo(firstName: String, lastName: String)
 
class Main {
 
  import java.io.{InputStream, OutputStream}
 
 
  val scalaMapper = {
    import com.fasterxml.jackson.databind.ObjectMapper
    import com.fasterxml.jackson.module.scala.DefaultScalaModule
    new ObjectMapper().registerModule(new DefaultScalaModule)
  }
 
  def hello(input: InputStream, output: OutputStream): Unit = {
    val name = scalaMapper.readValue(input, classOf[NameInfo])
    val result = s"Greetings ${name.firstName} ${name.lastName}."
    output.write(result.getBytes("UTF-8"))
  }
}

The complete source code is available on Github.

Deployment

With the code in place, call:

1
$> sbt assemby

Now lets deploy the artifact. You can do this via the CLI or the console; below I’ve illustrated what it looks like in the AWS console. Skip the steps ‘Select BluePrint’ and ‘Configure Triggers’ to arrive at the following screen:

screen1

I am using the default suggestions for memory, duration, and role (basic execution). Note the handler: io.ekito.lambdaTest.Main::hello. Click ‘Create Lambda function’ and you should be ready to test.

Testing

At this point the function should be fully operational. First configure a JSON object of the test event:

screen_test_event

Now you can test it by sending a JSON event object as illustrated here:

screen_test_result

As you can see, valuable information about execution duration, billed duration the used memory is displayed. Based on those values, we can reduce the amount of configured memory from 512MB to 128MB. Interesting is also that an initial invocation can take up to 2 seconds. There is somewhat of a warmup taking place. Execution times upon subsequent invocations are pretty short: ~ 0.6ms. However, the smallest billing period being 100ms, Amazon bills us for resources that we are actually not using 😐

REST API definition

Using Swagger’s API description language, lets write a little Hello World REST API with one GET operation. I used Swaggerhub.com, an online editor with inline syntax and grammar validation. SwaggerHub also has an API documentation preview.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
swagger: '2.0'
info:
  version: '1'
  title: Lambda Test
  description: A hello world API testing AWS API Gateway + Lambda

paths:
  /hello/{firstName}:
    get:
      consumes:
      - application/json
      produces:
      - text/html
      responses:
        200:
          description: "200 response"
          headers:
            Content-Type:
              type: string
      parameters:
        - name: firstName
          in: path
          description: first name
          required: true
          type: string

API Gateway

In Amazon API Gateway, create a new API by pasting the previously created API definition.

screen_api-definition

Now, link the operation /hello/{firstName}/GET, with our lambda function.

screen api_linking

With the API operation being linked to the lambda function a final step is necessary: We must transform the request’s parameters  into a JSON event object. The call flow is visualized hereafter:

screen call flow

Click on ‘Integration Request’ inorder to define the mapping. Several mappings can be defined depending on the request’s Content-Type header argument. We are defining one template for the content type application/json.

screen mapping

The mapping can include all request parameters, payloads or header attributes. In our example, we are injecting the path variable firstName. Full documentation for request and response mappings is avaliable here.

With the mapping in place, we can now test the integration:

screen test api gateway

A final step consists in deploying the API. Deployments are created in stages allowing for deployment of different versions of the same API. With the service being seamlessly scalable, defining rate and burst limits is crucial.  In doing so, we can define the capacity of our service and protect our bank account 😉 You can find the public URL of your service on the top of the page.

screen deployment

Load testing

Now it is time to load test our public FaaS service. But wait, Amazon’s API gateway uses SSL termination with TLSV1.2 encryption and Server Name Indication. SNI is a TLS protocol extension that unfortunately has not a lot of load testing tools support. As of the time of writing Apache Benchmarking Tool and goBench did not work for me. I got more lucky with wrk.

Test Setup

I configured the API endpoint with a rate limit of 100 and a burst of 200 requests / second.

I executed different test runs  through my non-saturated home ADSL connection. To control my configuration, I also spawned a Digital Ocean droplet in the London Data Center. I configured the API Gateway in the eu-west region in Ireland. Test results from both sites (Toulouse Area via local ISP, London Area Digital Ocean Droplet) are comparable.

Test 1: 8 concurrent connections

80.5 req/s. We are getting close to the the rate limit. Latency for the 90th percentile appears to be high.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
➜ virtualdisks docker run --rm williamyeh/wrk -c 8 -t 1 -d 60s --latency https://ph6oo2np9a.execute-api.eu-west-1.amazonaws.com/prd/hello/bert
Running 1m test @ https://ph6oo2np9a.execute-api.eu-west-1.amazonaws.com/prd/hello/bert
1 threads and 8 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 124.92ms 144.94ms 1.41s 93.52%
Req/Sec 82.07 23.87 141.00 70.74%
Latency Distribution
50% 77.97ms
75% 106.22ms
90% 160.59ms
99% 878.88ms
4836 requests in 1.00m, 1.78MB read
Requests/sec: 80.52
Transfer/sec: 30.27KB

Test 2: 14 concurrent connections

As expected, the rate limit kicks in. We have 2993 of 9146 requests with a non-2xx or 3xx response. Latency at the 90th percentile is 144 ms. Calculations show clearly that the right amount of requests are being rejected.

9146 - 2993 = 6153 req/minute = 102.5 req/s

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
➜ virtualdisks docker run --rm williamyeh/wrk -c 14 -t 1 -d 60s --latency https://ph6oo2np9a.execute-api.eu-west-1.amazonaws.com/prd/hello/bert
Running 1m test @ https://ph6oo2np9a.execute-api.eu-west-1.amazonaws.com/prd/hello/bert
1 threads and 14 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 118.14ms 146.95ms 1.48s 93.70%
Req/Sec 153.46 34.74 240.00 72.36%
Latency Distribution
50% 75.28ms
75% 87.52ms
90% 144.13ms
99% 892.30ms
9146 requests in 1.00m, 3.38MB read
Non-2xx or 3xx responses: 2993
Requests/sec: 152.29
Transfer/sec: 57.59KB

Analysis

Different load tests showed that API Gateway’s rate limiting works as expected. Interestingly, overall latency appears to be relatively high (generally > 130ms in the 90th percentile).

I conducted tests through two distinct ISPs with similar results. Given that our lambda operates in the sub-millisecond range, latency must come from the API Gateway’s SSL termination, routing and mapping infrastructure.

The final word

In conclusion, the API Gateway is a precious building block in a Serverless Architecture, allowing the declarative construction of API façades in front of BaaS (Backend as a Service) and FaaS components. It also integrates well with third party authentication providers, such as Auth0. However, its latency surprised me a little bit.

Building a FaaS with Amazon Lambda is quite easy. It is an amazing building block for all kinds of architectures, including Serverless ones.

This blog post only showed manual deployment through the AWS console. Deployment can be automated through Amazon’s command line interface and deployment APIs. Therefore it is easy to integrate into a Continuous Delivery pipeline.

Share Button

Bert Poller Author: Bert Poller

Architecte SI, Artisan logiciel - J'aime programmer et concevoir des systèmes de toutes sortes.
D'origine allemande, polyglotte pas qu'en langages de programmation. Technophile, avec un regard critique sur le monde.
#JAVA #SCALA #DOCKER #REACTIVE #FP #DEVOPS #MESSAGING #DIGITAL_FABRICATION #SUSTAINABLE #COFFEE

4 Comments

  1. “What struck me was the size of the jar file (12.4MB)” — indeed we don’t care much about that for long-running virtual machines. Have you tried to trim down the jar file using an obfuscator like Proguard? This should remove almost all of Scala’s standard library!

    • Bert Poller

      Hi Sylvain,

      thanks for your suggestion ! I configured proguard in sbt like so :

      plugin.sbt

      addSbtPlugin("com.typesafe.sbt" % "sbt-proguard" % "0.2.2")

      build.sbt

      name := "AWSLambdaTest"

      version := "1.0"

      scalaVersion := "2.11.8"

      libraryDependencies += "com.amazonaws" % "aws-lambda-java-core" % "1.1.0"
      libraryDependencies += "com.fasterxml.jackson.module" % "jackson-module-scala_2.11" % "2.7.5"

      javacOptions ++= Seq("-source", "1.8", "-target", "1.8", "-Xlint")

      proguardSettings

      ProguardKeys.proguardVersion in Proguard := "5.2.1"

      ProguardKeys.options in Proguard ++= Seq("-dontnote", "-dontwarn", "-ignorewarnings")

      ProguardKeys.options in Proguard += ProguardOptions.keepMain("io.ekito.lambdaTest.Main")

      ProguardKeys.merge in Proguard := true

      ProguardKeys.mergeStrategies in Proguard += ProguardMerge.discard("META-INF/.*".r)

      I can execute with sbt proguard:proguard and I get a zip of 343Kb. However, when deployin g to AWS, I can’t get the test to work. Aws can’t find the handler method “hello”. Do you know how to configure proguard in sbt to keep a public method ? Thank you !

  2. The “keepMain” option is not suitable here since you don’t have a “public static void main()” method. So ProGuard basically removes everything (don’t know what it even keeps 343kB of code)

    Also Jackson uses introspection for serialization, so we want to make sure the class and member names in the io.ekito.lambdaTest package are not obfuscated.

    The ProGuard options then become :

    ProguardKeys.options in Proguard ++= Seq(“-dontnote”, “-dontwarn scala.**”,
    “-keep public class io.ekito.lambdaTest.* { *; }”
    )

    … and you should remove the “ProguardOptions.keepMain” option.

    The resulting jar file is reduced to 1.1MB. We can gain a bit more my restricting the ScalaModule to what we need, i.e. case class support. For this, remplace the “registerModule” in the code by this:

    registerModule(new JacksonModule with ScalaAnnotationIntrospectorModule {})

    The resulting jar is down to 902 kB, as we removed most of Scala’s collection library which was pulled by the default Scala module.

    Note that most of these 902 kB are Jackson classes.

Leave a Reply

Required fields are marked *.


CommentLuv badge