Loading...

Follow Minborg's Java Pot on Feedspot

Continue with Google
Continue with Facebook
or

Valid
Reading and writing binary data with Java can sometimes be a hassle. Read this article and learn how to leverage Chronicle Bytes, thereby making these tasks both faster and easier.

I recently contributed to the open-source project “Chronicle Decentred” which is a high-performance decentralized ledger based on blockchain technology. For our binary access, we relied on a library called “Chronicle Bytes” which caught my attention. In this article, I will share some of the learnings I made while using the Bytes library.

What is Bytes?Bytes is a library that provides functionality similar to Java’s built-in ByteBuffer but obviously with some extensions. Both provide a basic abstraction of a buffer storing bytes with additional features over working with raw arrays of bytes. They are also both a VIEW of underlying bytes and can be backed by a raw array of bytes but also native memory (off-heap) or perhaps even a file.

Here is a short example of how to use Bytes:

// Allocate off-heap memory that can be expanded on demand.
Bytes bytes = Bytes.allocateElasticDirect();

// Write data
bytes.writeBoolean(true)
.writeByte((byte) 1)
.writeInt(2)
.writeLong(3L)
.writeDouble(3.14)
.writeUtf8("Foo")
.writeUnsignedByte(255);

System.out.println("Wrote " + bytes.writePosition() + " bytes");
System.out.println(bytes.toHexString());

Running the code above will produce the following output:

Wrote 27 bytes
00000000 59 01 02 00 00 00 03 00 00 00 00 00 00 00 1f 85 Y······· ········
00000010 eb 51 b8 1e 09 40 03 46 6f 6f ff ·Q···@·F oo·

We can also read back data as shown hereunder:

// Read data
boolean flag = bytes.readBoolean();
byte b = bytes.readByte();
int i = bytes.readInt();
long l = bytes.readLong();
double d = bytes.readDouble();
String s = bytes.readUtf8();
int ub = bytes.readUnsignedByte();

System.out.println("d = " + d);

bytes.release();

This will produce the following output:

d = 3.14

HexDumpBytesBytes also provides a HexDumpBytes which makes it easier to document your protocol.

// Allocate off-heap memory that can be expanded on demand.
Bytes bytes = new HexDumpBytes();

// Write data
bytes.comment("flag").writeBoolean(true)
.comment("u8").writeByte((byte) 1)
.comment("s32").writeInt(2)
.comment("s64").writeLong(3L)
.comment("f64").writeDouble(3.14)
.comment("text").writeUtf8("Foo")
.comment("u8").writeUnsignedByte(255);

System.out.println(bytes.toHexString());
This will produce the following output:

59                                              # flag
01 # u8
02 00 00 00 # s32
03 00 00 00 00 00 00 00 # s64
1f 85 eb 51 b8 1e 09 40 # f64
03 46 6f 6f # text
ff # u8

SummaryAs can be seen, it is easy to write and read various data formats and Bytes maintain separate write and read positions making it even easier to use (No need for “flipping” a Buffer). The examples above illustrate “streaming operations” where consecutive write/reads are made. There are also “absolute operations” that provide us with random access within the Bytes’ memory region.

Another useful feature of Bytes is that it can be “elastic” in the sense that its backing memory is expanded dynamically and automatically if we write more data than we initially allocated. This is similar to an ArrayList with an initial size that is expanded as we add additional elements.

ComparisonHere is a short table of some of the properties that distinguish Bytes from ByteBuffer:

ByteBuffer Bytes
Max size [bytes] 2^31 2^63
Separate read and write position No Yes
Elastic Buffers No Yes
Atomic operations (CAS) No Yes
Deterministic resource release Internal API (Cleaner) Yes
Ability to bypass initial zero-out No Yes
Read/Write strings No Yes
Endianness Big and Little Native only
Stop Bit compression No Yes
Serialize objects No Yes
Support RPC serialization No Yes

How Do I Install it?When we want to use Bytes in our project, we just add the following Maven dependency in our pom.xml file and we have access to the library.

<dependency>
<groupId>net.openhft</groupId>
<artifactId>chronicle-bytes</artifactId>
<version>2.17.27</version>
</dependency>

If you are using another build tool, for example, Gradle, you can see how to depend on Bytes by clicking this link.

Obtaining Bytes ObjectsA Bytes object can be obtained in many ways, including wrapping an existing ByteBuffer. Here are some examples:
// Allocate Bytes using off-heap direct memory
// whereby the capacity is fixed (not elastic)
Bytes bytes = Bytes.allocateDirect(8);

// Allocate a ByteBuffer somehow, e.g. by calling
// ByteBuffer's static methods or by mapping a file
ByteBuffer bb = ByteBuffer.allocate(16);
//
// Create Bytes using the provided ByteBuffer
// as backing memory with a fixed capacity.
Bytes bytes = Bytes.wrapForWrite(bb);

// Create a byte array
byte[] ba = new byte[16];
//
// Create Bytes using the provided byte array
// as backing memory with fixed capacity.
Bytes bytes = Bytes.wrapForWrite(ba);

// Allocate Bytes which wraps an on-heap ByteBuffer
Bytes bytes = Bytes.elasticHeapByteBuffer(8);
// Acquire the current underlying ByteBuffer
ByteBuffer bb = bytes.underlyingObject();

// Allocate Bytes which wraps an off-heap direct ByteBuffer
Bytes bytes = Bytes.elasticByteBuffer(8);
// Acquire the current underlying ByteBuffer
ByteBuffer bb = bytes.underlyingObject();

// Allocate Bytes using off-heap direct memory
Bytes bytes = Bytes.allocateElasticDirect(8);
// Acquire the address of the first byte in underlying memory
// (expert use only)
long address = bytes.addressForRead(0);

// Allocate Bytes using off-heap direct memory
// but only allocate underlying memory on demand.
Bytes bytes = Bytes.allocateElasticDirect();
Releasing BytesWith ByteBuffer, we normally do not have any control of when the underlying memory is actually released back to the operating system or heap. This can be problematic when we allocate large amounts of memory and where the actual ByteBuffer objects as such are not garbage collected.

This is how the problem may manifest itself: Even though the ByteBuffer objects themselves are small, they may hold vast resources in underlying memory. It is only when the ByteBuffers are garbage collected that the underlying memory is returned. So we may end up in a situation where we have a small number of objects on the heap (say we have 10 ByteBuffers holding 1 GB each). The JVM finds no reason to run the garbage collector with only a few objects on heap. So we have plenty of heap memory but may run out of process memory anyhow.

Bytes provides a deterministic means of releasing the underlying resources promptly as illustrated in this example below:

Bytes bytes = Bytes.allocateElasticDirect(8);
try {
doStuff(bytes);
} finally {
bytes.release();
}

This will ensure that underlying memory resources are released immediately after use.

If you forget to call release(), Bytes will still free the underlying resources when a garbage collection occurs just like ByteBuffer, but you could run out of memory waiting for that to happen.

Writing DataWriting data can be made in two principal ways using either:

  • Streaming operations 
  • Absolute operations

Streaming OperationsStreaming operations occur as a sequence of operations each laying out its contents successively in the underlying memory. This is much like a regular sequential file that grows from zero length and upwards as contents are written to the file.

// Write in sequential order
bytes.writeBoolean(true)
.writeByte((byte) 1)
.writeInt(2)
Absolute OperationsAbsolute operations can access any portion of the underlying memory in a random access fashion much like a random access file where content can be written at any location at any time.

// Write in any order
bytes.writeInt(2, 2)
.writeBoolean(0, true)
.writeByte(1, (byte) 1);

Invoking absolute write operations does not affect the write position used for streaming operations.

Reading DataReading data can also be made using streaming or absolute operations.

Streaming OperationsAnalog to writing, this is how streaming reading looks like:

boolean flag = bytes.readBoolean();
byte b = bytes.readByte();
int i = bytes.readInt();
Absolute OperationsAs with absolute writing, we can read from arbitrary positions:

int i = bytes.readInt(2);
boolean flag = bytes.readBoolean(0);
byte b = bytes.readByte(1);

Invoking absolute read operations does not affect the read position used for streaming operations.

MiscellaneousBytes supports writing of Strings which ByteBuffer does not:
bytes.writeUtf8("The Rain in Spain stays mainly in the plain");

There are also methods for atomic operations:
bytes.compareAndSwapInt(16, 0, 1);

This will atomically set the int value at position 16 to 1 if and only if it is 0. This provides thread-safe constructs to be made using Bytes. ByteBuffer cannot provide such tools.

BenchmarkingHow fast is Bytes? Well, as always, your mileage may vary depending on numerous factors. Let us compare ByteBuffer and Bytes where we allocate a memory region and perform some common operations on it and measure performance using JMH (initialization code not shown for brevity):

@Benchmark
public void serializeByteBuffer() {
byteBuffer.position(0);
byteBuffer.putInt(POINT.x()).putInt(POINT.y());
}


@Benchmark
public void serializeBytes() {
bytes.writePosition(0);
bytes.writeInt(POINT.x()).writeInt(POINT.y());
}

@Benchmark
public boolean equalsByteBuffer() {
return byteBuffer1.equals(byteBuffer2);
}

@Benchmark
public boolean equalsBytes() {
return bytes1.equals(bytes2);
}

This produced the following output:

Benchmark                          Mode  Cnt         Score          Error  Units
Benchmarking.equalsByteBuffer thrpt 3 3838611.249 ± 11052050.262 ops/s
Benchmarking.equalsBytes thrpt 3 13815958.787 ± 579940.844 ops/s
Benchmarking.serializeByteBuffer thrpt 3 29278828.739 ± 11117877.437 ops/s
Benchmarking.serializeBytes thrpt 3 42309429.465 ± 9784674.787 ops/s

Here is a diagram of the different benchmarks showing relative performance (higher is better):



The performance Bytes is better than ByteBuffer for the benchmarks run.

Generally speaking, it makes sense to reuse direct off-heap buffers since they are relatively expensive to allocate. Reuse can be made in many ways including ThreadLocal variables and pooling. This is true for both Bytes and ByteBuffer.

The benchmarks were run on a Mac Book Pro (Mid 2015, 2.2 GHz Intel Core i7, 16 GB) and under Java 8 using all the available threads. It should be noted that you should run your own benchmarks if you want a relevant comparison pertaining to a specific problem.

APIs and Streaming RPC callsIt is easy to setup an entire framework with remote procedure calls (RPC) and APIs using Bytes which supports writing to and replaying of events. Here is a short example where MyPerson is a POJO that implements the interface BytesMarshable. We do not have to implement any of the methods in BytesMarshallable since it comes with default implementations.

public final class MyPerson implements BytesMarshallable {

private String name;
private byte type;
private double balance;

public MyPerson(){}

// Getters and setters not shown for brevity

}

interface MyApi {
@MethodId(0x81L)
void myPerson(MyPerson byteable);
}

static void serialize() {
MyPerson myPerson = new MyPerson();
myPerson.setName("John");
yPerson.setType((byte) 7);
myPerson.setBalance(123.5);

HexDumpBytes bytes = new HexDumpBytes();
MyApi myApi = bytes.bytesMethodWriter(MyApi.class);

myApi.myPerson(myPerson);

System.out.println(bytes.toHexString());

}

Invoking serialize() will produce the following output:

81 01                                           # myPerson
04 4a 6f 68 6e # name
07 # type
00 00 00 00 00 e0 5e 40 # balance
As can be seen, It is very easy to see how messages are composed.

File-backed BytesIt is very uncomplicated to create file mapped bytes that grow as more data is appended as shown hereunder:

try {
MappedBytes mb = MappedBytes.mappedBytes(new File("mapped_file"), 1024);
mb.appendUtf8("John")
.append(4.3f);
} catch (FileNotFoundException fnfe) {
fnfe.printStackTrace();
}

This will create a memory mapped file named “mapped_file”.

$ hexdump mapped_file 
0000000 4a 6f 68 6e 34 2e 33 00 00 00 00 00 00 00 00 00
0000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
0001400

Licensing and DependenciesBytes is open-source and licensed under the business-friendly Apache 2 license which makes it easy to include it in your own projects whether they are commercial or not.

Bytes have three runtime dependencies: chronicle-core, slf4j-api and com.intellij:annotations which, in turn, are licensed under Apache 2, MIT and Apache 2.

ResourcesChronicle Bytes: https://github.com/OpenHFT/Chronicle-Bytes
The Bytes library provides many interesting features and provides good performance.
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
Java: How to Slash Down Building Times Using the CloudBuilding larger Java projects on a laptop with Maven can be frustrating and slow. Learn how you could slash down building times by building in the cloud instead.

SetupAs a founder of open-source Speedment Stream ORM, I usually build the project several times per day on my now somewhat old laptop (Macbook Pro, Mid 2015). The Speedment project consists of over 60 modules and the build process is managed by Maven. The project lives here on Github.

I wanted to find out if I could save time by building the project in the cloud instead. In this short article, I will share my results. I have compared my laptop with Oracle Cloud, running the same build process.

I am using the following setup:


Laptop Oracle Cloud
Java JDK OracleJDK 1.8.0_191 OracleJDK 1.8.0_201
Maven Version 3.6.0 3.5.4
CPU Cores 4 4
CPU Type 2.2 GHz Intel Core i7 2.0 GHz Intel Xeon Platinum 8167M
RAM 30G 16G

I should mention that we also have continuous integration servers that run in the cloud using Jenkins.

Laptop
Pers-MBP:speedment pemi$ time mvn clean install

...
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 07:46 min
[INFO] Finished at: 2019-04-09T15:34:25+02:00
[INFO] ------------------------------------------------------------------------

real 7m48.065s
user 12m33.850s
sys 0m50.476s
Oracle Cloud
[opc@instance-20190409-xxxx speedment]$ time mvn clean install

...
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 03:41 min
[INFO] Finished at: 2019-04-09T13:30:20Z
[INFO] ------------------------------------------------------------------------

real 3m42.602s
user 10m22.353s
sys 0m32.967s
Parallel BuildsRunning parallel builds reduce building time:

Pers-MBP:speedment pemi$ time mvn -T 4 clean install

real 4m47.629s
user 14m24.607s
sys 0m56.834s


[opc@instance-20190409-xxxx speedment]$ time mvn -T 4 clean install

real 3m21.731s
user 11m15.436s
sys 0m34.000s
SummaryThe following graph shows a comparison for sequential Speedment Maven builds on my laptop vs. Oracle Cloud (lower is better):



The next graph shows a comparison for parallel builds (lower is better):



The conclusion is that sequential build time was reduced by over 50% when I used the cloud solution and the parallel build time was reduced by 30%.

If I re-build completely two times a day, this means I will save 2 hours per month. More importantly, I will get feedback faster so I could stay “in the development flow”.

As a final word, it should be noted that there are other complementary ways of reducing building times including selecting appropriate maven and JVM parameters, only build changed modules and running the build under GraalVM.

ResourcesSpeedment Open Source: https://github.com/speedment/speedment
Oracle Cloud: https://cloud.oracle.com/home
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
Java: How to Become More Productive with Hazelcast in Less Than 5 MinutesWhat if you want to use a Hazelcast In-Memory Data Grid (IMDG) to speed up your database applications, but you have hundreds of tables to handle? Manually coding all Java POJOs and serialization support would entail weeks of work and when done, maintaining that domain model by hand would soon turn into a nightmare. Read this article and learn how to save time and do it in 5 minutes.

Now there is a graceful way to manage these sorts of requirements. The Hazelcast Auto DB Integration Tool allows connection to an existing database which can generate all these boilerplate classes automatically. We get true POJOs, serialization support, configuration, MapStore/MapLoad, ingest and more without having to write a single line of manual code. As a bonus, we get Java Stream support for Hazelcast distributed maps.

Using the ToolLet us try an example. As in many of my articles, I will be using the Sakila open-source example database. It can be downloaded as a file or as a Docker instance. Sakila contains 16 tables and a total of 90 columns in those tables. It also includes seven views with additional columns.

To start, we use the Hazelcast Auto DB Integration Initializer and a trial license key.


Fill in the values as shown above and press “Download” and your project is saved to your computer. Then, follow the instructions on the next page explaining how to unzip, start the tool and get the trial license.

Next, we connect to the database:



The tool now analyses the schema metadata and then visualizes the database schema in another window:



Just press the “Generate” button and the complete Hazelcast domain model will be generated automatically within 2 or 3 seconds.



Now, we are almost ready to write our Hazelcast IMDG application. We need to create a Hazelcast IMDG to store the actual data in first.

ArchitectureThis is how the architecture looks like where the Application talks to the Hazelcast IMDG which, in turn, gets its data from the underlying Database:





The code generated by the tool need only be present in the Application and not in the Hazelcast IMDG.

Creating a Hazelcast IMDGCreating a Hazelcast IMDG is easy. Add the following dependency to your pom.xml file:

<dependency>
<groupId>com.hazelcast</groupId>
<artifactId>hazelcast</artifactId>
<version>3.11</version>
</dependency>

Then, copy the following class to your project:

public class Server {

public static void main(String... args) throws InterruptedException {
final HazelcastInstance instance = Hazelcast.newHazelcastInstance();
while (true) {
Thread.sleep(1000);
}
}

}

Run this main method three times to create three Hazelcast nodes in a cluster. More recent versions of IDEA requires “Allow parallel run” to be enabled in the Run/Debug Configurations. If you only run it once, that is ok too. The example below will still work even though we would just have one node in our cluster.

Running the main method tree times will produce something like this:

Members {size:3, ver:3} [
Member [172.16.9.72]:5701 - d80bfa53-61d3-4581-afd5-8df36aec5bc0
Member [172.16.9.72]:5702 - ee312d87-abe6-4ba8-9525-c4c83d6d99b7
Member [172.16.9.72]:5703 - 71105c36-1de8-48d8-80eb-7941cc6948b4 this
]
Nice! Our three-node-cluster is up and running!

Data IngestBefore we can run any business logic, we need to ingest data from our database into the newly created Hazelcast IMDG. Luckily, the tool does this for us too. Locate the generated class named SakilaIngest and run it with the database password as the first command line parameter or modify the code so it knows about the password. This is what the generated class looks like.

public final class SakilaIngest {

public static void main(final String... argv) {
if (argv.length == 0) {
System.out.println("Usage: " + SakilaIngest.class.getSimpleName() + " database_password");
} else {
try (Speedment app = new SakilaApplicationBuilder()
.withPassword(argv[0]) // Get the password from the first command line parameter
.withBundle(HazelcastBundle.class)
.build()) {

IngestUtil.ingest(app).join();
}
}
}
}
When run, the following output is shown (shortened for brevity):

...
Completed 599 row(s) ingest of data for Hazelcast Map sakila.sakila.customer_list
Completed 2 row(s) ingest of data for Hazelcast Map sakila.sakila.sales_by_store
Completed 16,049 row(s) ingest of data for Hazelcast Map sakila.sakila.payment
Completed 16,044 row(s) ingest of data for Hazelcast Map sakila.sakila.rental
Completed 200 row(s) ingest of data for Hazelcast Map sakila.sakila.actor_info

We now have all data from the database in the Hazelcast IMDG. Nice!

Hello WorldNow that our grid is live and we have ingested data, we have access to populated Hazelcast maps. Here is a program that prints all films of length greater than one hour to the console using the Map interface:
public static void main(final String... argv) {
try (Speedment app = new SakilaApplicationBuilder()
.withPassword("your-db-password-goes-here")
.withBundle(HazelcastBundle.class)
.build()) {

HazelcastInstance hazelcast = app.getOrThrow(HazelcastInstanceComponent.class).get();

IMap<Integer, Film> filmMap = hazelcast.getMap("sakila.sakila.film");
filmMap.forEach((k, v) -> {
if (v.getLength().orElse(0) > 60) {
System.out.println(v);
}
});

}
}

The film length is an optional variable (i.e., nullable in the database) so it gets automatically mapped to an OptionalLong. It is possible to set this behavior to “legacy POJO” that returns null if that is desirable in the project at hand.

There is also an additional feature with the tool: We get Java Stream support! So, we could write the same functionality like this:

public static void main(final String... argv) {
try (Speedment app = new SakilaApplicationBuilder()
.withPassword("your-db-password-goes-here")
.withBundle(HazelcastBundle.class)
.build()) {

FilmManager films = app.getOrThrow(FilmManager.class);

films.stream()
.filter(Film.LENGTH.greaterThan(60))
.forEach(System.out::println);

}
Under the HoodThe tool generates POJOs that implements Hazelcast’s “Portable” serialization support. This means that data in the grid is accessible from applications written in many languages like Java, Go, C#, JavaScript, etc.

The tool generates the following Hazelcast classes:

POJOOne for each table/view that implements the Portable interface.

Serialization FactoryOne for each schema. This is needed to efficiently create Portable POJOs when de-serializing data from the IMDG in the client.

MapStore/MapLoadOne for each table/view. These classes can be used by the IMDG to load data directly from a database.

Class DefinitionOne for each table/view. These classes are used for configuration.

Index utility methodOne per project. This can be used to improve the indexing of the IMDG based on the database indexing.

Config supportOne per project. Creates automatic configuration of serialization factories, class definitions, and some performance setting.

Ingest supportOne per project. Template for ingesting data from the database into the Hazelcast IMDG.

The tool also contains other features such as support for Hazelcast Cloud and Java Stream support.

A particularly appealing property is that the domain model (e.g., POJOs and serializers) does not need to be on the classpath of the servers. They only need to be on the classpath on the client side. This dramatically simplifies the setup and management of the grid. For example, if you need more nodes, add a new generic grid node and it will join the cluster and start participating directly.

Hazelcast CloudConnections to Hazelcast Cloud instances can easily be configured using the application builder as shown in this example:

Speedment hazelcastApp = new SakilaApplicationBuilder()
.withPassword(“<db-password>")
.withBundle(HazelcastBundle.class)
.withComponent(HazelcastCloudConfig.class,
() -> HazelcastCloudConfig.create(
"<name of cluster>",
"<cluster password>",
"<discovery token>"
)
)
.build();
SavingsI estimate that the tool saved me several hours (if not days) of boilerplate coding just for the smaller example Sakila database. In an enterprise-grade project with hundreds of tables, the tool would save a massive amount of time, both in terms of development and maintenance.

Now that you have learned how to create code for your first exemplary project and have set up all the necessary tools, I am convinced that you could generate code for any Hazelcast database project in under 5 minutes.

ResourcesSakila: https://dev.mysql.com/doc/index-other.html or https://hub.docker.com/r/restsql/mysql-sakila
Initializer: https://www.speedment.com/hazelcast-initializer/
Manual: https://speedment.github.io/speedment-doc/hazelcast.html
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
Java Stream: Part 2, Is a Count Always a Count?In my previous article on the subject, we learned that JDK 8’s stream()::count takes longer time to execute the more elements there are in the Stream. For more recent JDKs, such as Java 11, that is no longer the case for simple stream pipelines. Learn how things have gotten improved within the JDK itself.

Java 8In my previous article, we could conclude that the operation list.stream().count() is O(N) under Java 8, i.e. the execution time depends on the number of elements in the original list. Read the article here.

Java 9 and UpwardsAs rightfully pointed out by Nikolai Parlog (@nipafx) and Brian Goetz (@BrianGoetz) on Twitter, the implementation of Stream::count was improved beginning from Java 9. Here is a comparison of the underlying Stream::count code between Java 8 and later Java versions:

Java 8 (from the ReferencePipeline class)
return mapToLong(e -> 1L).sum();
Java 9 and later (from the ReduceOps class)
if (StreamOpFlag.SIZED.isKnown(flags)) {
return spliterator.getExactSizeIfKnown();
}
...

It appears Stream::count in Java 9 and later is O(1) for Spliterators of known size rather than being O(N). Let’s verify that hypothesis.

BenchmarksThe big-O property can be observed by running the following JMH benchmarks under Java 8 and Java 11:

@State(Scope.Benchmark)
public class CountBenchmark {

private List<Integer> list;

@Param({"1", "1000", "1000000"})
private int size;

@Setup
public void setup() {
list = IntStream.range(0, size)
.boxed()
.collect(toList());
}

@Benchmark
public long listSize() {
return list.size();
}

@Benchmark
public long listStreamCount() {
return list.stream().count();
}

public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(CountBenchmark.class.getSimpleName())
.mode(Mode.Throughput)
.threads(Threads.MAX)
.forks(1)
.warmupIterations(5)
.measurementIterations(5)
.build();

new Runner(opt).run();

}

}

This will produce the following outputs on my laptop (MacBook Pro mid 2015, 2.2 GHz Intel Core i7):

JDK 8 (from my previous article)
Benchmark                        (size)   Mode  Cnt          Score           Error  Units
CountBenchmark.listSize 1 thrpt 5 966658591.905 ± 175787129.100 ops/s
CountBenchmark.listSize 1000 thrpt 5 862173760.015 ± 293958267.033 ops/s
CountBenchmark.listSize 1000000 thrpt 5 879607621.737 ± 107212069.065 ops/s
CountBenchmark.listStreamCount 1 thrpt 5 39570790.720 ± 3590270.059 ops/s
CountBenchmark.listStreamCount 1000 thrpt 5 30383397.354 ± 10194137.917 ops/s
CountBenchmark.listStreamCount 1000000 thrpt 5 398.959 ± 170.737 ops/s

JDK 11
Benchmark                                  (size)   Mode  Cnt          Score           Error  Units
CountBenchmark.listSize 1 thrpt 5 898916944.365 ± 235047181.830 ops/s
CountBenchmark.listSize 1000 thrpt 5 865080967.750 ± 203793349.257 ops/s
CountBenchmark.listSize 1000000 thrpt 5 935820818.641 ± 95756219.869 ops/s
CountBenchmark.listStreamCount 1 thrpt 5 95660206.302 ± 27337762.894 ops/s
CountBenchmark.listStreamCount 1000 thrpt 5 78899026.467 ± 26299885.209 ops/s
CountBenchmark.listStreamCount 1000000 thrpt 5 83223688.534 ± 16119403.504 ops/s

As can be seen, in Java 11, the list.stream().count() operation is now O(1) and not O(N).

Brian Goetz pointed out that some developers, who were using Stream::peek method calls under Java 8, discovered that these methods were no longer invoked if the Stream::count terminal operation was run under Java 9 and onwards. This generated some negative feedback to the JDK developers. Personally, I think it was the right decision by the JDK developers and that this instead presented a great opportunity for Stream::peek users to get their code right.

More Complex Stream PipelinesIn this chapter, we will take a look at more complex stream pipelines.

JDK 11Tagir Valeev concluded that pipelines like stream().skip(1).count() are not O(1) for List::stream.
This can be observed by running the following benchmark:
@Benchmark
public long listStreamSkipCount() {
return list.stream().skip(1).count();
}
CountBenchmark.listStreamCount                  1  thrpt    5  105546649.075 ±  10529832.319  ops/s
CountBenchmark.listStreamCount 1000 thrpt 5 81370237.291 ± 15566491.838 ops/s
CountBenchmark.listStreamCount 1000000 thrpt 5 75929699.395 ± 14784433.428 ops/s
CountBenchmark.listStreamSkipCount 1 thrpt 5 35809816.451 ± 12055461.025 ops/s
CountBenchmark.listStreamSkipCount 1000 thrpt 5 3098848.946 ± 339437.339 ops/s
CountBenchmark.listStreamSkipCount 1000000 thrpt 5 3646.513 ± 254.442 ops/s

Thus, list.stream().skip(1).count() is still O(N).

SpeedmentSome stream implementations are actually aware of their sources and can take appropriate shortcuts and merge stream operations into the stream source itself. This can improve performance massively, especially for large streams with more complex stream pipelines like stream().skip(1).count()

The Speedment ORM tool allows databases to be viewed as Stream objects and these streams can optimize away many stream operations like the Stream::count, Stream::skip, Stream::limit operation as demonstrated in the benchmark below. I have used the open-source Sakila exemplary database as data input. The Sakila database is all about rental films, artists etc.
@Benchmark
public long rentalsSkipCount() {
return rentals.stream().skip(1).count();
}

@Benchmark
public long filmsSkipCount() {
return films.stream().skip(1).count();
}
When run, the following output will be produced:

SpeedmentCountBenchmark.filmsSkipCount        N/A  thrpt    5   68052838.621 ±    739171.008  ops/s
SpeedmentCountBenchmark.rentalsSkipCount N/A thrpt 5 68224985.736 ± 2683811.510 ops/s
The “rental” table contains over 10,000 rows whereas the “film” table only contains 1,000 rows. Nevertheless, their stream().skip(1).count() operations complete in almost the same time. Even if a table would contain a trillion rows, it would still count the elements in the same elapsed time. Thus, the stream().skip(1).count() implementation has a complexity that is O(1) and not O(N).

Note: The benchmark above were run with “DataStore” in-JVM-memory acceleration. If run with no acceleration directly against a database, the response time would depend on the underlying database’s ability to execute a nested “SELECT count(*) …” statement.

SummaryStream::count was significantly improved in Java 9.

There are stream implementations, such as Speedment, that are able to compute Stream::count in O(1) time even for more complex stream pipelines like stream().skip(...).count() or even stream.filter(...).skip(...).count().

ResourcesSpeedment Stream ORM Initializer: https://www.speedment.com/initializer/

Sakila: https://dev.mysql.com/doc/index-other.html or https://hub.docker.com/r/restsql/mysql-sakila
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
Java Stream: Is a Count Always a Count?It might appear obvious that counting the elements in a Stream takes longer time the more elements there are in the Stream. But actually, Stream::count can sometimes be done in a single operation, no matter how many elements you have. Read this article and learn how.

Count ComplexityThe Stream::count terminal operation counts the number of elements in a Stream. The complexity of the operation is often O(N), meaning that the number of sub-operations is proportional to the number of elements in the Stream.

In contrast, the List::size method has a complexity of O(1) which means that regardless of the number of elements in the List, the size() method will return in constant time. This can be observed by running the following JMH benchmarks:
@State(Scope.Benchmark)
public class CountBenchmark {

private List<Integer> list;

@Param({"1", "1000", "1000000"})
private int size;

@Setup
public void setup() {
list = IntStream.range(0, size)
.boxed()
.collect(toList());
}

@Benchmark
public long listSize() {
return list.size();
}

@Benchmark
public long listStreamCount() {
return list.stream().count();
}

public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(CountBenchmark.class.getSimpleName())
.mode(Mode.Throughput)
.threads(Threads.MAX)
.forks(1)
.warmupIterations(5)
.measurementIterations(5)
.build();

new Runner(opt).run();

}

}

This produced the following output on my laptop (MacBook Pro mid 2015, 2.2 GHz Intel Core i7):

Benchmark                        (size)   Mode  Cnt          Score           Error  Units
CountBenchmark.listSize 1 thrpt 5 966658591.905 ± 175787129.100 ops/s
CountBenchmark.listSize 1000 thrpt 5 862173760.015 ± 293958267.033 ops/s
CountBenchmark.listSize 1000000 thrpt 5 879607621.737 ± 107212069.065 ops/s
CountBenchmark.listStreamCount 1 thrpt 5 39570790.720 ± 3590270.059 ops/s
CountBenchmark.listStreamCount 1000 thrpt 5 30383397.354 ± 10194137.917 ops/s
CountBenchmark.listStreamCount 1000000 thrpt 5 398.959 ± 170.737 ops/s

As can be seen, the throughput of List::size is largely independent of the number of elements in the List whereas the throughput of Stream::count drops of rapidly as the numbers of elements grow. But, is this really always the case for all Stream implementation per se?

Source Aware StreamsSome stream implementations are actually aware of their sources and can take appropriate shortcuts and merge stream operations into the stream source itself. This can improve performance massively, especially for large streams. The Speedment ORM tool allows databases to be viewed as Stream objects and these streams can optimize away many stream operations like the Stream::count operation as demonstrated in the benchmark below. I have used the open-source Sakila exemplary database as data input. The Sakila database is all about rental films, artists etc.

@State(Scope.Benchmark)
public class SpeedmentCountBenchmark {

private Speedment app;
private RentalManager rentals;
private FilmManager films;

@Setup
public void setup() {
app = new SakilaApplicationBuilder()
.withBundle(DataStoreBundle.class)
.withLogging(ApplicationBuilder.LogType.STREAM)
.withPassword(ExampleUtil.DEFAULT_PASSWORD)
.build();

app.get(DataStoreComponent.class).ifPresent(DataStoreComponent::load);

rentals = app.getOrThrow(RentalManager.class);
films = app.getOrThrow(FilmManager.class);

}

@TearDown
public void tearDown() {
app.close();
}


@Benchmark
public long rentalsCount() {
return rentals.stream().count();
}


@Benchmark
public long filmsCount() {
return films.stream().count();
}


public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(SpeedmentCountBenchmark.class.getSimpleName())
.mode(Mode.Throughput)
.threads(Threads.MAX)
.forks(1)
.warmupIterations(5)
.measurementIterations(5)
.build();

new Runner(opt).run();

}

}
When run, the following output will be produced:

Benchmark                              Mode  Cnt         Score          Error  Units
SpeedmentCountBenchmark.filmsCount thrpt 5 71037544.648 ± 75915974.254 ops/s
SpeedmentCountBenchmark.rentalsCount thrpt 5 69750012.675 ± 37961414.355 ops/s
The “rental” table contains over 10,000 rows whereas the “film” table only contains 1,000 rows. Nevertheless, their Stream::count operations complete in almost the same time. Even if a table would contain a trillion rows, it would still count the elements in the same elapsed time. Thus, the Stream::count implementation has a complexity that is O(1) and not O(N).

Note: The benchmark above were run with Speedment's “DataStore” in-JVM-memory acceleration. If run with no acceleration directly against a database, the response time would depend on the underlying database’s ability to execute a “SELECT count(*) FROM film” query.

SummaryIt is possible to create Stream implementation that counts their elements in a single operation rather than counting each and every element in the stream. This can improve performance significantly, especially for streams with many elements.

ResourcesSpeedment Stream ORM Initializer: https://www.speedment.com/initializer/
Sakila: https://dev.mysql.com/doc/index-other.html or https://hub.docker.com/r/restsql/mysql-sakila
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
Java 12: Mapping with Switch ExpressionsIn this article, we will be looking at the new Java 12 feature “Switch Expressions” and how it can be used in conjunction with the Stream::map operation and some other Stream operations. Learn how you can make your code better with Streams and Switch Expressions.

Switch ExpressionsJava 12 comes with “preview” support for “Switch Expressions”. Switch Expression allows switch statements to return values directly as shown hereunder:

public String newSwitch(int day) {
return switch (day) {
case 2, 3, 4, 5, 6 -> "weekday";
case 7, 1 -> "weekend";
default -> "invalid";
} + " category";
}
Invoking this method with 1 will return “weekend category”.

This is great and makes our code shorter and more concise. We do not have to bother will fall-through concerns, blocks, mutable temporary variables or missed cases/default that might be the case for the good ole’ switch. Just look at this corresponding old switch example and you will see what I mean:
public String oldSwitch(int day) {
final String attr;
switch (day) {
case 2: case 3: case 4: case 5: case 6: {
attr = "weekday";
break;
}
case 7: case 1: {
attr = "weekend";
break;
}
default: {
attr = "invalid";
}
}
return attr + " category";
}
Switch Expressions is a Preview FeatureIn order to get Switch Expression to work under Java 12, we must pass “--enable-preview” as a command line argument both when we compile and run our application. This proved to be a bit tricky but hopefully, it will get easier with the release of new IDE versions and/or/if Java incorporates this feature as a fully supported feature. IntelliJ users need to use version 2019.1 or later.

Switch Expressions in Stream::mapSwitch Expressions are very easy to use in Stream::map operators, especially when compared with the old switch syntax. I have used Speedment Stream ORM and the Sakila exemplary database in the examples below. The Sakila database is all about films, actors and so forth.

Here is a stream that decodes a film language id (a short) to a full language name (a String) using map() in combination with a Switch Expression:

public static void main(String... argv) {

try (Speedment app = new SakilaApplicationBuilder()
.withPassword("enter-your-db-password-here")
.build()) {

FilmManager films = app.getOrThrow(FilmManager.class);

List<String> languages = films.stream()
.map(f -> "the " + switch (f.getLanguageId()) {
case 1 -> "English";
case 2 -> "French";
case 3 -> "German";
default -> "Unknown";
} + " language")
.collect(toList());

System.out.println(languages);
}
}
This will create a stream of all the 1,000 films in the database and then it will map each film to a corresponding language name and collect all those names into a List. Running this example will produce the following output (shortened for brevity):

[the English language, the English language, … ]

If we would have used the old switch syntax, we would have gotten something like this:
        ...
List<String> languages = films.stream()
.map(f -> {
final String language;
switch (f.getLanguageId()) {
case 1: {
language = "English";
break;
}
case 2: {
language = "French";
break;
}
case 3: {
language = "German";
break;
}
default: {
language = "Unknown";
}
}
return "the " + language + " language";
})
.collect(toList());
...
Or, perhaps something like this:
        ...
List<String> languages = films.stream()
.map(f -> {
switch (f.getLanguageId()) {
case 1: return "the English language";
case 2: return "the French language";
case 3: return "the German language";
default: return "the Unknown language";
}
})
.collect(toList());
...
The latter example is shorter but duplicates logic.

Switch Expressions in Stream::mapToIntIn this example, we will compute summary statistics about scores we assign based on a film’s rating. The more restricted, the higher score according to our own invented scale:
IntSummaryStatistics statistics = films.stream()
.mapToInt(f -> switch (f.getRating().orElse("Unrated")) {
case "G", "PG" -> 0;
case "PG-13" -> 1;
case "R" -> 2;
case "NC-17" -> 5;
case "Unrated" -> 10;
default -> 0;
})
.summaryStatistics();

System.out.println(statistics);
This will produce the following output:
IntSummaryStatistics{count=1000, sum=1663, min=0, average=1.663000, max=5}

In this case, the difference between the Switch Expressions and the old switch is not that big. Using the old switch we could have written:

IntSummaryStatistics statistics = films.stream()
.mapToInt(f -> {
switch (f.getRating().orElse("Unrated")) {
case "G": case "PG": return 0;
case "PG-13": return 1;
case "R": return 2;
case "NC-17": return 5;
case "Unrated": return 10;
default: return 0;
}
})
.summaryStatistics();
Switch Expressions in Stream::collectThis last example shows the use of a switch expression in a grouping by Collector. In this case, we would like to count how many films that can be seen by a person of a certain minimum age. Here, we are using a Map with the minimum age as keys and counted films as values.
Map<Integer, Long> ageMap = films.stream()
.collect(
groupingBy( f -> switch (f.getRating().orElse("Unrated")) {
case "G", "PG" -> 0;
case "PG-13" -> 13;
case "R" -> 17;
case "NC-17" -> 18;
case "Unrated" -> 21;
default -> 0;
},
TreeMap::new,
Collectors.counting()
)
);

System.out.println(ageMap);
This will produce the following output:
{0=372, 13=223, 17=195, 18=210}
By providing the (optional) groupingBy Map supplier TreeMap::new, we get our ages in sorted order. Why PG-13 can be seen from 13 years of age but NC-17 cannot be seen from 17 but instead from 18 years of age is mysterious but outside the scope of this article.

SummaryI am looking forward to the Switch Expressions feature being officially incorporated in Java. Switch Expressions can sometimes replace lambdas and method references for many stream operation types.

ResourcesSakila: https://dev.mysql.com/doc/index-other.html or https://hub.docker.com/r/restsql/mysql-sakila
JDK 12 Download: https://jdk.java.net/12/
Speedment Stream ORM Initializer: https://www.speedment.com/initializer/
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
Who’s Been Naughty, Who’s Been Nice? Santa Gives You Java 11 Advice!

Ever wondered how Santa can deliver holiday gifts to all kids around the world? There are 2 billion
kids, each with an individual wishlist, and he does it in 24 hours. This means 43 microseconds per kid on average and he needs to check whether every child has been naughty or nice.

You do not need to wonder anymore. I will reveal the secret. He is using Java 11 and a modern stream ORM with superfast execution.

Even though Santa’s backing database is old and slow, he can analyze data in microseconds by using standard Java streams and in-JVM-memory technology. Santa’s database contains two tables; Child which holds every child in the world, and HolidayGift that specifies all the items available for production in Santa’s workshop. A child can only have one wish, such are the hash rules.

Viewing the Database as StreamsSpeedment is a modern stream based ORM which is able to view relational database tables as standard Java streams. As we all know, only nice children get gifts, so it is important to distinguish between those who’s been naughty and those who’s been nice. This is easily accomplished with the following code:
var niceChildren = children.stream()
.filter(Child.NICE.isTrue())
.sorted(Child.COUNTRY.comparator())
.collect(Collectors.toList());

This stream will yield a long list containing only the kids that have been nice. To enable Santa to optimize his delivery route, the list is sorted by country of residence.

Joining Child and HolidayGiftThis list seems incomplete though. How does Santa keep track of which gift goes to whom? Now the HolidayGift table will come in handy. Since some children provided Santa with their wish list, we can now join the two tables together to make a complete list containing all the nice children and their corresponding gift. It is important to include the children without any wish (they will get a random gift), therefore we make a left join.
var join = joinComponent
.from(ChildManager.IDENTIFIER)
.where(Child.NICE.isTrue())
.leftJoinOn(HolidayGift.GIFT_ID).equal(Child.GIFT_ID)
.build(Tuples::of);

Speedment is using a builder pattern to create a Join<T> object which can then be reused over and over again to create streams with elements of type T. In this case, it is used to join Child and HolidayGift. The join only includes children that are nice and matches rows which contain the same value in the gift_id fields.

This is how Santa deliver all packages:
join.stream()
.parallel()
.forEach(SleighUtil::deliver);
As can be seen, Santa can easily deliver all the packages with parallel sleighs, carried by reindeers.

This will render the stream to an efficient SQL query but unfortunately, it is not quick enough to make it in time.

Using In-JVM-Memory AccelerationNow to the fun part. Santa is activating the in-JVM-memory acceleration component in Speedment, called DataStore. This is done in the following way:
var santasWorkshop = new ApplicationBuilder()
.withPassword("north-pole")
// Activate DataStore
.withBundle(DataStoreBundle.class)
.build();

// Load a snapshot of the database into off-heap memory
santasWorkshop.get(DataStoreComponent.class)
.ifPresent(DataStoreComponent::load);
This startup configuration is the only needed adjustment to the application. All stream constructs above remain the same. When the application is started, a snapshot of the database is pulled into the JVM and is stored off-heap. Because the data is stored off-heap, it will not influence garbage collection and the amount of data is only limited by available RAM. Nothing prevents Santa from loading terabytes of data since he is using a cloud service and can easily expand his RAM. Now the application will run order of magnitudes faster and Santa will be able to deliver all packages in time.

Run Your Own Projects with In-JVM-Memory AccelerationIf you want to try for yourself how fast a database application can be, there is an Initializer that can be found here. Just tick in your desired database type (Oracle, MySQL, MariaDB, PostgreSQL, Microsoft SQL Server, DB2 or AS400) and you will get a POM and an application template automatically generated for you.

If you need more help setting up your project, check out the Speedment GitHub page or explore the user guide.

AuthorsThank you, Julia Gustafsson and Carina Dreifeldt for co-writing this article.
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
Minborg's Java Pot by Per-Åke Minborg - 7M ago
Java: Aggregate Data Off-HeapExplore how to create off-heap aggregations with a minimum of garbage collect impact and
maximum memory utilization.

Creating large aggregations using Java Map, List and Object normally creates a lot of heap memory overhead. This also means that the garbage collector will have to clean up these objects once the aggregation goes out of scope.

Read this short article and discover how we can use Speedment Stream ORM to create off-heap aggregations that can utilize memory more efficiently and with little or no GC impact.
PersonLet’s say we have a large number of Person objects that take the following shape:

public class Person {
private final int age;
private final short height;
private final short weight;
private final String gender;
private final double salary;

// Getters and setters hidden for brievity
}

For the sake of argument, we also have access to a method called persons() that will create a new Stream with all these Person objects.
Salary per AgeWe want to create the average salary for each age bucket. To represent the results of aggregations we will be using a data class called AgeSalary which associates a certain age with an average salary.

public class AgeSalary {
private int age;
private double avgSalary;

// Getters and setters hidden for brievity
}

Age grouping for salaries normally entails less than 100 buckets being used and so this example is just to show the principle. The more buckets, the more sense it makes to aggregate off-heap.
SolutionUsing Speedment Stream ORM, we can derive an off-heap aggregation solution with these three steps:
Create an Aggregator
var aggregator = Aggregator.builderOfType(Person.class, AgeSalary::new)
.on(Person::age).key(AgeSalary::setAge)
.on(Person::salary).average(AgeSalary::setAvgSalary)
.build();

The aggregator can be reused over and over again.
Compute an Aggregation
var aggregation = persons().collect(aggregator.createCollector());

Using the aggregator, we create a standard Java stream Collector that has its internal state completely off-heap.
Use the Aggregation Result
aggregation.streamAndClose()
.forEach(System.out::println);

Since the Aggregation holds data that is stored off-heap, it may benefit from explicit closing rather than just being cleaned up potentially much later. Closing the Aggregation can be done by calling the close() method, possibly by taking advantage of the AutoCloseable trait, or as in the example above by using streamAndClose() which returns a stream that will close the Aggregation after stream termination.
Everything in a One-LinerThe code above can be condensed to what is effective a one-liner:

persons().collect(Aggregator.builderOfType(Person.class, AgeSalary::new)
.on(Person::age).key(AgeSalary::setAge)
.on(Person::salary).average(AgeSalary::setAvgSalary)
.build()
.createCollector()
).streamAndClose()
.forEach(System.out::println);

There is also support for parallel aggregations. Just add the stream operation Stream::parallel and aggregation is done using the ForkJoin pool.
ResourcesDownload Speedment here

Read more about off-heap aggregations here
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
Java 11: JOIN Tables, Get Java StreamsEver wondered how you could turn joined database tables into a Java Stream? Read this short article and find out how it is done using the Speedment Stream ORM. We will start with a Java 8 example and then look into the improvements with Java 11.

Java 8 and JOINsSpeedment allows dynamically JOIN:ed database tables to be consumed as standard Java Streams. We begin by looking at a solution for Java 8 using the Sakila exemplary database:

    Speedment app = ...;

JoinComponent joinComponent = app.getOrThrow(JoinComponent.class);

Join<Tuple2OfNullables<Language, Film>> join = joinComponent
.from(LanguageManager.IDENTIFIER)
.innerJoinOn(Film.LANGUAGE_ID).equal(Language.LANGUAGE_ID)
.build();

join.stream()
.forEach(System.out::println);

This will produce the following output (reformatted and shortened for readability):

Tuple2OfNullablesImpl {
LanguageImpl { languageId = 1, name = English, ... },
FilmImpl { filmId = 1, title = ACADEMY DINOSAUR, ... }
}
Tuple2OfNullablesImpl {
LanguageImpl { languageId = 1, name = English, ... },
FilmImpl { filmId = 2, title = ACE GOLDFINGER, ... }
}
Tuple2OfNullablesImpl {
LanguageImpl { languageId = 1, name = English, ... },
FilmImpl { filmId = 3, title = ADAPTATION HOLES, ... }
}
...

Java 11 and JOINsIn the new Java version 11 there is Local-Variable-Type-Inference (aka var declaration) which makes it even easier to write joins with Speedment. We do not have to explicitly state the type of the join variable:

    Speedment app = ...;

JoinComponent joinComponent = app.getOrThrow(JoinComponent.class);

var join = joinComponent
.from(LanguageManager.IDENTIFIER)
.innerJoinOn(Film.LANGUAGE_ID).equal(Language.LANGUAGE_ID)
.build();

join.stream()
.forEach(System.out::println);
Code BreakdownThe from() method takes the first table we want to use (Language). The innerJoinOn() method takes a specific column of the second table we want to join. Then, the equal() method takes a column from the first table that we want to use as our join condition. So, in this example, we will get matched Language and Film entities where the column Film.LANGUAGE_ID equal Language.LANGUAGE_ID.

Finally, build() will construct our Join object that can, in turn, be used to create Java Streams. The Join object can be re-used over and over again.

JOIN Types and ConditionsWe can use innerJoinOn()leftJoinOn(), rightJoinOn() and crossJoin() and tables can be joined using the conditions equal(), notEqual(), lessThan(), lessOrEqual(), greaterThan() and lessOrEqual().

What's Next?Download open-source Java 11 here.
Download Speedment here.
 Read all about the JOIN functionality in the Speedment User's Guide.
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 
Shortest Code and Lowest Latency
I was a speaker at Oracle CodeOne 2018 in San Francisco and in conjunction with that, I promoted a code challenge where the object was to write code that solved a problem with the lowest possible latency multiplied with the number of code lines that was used. The contestant with the lowest score would be the winner (i.e. having low latency and at the same time using as few lines as possible is good).

The input data was to be taken from the table “film” in the open-source Sakila database.

More specifically, the object was to develop a Java application that computes the sum, min, max, and average rental duration for five films out of the existing 1,000 films using a general solution. The five films should be the films around the median film length starting from the 498:th and ending with the 502:th film (inclusive) in ascending film length.

Contestants were free to use any Java library and any solution available such as Hibernate, JDBC or other ORM tools.

A Solution Based on SQL/JDBCOne way of solving the problem would be using SQL/JDBC directly as shown hereunder. Here is an example of SQL code that solves the computational part of the problem:

SELECT 
sum(rental_duration),
min(rental_duration),
max(rental_duration),
avg(rental_duration)
FROM
(select rental_duration from sakila.film
order by length
LIMIT 5 OFFSET 498) as A

If I run it on my laptop, I get a latency of circa 790 microseconds on the server side (standard MySQL 5.7.16). To use this solution we also need to add Java code to issue the SQL statement and to read the values back from JDBC into our Java variables. This means that the code will be even larger and will take a longer time to execute as shown hereunder:

try (Connection con = DriverManager
.getConnection("jdbc:mysql://somehost/sakila?"
+ "user=sakila-user&password=sakila-password")) {

try (Statement statement = con.createStatement()) {

ResultSet resultSet = statement
.executeQuery(
"SELECT " +
" sum(rental_duration)," +
" min(rental_duration)," +
" max(rental_duration)," +
" avg(rental_duration)" +
"FROM " +
" (select rental_duration from sakila.film " +
" order by length " +
"limit 5 offset 498) as A");

if (resultSet.next()) {
int sum = resultSet.getInt(1);
int min = resultSet.getInt(2);
int max = resultSet.getInt(3);
double avg = resultSet.getDouble(4);
// Handle the result
} else {
// Handle error
}
}
}

To give this alternative a fair chance, I reused the connection between calls in the benchmark rather than re-creating it each time (recreation is shown above but was not used in benchmarks).

Result: ~1,000 us and ~25 code lines

The Winning ContributionThe winner of the challenge was Sergejus Sosunovas from Switzerland and he used Speedment in-JVM-memory acceleration. Here is his solution:

IntSummaryStatistics result = app.getOrThrow(FilmManager.class).stream()
.sorted(Film.LENGTH)
.skip(498)
.limit(5)
.mapToInt(GeneratedFilm::getRentalDuration)
.summaryStatistics();

This was much faster than SQL/JDBC and completed in as little as 6 microseconds.

Result: 6 us and 6 code lines

A Solution with Only Five LinesOne of the contestants, Corrado Lombard from Italy, is worth an honorary mention since he was able to solve the problem in only five lines. Unfortunately, there was a slight error in his original contribution, but when fixed, the solution looked like this:

IntSummaryStatistics result = app.getOrThrow(FilmManager.class).stream()
.sorted(Film.LENGTH)
.skip(498)
.limit(5)
.collect(summarizingInt(Film::getRentalDuration));

This fixed solution had about the same performance as the winning solution.

Optimized Speedment SolutionAs a matter of fact, there is a way of improving latency even more than the winning contribution. By applying an IntFunction that is able to do in-place-deserialization (in situ) of the int value directly from RAM, improves performance even more. In-place-deserialization means that we do not have to deserialize the entire entity, but just extract the parts of it that are needed. This saves time, especially for entities with many columns. Here is how an optimized solution could look like:

IntSummaryStatistics result = app.getOrThrow(FilmManager.class).stream()
.sorted(Film.LENGTH)
.skip(498)
.limit(5)
.mapToInt(Film.RENTAL_DURATION.asInt()) // Use in-place-deserialization
.summaryStatistics();

This was even faster and completed in a just 3 microsecond.

Result: 3 us and 6 code lines

GraalVM and Optimized Speedment SolutionGraalVM contains an improved C2 Compiler that is known to improve stream performance for many workloads. In particular, it seems like the benefits from inlining and escape-analysis are much better under Graal than under the normal OpenJDK.

I was curious to see how the optimized solution above could benefit from GraalVM. With no code change, I run it under GraalVM (1.0.0-rc9) and now latency was down to just 1 microsecond! This means that we could perform 1,000,000 such queries per second per thread on a laptop and presumably much more on a server grade computer.

Result: 1 us and 6 code lines

OverviewWhen we compare SQL/JDBC latency against the best Speedment solution, the speedup factor with Speedment was about 1,000 times. This is the same difference as comparing walking to work or taking the worlds fastest manned jet plane (SR-71 "Blackbird"). A huge improvement if you want to streamline your code.

To be able to plot the solutions in a diagram, I have removed the comparatively slow SQL/JDBC solution, and only showed how the different Speedment solutions and runtimes measure up:


Try it OutThe competition is over. However, feel free to challenge the solutions above or try them out for yourself. Full details of the competition and rules can be found here. If you can beat the fastest solution, let me know in the comments below.

Download Sakila database.

Download Speedment

Benchmark NotesThe benchmark results presented above were obtained when running on my MacBook Pro Mid 2015, 2.2 GHz Intel Core i7, 16 GB 1600 MHz DDR3, Java 8, JMH (1.21) and Speedment (3.1.8)

ConclusionIt is possible to reduce latencies by orders of magnitude and reduce code size at the same time using in-JVM-memory technology and Java streams.
GraalVM can improve your stream performance significantly under many conditions.

Read for later

Articles marked as Favorite are saved for later viewing.
close
  • Show original
  • .
  • Share
  • .
  • Favorite
  • .
  • Email
  • .
  • Add Tags 

Separate tags by commas
To access this feature, please upgrade your account.
Start your free month
Free Preview