There is a right way and a wrong way to use the String.split(String) method in Java, and inexperienced developers often use the wrong way.
int firstPart = commaSeparatedList.split(",")[0];
int secondPart = commaSeparatedList.split(",")[1];
int thirdPart = commaSeparatedList.split(",")[2];
int fourthPart = commaSeparatedList.split(",")[3];
int fifthPart = commaSeparatedList.split(",")[4];
int sixthPart = commaSeparatedList.split(",")[5];
By repeatedly calling String.split(String)
multiple times with the exact same input and pattern, the exact same result has to be calculated and stored into a new array each time. This wastes CPU time and RAM space, just to produce multiple arrays which are all identical.
String[] listParts = commaSeparatedList.split(",");
int firstPart = listParts[0];
int secondPart = listParts[1];
int thirdPart = listParts[2];
int fourthPart = listParts[3];
int fifthPart = listParts[4];
int sixthPart = listParts[5];
By calling String.split(String)
just once and storing the resulting array in a local variable, and then referring to that local variable in all the places that need the data from it, CPU time and RAM space are not wasted. The code also ends up cleaner and easier to read.
A JMH benchmark can compare the above two approaches to measure the effect each has on throughput (the number of times the code can run per second) and on memory allocation (how many bytes are written to heap memory each time the code runs).
The "wrong" approach scores an average throughput of 961,293.744 operations per second, with an average allocation rate of 2,449.07742 bytes per operation.
The "correct" approach scores an average throughput of 5,845,193.17 operations per second, with an average allocation rate of 408.166210 bytes per operation.
So the "correct" approach runs 6.08 times faster, and allocates only 16.7% (a sixth) as many bytes per operation. (And the 99.9% confidence intervals do not overlap for either throughput or allocation rate, so we reject the null hypothesis and assume that the two approaches are significantly different.)
This outcome makes sense: String.split is a non-trivial method (it has to calculate things, and has to allocate new space in memory), so calling it six times instead of just once will mean that the splitting code takes six times longer and uses six times as much memory.
This difference may be impossible to notice if the splitting code just runs once now and again (such as in response to user input or interaction). But if the splitting code runs frequently (as part of an intensively run process, or within a loop with a large number of iterations) then the inefficiency of using the "wrong" approach will start to hurt the overall performance of an application. And given how easy it is to use the "correct" approach, there's no good reason to ever use the "wrong" approach.
Ideally a JMH benchmark method should measure only the action of interest, and nothing else. In this case the action of interest is just the splitting of an input string using String.split(String)
, so the creation of an input ("source") string takes place in a State class which is prefixed with the @org.openjdk.jmh.annotations.State
annotation. Activity which takes place in the State setup methods is not considered when JMH counts the time or memory usage, so it does not interfere with the benchmarking results.
@org.openjdk.jmh.annotations.State(Scope.Thread)
public static class State {
public String source;
@Setup(Level.Iteration)
public void setupIteration() {
source = "FIRST,SECOND,THIRD,FOURTH,FIFTH,SIXTH";
}
}
Do not be tempted to define a @Setup(Level.Invocation)
method without reading the warning about Level.Invocation
within the JMH source code. In this String.split benchmark, it is fine to create the String once per iteration, because the String value is not changed by the benchmark code, so @Setup(Level.Iteration)
is used to annotate the setup method which creates the input string in the State object.
Also avoid using a static final String
to hold the input, because the Java compiler will see this as a compile-time constant and may simply write its literal value directly into the code which refers to it, which may lead to compilation or JIT compiler optimisations which stop the benchmark from measuring the scenario that was intended. Using a State class and a non-final String class field is more likely to represent real world scenarios, where variable values are received and cannot be known in advance.
Create a method with the JMH @Benchmark
annotation to test the approach which wastefully calls String.split(String)
multiple times.
@Benchmark
public void repeatedSplit(State state, Blackhole blackhole) {
blackhole.consume(state.source.split(",")[0]);
blackhole.consume(state.source.split(",")[1]);
blackhole.consume(state.source.split(",")[2]);
blackhole.consume(state.source.split(",")[3]);
blackhole.consume(state.source.split(",")[4]);
blackhole.consume(state.source.split(",")[5]);
}
Note that this reads the input ("source") string from the State
object, so that the benchmark does not count the time it takes to create the input string. And the data extracted from each split is fed to the JMH Blackhole
object, to make sure that the Java compiler does not optimise away the code altogether.
@Benchmark
public void oneSplit(State state, Blackhole blackhole) {
String[] parts = state.source.split(",");
blackhole.consume(parts[0]);
blackhole.consume(parts[1]);
blackhole.consume(parts[2]);
blackhole.consume(parts[3]);
blackhole.consume(parts[4]);
blackhole.consume(parts[5]);
}
As before, the input string is read from the State
object, and the result of extracting each part is fed to the Blackhole
. But a local variable is used to hold the array returned, so there is just one call to String.split.
Other than the approach used to get the parts from the input string, both methods are exactly the same in their outward behaviour: they do something with each part extracted from a comma-separated list. The only difference we are measuring is in the process of splitting the parts out of the comma-separated list.
There are other ways to configure and run JMH benchmarks, but creating a main method allows all of the configuration to be visible within the source code.
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(RepeatedSplit.class.getSimpleName())
.warmupIterations(3)
.warmupTime(TimeValue.seconds(2))
.measurementIterations(5)
.measurementTime(TimeValue.seconds(3))
.forks(6)
.mode(Mode.Throughput)
.timeUnit(TimeUnit.SECONDS)
.shouldDoGC(true)
.addProfiler(GCProfiler.class)
.result("jmh_RepeatedSplit.json")
.resultFormat(ResultFormatType.JSON)
.build();
new Runner(opt).run();
}
An Options
object is defined and then a Runner
object created and run using those Options
. The options here add the current benchmark class (named RepeatedSplit), and specify that each "fork" should have 3 warmup iterations (of at least 2 seconds) followed by 5 measurement iterations (of at least 3 seconds), and there will be 6 "forks". This will give 5×6 = 30 measurement iterations in total.
Mode.Throughput
is used (to show ops/sec) and the GCProfiler is added along with a call to shouldDoGC(true)
so that the Java garbage collector is run and studied in order to estimate the amount of data allocated to heap memory.
JSON file output is requested by calling result
with the desired output file name, and calling resultFormat(ResultFormatType.JSON)
.
Below is the data reported by the JMH benchmark when run on my machine. To make it easier to read, the numbers have been truncated (but keep at least five significant digits).
Statistic | oneSplit | repeatedSplit |
---|---|---|
Throughput mean (ops/sec) | 5,845,193 | 961,293 |
Throughput CI 99.9% lower | 5,693,438 | 939,704 |
Throughput CI 99.9% upper | 5,996,947 | 982,883 |
gc.alloc.rate.norm mean (bytes/op) | 408.17 | 2,449.1 |
gc.alloc.rate.norm CI 99.9% lower | 408.16 | 2,449.0 |
gc.alloc.rate.norm CI 99.9% upper | 408.17 | 2,449.1 |
This data was gathered on a Windows 10 machine with an Intel® Core™ i5-1035G1 CPU and eight gibibytes of RAM, using JMH benchmarks compiled and run on OpenJDK 15.0.1 from within an IDE. (Be aware that for increased precision and reduced data noise, it would be recommended to run from a command line with no other applications running.)
Note that different hardware and different JDK and JVM versions will give different results, so if the findings are important then you should reproduce this benchmark on your own specific environment. However, with a difference as strong as is seen in this case, it seems likely that the "correct" approach will outperform the "wrong" approach on almost all systems.
Instead of calling a non-trivial method repeatedly just to produce the exact same result, call the method once and hold the result in a local variable. That will probably give better performance, and will probably make the code easier to read and work with.