Binary Search

Wow that is ancient technology you might think – can more be written about that? Probably yes πŸ™‚

So Binary Search has been around for as long as we’ve had sorted lists and the need to locate a specific element in the list. If you have studied Computer Science you most probably has come across binary search algorithms.

If you need to locate an element in a list and just start searching from one end you in average will do n/2 comparisons if there are n elements in the list. That is O(n) performance.

That is probably fine if the list always contains a limited number of elements or you don’t have performance as a requirement.

If the elements you are dealing with can be compared you can sort the list and use binary search to locate the element. The algorithm basically starts in the middle of the sorted list and if the element you are searching for is “smaller” than the middle element the algorithm repeats for the middle element in the first half of the list, and if it is “larger” it repeats in the middle for the last half of the list. And so on, until the element is located.

So with n elements in the list the algorithm does on average log2(n) comparisons. That is O(log n) performance. If there are 1 million elements in the sorted list the algorithm will do at most 19 comparisons before the element is found or it is determined that the element does not exist.

So why are we discussing Binary Search again? Doesn’t JAVA have binary search like built-in?

Absolutely for example Collections has a few variants and so does Arrays. But my use-case is a bit special so neither fits and I felt like rolling my own πŸ™‚

So in my case I have a number of ranges given and I need to relatively fast locate a given number in one of the ranges. For example 4455000000 belongs to the range 4450000000..4459999999. In the particular challenge these strings of digits are actually not treated as numbers and they might actually have different lengths, so 44550000 also would belong to the range 4450000000..4459999999…

So I have a list of ranges, they can be sorted, and I have a string of digits and need to know which range it belongs to. That of course could be solved by normalizing my single string of digits and turn it into a short range like 4455000000..4455000000 and the standard Collections.binarySearch() would probably do nicely.

But that is no fun. Also the range contains other stuff than just the low and high values of the range and it would feel unnatural to turn my string of digits into a range. In my “day-time job” implementation (which is not the same as the one presented here) the normalization takes place in the comparison so we never have to bother about which length of the string of digits is the right. It could be 16, it could be 18 or even 19 – we don’t care and do not have to make a decision about that πŸ˜‰

So I decided to roll my own completely generic BinarySearcher<T> that can do binary searches on sorted lists locating anything that can be compared to the elements in the list. It uses a BiComparator<T, K> because it needs to be able to compare elements of potentially different types. As the BinarySearcherSimpleIntegerTest illustrates it can also be used with simple types.

In my setup the BinarySearcher owns the list of elements to be searched because it makes best sense in my setup. But it can easily be turned into a static method and be used in other setups.

Enjoy πŸ˜€

Posted in Java | Tagged , | Leave a comment

Mutable

So you might wonder – “Mutable” – what is that all about?

In general we prefer things to be Immutable. With immutability comes improved performance because we do not have to copy things around and the JIT can do some optimizations. And there are less surprises, less WTFs because referenced things doesn’t suddenly change state.

One of my favorites is Optional. It either has a present value or is empty. Even after a few thousand lines of code a reference to an Optional is either empty or has the value it was initialized with. Love it! No surprises, no WTFs…

It also has a nice stream-like API with map() and filter() and whatnot.

Then in java 16 came records. And I love records. With a few lines of code you define immutable data carriers with sane equals(), hashCode() and toString() implementations and natural named accessors not prefixed with get as in the JavaBeans legacy. Love it! No surprises, no WTFs…

So whenever I’m going to introduce a class with the primary responsibility of moving some information around I try to use records. Sometimes annotated with lombok @Builder because it makes construction of these instances somewhat easier and easily split over multiple lines…

And as already mentioned one of the selling points of records is their immutability!

Then recently I was refactoring a rather complex piece of code that dealt with parsing some interesting external files into some rather complex hierarchies and at the bottom sometimes enhancing higher-level items with some information from internal systems. In other words, adding information at a later stage, information not available at construction time.

I believe we’ve all been there…

And I wanted to refactor the code to use records simply because there is much less code to maintain (less TCO) and there are less … surprises…

This phrase seems to come up often… “Less surprises”… Believe me I am getting old and with age comes some resistance towards surprises. I don’t need them and basically want less of them. And compared to how I coded like 25 years ago that mentality has changed my coding style A LOT! So, less surprises, less WTFs are important to me πŸ™‚
My younger co-workers might find me grumpy at times, but with less surprises comes increased maintainability and with that less TCO. So… Lets get on with it!

To solve my hierarchy files parsing issue I needed a TopLevel type and a Child type, they have things in common so they implement a HierarchyElement interface with reference to parent() which for TopLevel is it self and for Child is another child or a TopLevel. Both has a children() list and some other stuff. And TopLevel has a field that can only be set when parsing a deep child and some children might get values from another system later, before we’re done parsing… Got it so far? πŸ˜‰

One way of doing this would be to – when we have the values for the “late fields” – replace the Child or the TopLevel with a copy of the existing with a field or two added…
With a complex hierarchy that is just not trivial. Doable but not trivial and the code would be… Complex and full of surprises. Some junior developer might come along and think WTF – not needed – DELETE… Hmmm…

So I decided to introduce Mutable<T>. I could have used AtomicReference<T> but I need no concurrency and has some other needs I might as well cater for.

One thing that really annoyed me in the original implementation was 9 lines of code that basically checked if the top- or the child already had a field set and if not, set the field to some new value. All those ifs-and-buts needs to be unit-tested and then you have some very complex unit-tests to cater for something extremely simple. Must be tested or it does not work but considering TCO…

if (someForeignThing != null) {
  if (top.getSomeField() != null) {
    top.setSomeField(someForeignThing.getValueForField());
  }
  if (child.getSomeField() != null) {
    child.setSomeField(someForeignThing.getAnotherValue());
  }
  if (child.getAnotherField() != null) {
    child.setAnotherField(someForeignThing.getCompletelyDifferentField());
  }
}

So I came up with Mutable<T> to make it possible to set some rather important fields later. Both the TopLevel and Child types already has mutable collections (list of children) that are set to new ArrayList<>() so having one or two other mutable fields does not seem too arcane.

What should it look like… Well, it must have get() and set() like in AtomicReference<T>. Except I like methods that change a single state of an object to return the previous state so we have T set(T newValue).

And since it can supply it’s current value and it can consume a T, why not have it implement Supplier<T> and Consumer<T>. Supplier uses the method accept() to set a new value so we delegate that to the set() method.

And to boil the 9 lines of code down to three we introduce a setIfNull() method that only set the mutable value if it is already null and we have:

if (someForeignThing != null) {
  top.someField().setIfNull(someForeignThing.getValueForField());
  child.someField().setIfNull(someForeignThing.getAnotherValue());
  child.anotherField()
    .setIfNull(someForeignThing.getCompletelyDifferentField());
}

Less is more! Love it!! And the null checking only needs to be unit-tested with the Mutable<T> tests. And I do believe even a novice developer can see what is going on πŸ˜‰

So the Mutable<T> has a not-null concept. It looks very much like the Optional empty or not. But slightly different since the Optional is immutable so either it has a value present or it is empty. Even the API states this. My mutable has a null or not-null concept, empty is something slightly different and the not-null state can change to another not-null state and even back. So decided to go with a isNull() method simply because it describes the situation better. Also there is no NoSuchElementException involved when get() returns null πŸ˜‰

We are not trying to solve the infamous NullPointerException thing here πŸ˜€

So the core of it is like the following:

public final class Mutable<T> implements Supplier<T>, Consumer<T> {
    private T value;
    public boolean isNull() {
        return value == null;
    }
    public boolean setIfNull(T value) {
        if (isNull()) {
            set(value);
            return true;
        }
        return false;
    }
    public T set(T value) {
        final var old = this.value;
        this.value = value;
        return old;
    }
    @Override
    public T get() {
        return value;
    }
    @Override
    public void accept(T value) {
        set(value);
    }
    @Override
    public String toString() {
        return String.valueOf(value);
    }
}

So get(), set() and isNull() are the very core part of the implementation – the other methods just delegates. Again this makes testing simpler and the JIT can definitely inline those.

The setIfNull() returns true if the Mutable was changed – that information might come in handy sometimes.

There is also a toString() but that is mostly for debugging purposes and for ensuring a relatively sane output in logs etc.

Things to consider

First, this thing is not thread-safe and is not meant to be. It is for temporary eventually lazy settable items in same-thread situations. If you need thread-safety and concurrency use the AtomicReference<T> and similar. Period πŸ˜‰

There are no equals() and hashCode() and why not?

Basically these methods are for things to be put in collections specifically in Sets and Maps. Since Mutables are … mutable … they should never be put into Sets or used as keys in Maps. Never ever. Remember surprises and WTFs.

One could consider implementing equals() – it is not difficult to come up with a reasonable implementation, eg:

@Override
public boolean equals(Object obj) {
    if (this == obj) {
        return true;
    }
    return obj instanceof Mutable<?> other
            && Objects.equals(value, other.value);
}

That would work perfectly fine for all trivial cases. But please do read the JavaDocs for Object.equals():

  • It is reflexive: for any non-null reference value x, x.equals(x) should return true.
  • It is symmetric: for any non-null reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.
  • It is transitive: for any non-null reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.
  • It is consistent: for any non-null reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the objects is modified.
  • For any non-null reference value x, x.equals(null) should return false.

The fourth bullet – it is consistent – we cannot ensure that with a Mutable<T> since after a call to set(), setIfNull() or accept() the method might return something different. The documentation actually allows that, but I hate surprises and things that are equal one moment should still be equal even a few lines of code later…

And another thing is if we implement equals() we really should also implement hashCode() (and really consider compareTo() too but that is another even longer story) with a similar contract and we’re basically in deep trouble. DO NOT. NO NO DO NOT. We hate nasty surprises πŸ˜€ We like less WTFs.

By not implementing these methods they revert to the Object implementations which are based on object identity and are very deterministic. No surprises. We like πŸ˜‰

Look for nasty surprises here: SurprisesTest.java

Records in Sets and as keys in Maps

So since records are immutable they are perfectly safe to use in Sets and as keys in Maps?

Well not quite. If your implementation contains at least one mutable field (a Mutable<T>, an AtomicSomething or some mutable collection) then you are out for nasty surprises. Unless you specifically override equals() and hashCode() to be stable towards the mutable elements.

And then while you are at it, consider implementing Comparable<T>

So be careful out there, use the Mutable<T> with care, don’t ever overdo it and please do not ever be tempted to implement equals() and hashCode() or use instances in Set or as keys in Maps.

Posted in Java | Tagged , | Leave a comment

Generating garbage…

At JCrete the last couple of years I’ve had the pleasure to be socializing with, among others, HFT people like Peter Lawrey and Martin Thompson.

Those HFT dudes really makes me thinking when I’m implementing stuff. Thinking about how much garbage I create.

In High Frequency Trading systems you cannot afford garbage collection and therefore they are doing lots of tricks to get around generating garbage.

With the stuff I’m doing a gc pause is generally not critical, so generating garbage is no big deal. Writing easy to read- and maintain code is more important.

But then again, it does not harm to think about garbage.
Continue reading

Posted in Fun, Java | Tagged , | 1 Comment

SafeEquals

I’m currently designing a user-authentication OAuth2 based service.

I’m trying very hard never ever to reveal anything about users or passwords. Credential-lookup by userid is always done twice. If the user is not found a known dummy-user is looked up instead. If the user is found, a “by guarantee not existing” user is looked up. Just to try to make the timings of existing and not-existing users the same.

Internally a lot of hashing takes place with random and known stuff. Valid and invalid users and passwords all take exactly the same “route”.

At the end, byte arrays are compared, and if they contain the same binary information, the user is authenticated.

In order to ensure this check timing-wise is independent of the outcome, a special equals(byte[], byte[]) method is implemented. Also, a rather special counterpart, the neverEquals(byte[], byte[]) is implemented. This does the exact same comparisons as the equals method, just the outcome is always false, to be used when the user is invalid. Continue reading

Posted in Java, Open Source | Tagged , | Leave a comment

YN

Yeah, I know, this isn’t rocket science. But it is rather nice πŸ™‚

So, I’m working on this JEE project, backed by an Oracle database. Some of my work involves refactoring CMP 2.1 entities into JPA entities.

Here I’m faced with the issue, that sometimes true/false columns are modeled using CHAR(1) as ‘Y’ vs ‘N’. And at other times they are modeled using INTEGER as 1 vs 0. Sometimes both versions are using in the same table.

The natural datatypes in JPA are String and int. So I’m having code doing β€œY”.equals(colum) and column == 1 etc. For the String I’ve generally been using an enum called YN having 2 constants, Y and N. So I can do column == YN.Y. Still a bit annoying though…

Just today I realized I could fix both issues with a simple version of my YN enum below. This works both for the CHAR ‘Y’ / ‘N’ case and the INTEGER 1 / 0 case.

And it’s damn simple containing 2 simple convenience methods that does make life simpler:

/**
 * Being used in JPA entities as @Enumerated(EnumType.STRING) or @Enumerated(EnumType.ORDINAL) (N=0, Y=1)
 */
public enum YN {
    N, Y;
 
    public final boolean bool() {
        return this == Y;
    }
    public static final YN bool(boolean b) {
        return b ? Y : N;
    }
}

Now I can do column.bool() and I can do YN.bool(something) the other way. And the mapping in the JPA entity takes care of the rest.

Nice and simple, definitely not rocket science.

Posted in Databases, Java | Tagged , , , , , , | Leave a comment

Unit testing really simple implementations

As an independent subcontractor (freelance consultant) I get to work in various very different organisations, with very different approaches towards testing in general and unit-testing in particular.

Some places unit-testing is mandatory and there might even be a code-coverage tool where you must strive to meet a specific threshold.

At other places unit-tests are more like integration tests, where the unit-testing is considered implicit in the integration-unit-tests.

And then there are places where unit-tests are of your own choice, as long as you do not spend too much (measurable) time on them.

In the latter cases you as an experienced developer might feel that unit-testing very simple stuff is superfluous.

Not so! In my experience, it is when you do these simple no-brainer implementations that you make mistakes, simply because you do not have to think. Also, unit-testing simple utility stuff demonstrates corner-cases that might suddenly be used or mis-used in the code where changing even the simplest implementation might break important business logic

In this post I will go through a few very simple utility methods that I’ve worked on recently and talk about some of the finer details – and why unit-testing even the simplest is really important.

Continue reading

Posted in Java | Tagged , | Leave a comment

Bad sample code…

Okay, I just HATE bad code.

I’m not religious about how you express your business logic in the code. I don’t mind β€œdifferent” indentation or long methods although I try to keep my own code as short and simple as possible. I prefer lots of small classes over a few huge, and I prefer a bunch of short methods (β€œa screenfull”) over a few huge ones. Complex IF-clauses formed as simple methods etc.

Whenever I’m maintaining code developed by others, I accept the previous developers different habits. I might rewrite (also known as refactoring) parts, especially if there are unit-tests. But only if I think it would benefit the quality and maintainability of the code.

But, when reading books or articles I get really REALLY annoyed when I encounter bad code and bad habits in samples etc. Whenever I come across these things, where the quality is as if the developer is not really a JAVA developer, absentminded or just plain drunk, I loose interest in the rest of the content. As if, if the code is really bad (and untrustworthy) how can I then trust the rest of the content?

Continue reading

Posted in Fun, Java | 1 Comment

Handing over JAVA components to L3 support…

Java Specialist13:13

Hi

Udby, Jesper13:13

Hi

Java Specialist13:13

we are unable to build with existing set up

can you give us ear file to me

Udby, Jesper13:14

no, unfortunately not, I’m working from home today and do not have access to my dev env

Java Specialist13:14

i will put in weblogic server

Udby, Jesper13:14

I suggest you configure you’r maven setup to use the proper repo so you can build it. It has to be done anyways…

Java Specialist13:15

can you share your maven to me

_

folder

Udby, Jesper13:15

no… its on m dev env too…

the settings.xml linked from the document i sent you contains information about the repo – you should be able to use that?

which weblogic server btw?

Java Specialist13:18

weblogic server 11gR1

Udby, Jesper13:18

you have a local setup?

Java Specialist13:18

yes

Udby, Jesper13:18

nice:)

you got a local db setup too?

Java Specialist13:19

datasouce

Udby, Jesper13:20

yes, datasource. does it point to a local db or one of our dev environments?

Java Specialist13:20

yes

Udby, Jesper13:21

do you have a local db?

Java Specialist13:21

yes

Udby, Jesper13:21

nice:)

how’s the schema setup?

Java Specialist13:21

there

Udby, Jesper13:21

there?

Java Specialist13:21

yes

Udby, Jesper13:22

ok, i rephrase: how are the users/schemas, tables, sequences etc setup?

DDL

Java Specialist13:22

i will give schema name

usern and password

yes

Udby, Jesper13:23

i don’t know if you are aware of it, but “we” are using what is called “database migration” scripts to do the DDL

Java Specialist13:24

1 min

Udby, Jesper13:24

1 min what?

Java Specialist13:25

wait

database migrationΒ is no there with me

Udby, Jesper13:27

well thats not important if you are able to setup the schema/users, tables etc yourself

I was just wondering where you got the details from

Java Specialist13:28

webservice setup document

Udby, Jesper13:29

ok

Java Specialist13:31

Do you have this type of document with you_

Udby, Jesper13:32

nope, not here, might have it in the office…

Java Specialist13:32

ok

is there anu body who will share information_

Udby, Jesper13:33

what information?

Java Specialist13:34

regadring configuartion of java compenent with oSB

Udby, Jesper13:34

what component and what kind of configuration?

Java Specialist13:36

two ear file ie two java componet and after that i will confuge jMS queue with SFDCΒ 

and OSB

Udby, Jesper13:36

first, there are more than two components, you need to tell me which components you need information about

then I can tell you which datasouces and jms queues are necessary

but you can see that in the code, if you like?

take the xxx-feed-parser (from memory, you can see the proper name in the poms/mail/source etc)

Java Specialist13:38

ok

Udby, Jesper13:38

it needs a jms queue, can’t remember the jndi name, but it’s in the annotation in the MDB

and it needs a datasource, can’t remember the jndi name, but its in the persistence.xml file

Java Specialist13:39

i have JMSQUEue is there with me

Udby, Jesper13:39

what?

Java Specialist13:40

while i am doing yYYY project

Udby, Jesper13:40

ok, good, but its probably not the same queue:)

Java Specialist13:41

ok

Udby, Jesper13:41

you can see the jndi name of the jms queue in the annotation in the mdb – message driven bean

Java Specialist13:44

1min

wait

i am opening weblogic server

Java Specialist13:53

yes

Udby, Jesper13:54

yes?

Java Specialist13:55

i am able to jndi nameof the JMS queue

for YYYY

Udby, Jesper13:55

ok

I wasn’t referring to the jndi name of the jms queue in yyyy

you do know what an annotation is, right?

Java Specialist14:00

no

Udby, Jesper14:00

ok, you are in trouble

do you know what a message driven bean is?

Java Specialist14:01

yes

Udby, Jesper14:03

good

in the mdb (aka message driven bean) there is an annotation that tells it which jms queue it should connect to

i can’t remember the proper name of it, but somewhere in the xxx-feed-ejb (i think it is) there is the java source of a message driven bean – I even think it is called something like Xxx…Mdb or similar. In that source you can find the annotation containing the name of the jms queue

also, in that project, there is a persistence.xml – you do know what that is?

Java Specialist14:06

jesper please share me the ear file to deploy javacomponents in weblogic server

Udby, Jesper14:06

sorry, i cant

you should configure your maven to use the proper repo, then you can build it yourself

btw: there is more than one ear

one component = one ear

 

Posted in Fun, Java | Leave a comment

Salesforce wsc (sfdc-wsc): new release 23 coming

The current situation is a mess. There are now 3 known versions of the sfdc-wsc tool:

I will try to merge relevant changes from the codebase at mvnrepository but please notice it has already forked. The code style at mvnrepository is slightly different to my style and I take the liberty to refactor stuff in order to keep classes clean and without too much responsibility (e.g. ConnectorConfig should not have the responsibility of creating connections, adding stream wrappers and what not).

Released – wsc-23.jar – with the following:
Continue reading

Posted in Java, Salesforce | Tagged , , , | Leave a comment

Salesforce wsc hacking: yet another Open Source encounter?

As I wrote in an earlier post I’ve volunteered to become a committer to the wsc tool and are in the progress of making minor tweaks and enhancements for a coming release 23.

My plan was that this release should be no more than a “maintenance” release covering most of the relevant issues. And I do have a number of ideas for major changes for a possible release 24, e.g.:

  • Fixing issue 51 – or at least provide a usable implementation hopefully inspired by input from Martin Haynes
  • Reworking/reimplementing the wsdlc tool and templating mechanism
  • Reworking infrastructure parts, possibly using JAXB/JAX-WS
  • Mavenize the build, divide build tools and infrastructure, 2 perhaps 3 artifacts
  • Adding unit tests

I stopped further developments on the “23 branch” when I realized that there apparently already was a “fork” on github.
Continue reading

Posted in Java, Open Source, Salesforce | Tagged , | Leave a comment