## Objects, equals() and hash-values

No. 126

### Choosing a “good”hashCode() method

Q:

We consider:

public class TimePeriod {
private final int hours, minutes, seconds;
/**
* A time period within a day e.g. 4 hours, 23 minutes and 11 seconds.
*
* @param hours Hours ranging from 0 to 23
* @param minutes Minutes ranging from 0 to 59
* @param seconds Seconds ranging from 0 to 59
*/
public TimePeriod(final int hours, final int minutes, final int seconds) {
this.hours = hours;
this.minutes = minutes;
this.seconds = seconds;
}

@Override
public boolean equals(final Object o) {
if (o instanceof TimePeriod) {
final TimePeriod other = (TimePeriod) o;
return hours == other.hours &&
minutes == other.minutes &&
seconds == other.seconds;
} else {
return false;
}
}
}

Which of the two hashCode() implementations is better with respect to quickly telling whether two TimePeriod instances are having the same value or not?

Method 1 Method 2
public class TimePeriod {
...
@Override
public int hashCode() {
return seconds + 60 * (minutes + 60 * hours);
}
}
public class TimePeriod {
...
@Override
public int hashCode() {
return seconds + minutes + hours;
}
}

### Tip

Excerpt from Object.hashCode():

However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.

A:

According to Figure 423, “Hashing in Java and equals() part of a hashCode() methods contract is:

true == a.equals(b)a.hashCode() == b.hashCode().

An ideal/perfect hashCode() method in addition will return different values whenever two instances a and b differ in value with respect to the underlying equals() method:

false == a.equals(b)a.hashCode() != b.hashCode().

Combining these two statements a perfect hashCode() method will have the following property with respect to its corresponding equals() method:

a.equals(b) == (a.hashCode() == b.hashCode())

Method 1 serves this purpose: Its returns a given TimePeriod instance into its total amount of seconds equivalent. Example: (1 hour, 2 minutes, 5 seconds) amounts to (1 * 60 + 2) * 60 + 5 = 3725 seconds. Due to its specification an instance of TimePeriod cannot exceed 24 hours being equivalent to 24 * 60 * 60 = 86400 seconds. Thus a 4-byte int is safe with respect to overflow issues.

Conclusion: Whenever two instances of TimePeriod differ their method 1 hashCode() values will differ as well.

This does not hold with respect to method 2. Consider two instances (1 hour, 2 minutes, 3 seconds) vs. (3 hours, 2 minutes, 1 second) returning an identical hashCode() of 1 + 2 + 3 = 3 + 2 + 1 = 6. So method 2 requiring just two additions offers (slightly) better runtime performance at the expense of a higher hash value collision rate.

No. 127

### String and good hashCode() implementations.

Q:

In the previous exercise we found an ideal hashCode() implementation:

public class TimePeriod {
...
@Override
public int hashCode() {
return seconds + 60 * (minutes + 60 * hours);
}
}

Is this possible for instances of String as well?

### Tip

Consider the possible number of different strings.

A:

It is not possible to construct a perfect hashCode() method acting on arbitrary strings. A Java String consists of individual char elements each requiring two bytes. Considering strings of fixed length we have the following number of different strings:

Number of chars Number of possible strings
1 $2 16$
2 $2 2 × 16$
3 $2 3 × 16$

A four byte int only offers $2 32$ different values. Thus even mapping just one- and two-Unicode character strings exceeds the number of different int values thus requiring different string instances being mapped to identical hash values. Consider for example:

Code Execution result
System.out.println("hashcode of AA: " + "Aa".hashCode());
System.out.println("hashcode of BB: " + "BB".hashCode());
hashcode of AA: 2112
hashcode of BB: 2112