Collections : Basics

A collection is simply an object that groups multiple elements into a single unit. Collections are used to store, retrieve, manipulate, and communicate aggregate data.

JavaCollectionClassHeirarchy1

JavaCollectionClassHeirarchy2

Let us see few of the important object class methods and terms, that are important while dealing with collections.

toString()

public class TestToString {

public static void main(String[] args) {
TestToString aTestToString=new TestToString();
System.out.println(aTestToString);
}

}

This will produce an output something like this:
TestToString@621e605

It gives you the class name followed by the @ symbol, followed by the unsigned hexadecimal representation of the object’s hashcode.
Override toString() to have a meaningful meaning of the object.

public class TestToString {
public int value;
@Override
public String toString(){
return"This is TestToString with value "+this.value;
}

public static void main(String[] args) {
TestToString aTestToString=new TestToString();
aTestToString.value=1;
System.out.println(aTestToString);
}

}

Now the output: This is TestToString with value 1

equals()

Comparing two object references using the == operator evaluates to true only when both references refer to the same object (because == simply looks at the bits in the variable, and they’re either identical or they’re not).
When you really need to know if two references are identical, use ==. But when you need to know if the objects themselves (not the references) are equal, use the equals() method.
The equals() method in class Object uses only the == operator for comparisons, so unless you override equals(), two objects are considered equal only if the two references refer to the same object.

public class Employee {

private int id;

public int getId() {
return id;
}

public void setId(int id) {
this.id = id;
}

@Override
public boolean equals(Object obj) {
if ((obj instanceof Employee) && (((Employee) obj).getId() == this.id)) {
return true;
} else {
return false;
}

}

}

Our Test class

public class EqualsTest {
public static void main(String[] args) {
Employee aEmployee=new Employee();
aEmployee.setId(123);
Employee bEmployee=new Employee();
bEmployee.setId(123);
System.out.println(aEmployee.equals(bEmployee));
}

}

This will return true.

equals() and hashCode() are bound together by a joint contract that specifies if two objects are considered equal using the equals() method, then they must have identical hashcode values. So to be truly safe, your rule of thumb should be, if you override equals(), override hashCode() as well.

hashCode()

Hashcodes are typically used to increase the performance of large collections of data. The hashcode value of an object is used by some collection classes . Although you can think of it as kind of an object ID number, it isn’t necessarily unique. Collections such as HashMap and HashSet use the hashcode value of an object to determine how the object should be stored in the collection, and the hashcode is used again to help locate the object in the collection.
In real-life hashing, it’s not uncommon to have more than one entry in a bucket. Hashing retrieval is a two-step process.
1. Find the right bucket (using hashCode())
2. Search the bucket for the right element (using equals() ).

A hashCode() that returns the same value for all instances whether they’re equal or not is still a legal, for eg.

public int hashCode() { return 123; }

This hashCode() method is horribly inefficient, because it makes all objects land in the same bucket, but even so, the object can still be found as the collection cranks through the one and only bucket—using equals().

Condition

Required

Not Required (But Allowed)

x.equals(y) == true

x.hashCode() ==

y.hashCode()

x.hashCode() ==

y.hashCode()

x.equals(y) == true

x.equals(y) == false

No hashCode()

requirements

x.hashCode() !=

y.hashCode()

x.equals(y) == false

Another important point here is that transient variables should not be used in generating hashcode, reason being when you make an object where transient variable is having some value which is being used for generating hash and we save the object using serialization. Now when we retrieve the object using deserialzation, the transient variable is back to its default value and the hash code changes.

Remember that the equals(), hashCode(), and toString() methods are all public.

Ordered

When a collection is ordered, it means you can iterate through the collection in a specific (not-random) order. A Hashtable collection is not ordered, you won’t find any order when you iterate through the Hashtable. An ArrayList, however, keeps the order established by the elements’ index position (just like an array).

Sorted

A sorted collection means that the order in the collection is determined according to some rule or rules, known as the sort order. A sort order has nothing to do with when an object was added to the collection, or when was the last time it was accessed, or what “position” it was added at. Sorting is done based on properties of the objects themselves. You put objects into the collection, and the collection will figure out what order to put them in, based on the sort order.
Most commonly, the sort order used is something called the natural order.For a collection of String objects, then, the natural order is alphabetical. For Integer objects, the natural order is by numeric value—1 before 2, and so on. There is no natural order for TestClass unless or until the TestClass developer provides one, through an interface (Comparable)that defines how instances of a class can be compared to one another (does instance a come before b, or does instance b come before a?). Aside from natural order as specified by the Comparable interface, it’s also possible to define other, different sort orders using another interface: Comparator.
Total ordering means all values can be compared to all other values. For example, if you have a collection of Integer and String there is no natural total order (but you could invent one). In Java, the Natural order is defined as the ordering provided by the JVM. This might not match what a people might believe is the natural order. e.g. Strings are sorted ASCIIbetically. meaning Z comes before a.

Comparable<E> Interface

The natural ordering of objects is specified by implementing the generic Comparable interface. (java.lang)

int compareTo(E o)

It returns a negative integer, zero, or a positive integer if the current object is less than, equal to, or greater than the specified object, based on the natural ordering. It throws a ClassCastException if the reference value passed in the argument cannot be compared to the current object.

Many of the standard classes in the Java API, such as the primitive wrapper classes, String, Date, and File, implement the Comparable interface. Objects implementing this interface can be used as

• elements in a sorted set

• keys in a sorted map

• elements in lists that are sorted manually using the Collections.sort() method

The natural ordering for String objects (and Character objects) is lexicographical ordering, i.e., their comparison is based on the Unicode value of each character in the strings. The natural ordering for objects of a numerical wrapper class is in ascending order of the values of the corresponding numerical primitive type.
Implementing the compareTo() method is not much different from implementing the equals() method. In fact, given that the functionality of the equals() method is a subset of the functionality of the compareTo() method, the equals() implementation can call the compareTo() method. This guarantees that the two methods are always consistent with one another.

Comparator<E> Interface

Precise control of ordering can be achieved by creating a customized comparator that imposes a specific total ordering on the elements. (java.util)

int compare(E o1, E o2)

The compare() method returns a negative integer, zero, or a positive integer if the first object is less than, equal to, or greater than the second object, according to the total ordering, i.e., it’s contract is equivalent to that of the compareTo() method of the Comparable interface. Since this method tests for equality, it is strongly recommended that its implementation does not contradict the semantics of the equals() method.

Let us see an example of both:

import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.List;

public class Person implements Comparable<Person>, Comparator<Person> {

private String name;
private int age;

public Person() {

}

public Person(String name, int age) {
this.name = name;
this.age = age;
}

public String getName() {
return name;
}

public int getAge() {
return age;
}

@Override
public int compareTo(Person o) {
return (this.name).compareTo(o.name);
}

@Override
public int compare(Person o1, Person o2) {
return o1.age - o2.age;
}

public static void main(String[] args) {
List<Person> list = new ArrayList<Person>();
list.add(new Person("jake", 20));
list.add(new Person("sam", 30));
list.add(new Person("dave", 26));
list.add(new Person("roger", 28));

Collections.sort(list);// Sorts the array list

for (Person a : list)
// printing the sorted list of names
System.out.print(a.getName() + "  : " + a.getAge() + ", ");

// Sorts the array list using comparator
Collections.sort(list, new Person());
System.out.println(" ");
for (Person a : list)
// printing the sorted list of ages
System.out.print(a.getName() + "  : " + a.getAge() + ", ");
}

}

Output:
dave  : 26, jake  : 20, roger  : 28, sam  : 30,
jake  : 20, dave  : 26, roger  : 28, sam  : 30,

Why doesn’t Collection extend Cloneable and Serializable?

Collection is an interface that specifies a group of objects known as elements. The details of how the group of elements is maintained is left up to the concrete implementations of Collection. For example, some Collection implementations like List allow duplicate elements whereas other implementations like Set don’t. A lot of the Collection implementations have a public clone method. However, it does’t really make sense to include it in all implementations of Collection. This is because Collection is an abstract representation. What matters is the implementation. The semantics and the implications of either cloning or serializing come into play when dealing with the actual implementation; that is, the concrete implementation should decide how it should be cloned or serialized, or even if it can be cloned or serialized. In some cases, depending on what the actual backing-implementation is, cloning and serialization may not make much sense. So mandating cloning and serialization in all implementations is actually less flexible and more restrictive. The specific implementation should make the decision as to whether it can be cloned or serialized.
If the client doesn’t know the actual type of a Collection, it’s much more flexible and less error prone to have the client decide what type of Collection is desired, create an empty Collection of this type, and use the addAll method to copy the elements of the original collection into the new one.

Grab all the code from my GitHub repository.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: