The median of a given sample.

No. 157

Calculating the median

Q:

We want to extend exercise Adding support to retrieve statistical data. by adding a method int getMedian(). For this purpose our current implementation lacks ordering of input values. Consider the following sample of values:

2, 7, 0, -3, 4

Obtaining the median requires ordering these values:

-3, 0, 2, 4, 7

Thus the given sample's median is 2. Solve this exercise in the following order:

1. For testing and other purposes it is convenient to provide an additional method returning the array of values being added so far:

/**
* @return The array of values entered so far
*/
public int[] getValues() {
...
return ...;
}

Caution

Do not just return your internal array int[] values! Due to the amortized doubling implementation this will in most cases contain unused positions on top of added values.

You may either construct a suitable copy containing the current elements yourself or get enlightened by reading the API documentation of copyOfRange(...).

2. Provide some tests to assure your sorting implementation works well. You'll implement the actual sorting in the next step. Right now testing for correct sorting will fail (unless a given set of values had already been added in ascending order). A test might look like:

final int[]
unsortedValues = {0, -1, 5, 2, 7, 6},
sortedValues = {-1, 0, 2, 5, 6, 7};

IntegerStore store = new IntegerStore();

for (final int i: unsortedValues) {
}

// Now check your store for correctly sorted order of elements
...

Do not forget to consider value sets which include duplicates and write tests accordingly!

Hint: The Junit Framework provides a (convenience) method assertArrayEquals(...).

3. Modify your addValue(...) method's implementation. Though there are more elaborate sorting methods available in Java we'll do it the hard way ourselves in this exercise. Consider the following example:

store.addValue(1);
store.addValue(3);

Prior to inserting a new value our addValue(...) method shall find a suitable position inside the array of already added values to insert the new value. When adding the last value 3 in the above example the internal array already contains the values (1, 2, 7, 9). Traversing this array shows that the new value of 3 should be inserted between 2 and 7.

Thus a general strategy inserting a new value candidate might be:

1. Find the first index pointing to an existing value being larger or equal to the given candidate. In the above example this index value is 2 pointing to value 7.

If there is no such existing value just add the new value at the array's top end like you did before when not yet bothering about sorting.

2. Shift the right part of the array starting at index 2 in our example one position to the right thus creating a free (denoted by F) insert position:

Index values     | 0| 1| 2| 3| 4| 5| ...
-----------------+--+--+--+--+-----+ ...
values oldArray  | 1| 2| 7| 9|  |  |
-----------------+--+--+--+--+-----+ ...
values newArray  | 1| 2| F| 7| 9|  | ...

You may now insert your latest value 3 at the free index position 2 ending up with a well sorted array (1, 2, 3, 7, 9).

This example just illustrates a (very simple!) sorting algorithm.

Hint: On implementation be very careful with respect to off by one errors you are likely to encounter. The tests you have written beforehand will guide you.

4. Provide a constructor public IntegerStore(final int[] values) in a meaningful way with respect to median calculations.

5. Add a dummy implementation double getMedian(){return 0;} to your class IntegerStore from exercise Adding support to retrieve statistical data. .

6. Provide some tests both for even and uneven sample sizes. All of these will probably fail till you complete your implementation.

7. Finally complete the desired double getMedian() method's implementation and actually test it. There must be at least one element in order to be able returning a meaningful result:

/**
* <dl>
*  <dt><b>Precondition:</b></dt>
*  <dd>There must be at least one element.</dd>
* </dl>
*
* @return The sample's median.
*/
public double getMedian() {
...
return ... ;
}
8. Implement a main method as in Reading console input asking the user for an arbitrary number of values. Then compute both their average and median like:

How big is your sample? 5

Enter value #1 of 5: 1
Enter value #2 of 5: -2
Enter value #3 of 5: 1
Enter value #4 of 5: 5
Enter value #5 of 5: 2

Your sample's average is: 1.4
Your sample's median is: 1.0

A:

1. The copyOfRange(...) method in getValues() returns that portion of our int[] values array actually been filled with data.

2. Provide some tests assuring your sorting implementation works well prior to actually implementing the actual sorting in the next step. Right now testing for correct sorting will fail (unless a given set of values had already been added in ascending order). A test might look like:

final int[]
unsortedValues = {0, -1, 5, 2, 7, 6},
sortedValues = {-1, 0, 2, 5, 6, 7};

IntegerStore store = new IntegerStore();

for (final int i: unsortedValues) {
}

// Now check your store for correctly sorted order of elements
...

Do not forget to consider value sets which include duplicates and write tests accordingly!

Hint: The Junit Framework provides a (convenient) method assertArrayEquals(...).

3. Modify your addValue(...) method's implementation. Though there are more elaborate sorting methods available in Java we'll do it the hard way ourselves in this exercise. Consider the following example:

store.addValue(1);
store.addValue(3);

Prior to inserting a new value our addValue(...) method shall find a suitable position inside the array of already added values to insert the new value. When adding the last value 3 in the above example the internal array already contains the values (1, 2, 7, 9). Traversing this array shows that the new value of 3 should be inserted between 2 and 7.

Thus a general strategy inserting a new value candidate might be:

1. Find the first index pointing to an existing value being larger or equal to the given candidate. In the above example this index value is 2 pointing to value 7.

If there is no such existing value just add the new value at the array's top as you did without bothering about sorting.

2. Shift the right part of the array starting at index 2 in our example one position to the right thus creating a free (denoted by F) insert position:

Index values     | 0| 1| 2| 3| 4| 5| ...
-----------------+--+--+--+--+-----+ ...
values oldArray  | 1| 2| 7| 9|  |  |
-----------------+--+--+--+--+-----+ ...
values newArray  | 1| 2| F| 7| 9|  | ...

You may now insert your latest value 3 at the free index position 2 ending up with a well sorted array (1, 2, 3, 7, 9).

This example just illustrates a (very simple!) sorting algorithm.

Hint: On implementation be very careful with respect to off by one errors you are likely to encounter. The tests you have written beforehand will guide you.

4. The constructor public IntegerStore(final int[] values) internally uses our addValue(...) method thus adding each array value one by one. Consider an alternative naive implementation:

public IntegerStore(final int[] values) {
this.values = values;
}

This will fail in most cases since the array parameter typically contains unsorted values.

Having this constructor in place also simplifies writing tests:

...
@Test
public void testMedian() {
IntegerStore store = new IntegerStore(new int[] {2, 7, 0, -3, 4});
assertArrayEquals(new int[] {-3, 0, 2, 4, 7}, store.getValues());
assertTrue(Math.abs(2. - store.getMedian()) < 1.E-10);
...
5. -

6. @Test
public void testMedian() {
IntegerStore store = new IntegerStore(new int[] {2, 7, 0, -3, 4});
assertArrayEquals(new int[] {-3, 0, 2, 4, 7}, store.getValues());
assertTrue(Math.abs(2. - store.getMedian()) < 1.E-10);

assertArrayEquals(new int[] {-3, 0, 2, 4, 7, 7}, store.getValues());
assertTrue(Math.abs(3. - store.getMedian()) < 1.E-10);

}
7. double getMedian()
8. main(...)