Bucket Sort Algorithm (Java, Python, C/C++ Code Examples)
What is Bucket Sort?
Bucket sort, often called bin sort, is a comparison sort method that accepts an unsorted array as an input and produces a sorted array as a result. This method works by distributing the elements into several buckets and sorting each of those buckets individually by any sorting algorithm such as insertion sort. Then all the buckets are merged together to form a sorted array.
Bucket sort is commonly used when the elements are-
- Floating-point values
- Uniformly distributed over a range
Bucket sort’s time complexity depends on the number of buckets used and the uniformity of the input distribution. While different sorting algorithms such as shell sort, merge sort, heapsort, and quicksort can achieve the best-case time complexity of O(n*logn), the bucket sorting algorithm can achieve the same in linear time complexity or O(n).
Bucket sort follows the scatter-gather approach. Applying this approach, elements are scattered into the corresponding buckets, sorted in the buckets, and gathered to form a sorted array as the final step. This scatter-gather approach is discussed in the following section
Scatter-Gather-Approach
Large-scale, complex problems can occasionally be challenging to solve. The scatter-gather approach attempts to solve such problems by dividing the entire data set into clusters. Then each cluster is addressed separately and brought everything back together to get the final answer.
This is how the bucket sorting algorithm implements the scatter-gather method:
How Bucket Sort Works
The basic working principle of bucket sort is as follows:
- A set of empty buckets is created. Based on different policies, the number of buckets can differ.
- From the input array, put each element into its corresponding bucket.
- Sort those buckets individually.
- Concatenate the sorted buckets to create a single output array.
The in-detailed work steps are provided in the following sections.
Pseudo Code
Start Create N empty buckets For each array element: Calculate bucket index Put that element into the corresponding bucket For each bucket: Sort elements within each bucket Merge all the elements from each bucket Output the sorted array End
Method 1: Bucket Sort Algorithm for Floating-Point Numbers
The bucket sorting algorithm for floating point numbers within the range from 0.0 to 1.0:
Step 1) Create ten(10) empty buckets such that the first bucket will contain numbers within the range [0.0, 0.1). Then the second bucket will contain within [0.1, 0.2), and so on.
Step 2) For each array element:
-
a. Calculate bucket index using the formula:
bucket index= no_of_buckets *array_element
-
b. Insert the element into the bucket[bucket_index]
Step 3) Sort each bucket individually using insertion sort.
Step 4) Concatenate all buckets into a single array.
Let’s see a bucket sort example. For this example, we will sort the following array using the bucket sort algorithm-
Step 1) First, we will create 10 empty buckets. The first bucket will contain the numbers between [0.0, 0.1). Then the second bucket will contain the numbers between [0.1, 0.2) and so on.
Step 2) For each array element, we will calculate the bucket index and place the element into that bucket.
The bucket index can be calculated using the formula:
bucket_index= no_of_buckets*array_element
Bucket Index Calculation:
a) 0.78
bucket_index = no_of_buckets*array_element
=10*0.78
= 7.8
Hence, the element 0.78 will be stored on the bucket[floor(7.8)] or bucket[7].
b) 0.17
bucket_index = no_of_buckets * array element
=10*0.17
= 1.7
The array element 0.17 will be stored on bucket[floor(1.7)] or bucket[1].
c) 0.39
bucket_index = no_of_buckets * array element
= 10*0.39
= 3.9
0.39 will be stored on bucket[floor(3.9)] or bucket[3].
After iterating over all array elements, the buckets will be the following:
Step 3) Then, each bucket will be sorted using insertion sort. After the sorting operation, the output would be:
Step 4) In the final step, the buckets will be concatenated into a single array. That array will be the sorted outcome of the input.
Each bucket will be concatenated to the output array. For example-the concatenation of the second bucket elements:
The concatenation of the last bucket elements will be the following:
After concatenation, the resulting array will be the desired sorted array.
Bucket Sort Program in C/C++
Input:
//Bucket Sort Program in C/C++ //For not having integer parts #include <bits/stdc++.h> #define BUCKET_SIZE 10 using namespace std; void bucketSort(float input[], int array_size) { vector <float>bucket[BUCKET_SIZE]; for (int i = 0; i < array_size; i++) { int index = BUCKET_SIZE*input[i]; bucket[index].push_back(input[i]); } for (int i = 0; i < BUCKET_SIZE; i++) sort(bucket[i].begin(), bucket[i].end()); int out_index = 0; for (int i = 0; i < BUCKET_SIZE; i++) for (int j = 0; j < bucket[i].size(); j++) input[out_index++] = bucket[i][j]; } int main() { float input[]={0.78,0.17,0.39,0.26,0.72,0.94,0.21,0.12,0.23,0.69}; int array_size = sizeof(input)/sizeof(input[0]); bucketSort(input, array_size); cout <<"Sorted Output: \n"; for (int i = 0; i< array_size; i++) cout<<input[i]<<" "; return 0; }
Output:
Sorted Output: 0.12 0.17 0.21 0.23 0.26 0.39 0.69 0.72 0.78 0.94
Bucket Sort Program in Python
Input:
# Bucket Sort Program in Python # For not having integer parts def bucketSort(input): output = [] bucket_size = 10 for bucket in range(bucket_size): output.append([]) for element in input: index = int(bucket_size * element) output[index].append(element) for bucket in range(bucket_size): output[bucket] = sorted(output[bucket]) out_index = 0 for bucket in range(bucket_size): for element in range(len(output[bucket])): input[out_index] = output[bucket][element] out_index += 1 return input input = [0.78, 0.17, 0.39, 0.26, 0.72, 0.94, 0.21, 0.12, 0.23, 0.69] print("Sorted Output:") print(bucketSort(input))
Output:
Sorted Output: [0.12, 0.17, 0.21, 0.23, 0.26, 0.39, 0.69, 0.72, 0.78, 0.94]
Bucket Sort in Java
Input:
import java.util.ArrayList; import java.util.Collections; import java.util.List; public class BucketSort { private static final int BUCKET_SIZE = 10; public static void bucketSort(float[] input, int arraySize) { List < Float > [] bucket = new ArrayList[BUCKET_SIZE]; for (int i = 0; i < arraySize; i++) { int index = (int)(BUCKET_SIZE * input[i]); if (bucket[index] == null) { bucket[index] = new ArrayList < > (); } bucket[index].add(input[i]); } for (int i = 0; i < BUCKET_SIZE; i++) { if (bucket[i] != null) { Collections.sort(bucket[i]); } } int outIndex = 0; for (int i = 0; i < BUCKET_SIZE; i++) { if (bucket[i] != null) { for (float value: bucket[i]) { input[outIndex++] = value; } } } } public static void main(String[] args) { float[] input = {0.78f,0.17f,0.39f,0.26f,0.72f,0.94f,0.21f,0.12f,0.23f,0.69f}; int arraySize = input.length; bucketSort(input, arraySize); System.out.println("Sorted Output:"); for (int i = 0; i < arraySize; i++) { System.out.print(input[i]+" "); } } }
Output:
Sorted Output: 0.12 0.17 0.21 0.23 0.26 0.39 0.69 0.72 0.78 0.94
Method 2: Bucket Sort Algorithm for Integer Elements
The bucket sort algorithm for the input that contains numbers beyond the range [0.0, 1.0] is a little bit different than the previous algorithm. The steps required for this case are as follows-
Step 1) Find the maximum and minimum elements.
Step 2) Select the number of buckets, n, and initialize those buckets as empty.
Step 3) Calculate the range or span of each bucket using the formula:
span = (maximum - minimum)/n
Step 4) For each array element:
-
1.Calculate bucket index:
bucket_index = (element - minimum)/span
-
2.Insert the element into bucket[bucket_index]
Step 5) Sort each bucket using insertion sort.
Step 6) Concatenate all the buckets into a single array.
Let’s see an example of this bucket sort algorithm. For this example, we will sort the following array using the bucket sort algorithm-
Step 1) In the first step, the maximum and minimum elements of the given array need to be found. For this example, the maximum is 24 and the minimum is 1.
Step 2) Now, we have to select a number of empty buckets, n. In this example, we will take 5 buckets. Then we will initialize them as empty.
Step 3) The span of each bucket needs to be calculated by the following formula:
span = (maximum-minimum)/n = (24-1)/5 = 4
;
Hence, the first bucket will contain the numbers within the range [0, 5). The second bucket will contain the numbers within [5, 10) and so on.
Step 4) For each array element, we will calculate the bucket index and place the element into that bucket. The bucket index can be calculated using the formula:
bucket_index = (element - minimum)/span
Bucket Index Calculation:
-
a) 11bucket_index = (element – minimum)/span
=(11-1)/4
=2
Thus, element 11 will be stored in bucket[2].
-
b) 9
bucket_index = (element – minimum)/span
=(9-1)/4
=2
Note: As 9 is a boundary element for the bucket[1], it needs to be appended in bucket[1] instead of appending in the same bucket of the previous element.
After performing the operations for each element, the buckets will be as follows:
Step 5) Now, each bucket will be sorted using insertion sort. The buckets after sorting-
Step 6) In the final step, the buckets will be concatenated into a single array. That array will be the sorted outcome of the input.
Bucket Sort Program in C/C++
Input:
#include<bits/stdc++.h> using namespace std; void bucketSort(vector < double > & input, int No_Of_Buckets) { double max_value = * max_element(input.begin(), input.end()); double min_value = * min_element(input.begin(), input.end()); double span = (max_value - min_value) / No_Of_Buckets; vector<vector <double>> output; for (int i = 0; i < No_Of_Buckets; i++) output.push_back(vector <double> ()); for (int i = 0; i < input.size(); i++) { double difference = (input[i] - min_value) / span - int((input[i] - min_value) / span); if (difference == 0 && input[i] != min_value) output[int((input[i] - min_value) / span) - 1] .push_back(input[i]); else output[int((input[i] - min_value) / span)].push_back( input[i]); } for (int i = 0; i < output.size(); i++) { if (!output[i].empty()) sort(output[i].begin(), output[i].end()); } int index = 0; for (vector <double> & bucket: output) { if (!bucket.empty()) { for (double i: bucket) { input[index] = i; index++; } } } } int main() { vector <double> input ={11,9,21,8,17,19,13,1,24,12 }; int No_Of_Buckets = 5; bucketSort(input, No_Of_Buckets); cout<< "Sorted Output:"; for (int i; i < input.size(); i++) cout <<input[i]<<" "; return 0; }
Output:
Sorted Output:1 8 9 11 12 13 17 19 21 24
Bucket Sort Program in Python
Input:
def bucketSort(input, No_Of_Buckets): max_element = max(input) min_element = min(input) span = (max_element - min_element) / No_Of_Buckets output = [] for bucket in range(No_Of_Buckets): output.append([]) for element in range(len(input)): diff = (input[element] - min_element) / span - int( (input[element] - min_element) / span ) if diff == 0 and input[element] != min_element: output[int((input[element] - min_element) / span) - 1].append( input[element] ) else: output[int((input[element] - min_element) / span)].append(input[element]) for bucket in range(len(output)): if len(output[bucket]) != 0: output[bucket].sort() index = 0 for bucket in output: if bucket: for element in bucket: input[index] = element index = index + 1 input = [11, 9, 21, 8, 17, 19, 13, 1, 24, 12] No_Of_Buckets = 5 bucketSort(input, No_Of_Buckets) print("Sorted Output:\n", input)
Output:
Sorted Output: [1, 8, 9, 11, 12, 13, 17, 19, 21, 24]
Bucket Sort in Java
Input:
import java.util.ArrayList; import java.util.Collections; import java.util.List; public class BucketSort { public static void bucketSort(List < Double > input, int No_Of_Buckets) { double max_value = Collections.max(input); double min_value = Collections.min(input); double span =(max_value - min_value) / No_Of_Buckets; List < List < Double > > output = new ArrayList < > (); for (int i = 0; i < No_Of_Buckets; i++) { output.add(new ArrayList < > ()); } for (Double value: input) { double difference = (value - min_value) / span - ((value - min_value) / span); if (difference == 0 && value != min_value) { output.get((int)((value - min_value) / span) - 1).add(value); } else { output.get((int)((value - min_value) / span)).add(value); } } for (List <Double> bucket: output) { if (!bucket.isEmpty()) { Collections.sort(bucket); } } int index = 0; for (List <Double> bucket: output) { if (!bucket.isEmpty()) { for (Double value: bucket) { input.set(index,value); index++; } } } } public static void main(String[] args) { List <Double> input = new ArrayList <> (); input.add(11.0); input.add(9.0); input.add(21.0); input.add(8.0); input.add(17.0); input.add(19.0); input.add(13.0); input.add(1.0); input.add(24.0); input.add(12.0); int No_Of_Buckets = 5; bucketSort(input, No_Of_Buckets); System.out.println("Sorted Output:"); for (Double value: input) { System.out.print(value + " "); } } }
Output:
Sorted Output: 1.0 8.0 9.0 11.0 12.0 13.0 17.0 19.0 21.0 24.0
Pros & Cons
Pros | Cons |
---|---|
Perform faster computation | Consumes more space compared to other algorithms |
It can be used as an external sorting method | Performs poorly when the data is not uniformly distributed |
Buckets can be processed independently |
Bucket Sort Complexity Analysis
Bucket Sort’s Time Complexity
- Best Case Complexity:If all the array elements are uniformly distributed and sorted beforehand, it would require O(n) time to scatter the elements into the corresponding buckets. Then sorting each bucket using insertion sort would cost O(k). Thus the overall complexity would be O(n+k).
- Average Case Complexity:For average cases, we assume the inputs are uniformly distributed. Thus the bucket sorting algorithm achieves linear time complexity of O(n+k). Here O(n) time is required for scattering the elements and O(k) time is required for sorting those elements using insertion sort.
- Worst Case Complexity:In the worst case, the elements will not be uniformly distributed and concentrated over one or two specific buckets. In that case, bucket sort will work as a bubble sort algorithm. Hence, in the worst case, the time complexity of a bucket sort would be O(n^2).
Space Complexity of Bucket Sort
Bucket sort’s space complexity is O(n*k). Here n is the number of elements and k is the number of buckets required.