Cough

How to sort data in Excel. Description of sorting algorithms and comparison of their performance

About what's in a word processor Microsoft Word you can create tables, almost all more or less active users of this program know. Yes, everything here is not as professionally implemented as in Excel, but for everyday needs the capabilities of a text editor are more than enough. We have already written quite a lot about the features of working with tables in Word, and in this article we will look at another topic.

How to sort a table alphabetically? Most likely, this is not the most popular question among Microsoft users, but not everyone knows the answer to it. In this article, we'll show you how to sort the contents of a table alphabetically, as well as how to sort a specific column.

1. Select the table with all its contents: to do this, place the cursor in its upper left corner, wait until the sign for moving the table appears (a small cross located in a square) and click on it.

2. Go to the tab "Layout"(chapter "Working with tables") and click on the button "Sorting" located in the group "Data".

Note: Before you start sorting the data in the table, we recommend cutting or copying the information contained in the header (first row) to another location. This will not only make sorting easier, but will also allow you to keep the table header in its place. If the position of the first row of the table is not important for you, and it should also be sorted alphabetically, select it too. You can also simply select the table without a header.

3. Select the required data sorting options in the window that opens.

If you want the data to be sorted relative to the first column, in the Sort By, Then By, Then By sections, set Columns 1.

If each column of the table should be sorted in alphabetical order, regardless of the other columns, you need to do this:

"Sort by"— “Columns 1”;
"Then by"— “Columns 2”;
"Then by"— “Columns 3.”

Note: In our example, we only sort the first column alphabetically.

In the case of text data, as in our example, the parameters "Type" And "By" for each line should be left unchanged ( "text" And "paragraphs", respectively). Actually, it is simply impossible to sort numerical data alphabetically.

The last column in the window " Sorting" is actually responsible for the sorting type:

"ascending"— in alphabetical order (from “A” to “Z”);
"descending"- in reverse alphabetical order (from “Z” to “A”).

4. By asking required values, press "OK" to close the window and see the changes.

5. The data in the table will be sorted alphabetically.

Don't forget to return the hat to its place. Click in the first cell of the table and click "CTRL+V" or button "Insert" in the group "Clipboard"(tab "Home").

Sort a single table column alphabetically

Sometimes you need to sort data in alphabetical order from only one column of a table. Moreover, this must be done in such a way that the information from all other columns remains in its place. If it concerns only the first column, you can use the method described above, doing it exactly the same way as we did in our example. If this is not the first column, do the following:

1. Select the table column that you want to sort alphabetically.

2. In the tab "Layout" in the tools group "Data" click the button "Sorting".

3. In the window that opens, in the section "First by" select initial sort option:

data of a specific cell (in our example this is the letter “B”);
indicate the sequence number of the selected column;
repeat similar action for "Then By" sections.

Note: Which sort type to choose (parameters "Sort by" And "Then by") depends on the data in the column cells. In our example, when the cells of the second column contain only letters for alphabetical sorting, it is enough to simply indicate in all sections "Columns 2". In this case, there is no need to perform the manipulations described below.

4. At the bottom of the window, select the option switch "List" to the required position:

"Title Row";
"No title line."

Note: The first parameter “attracts” the header to sorting, the second allows you to sort the column without taking into account the header.

5. Click the button below "Options".

6. In section "Sort Options" check the box next to the item "Columns only".

7. Closing the window "Sort Options"(“OK” button), make sure that the checkbox next to all sorting type items is checked "ascending"(alphabetical order) or "descending"(reverse alphabetical order).

8. Close the window by clicking "OK".

The column you select will be sorted alphabetically.

That's all, now you know how to sort a Word table alphabetically.

Sorting data is an integral part of data analysis. You may need to alphabetize names in a list, list inventory and sort it in descending order, or organize rows by color or icon. Data sorting helps you quickly visualize data and better understand, organize, and find it necessary information and ultimately make better decisions.

You can sort data by text (A to Z or Z to A), numbers (smallest to largest or largest to smallest), and dates and times (oldest to newest or newest to oldest) in one or more columns. You can also sort by custom lists that you create (for example, a list consisting of Large, Medium, and Small items), or by format, including cell color and font color, and by icons.

Notes:

Sorting text values

Notes: Possible problems

Sorting numbers

Notes:

Sort date and time values

Notes: Possible problems

Sort by multiple columns or rows

You may want to sort by two or more columns or rows to group the data with the same values in one column or row, and then sort those groups with the same values by another column or row. For example, if you have Department and Employee columns, you can sort first by Department (to group all employees by department) and then by Name (to alphabetize the names of employees in each department). You can sort by 64 columns simultaneously.

Note: To receive best results Column headers must be included in the range being sorted.

Sort by cell color, font color, or icon

If you formatted a cell range or table column manually or using conditional formatting using cell color or font color, you can also sort by color. You can also sort by a set of icons created using conditional formatting.

Sorting by custom lists

You can use custom lists to sort in a user-specified order. For example, a column might contain values that you want to sort by, such as High, Medium, and Low. How can I set the sorting to show "High" first, then "Medium", and finally "Low"? If you sort them alphabetically (A to Z), the "High" values will appear at the top, but behind them will be the "Low" values rather than the "Medium" ones. And when sorting from Z to A, the “Average” values will be at the very top. In reality, "Medium" values should always appear in the middle, regardless of the sort order. You can solve this problem by creating a custom list.

Case sensitive sorting

Sorting from left to right

Typically sorting is done from top to bottom, but values can be sorted from left to right.

Note: Tables do not support left-to-right sorting. First, convert the table into a range: select any cell in it and select the items Working with tables > Convert to range.

Note: When sorting rows that are part of a worksheet structure, Excel sorts groups highest level(level 1) in such a way that the order of detail rows or columns will not change, even if they are hidden.

Sort by part of the value in a column

To sort by part of the value in a column, such as part of the code (789- W.D.G.-34), last name (Regina Pokrovskaya) or first name (Pokrovskaya Regina), you first need to split the column into two or more parts so that the value you want to sort by is in its own column. To break down the values in a cell, you can use text functions or the Text Wizard. More information and examples, see the articles Splitting text into different cells and Splitting text into different columns using functions.

Sort a smaller range within a larger one

Warning: You can sort values in a range that is part of another range, but this is not recommended because it will break the connection between the sorted range and the original data. If you sort the data as shown below, the selected employees will be associated with other departments.

Fortunately, Excel issues a warning if it detects such an attempt:

If you did not intend to sort the data this way, select the option automatically expand the allocated range, otherwise - sort within specified selection.

If the result is not what you want, click the button Cancel .

Note: Sort in a similar way values in the table are not allowed.

Learn more about common sorting issues

If the results of sorting your data are not what you expected, do the following:

Check if the values returned by formulas have changed If the data you're sorting contains one or more formulas, the values they return may change when the worksheet is recalculated. In this case, reapply the sort to get the latest results.

Show hidden rows and columns before sorting Sort by column does not move hidden rows, and sort by row does not move hidden columns. Before sorting your data, it's a good idea to show hidden rows and columns.

Check the current locale setting The sort order depends on the selected language. Make sure that the control panels in section Regional settings or Region and Language The correct locale is set. For information about how to change the locale setting, see Microsoft Windows Help.

Enter column headings on only one line If you need to use multiple line headings, set word wrapping in the cell.

Enable or disable the title bar It is generally recommended to display a header row when sorting by columns because it makes the data easier to understand. By default, the value in the header is not included in the sort. But in some cases, you may want to enable or disable a header so that the value in the header is or is not included in the sort. Do one of the following:

To exclude the first row of data (column header) from sorting, on the tab Home in the group Editing click the button Sorting and Filter, select a command Custom sorting and check the box.

To include the first row of data in the sort (since it is not a column header), on the tab Home in the group Editing click the button Sorting and Filter, select a command Custom sorting and uncheck My data contains headers.

Hello, dear readers. How is a list sorted in ? Of course, you can do this manually, dragging one after another. Comfortable? Don't think. Let me tell you a better way.

I will show an example of my work using Word 2013 as an example, but this method will also work with Word 2010 and 2007 versions.

To demonstrate ascending sorting in Word, I'll use a small list of names.

Doing sorting in Word

Before you begin, you need to select it with the left mouse button. Then, on the tab " Home" in the section " Paragraph"There is a special button. Which? Check out the GIF below.

In the window " Sorting text» you can select the Data Type: text, number or date; and also choose a method: ascending or descending. I chose ascending and text type.

By the way, if you click on the button Options", then you can configure additional sorting options in Word.

Now, to complete our task, you need to click on the “ OK" After that, we got a list in which the names are arranged from A to Z.

If you need to sort in a Word table, the principle is the same. Select the column and do the same steps. And if you have numbers, then indicate in the type Numbers.

In general, that's all. Even if you need to sort alphabetically in Word 2010, there is nothing difficult about it, because the interfaces are similar.

Let's create an array in which the answer will be located after the algorithm is completed. We will insert elements from the original array one by one so that the elements in the response array are always sorted. Asymptotics in the average and worst case is O(n 2), in the best case O(n). It is more convenient to implement the algorithm in a different way (creating a new array and actually inserting something into it is relatively difficult): we will simply make sure that some prefix of the original array is sorted, instead of inserting we will change the current element with the previous one, while they are in the wrong order.

Implementation:

void insertionsort(int* l, int* r) ( for (int *i = l + 1; i< r; i++) { int* j = i; while (j >l && *(j - 1) > *j) ( swap(*(j - 1), *j); j--; ) ) )

Shell sort

We take the same idea as comb sort and apply it to insertion sort. Let's fix some distance. Then the elements of the array will be divided into classes - elements whose distance between them is a multiple of the fixed distance fall into one class. Let's sort each class by insertion sort. Unlike comb sorting, the optimal set of distances is unknown. There are quite a few sequences with different ratings. Shell sequence - the first element is equal to the length of the array, each next element is half the size of the previous one. Asymptotics in the worst case is O(n 2). Hibbard sequence – 2 n - 1, worst case asymptotics – O(n 1.5), Sedgwick sequence (the formula is non-trivial, you can see it at the link below) – O(n 4/3), Pratt (all products of powers of two and triples) - O(nlog 2 n). I note that all these sequences need to be calculated only up to the size of the array and run from larger to smaller (otherwise it will just be insertion sort). I also spent additional research and tested different sequences of the form s i = a * s i - 1 + k * s i - 1 (this was partly inspired by the empirical Tsiur sequence - one of the best distance sequences for small quantity elements). The best sequences turned out to be those with coefficients a = 3, k = 1/3; a = 4, k = 1/4 and a = 4, k = -1/5.

Some useful links:

Implementations:

void shellsort(int* l, int* r) ( int sz = r - l; int step = sz / 2; while (step >< r; i++) { int *j = i; int *diff = j - step; while (diff >= l && *diff > *j) ( swap(*diff, *j); j = diff; diff = j - step; ) ) step /= 2; ) ) void shellsorthib(int* l, int* r) ( int sz = r - l; if (sz<= 1) return; int step = 1; while (step < sz) step <<= 1; step >>= 1; step--; while (step >= 1) ( for (int *i = l + step; i< r; i++) { int *j = i; int *diff = j - step; while (diff >= l && *diff > *j) ( swap(*diff, *j); j = diff; diff = j - step; ) ) step /= 2; ) ) int steps; void shellsortsedgwick(int* l, int* r) ( int sz = r - l; steps = 1; int q = 1; while (steps * 3< sz) { if (q % 2 == 0) steps[q] = 9 * (1 << q) - 9 * (1 << (q / 2)) + 1; else steps[q] = 8 * (1 << q) - 6 * (1 << ((q + 1) / 2)) + 1; q++; } q--; for (; q > < r; i++) { int *j = i; int *diff = j - step; while (diff >= l && *diff > *j) ( swap(*diff, *j); j = diff; diff = j - step; ) ) ) ) void shellsortpratt(int* l, int* r) ( int sz = r - l; steps = 1; int cur = 1, q = 1;< sz; i++) { int cur = 1 << i; if (cur >sz / 2) break; for (int j = 1; j< sz; j++) { cur *= 3; if (cur >sz / 2) break; steps = cur; ) ) insertionsort(steps, steps + q); q--; for (; q >= 0; q--) ( int step = steps[q]; for (int *i = l + step; i< r; i++) { int *j = i; int *diff = j - step; while (diff >= l && *diff > *j) ( swap(*diff, *j); j = diff; diff = j - step; ) ) ) ) void myshell1(int* l, int* r) ( int sz = r - l, q = 1; steps = 1; while (steps< sz) { int s = steps; steps = s * 4 + s / 4; } q--; for (; q >= 0; q--) ( int step = steps[q]; for (int *i = l + step; i< r; i++) { int *j = i; int *diff = j - step; while (diff >= l && *diff > *j) ( swap(*diff, *j); j = diff; diff = j - step; ) ) ) ) void myshell2(int* l, int* r) ( int sz = r - l, q = 1; steps = 1; while (steps< sz) { int s = steps; steps = s * 3 + s / 3; } q--; for (; q >= 0; q--) ( int step = steps[q]; for (int *i = l + step; i< r; i++) { int *j = i; int *diff = j - step; while (diff >= l && *diff > *j) ( swap(*diff, *j); j = diff; diff = j - step; ) ) ) ) void myshell3(int* l, int* r) ( int sz = r - l, q = 1; steps = 1; while (steps< sz) { int s = steps; steps = s * 4 - s / 5; } q--; for (; q >= 0; q--) ( int step = steps[q]; for (int *i = l + step; i< r; i++) { int *j = i; int *diff = j - step; while (diff >= l && *diff > *j) ( swap(*diff, *j); j = diff; diff = j - step; ) ) ) )

Tree sort

We will insert elements into a binary search tree. After all the elements are inserted, it is enough to traverse the tree depth-wise and get a sorted array. If you use a balanced tree, such as red-black, the asymptotics will be O(nlogn) at worst, average and best case scenario. The implementation uses the multiset container.

Implementation:

void treesort(int* l, int* r) ( multiset m; for (int *i = l; i< r; i++) m.insert(*i); for (int q: m) *l = q, l++; }

Gnome sort

The algorithm is similar to insertion sort. We maintain a pointer to the current element; if it is larger than the previous one or it is the first one, we move the pointer to the position to the right, otherwise we change the positions of the current and previous elements and move to the left.

Implementation:

void gnomesort(int* l, int* r) ( int *i = l; while (i< r) { if (i == l || *(i - 1) <= *i) i++; else swap(*(i - 1), *i), i--; } }

Selection sort

At the next iteration, we will find the minimum in the array after the current element and change it with it, if necessary. Thus, after the i-th iteration, the first i elements will be in their places. Asymptotics: O(n 2) in the best, average and worst case. It should be noted that this sorting can be implemented in two ways - by storing the minimum and its index, or simply rearranging the current element with the one in question if they are in the wrong order. The first method turned out to be a little faster, which is why it was implemented.

Implementation:

void selectionsort(int* l, int* r) ( for (int *i = l; i< r; i++) { int minz = *i, *ind = i; for (int *j = i + 1; j < r; j++) { if (*j < minz) minz = *j, ind = j; } swap(*i, *ind); } }

Heapsort / Heapsort

Development of the idea of selection sorting. Let's use the data structure “heap” (or “pyramid”, hence the name of the algorithm). It allows you to get the minimum in O(1) by adding elements and extracting the minimum in O(logn). Thus, the asymptotic behavior is O(nlogn) in the worst, average and best case. I implemented the heap myself, although C++ has a priority_queue container, since this container is quite slow.

Implementation:

template class heap ( public: int size() ( return n; ) int top() ( return h; ) bool empty() ( return n == 0; ) void push(T a) ( h.push_back(a); SiftUp (n); n++; ) void pop() ( n--; swap(h[n], h); h.pop_back(); SiftDown(0); ) void clear() ( h.clear(); n = 0; ) T operator (int a) ( return h[a]; ) private: vector h; int n = 0; void SiftUp(int a) ( while (a) ( int p = (a - 1) / 2; if (h[p] > h[a]) swap(h[p], h[a]); else break ; a--; a /= 2; ) void SiftDown(int a) ( while (2 * a + 1< n) { int l = 2 * a + 1, r = 2 * a + 2; if (r == n) { if (h[l] < h[a]) swap(h[l], h[a]); break; } else if (h[l] <= h[r]) { if (h[l] < h[a]) { swap(h[l], h[a]); a = l; } else break; } else if (h[r] < h[a]) { swap(h[r], h[a]); a = r; } else break; } } }; void heapsort(int* l, int* r) { heaph; for (int *i = l; i< r; i++) h.push(*i); for (int *i = l; i < r; i++) { *i = h.top(); h.pop(); } }

Quick sort / Quicksort

Let's select some support element. After this, we will transfer all the elements smaller than it to the left, and the larger ones to the right. Let's call recursively from each of the parts. As a result, we get a sorted array, since each element smaller than the reference one came before each larger reference element. Asymptotics: O(nlogn) in average and best case, O(n 2). The worst estimate is achieved when the choice of support element is unsuccessful. My implementation of this algorithm is completely standard, we go simultaneously from the left and the right, find a couple of elements such that the left element is larger than the reference one, and the right one is smaller, and swap them. In addition to pure quick sort, sorting also took part in the comparison, switching to insertion sort when the number of elements is small. The constant was chosen by testing, and insertion sort is the best sort suitable for this task (although this should not make you think that it is the fastest of the quadratic sorts).

Implementation:

void quicksort(int* l, int* r) ( if (r - l<= 1) return; int z = *(l + (r - l) / 2); int* ll = l, *rr = r - 1; while (ll <= rr) { while (*ll < z) ll++; while (*rr >z) rr--; if(ll<= rr) { swap(*ll, *rr); ll++; rr--; } } if (l < rr) quicksort(l, rr + 1); if (ll < r) quicksort(ll, r); } void quickinssort(int* l, int* r) { if (r - l <= 32) { insertionsort(l, r); return; } int z = *(l + (r - l) / 2); int* ll = l, *rr = r - 1; while (ll <= rr) { while (*ll < z) ll++; while (*rr >z) rr--; if(ll<= rr) { swap(*ll, *rr); ll++; rr--; } } if (l < rr) quickinssort(l, rr + 1); if (ll < r) quickinssort(ll, r); }

Merge sort

Sorting based on the divide and conquer paradigm. Let's divide the array in half, recursively sort the parts, and then perform the merging procedure: we maintain two pointers, one to the current element of the first part, the second to the current element of the second part. From these two elements, select the minimum one, insert it into the response and move the pointer corresponding to the minimum. The merging works in O(n), logn levels in total, so the asymptotics is O(nlogn). It is efficient to create a temporary array in advance and pass it as an argument to the function. This sort is recursive, like fast, and therefore it is possible to switch to quadratic with a small number of elements.

Implementation:

void merge(int* l, int* m, int* r, int* temp) ( int *cl = l, *cr = m, cur = 0; while (cl< m && cr < r) { if (*cl < *cr) temp = *cl, cl++; else temp = *cr, cr++; } while (cl < m) temp = *cl, cl++; while (cr < r) temp = *cr, cr++; cur = 0; for (int* i = l; i < r; i++) *i = temp; } void _mergesort(int* l, int* r, int* temp) { if (r - l <= 1) return; int *m = l + (r - l) / 2; _mergesort(l, m, temp); _mergesort(m, r, temp); merge(l, m, r, temp); } void mergesort(int* l, int* r) { int* temp = new int; _mergesort(l, r, temp); delete temp; } void _mergeinssort(int* l, int* r, int* temp) { if (r - l <= 32) { insertionsort(l, r); return; } int *m = l + (r - l) / 2; _mergeinssort(l, m, temp); _mergeinssort(m, r, temp); merge(l, m, r, temp); } void mergeinssort(int* l, int* r) { int* temp = new int; _mergeinssort(l, r, temp); delete temp; }

Counting sort

Let's create an array of size r – l, where l is the minimum and r is the maximum element of the array. After this, we will go through the array and count the number of occurrences of each element. Now you can go through the array of values and write out each number as many times as needed. Asymptotics – O(n + r - l). You can modify this algorithm to make it stable: to do this, we will determine the place where the next number should be (these are just prefix sums in the array of values) and we will go through the original array from left to right, putting the element in the correct place and increasing the position by 1. This sorting was not tested because most tests contained numbers large enough to not create an array of the required size. However, it was useful nonetheless.

Bucket sort

(also known as basket and pocket sorting). Let l be the minimum and r the maximum element of the array. Let's divide the elements into blocks, the first will contain elements from l to l + k, the second - from l + k to l + 2k, etc., where k = (r – l) / number of blocks. In general, if the number of blocks is two, then this algorithm turns into a type of quick sort. The asymptotic behavior of this algorithm is unclear; the running time depends on both the input data and the number of blocks. It is argued that on successful data the running time is linear. Implementing this algorithm turned out to be one of the most difficult tasks. You can do it this way: simply create new arrays, sort them recursively and merge them. However, this approach is still quite slow and did not suit me. An effective implementation uses several ideas:

1) We will not create new arrays. To do this, we will use the counting sort technique - we will count the number of elements in each block, the prefix sums and, thus, the position of each element in the array.

2) We will not launch from empty blocks. Let's put the indices of non-empty blocks in a separate array and start only from them.

3) Let's check if the array is sorted. This will not worsen the running time, since you still need to make a pass to find the minimum and maximum, but it will allow the algorithm to speed up on partially sorted data, since elements are inserted into new blocks in the same order as in the original array.

4) Since the algorithm turned out to be quite cumbersome, with a small number of elements it is extremely ineffective. To such an extent that switching to insertion sort speeds up the work by about 10 times.

All that remains is to understand how many blocks you need to select. In randomized tests, I was able to get the following estimate: 1500 blocks for 10 7 elements and 3000 for 10 8. It was not possible to find a formula - the operating time deteriorated several times.

Implementation:

void _newbucketsort(int* l, int* r, int* temp) ( if (r - l<= 64) { insertionsort(l, r); return; } int minz = *l, maxz = *l; bool is_sorted = true; for (int *i = l + 1; i < r; i++) { minz = min(minz, *i); maxz = max(maxz, *i); if (*i < *(i - 1)) is_sorted = false; } if (is_sorted) return; int diff = maxz - minz + 1; int numbuckets; if (r - l <= 1e7) numbuckets = 1500; else numbuckets = 3000; int range = (diff + numbuckets - 1) / numbuckets; int* cnt = new int; for (int i = 0; i <= numbuckets; i++) cnt[i] = 0; int cur = 0; for (int* i = l; i < r; i++) { temp = *i; int ind = (*i - minz) / range; cnt++; } int sz = 0; for (int i = 1; i <= numbuckets; i++) if (cnt[i]) sz++; int* run = new int; cur = 0; for (int i = 1; i <= numbuckets; i++) if (cnt[i]) run = i - 1; for (int i = 1; i <= numbuckets; i++) cnt[i] += cnt; cur = 0; for (int *i = l; i < r; i++) { int ind = (temp - minz) / range; *(l + cnt) = temp; cur++; cnt++; } for (int i = 0; i < sz; i++) { int r = run[i]; if (r != 0) _newbucketsort(l + cnt, l + cnt[r], temp); else _newbucketsort(l, l + cnt[r], temp); } delete run; delete cnt; } void newbucketsort(int* l, int* r) { int *temp = new int; _newbucketsort(l, r, temp); delete temp; }

Radix sort

(also known as digital sorting). There are two versions of this sort, which, in my opinion, have little in common, except for the idea of using a number representation in some number system (for example, binary).

LSD (least significant digit):

Let's represent each number in binary form. At each step of the algorithm, we will sort the numbers in such a way that they are sorted by the first k * i bits, where k is some constant. From this definition it follows that at each step it is sufficiently stable to sort elements by new k bits. Counting sort is ideal for this (2k of memory and time are required, which is not much if the constant is chosen successfully). Asymptotics: O(n), if we assume that the numbers are of a fixed size (otherwise it would not be possible to assume that the comparison of two numbers is performed in a unit of time). The implementation is quite simple.

Implementation:

int digit(int n, int k, int N, int M) ( return (n >> (N * k) & (M - 1)); ) void _radixsort(int* l, int* r, int N) ( int k = (32 + N - 1) / N; int M = 1<< N; int sz = r - l; int* b = new int; int* c = new int[M]; for (int i = 0; i < k; i++) { for (int j = 0; j < M; j++) c[j] = 0; for (int* j = l; j < r; j++) c++; for (int j = 1; j < M; j++) c[j] += c; for (int* j = r - 1; j >= l; j--) b[--c] = *j; int cur = 0; for (int* j = l; j< r; j++) *j = b; } delete b; delete c; } void radixsort(int* l, int* r) { _radixsort(l, r, 8); }

MSD (most significant digit):

Actually, some kind of block sort. One block will contain numbers with equal k bits. The asymptotic behavior is the same as for the LSD version. The implementation is very similar to block sort, but simpler. It uses the digit function defined in the LSD version implementation.

Implementation:

void _radixsortmsd(int* l, int* r, int N, int d, int* temp) ( if (d == -1) return; if (r - l<= 32) { insertionsort(l, r); return; } int M = 1 << N; int* cnt = new int; for (int i = 0; i <= M; i++) cnt[i] = 0; int cur = 0; for (int* i = l; i < r; i++) { temp = *i; cnt++; } int sz = 0; for (int i = 1; i <= M; i++) if (cnt[i]) sz++; int* run = new int; cur = 0; for (int i = 1; i <= M; i++) if (cnt[i]) run = i - 1; for (int i = 1; i <= M; i++) cnt[i] += cnt; cur = 0; for (int *i = l; i < r; i++) { int ind = digit(temp, d, N, M); *(l + cnt) = temp; cur++; cnt++; } for (int i = 0; i < sz; i++) { int r = run[i]; if (r != 0) _radixsortmsd(l + cnt, l + cnt[r], N, d - 1, temp); else _radixsortmsd(l, l + cnt[r], N, d - 1, temp); } delete run; delete cnt; } void radixsortmsd(int* l, int* r) { int* temp = new int; _radixsortmsd(l, r, 8, 3, temp); delete temp; }

Bitonic sort:

The idea of this algorithm is that the original array is converted into a bitonic sequence - a sequence that first increases and then decreases. It can be effectively sorted as follows: we split the array into two parts, create two arrays, add to the first all elements equal to the minimum of the corresponding elements of each of the two parts, and to the second - equal to the maximum. It is stated that two bitonic sequences will be obtained, each of which can be sorted recursively in the same way, after which two arrays can be glued together (since any element of the first is less than or equal to any element of the second). In order to convert the original array into a bitonic sequence, we will do the following: if the array consists of two elements, we can simply terminate, otherwise we will divide the array in half, recursively call the algorithm from the halves, after which we will sort the first part in order, the second in reverse order and glue . Obviously, the result is a bitonic sequence. Asymptotics: O(nlog 2 n), since when constructing the bitonic sequence we used sorting, which takes O(nlogn), and the total levels were logn. Also note that the size of the array must be a power of two, so you may have to pad it with dummy elements (which does not affect the asymptotics).

Implementation:

void bitseqsort(int* l, int* r, bool inv) ( if (r - l<= 1) return; int *m = l + (r - l) / 2; for (int *i = l, *j = m; i < m && j < r; i++, j++) { if (inv ^ (*i >*j)) swap(*i, *j); ) bitseqsort(l, m, inv); bitseqsort(m, r, inv); ) void makebitonic(int* l, int* r) ( if (r - l<= 1) return; int *m = l + (r - l) / 2; makebitonic(l, m); bitseqsort(l, m, 0); makebitonic(m, r); bitseqsort(m, r, 1); } void bitonicsort(int* l, int* r) { int n = 1; int inf = *max_element(l, r) + 1; while (n < r - l) n *= 2; int* a = new int[n]; int cur = 0; for (int *i = l; i < r; i++) a = *i; while (cur < n) a = inf; makebitonic(a, a + n); bitseqsort(a, a + n, 0); cur = 0; for (int *i = l; i < r; i++) *i = a; delete a; }

Timsort

Hybrid sort that combines insertion sort and merge sort. Let's split the elements of the array into several small subarrays, and we will expand the subarray while the elements in it are sorted. Let's sort the subarrays using insertion sort, taking advantage of the fact that it works effectively on sorted arrays. Next, we will merge the subarrays as in merge sort, taking them of approximately equal size (otherwise the running time will approach quadratic). For this purpose, it is convenient to store subarrays on the stack, maintaining the invariant - the farther from the top, the larger the size, and merge subarrays at the top only when the size of the third most distant subarray from the top is greater than or equal to the sum of their sizes. Asymptotics: O(n) in the best case and O(nlogn) in the average and worst case. The implementation is non-trivial, I have no firm confidence in it, but the running time was quite good and consistent with my ideas about how this sorting should work.

Timsort is described in more detail here:

Implementation:

void _timsort(int* l, int* r, int* temp) ( int sz = r - l; if (sz<= 64) { insertionsort(l, r); return; } int minrun = sz, f = 0; while (minrun >= 64) ( f |= minrun & 1; minrun >>= 1; ) minrun += f; int* cur = l; stack > s; while (cur< r) { int* c1 = cur; while (c1 < r - 1 && *c1 <= *(c1 + 1)) c1++; int* c2 = cur; while (c2 < r - 1 && *c2 >= *(c2 + 1)) c2++; if (c1 >= c2) ( c1 = max(c1, cur + minrun - 1); c1 = min(c1, r - 1); insertionsort(cur, c1 + 1); s.push(( c1 - cur + 1, cur )); cur = c1 + 1; ) else ( c2 = max(c2, cur + minrun - 1); c2 = min(c2, r - 1); reverse(cur, c2 + 1); insertionsort( cur, c2 + 1); s.push(( c2 - cur + 1, cur )); cur = c2 + 1; while (s.size() >= 3) ( pair x = s.top(); s.pop(); pair y = s.top(); s.pop(); pair z = s.top(); s.pop(); if (z.first >= x.first + y.first && y.first >= x.first) ( s.push(z); s.push(y); s.push(x); break; ) else if (z.first >= x.first + y.first) ( merge(y.second, x.second, x.second + x.first, temp); s.push(z); s.push(( x .first + y.first, y.second )); else ( merge(z.second, y.second, y.second + y.first, temp); s.push(( z.first + y.first, z.second )); s.push(x); ) ) while (s.size() != 1) ( pair x = s.top(); s.pop(); pair y = s.top(); s.pop(); if (x.second< y.second) swap(x, y); merge(y.second, x.second, x.second + x.first, temp); s.push({ y.first + x.first, y.second }); } } void timsort(int* l, int* r) { int* temp = new int; _timsort(l, r, temp); delete temp; }

Testing

Hardware and system

Processor: Intel Core i7-3770 CPU 3.40 GHz
RAM: 8 GB
Testing was carried out on an almost clean Windows 10 x64 system, installed a few days before launch. The IDE used is Microsoft Visual Studio 2015.

Tests

All tests are divided into four groups. The first group is an array of random numbers in different modules (10, 1000, 10 5, 10 7 and 10 9). The second group is an array divided into several sorted subarrays. In fact, an array of random numbers modulo 10 9 was taken, and then subarrays of size equal to the minimum of the length of the remaining suffix and a random number modulo some constant were sorted. The sequence of constants is 10, 100, 1000, etc. up to the size of the array. The third group is an initially sorted array of random numbers with a certain number of “swaps” - permutations of two random elements. The sequence of swap quantities is the same as in the previous group. Finally, the last group consists of several tests with a completely sorted array (in forward and reverse order), several tests with an initial array of natural numbers from 1 to n, in which several numbers are replaced by a random one, and tests with a large number of repetitions of one element (10 %, 25%, 50%, 75% and 90%). Thus, the tests allow you to see how sorts work on random and partially sorted arrays, which seems to be the most significant. The fourth group is largely directed against linear time sorts, which love sequences of random numbers. At the end of the article there is a link to a file that describes all the tests in detail.

Input size

It would be pretty stupid to compare, for example, a linear time sort and a quadratic sort and run them on tests of the same size. Therefore, each of the test groups is divided into four more groups, with sizes of 10 5, 10 6, 10 7 and 10 8 elements. The sorts were divided into three groups, in the first - quadratic (bubble, insertion, selection, shaker and dwarf sorts), in the second - something between logarithmic time and square (bitonic, several types of Shell sort and tree sort), in the third - all rest. Some may be surprised that tree sorting is not included in the third group, although its asymptotic behavior is O(nlogn), but, unfortunately, its constant is very large. The sortings of the first group were tested on tests with 10 5 elements, the second group - on tests with 10 6 and 10 7, the third - on tests with 10 7 and 10 8. It is precisely these data sizes that allow you to somehow see the increase in operating time; with smaller sizes the error is too large, with larger sizes the algorithm takes too long (or there is a lack of RAM). I didn’t bother with the first group so as not to violate the tenfold increase (10 4 elements are too few for quadratic sorts); after all, they are of little interest on their own.

How the testing was carried out

On each test, 20 launches were carried out, the final operating time was the average of the resulting values. Almost all the results were obtained after one run of the program, however, due to several errors in the code and system glitches (however, testing lasted almost a week of pure time), some sorts and tests had to be subsequently retested.

Subtleties of implementation

It may surprise someone that in implementing the testing process itself, I did not use function pointers, which would have greatly shortened the code. It turned out that this noticeably slows down the algorithm (by about 5-10%). Therefore, I used a separate call to each function (this, of course, would not affect the relative speed, but... I still want to improve the absolute speed). For the same reason, vectors were replaced with regular arrays; templates and comparator functions were not used. All this is more relevant for the industrial use of the algorithm rather than its testing.

Results

All results are available in several views - three charts (a histogram that shows the change in speed when moving to the next limit on one type of test, a graph that shows the same thing, but sometimes more clearly, and a histogram that shows which sorting is best runs on some type of test) and the tables they are based on. The third group was divided into three more parts, otherwise little would have been clear. However, not all diagrams are successful (I seriously doubt the usefulness of the third type of diagrams), but I hope everyone can find the most suitable one for understanding.

Since there are a lot of pictures, they are hidden by spoilers. A few comments about the notation. The sorts are named as above; if it is a Shell sort, then the author of the sequence is indicated in parentheses; the names of sorts that switch to insertion sort are appended with Ins (for compactness). In the diagrams, the second group of tests shows the possible length of sorted subarrays, the third group shows the number of swaps, and the fourth shows the number of replacements. The overall result was calculated as the average of the four groups.

First group of sorts

Array of random numbers

Tables

The results are completely boring, even partial sorting with a small module is almost unnoticeable.

Tables

Much more interesting now. The exchange sortings reacted most violently, the shaker sorting even overtook the dwarves. Insertion sort only sped up towards the very end. Selection sort, of course, works exactly the same way.

Swaps

Tables

Here, insertion sorting finally showed itself, although the increase in speed for the shaker sort is about the same. This is where the weakness of bubble sort comes into play - just one swap moving a small element to the end is enough, and it is already slow. The selection sort was almost at the end.

Changes in permutation

Tables

The group is almost no different from the previous one, so the results are similar. However, bubble sort takes the lead because a random element inserted into the array will most likely be larger than all the others, that is, it will move to the end in one iteration. Selection sort has become an underdog.

Replays

Tables

Here, all sorts (except, of course, selection sort) worked almost the same, speeding up as the number of repetitions increased.

Final results

Due to its absolute indifference to the array, selection sort, which worked the fastest on random data, still lost to insertion sort. Dwarven sorting turned out to be noticeably worse than the latter, which is why its practical use is questionable. Shaker and bubble sorting were the slowest.

Second group of sorts

Array of random numbers

Tables, 1-6 elements

Shell sort with Pratt sequence behaves very strangely, the rest is more or less clear. Tree sort likes partially sorted arrays, but doesn't like repetitions, which is probably why the worst run time is in the middle.

Tables, 1е7 elements

Everything is the same as before, only Shell and Pratt strengthened in the second group due to sorting. The influence of asymptotics also becomes noticeable - tree sorting comes in second place, in contrast to a group with a smaller number of elements.

Partially sorted array

Tables, 1-6 elements

Here all sorts behave in an understandable way, except for Shell and Hibbard, which for some reason does not immediately begin to accelerate.

Tables, 1е7 elements

Everything here is, in general, the same as before. Even the asymptotics of tree sorting did not help her escape from last place.

Swaps

Tables, 1-6 elements

Tables, 1е7 elements

It is noticeable here that Shell sorts have a greater dependence on partial sorting, since they behave almost linearly, and the other two only fall off significantly in the last groups.

Changes in permutation

Tables, 1-6 elements

Tables, 1е7 elements

Everything here is similar to the previous group.

Replays

Tables, 1-6 elements

Again, all sortings demonstrated amazing balance, even the bitonic one, which seemed almost independent of the array.

Tables, 1е7 elements

Nothing interesting.

Final results

Shell's Hibbard sorting took a convincing first place, not losing to any intermediate group. Perhaps she should have been sent to the first triage group, but... she is too weak for that, and even then there would be almost no one in the group. Bit sorting quite confidently took second place. Third place with a million elements was taken by another Shell sort, and with ten million elements sorting by tree (asymptotics had an effect). It is worth noting that with a tenfold increase in the size of the input data, all algorithms, except tree sort, slowed down by almost 20 times, and the latter by only 13.

Third group of sorts

Array of random numbers

Tables, 1е7 elements

Tables, 1e8 elements

Almost all sortings in this group have almost the same dynamics. Why is it that almost all sorts are faster when the array is partially sorted? Exchange sorts work faster because fewer exchanges need to be done; in Shell sort, insertion sort is performed, which is greatly accelerated on such arrays; in heapsort, sifting is immediately completed when inserting elements; in merge sort, at best, half as many comparisons are performed. Block sort works better the smaller the difference between the minimum and maximum elements. The only fundamental difference is radix sorting, which is indifferent to all this. The LSD version works better the larger the module. The dynamics of the MSD version are not clear to me; the fact that it worked faster than the LSD surprised me.

Partially sorted array

Tables, 1е7 elements

Tables, 1e8 elements

Everything here is also quite clear. The Timsort algorithm has become noticeable; sorting affects it more strongly than the others. This allowed this algorithm to be almost on par with the optimized version of quicksort. Block sort, despite improving the running time of partial sorting, was unable to outperform radix sort.

Many people don’t like the random arrangement of elements. So let's look at this moment: how to organize files in a folder in Windows 7, namely sorting and grouping.

If there are not many elements, you can still figure it out, but it happens that there are a very large number of files with different extensions or folders. Such an extensive list will be difficult to comprehend. Of course, when everything is laid out on the shelves, it is much more convenient. For this purpose, OS developers came up with special ordering filter settings.

Basic filtering of files and folders: sorting and grouping

You can organize your lists using options:

-sorting – With this setting, you can quickly organize files by size, type (documents, program shortcuts, images, etc.) and much more. To use this option, right-click on any empty space in Explorer. In the context menu, point to the “sorting” item and select the option you want.

You can also use other sorting filters. To do this, select “more details” in the context menu. The “select columns in the table” window will open in front of you. Use the checkboxes to mark which options you want to add. Press the “up” and “down” buttons to determine their location in the context menu.

Don’t forget about the “Ascending” and “Descending” items. If we select the first option, then the sorting will be from 0-9, from A-Z, if the second option, then from Z-A, from 9-0.

By combining sorting types you can get excellent organizing files in a folder in Windows 7. For example, the filter will arrange groups of files by type, and at the same time in ascending order.

- Grouping– with this setting, you can create groups of files and folders by size, name, type. This means you can separate the elements you need from the others.

To use this feature, right-click on an empty space in Windows 7 Explorer. From the menu that appears, select “Grouping” and specify any grouping item.

Note: The above methods only apply to the current folder. Any newly added advanced ordering options will appear in both the “Sorting” and “Grouping” options.

To get rid of grouping elements, click on “(No)”, then all changes will disappear.

The sorting and grouping options can be used simultaneously. For example, you can group by size or type and sort that group by date, name, or other properties.

If you left-click on the group name, then the elements.

Advanced filtering for organizing files and folders in Windows 7: sorting and grouping

For the following filtering options, you must use the Tile view. Advanced options can be considered an extension of grouping by filtering. You can even use this to filter based on very specific criteria.

In Tile view, you have multiple columns such as name, data, size, etc. If you hover over the column, you will notice a small arrow on the right side. Click on it and you will see several options that allow you to organize into specific groups (for example, files named A to K).

Select an option and you will see how only files and folders according to the specified criteria will remain. Additionally, you will see a small check mark on the right side of the column, indicating that the filter is active.

You can select from multiple columns based on different criteria. Additionally, it is possible to use the search box to filter even more specific results. To disable advanced grouping, simply uncheck the option.

As you can see, the Explorer window can be quite powerful organize files in a folder in Windows 7. After a little experimentation, you will get used to all the available options and quickly find the elements you are looking for.