I built Turbo Word Search because I couldn't find a search tool that I liked. There were some very good ones, but all of them took the traditional three-step approach of:
Enter your query
Select the type of search to make
Press the 'search' button
I cut it to one step: Type. On every keystroke it does eight searches instantly. It’s a web app—nothing to install—and it works on your phone, tablet, and PC.
It's called 'Turbo' as a nod to happy times at Borland where we designed for performance. This was a fun optimization problem and a good learning experience for my first web application. On my iPhone 14, eight searches over the full 280,887-word Collins list take a little over three milliseconds. The design choices that make it fast are interesting and something I plan to share in a future journal update.
A Shorter Sort — Found After 34 Years
The human mind can be a funny thing. It can hold a problem in reserve, sometimes for years, patiently waiting for its chance to reassert it when the time is right. This was one of those problems.
Back in 1991, I needed to sort a small array of integers within the Turbo C startup code. I could have linked in qsort() from the standard library, but it would have been overkill and, worse, would have increased the size of every program built with Turbo C by about a kilobyte. That was unacceptable so I set about making the smallest sort function I could.
I built a working 25-byte sort. I felt certain a smaller one was possible and though I made many attempts, and managed to find more sorts at 25 bytes, I could not do any better. It was as if 25 was some kind of fundamental barrier to how small it could be. It bothered me but I was out of time and had to put the problem aside and move on.
;------------------------------------------------------------------------
; cdecl void sort25( int count, int *array )
;
; Sorts an array of ints. C-callable. Time complexity is quadratic (n^2).
;
; This is the original sort I shared in 1991.
;
; Size: 25 bytes
;
; In: array to sort
; count of integers
;
; Out: array is sorted
;------------------------------------------------------------------------
public _sort25
sort25_proc proc near
@@top: mov dx,[bx] ;swap two adjacent integers
xchg dx,[bx+2]
xchg dx,[bx]
cmp dx,[bx] ;are they in the right order?
jl @@top ;no, swap them back
inc bx ;bump the pointer
inc bx
loop @@top
_sort25: pop dx ;get return address (entry point)
pop cx ;get count
pop bx ;get array
push bx ;save array
dec cx ;(1) decrement count
push cx ;save count
push dx ;save return address
jg @@top ;(1) at least two to sort?
ret
endp
Fast-forward 34 years to 2025. I had just retired and one night, feeling sleepless, and, for reasons I can't explain, this ancient problem returned to mind. I reached for my Kindle Scribe on the nightstand and began sketching out sorts in 8086 assembler. After an hour or so of making no progress, I fell asleep.
The same thing happened the following night. And the next. I found more sorts, just as I did in 1991, and hit the same frustrating barrier. Why my mind fixated on this problem is a mystery.
After the third night, as I stood in my morning shower, it suddenly occurred to me why I was hitting the 25-byte wall and how to get past it. The structure of all the 25-byte sorts had been the common "selection sort" pair of nested loops. But we don't need two! We can do it with just one loop. The idea is to loop through the array comparing adjacent items and if there are no swaps, exit the function, we're done. If a swap is required, do it, and structure the code flow so the swap simply falls through to the entry point of the function -- effectively restarting it from scratch. This results in one less branch, saving two bytes.
;------------------------------------------------------------------------
; cdecl void sort23( int count, int *array )
;
; Sorts an array of ints. C-callable.
;
; Caution: This sort has cubic (n^3) performance. It's fine for
; a few hundred integers. Beware of sorting thousands.
;
; Size: 23 bytes
;
; In: array to sort
; count of integers
;
; Out: array is sorted
;------------------------------------------------------------------------
public _sort23
sort23_proc proc near
@@top: mov ax,[bx] ;get the first integer of a pair
inc bx ;bump pointer
inc bx
cmp ax,[bx] ;compare to the second integer
jle @@bottom ;no swap? continue the loop
xchg ax,[bx] ;swap the integer pair
xchg ax,[bx-2]
; If we get here, there has been a swap, and we do not
; continue looping over the array. Instead, the code
; path falls through to the entry point which starts
; everything over from the beginning.
_sort23: pop ax ;get return address
pop cx ;cx=count
pop bx ;bx=array
push bx ;save array
push cx ;save count
push ax ;save return address
@@bottom: dec cx ;decrement count
jg @@top ;at least two to sort?
ret
endp
There is a price to pay for its smaller size. It has cubic runtime performance so it should only be used on very small arrays. And I wouldn't trade quadratic performance for cubic performance to save two bytes. Not in 2025. But if it was 1991... and if the array was certain to be small? Hmmmm... I would have to think very hard about it.
I wondered: How small could it be if it didn't have to obey any calling convention?
;------------------------------------------------------------------------
; Sorts an array of ints. NOT C-callable.
;
; Same algorithm as _sort23 but optimized for size by removing the
; constraint of following a calling convention. This reduces the code to
; its purest and simplest form. As with _sort23, the runtime complexity
; is cubic, and sorting more than a few hundred items is not recommended.
;
; Size: 18 bytes
;
; In: bx = array to sort
; dx = count of integers
; direction-flag = clear
;
; Out: array at bx is sorted
;
; Preserves: bx, dx, bp, di
;
; Destroys: ax, cx, si
;------------------------------------------------------------------------
public _sort18
sort18_proc proc near
@@top: lodsw ;fetch integer
cmp ax,[si] ;compare to next integer
jle @@bottom ;no swap? continue the loop
xchg ax,[si] ;swap the integer pair
xchg ax,[si-2]
; If we get here, there has been a swap, and we do not
; continue looping over the array. Instead, the code
; path falls through to the entry point which starts
; everything over from the beginning.
_sort18: mov si,bx ;si=array (entry point)
mov cx,dx ;cx=count
@@bottom: dec cx ;decrement count
jg @@top ;at least two to sort?
ret
endp
18 bytes. Nice. Those 8- and 16-bit days were special for size optimization.
The source code, Borland's assembler and linker, and prebuilt binaries, are available in the link below
in testsort.zip. The source includes nine sorting algorithms, a test harness to generate sample data,
validate correctness, and report results, in 998 total bytes of 8086 assembly.
I recently semi-retired. Over the years I’ve been a software engineer, scientist, startup co-founder, and leader of mid-sized engineering and science groups. It was a deeply fulfilling career and I would happily do it all over again.
There were two aspects of my working life I was particularly passionate about:
As an individual contributor—performance optimization. I loved the puzzle of mapping a problem to the capabilities of the machine. I wrote a lot of assembler code in the 8- and 16-bit days, when getting a computer to do even simple things was a challenge.[1] That interest has never left me and I look forward to continuing to pursue it.
As an organizational leader responsible for delivering new technologies—building and guiding teams to function and succeed under conditions of high invention risk and ambiguity. I often worked with people who were much smarter than me and while I couldn't solve their technical problems myself, what I could do was provide them with a structured approach where, if a solution existed, we'd find it.
Work with the best people on the worthiest problems. Take big risks and make big mistakes. Occasionally build something great.
That is the motto that guided my entire career. I worked with outstanding people on projects with real potential to make a better world. There were some successes and more failures than I would have liked, but my mistakes were my own and I learned so much along the way. I hope to share some of what I learned in this journal.
To everyone I had the honor of working with over the years: Thank you for the memories. You were wonderful!