patternpythonMinor

Finding the longest substring in alphabetical order

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

theordersubstringlongestfindingalphabetical

Problem

I wrote code in Python that takes a string s as input and determines the longest substring of s that’s in alphabetical order. In the case where there are two strings of the same length, code outputs the first one.

I'm a beginner with 1.5 weeks of experience and wrote both versions myself. Which one is better, and why?

Version 1

def alphacheck(x):
    '''This function checks if a given string is in Alphabetical order'''

    # Creating a variable to allow me to efficiently index a string of any length
    i = len(x) - 1
    # Using a recursive formula and checking for the base case when the string is of less than one character
    if len(x)  0:
        #this is the code that slices up the string according the way outlined in earlier comments
        x = s[counter-1:counter+u ]
        list1.append(x)
        itersLeft -= 1
        counter += 1
#I change the value of u because that's what decides how big my slices are if u is 0 the slices are length 0 if u is 1 slices are lenght 1 and so on
    u += 1
    itersLeft = i
    counter = 0
del list1[0]

#This removes duplicates in the list by scanning the first list for elements and adding them to the second (empty)list if theyre not already in the secodn     list
list2 = []
numMatch = 0
for e1 in list1:
    if e1 in list2:
        #This numMatch variable is used just as a place holder since it's the else condition i need not the if
        numMatch += 1
    else:
        list2.append(e1)
list3 = []
for e in list2:
    if alphacheck(e) == True:
        list3.append(e)
list4 = list3[:]
for e in list3:
    if len(e) < len(list3[len(list3)-1]):
        list4.remove(e)
print(list1)
print(list2)
print(list3)
print(list4)
print('Longest substring in alphabetical order is: ' + list4[0] )

Version 2

```
def alphacheck(x):
'''This function checks if a given string is in Alphabetical order'''

# Creating a variable to allow me to efficiently index a string of any length
i = len(x) - 1
# Using a recur

Solution

“Better code” is asking for an opinion, as there is no objective measure of “better”. My initial, flippant, answer is that the one with 44 lines is better than the one with 67, since it gets the same job done in fewer lines, making it easier to read. (This is what I said to this question on Quora. Since this site is explicitly a code-review site, the philosophy of "better code" may be out of place. I'm sure the regular denizens of this stack exchange site will let me know if this is out of place).

This is not completely trivial (shorter = better). What is important is that computer languages are tools not just for instructing a computer, but also with communicating with other programmers. Computer programs are not static, and you will, in your career as a programmer, find yourself reading code written by others a lot more often than writing new code from scratch. As such, one of the most important needs of “good” code is readability.

Kent Beck highlighted Four rules of Software Design which might be useful to keep in mind. Martin Fowler's description of those four rules are:

Passes the tests (i.e., it works as expected)

Reveals intention

No duplication

Fewest elements

The last three rules are really rules for readability: You can read it and see what it does, it doesn’t do things more than once, and it’s simple and clear.

You have 1.5 weeks of experience. It isn’t expected that your code is readable or “good”, but it is something you should strive for for your entire career. Professional programmers struggle on this their entire careers, and it is not uncommon for a professional to have to fix a bug in code that hasn’t been touched in 6 months to a year, complain about how crappy it is, and then find that they wrote it. Improving is a continuous process.

With that in mind, let’s look at the code. I will try at first to compare the two versions, then give you ideas on how to improve it.

First, you have defined a recursive function to determine if a string is in alphabetical order, alphacheck(x)

This is identical in the long version, so it doesn’t make either version better or worse.

Next, you have a section which (presumably) differs between versions, because of length.

Now I would definitely say I would rather edit the shorter version that the longer version. Both versions have a loop which (I assume, without close analysis) finds all substrings of s.

The short version checks to see if the substring is alphabetical, and if it is, if it’s longer than any seen before. Since after processing all the substrings, the longest, alphabetical substring has already been found and stored, all that’s needed is to print it.

The longer version generates a list of substrings (list1), removes duplicates to get list2, collects just the ones which are alphabetic to get list3, then finds the longest substrings list4, then returns the first one.

The short version expresses its intent better (it took me a few read-throughs to understand the long version). Both versions have opaque names (what is u? i? Why are they named that?), but the longer version has more, especially list1, list2,list3,list4.

One place where intent-revealing could be helped in the longer version is the expression len(list3[len(list3) -1]). It is a bit complex, and it’s uncertain if it ever changes. I also think it’s a potential location of a bug (it looks like you are removing strings from list3 that are shorter than the last string, but how do we know that the last string is the longest?

There is little obvious duplication in either. I’m sure there is some, probably in the form of repeated calculations of string lengths of the same string, but nothing jumps out.

When evaluating “fewest elements”, this is mainly in the form of looking for things which can be improved or simplified. Obviously, the longer version has more elements than the shorter version, but they are also doing different things. It is hard (using the language and style that you are using) to write the longer version using fewer “elements” than the shorter version, so it is unfair to say the shorter version has 5 variables while the longer version has 12 and say simply from that that the shorter version is better.

But we can look at the longer version, and point out that you have three loops using the variables e, e1, e, and the structure is such that clearly you can use e for all three loops. Similarly, you have numMatches which even in your comments you acknowledge is superfluous, just there because you want the else branch of an if, and need something in the other branch. That’s an element that can be removed (if you knew how to invert the if (check out element not in array)).

I don’t see that sort of redundant elements in the shorter version. The itersLeft variable looks dubious, or at least poorly named, but I can’t obviously see how it can be removed.

So based on those measures, I’d say the strict answer to y

Code Snippets

def alphacheck(x):
  if len(x) <= 1: # Empty or single character strings are alphabetic
    return True; 
  if x[-1] < x[-2]: # if last two characters are out of order, not alphabetic
    return False;
  return alphacheck(x[:-1]) # recursively check truncated string

Context

StackExchange Code Review Q#141715, answer score: 4

Revisions (0)

No revisions yet.