HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonMinor

Sorting strings with certain restrictions

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
sortingrestrictionswithcertainstrings

Problem

I want to sort a string which contains words and integers with the restriction that if the nth element in the list is an integer it must remain an integer and if it is a word, it must remain a word.

Currently, what I am doing is splitting the input string on spaces, putting the words and integers into their own lists, sorting them appropriately by mapping to a different list, and then injecting the sorted words/integers back into the original input list.

I am trying to find comments on improve the code or looking for a different approach to the problem if there is a better one. Does anyone have any comments?

http://ideone.com/X2pPfg

```
justWords = {}
words = []

justNums = {}
nums = []

# Check if value in the input is an integer or a word
def is_number(s):
try:
float(s)
return True
except ValueError:
return False

# Replace the values in the dictionary with the correctly sorted words/integers
def go_through(theDict, theList):
counter = 0
for k,v in theDict.iteritems():
theDict[k] = theList[counter]
counter = counter + 1
return theDict

# Replace the values in the original input
# list with correctly sorted values of the word/int dicts
def inject(theDict, theList):
for k,v in theDict.iteritems():
theList[k] = v
return theList

if __name__ == "__main__":

splitInput = (raw_input("")).split()

# Sort the words and numbers into their own lists as tuples
for i,j in enumerate(splitInput):
if is_number(j):
justNums[i] = j
nums.append(j)
elif not is_number(j):
justWords[i] = j
words.append(j)

print("%s\n%s\n" % (justWords, justNums))

words = sorted(words)
nums = sorted(nums)

print("%s\n%s\n" % (words, nums))

# Replace the values in the dictionaries with the values in the sorted list
justWords = go_through(justWords, words)
justNums = go_through(just

Solution

Numeric vs. lexicographic sort

Given this input: 8 4 6 1 12 55,

the output will be 1 12 4 55 6 8.

The question did not explicitly state so, but in the absence of specific instructions, I would have expected a numeric sort to be more appropriate for the numbers.

The root cause of the bug (if you do consider it to be a bug, that is) is that you detect whether each word looks like it could be a float, but you don't actually convert it to a float. Therefore, the numbers are sorted as if they were strings.

I'm going to assume that numeric sorting is desired.

Style

Whenever you start with an empty list or dict, then populate it using a loop, consider using a list comprehension or dict comprehension instead. Defining a variable "all at once" using a comprehension is more elegant, and possibly faster as well.

It's worth packaging the solution to the problem in a function, rather as freely floating code. I suggest calling the function segregated_sort().

Suggested solution

Here's a much shorter way to write it. Useful techniques include:

  • Use list comprehensions and dict comprehensions, as mentioned.



  • Use filter() to help build the equivalent of your nums and words. (Here, I've generalized support beyond the two types int and str. Note, however, that float and int would not intermingle with this approach.)



  • Use iter() to form a queue of sorted items for each type. Each queue is drained by calling next() on it.



def segregated_sort(input_list):
    """Returns a sorted copy of the input list that maintains the same
    type of data at each index of the output as the type of data at that
    index in the input."""
    types = [type(datum) for datum in input_list]
    sorted_data_by_type = {
        t: iter(sorted(filter(lambda datum: type(datum) == t, input_list)))
        for t in set(types)
    }
    return [next(sorted_data_by_type[t]) for t in types]

def to_int_where_possible(words):
    """Returns a copy of the input list in which any word that can be
    an int is replaced with its int value."""
    def convert(word):
        try:
            return int(word)
        except ValueError:
            return word
    return [convert(word) for word in words]

if __name__ == '__main__':
    split_input = raw_input().split()
    result = segregated_sort(to_int_where_possible(split_input))
    print(' '.join(str(r) for r in result))


Addressing float/int lossiness

To address allow floats and ints to mingle, and to output the strings exactly as they were input, you can convert all numeric values into a Number class.

I've also incorporated @Veedrac's suggestion to abandon filter() in favour of a list comprehension.

class Number:
    def __init__(self, string):
        self.value = float(string)
        self.string = string

    def __str__(self):
        return self.string

    def __cmp__(self, other):
        return cmp(self.value, other.value) or cmp(self.string, other.string)

def segregated_sort(input_list):
    """Returns a sorted copy of the input list that maintains the same
    type of data at each index of the output as the type of data at that
    index in the input."""
    types = [type(datum) for datum in input_list]
    sorted_data_by_type = {
        t: iter(sorted(datum for datum in input_list if type(datum) == t))
        for t in set(types)
    }
    return [next(sorted_data_by_type[t]) for t in types]

def to_number_where_possible(words):
    """Returns a copy of the input list in which any word that looks
    numeric is converted to a Number."""
    def convert(word):
        try:
            return Number(word)
        except ValueError:
            return word
    return [convert(word) for word in words]

if __name__ == '__main__':
    split_input = raw_input().split()
    result = segregated_sort(to_number_where_possible(split_input))
    print(' '.join(str(r) for r in result))

Code Snippets

def segregated_sort(input_list):
    """Returns a sorted copy of the input list that maintains the same
    type of data at each index of the output as the type of data at that
    index in the input."""
    types = [type(datum) for datum in input_list]
    sorted_data_by_type = {
        t: iter(sorted(filter(lambda datum: type(datum) == t, input_list)))
        for t in set(types)
    }
    return [next(sorted_data_by_type[t]) for t in types]

def to_int_where_possible(words):
    """Returns a copy of the input list in which any word that can be
    an int is replaced with its int value."""
    def convert(word):
        try:
            return int(word)
        except ValueError:
            return word
    return [convert(word) for word in words]

if __name__ == '__main__':
    split_input = raw_input().split()
    result = segregated_sort(to_int_where_possible(split_input))
    print(' '.join(str(r) for r in result))
class Number:
    def __init__(self, string):
        self.value = float(string)
        self.string = string

    def __str__(self):
        return self.string

    def __cmp__(self, other):
        return cmp(self.value, other.value) or cmp(self.string, other.string)


def segregated_sort(input_list):
    """Returns a sorted copy of the input list that maintains the same
    type of data at each index of the output as the type of data at that
    index in the input."""
    types = [type(datum) for datum in input_list]
    sorted_data_by_type = {
        t: iter(sorted(datum for datum in input_list if type(datum) == t))
        for t in set(types)
    }
    return [next(sorted_data_by_type[t]) for t in types]

def to_number_where_possible(words):
    """Returns a copy of the input list in which any word that looks
    numeric is converted to a Number."""
    def convert(word):
        try:
            return Number(word)
        except ValueError:
            return word
    return [convert(word) for word in words]

if __name__ == '__main__':
    split_input = raw_input().split()
    result = segregated_sort(to_number_where_possible(split_input))
    print(' '.join(str(r) for r in result))

Context

StackExchange Code Review Q#72100, answer score: 4

Revisions (0)

No revisions yet.