HiveBrain v1.2.0
Get Started
← Back to all entries
patternpythonCriticalCanonical

Split Strings into words with multiple word boundary delimiters

Submitted by: @import:stackoverflow-api··
0
Viewed 0 times
withsplitwordsstringswordmultipleboundarydelimitersinto

Problem

I think what I want to do is a fairly common task but I've found no reference on the web. I have text with punctuation, and I want a list of the words.

"Hey, you - what are you doing here!?"


should be

['hey', 'you', 'what', 'are', 'you', 'doing', 'here']


But Python's str.split() only works with one argument, so I have all words with the punctuation after I split with whitespace. Any ideas?

Solution

A case where regular expressions are justified:

import re
DATA = "Hey, you - what are you doing here!?"
print re.findall(r"[\w']+", DATA)
# Prints ['Hey', 'you', 'what', 'are', 'you', 'doing', 'here']

Code Snippets

import re
DATA = "Hey, you - what are you doing here!?"
print re.findall(r"[\w']+", DATA)
# Prints ['Hey', 'you', 'what', 'are', 'you', 'doing', 'here']

Context

Stack Overflow Q#1059559, score: 552

Revisions (0)

No revisions yet.