patterncMinor
No more filthy words
Viewed 0 times
filthywordsmore
Problem
Challenge
Given a list of words mixed with extra symbols. Write a program that will clean up the words from extra numbers and symbols.
Specifications
Constraints
Input Sample
(--9Hello----World...--)
Can 0$9 ---you~
13What213are;11you-123+138doing7
Output Sample
hello world
can you
what are you doing
Source
Solution:
Given a list of words mixed with extra symbols. Write a program that will clean up the words from extra numbers and symbols.
Specifications
- The first argument is a path to a file.
- Each line includes a test case.
- Each test case is a list of words.
- Letters are both lowercase and uppercase, and mixed with extra symbols.
- Print the words separated by spaces in lowercase letters.
Constraints
- The length of a test case together with extra symbols can be in a range from 10 to 100 symbols.
- The number of test cases is 40.
Input Sample
(--9Hello----World...--)
Can 0$9 ---you~
13What213are;11you-123+138doing7
Output Sample
hello world
can you
what are you doing
Source
Solution:
#include
#include
#include
#include
#include
void to_lowercase(char * input) {
for(int i = 0; input[i]; i++){
input[i] = tolower(input[i]);
}
}
char * sanitize(char * input) {
char *sanitized = malloc(sizeof (char) * 1024);
int iterator = 0;
int character_value;
bool wordMatched = false;
for (int i = 0; i = 97 && character_value = 65 && character_value <= 90) {
sanitized[iterator++] = input[i];
wordMatched = true;
} else if (wordMatched) {
wordMatched = false;
sanitized[iterator++] = ' ';
}
}
sanitized[iterator] = '\0';
to_lowercase(sanitized);
return sanitized;
}
int main(int argc, const char * argv[]) {
FILE *file = fopen(argv[1], "r");
char line[1024];
while (fgets(line, 1024, file)) {
printf("%s\n", sanitize(line));
}
return 0;
}Solution
Look here:
Note that
Others have pointed out your memory leak. What should be done about it? The output is never longer than the input, right? Therefore, I would say that your best option is to overwrite the input with the sanitized output. There is no need to allocate any additional memory, and no need to worry about the buffer size. (If the caller wants to keep the original string, then the caller can duplicate it first.)
There is no need to write a
The
With this code…
… you output a space when transitioning from a letter to a non-letter. However, that would output a space corresponding to the
Suggested solution
I've put some miscellaneous remarks in comments.
for (int i = 0; i < strlen(input); i++) {Note that
strlen(input) is O(n), proportional to the length of the input. That makes your algorithm O(n2), which is slower than it should be. If you need to call strlen(), make sure to call it just once. However, this problem is easily solvable without using strlen() at all.Others have pointed out your memory leak. What should be done about it? The output is never longer than the input, right? Therefore, I would say that your best option is to overwrite the input with the sanitized output. There is no need to allocate any additional memory, and no need to worry about the buffer size. (If the caller wants to keep the original string, then the caller can duplicate it first.)
There is no need to write a
to_lowercase() function that calls tolower() on each character in the string. Just call tolower() as part of the loop.The
character_value comparisons could be simplified to isalpha(character_value). Your naming style is inconsistent between character_value and wordMatched. I would change int character_value to char c.With this code…
} else if (wordMatched) {
wordMatched = false;
sanitized[iterator++] = ' ';
}… you output a space when transitioning from a letter to a non-letter. However, that would output a space corresponding to the
7 at the end of 13What213are;11you-123+138doing7. In my opinion, it would be better if it didn't output a space at the end.Suggested solution
I've put some miscellaneous remarks in comments.
#include
#include
#include
#include // You missed this for printf(3)
#include
/**
* Replaces consecutive non-alphabetic characters in the input
* string with a single space. Non-alphabetic characters at the
* beginning and end are trimmed off as well. The remaining ASCII
* letters are replaced with their lowercase counterparts.
*
* The input string will be overwritten.
*
* Returns the length of the sanitized output.
*/
size_t sanitize(char *s) {
bool needSpace = false;
char *out = s;
for (char *in = s; *in != '\0'; in++) {
assert(out s) {
needSpace = true;
}
}
*out = '\0';
return out - s;
}
int main(int argc, const char *argv[]) {
char line[1024];
FILE *file = stdin; // Read from stdin if no filename given
if (argc > 1 && !(file = fopen(argv[1], "r"))) {
perror(argv[1]); // Some error handling that you didn't have
return EXIT_FAILURE;
}
while (fgets(line, sizeof(line), file)) {
sanitize(line);
puts(line); // Don't need printf() when puts() will do
}
}Code Snippets
for (int i = 0; i < strlen(input); i++) {} else if (wordMatched) {
wordMatched = false;
sanitized[iterator++] = ' ';
}#include <assert.h>
#include <ctype.h>
#include <stdbool.h>
#include <stdio.h> // You missed this for printf(3)
#include <stdlib.h>
/**
* Replaces consecutive non-alphabetic characters in the input
* string with a single space. Non-alphabetic characters at the
* beginning and end are trimmed off as well. The remaining ASCII
* letters are replaced with their lowercase counterparts.
*
* The input string will be overwritten.
*
* Returns the length of the sanitized output.
*/
size_t sanitize(char *s) {
bool needSpace = false;
char *out = s;
for (char *in = s; *in != '\0'; in++) {
assert(out <= in);
if (isalpha(*in)) {
if (needSpace) *out++ = ' ';
needSpace = false;
*out++ = tolower(*in);
} else if (out > s) {
needSpace = true;
}
}
*out = '\0';
return out - s;
}
int main(int argc, const char *argv[]) {
char line[1024];
FILE *file = stdin; // Read from stdin if no filename given
if (argc > 1 && !(file = fopen(argv[1], "r"))) {
perror(argv[1]); // Some error handling that you didn't have
return EXIT_FAILURE;
}
while (fgets(line, sizeof(line), file)) {
sanitize(line);
puts(line); // Don't need printf() when puts() will do
}
}Context
StackExchange Code Review Q#131730, answer score: 6
Revisions (0)
No revisions yet.