HiveBrain v1.2.0
Get Started
← Back to all entries
patternshellMinor

Removing ASCII "frame" around text

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
aroundremovingtextasciiframe

Problem

I sometimes save nice fortune outputs to my Evernote. The data looks like this:

/ Abou Ben Adhem (may his tribe increase!) \
| Awoke one night from a deep dream of peace, |
| And saw, within the moonlight in his room, |
| Making it rich, and like a lily in bloom, |
| An angel writing in a book of gold. |
| Exceeding peace had made Ben Adhem bold, |
| And to the presence in the room he said, |
| "What writest thou?" The vision raised its head, |
| And with a look made of all sweet accord, |
| Answered, "The names of those who love the Lord." |
| "And is mine one?" said Abou. "Nay not so," |
| Replied the angel. Abou spoke more low, |
| But cheerly still; and said, "I pray thee then, |
| Write me as one that loves his fellow-men." |
| The angel wrote, and vanished. The next night |
| It came again with a great wakening light, |
| And showed the names whom love of God had blessed, |
| And lo! Ben Adhem's name led all the rest. |
\ -- James Henry Leigh Hunt, "Abou Ben Adhem"


I am using the following regex:

$ cat abcd | sed -r 's/^[/|\\]\s?([-A-Za-z ()!,?".;'"'"']*)[/|\\]?/\1/'


Abou Ben Adhem (may his tribe increase!)
Awoke one night from a deep dream of peace,
...


How do I improve it?

Solution

Your regex looks complicated. What's more, it doesn't work:

$ cat fortune | sed -r 's/^[/|\\]\s?([-A-Za-z ()!,?".;'$a']*)[/|\\]?/\1/'
Abou Ben Adhem (may his tribe increase!)
Awoke one night from a deep dream of peace,
And saw, within the moonlight in his room,
Making it rich, and like a lily in bloom,
An angel writing in a book of gold.
Exceeding peace had made Ben Adhem bold,
And to the presence in the room he said,
"What writest thou?"  The vision raised its head,
And with a look made of all sweet accord,
Answered, "The names of those who love the Lord."
"And is mine one?" said Abou. "Nay not so,"
Replied the angel.  Abou spoke more low,
But cheerly still; and said, "I pray thee then,
Write me as one that loves his fellow-men."
The angel wrote, and vanished.  The next night
It came again with a great wakening light,
And showed the names whom love of God had blessed,
And lo!  Ben Adhem's name led all the rest.                 |
                -- James Henry Leigh Hunt, "Abou Ben Adhem"


See this stray | in the second line from the bottom? Honestly, I didn't analyze why your regex doesn't work because, as I said, it looks horribly complicated. But, first of all, cat is not necessary here because most of sed implementations - GNU, busybox, BSD can operate on a file directly and some of them even take i parameter which makes sed modify a file in place. That being said, here's how I would do it:

$ sed -r 's,^(/|\||\\) ,,g' fortune | sed -r 's,(\\|\|)$,,g'


I tested it with GNU sed 4.2.1 and with busybox version of sed on OpenWRT. The only thing that I don't like about it is usage of -r. There may be some basic sed implementations that don't implement extended regular expressions but they should be very rare.

EDIT:
No chaining with perl:

$ perl -pe 's,^(/|\||\\) (.+?)(\\|\|)?$,\2,' fortune

Code Snippets

$ cat fortune | sed -r 's/^[/|\\]\s?([-A-Za-z ()!,?".;'$a']*)[/|\\]?/\1/'
Abou Ben Adhem (may his tribe increase!)
Awoke one night from a deep dream of peace,
And saw, within the moonlight in his room,
Making it rich, and like a lily in bloom,
An angel writing in a book of gold.
Exceeding peace had made Ben Adhem bold,
And to the presence in the room he said,
"What writest thou?"  The vision raised its head,
And with a look made of all sweet accord,
Answered, "The names of those who love the Lord."
"And is mine one?" said Abou. "Nay not so,"
Replied the angel.  Abou spoke more low,
But cheerly still; and said, "I pray thee then,
Write me as one that loves his fellow-men."
The angel wrote, and vanished.  The next night
It came again with a great wakening light,
And showed the names whom love of God had blessed,
And lo!  Ben Adhem's name led all the rest.                 |
                -- James Henry Leigh Hunt, "Abou Ben Adhem"
$ sed -r 's,^(/|\||\\) ,,g' fortune | sed -r 's,(\\|\|)$,,g'
$ perl -pe 's,^(/|\||\\) (.+?)(\\|\|)?$,\2,' fortune

Context

StackExchange Code Review Q#98208, answer score: 5

Revisions (0)

No revisions yet.