Conference Topic:

Regular expressions add power to the pattern searching functionality of UNIX, though regular expressions can be confusing since many of the same characters/symbols are used in different contexts. What is your opinion? Can you give some interesting/intriguing examples?


Theresa L. Ford on 02-23-2004

When I first saw regular expressions, I thought, "Well, here's something designed with the help of a good-sized marijuana joint." With respect to Stephen Kleene, the inventor of regular expressions, my non-mathematical opinion hasn't changed much despite the realization that regular expressions are founded on math, finite state machines, and evolved as they became more prolific in various Unix utilities thanks to Ken Thompson and others.

What we have today is a monstrously complex, extremely powerful tool for locating patterns. Individually, each notation is simple and makes sense. Combined, the parts of a regular expression look like a monkey was typing on the keyboard. However, regular expressions represent the needs of the community over a long implementation period and each stray bit of syntax was added because it was needed. Anything less robust will surely be inadequate. Anything trying to make all the same features available through something more legible will surely be too cumbersome to use.

l337, another language that makes my eyeballs hurt, isn't nearly as impressive as regular expressions even though it is also capable of replacing complex language with suitable words. Therefore, I am forced combine the two into a nice vi script to achieve absolute incomprehensibility and maximum migraine, particularly as I am a novice with respect to regular expressions, l337, and vi scripts. See below.

Stray websites of information used:

VI Script:
"Normal prose to L337. T.Ford 2/23/04
"Put in ~/.exrc
"Invoke using CTRL+L
"Some notes: VI does not like a pipe, so l's are used instead.
" VI macros exit if a match isn't found. This adds a line and removes it to prevent this.
" All ^M's and ^[letter]'s are actually control characters inserted using CTRL+V, CTRL+[letter].
" Reserved control characters may accidentally be used. Tested on Gentoo.
" L337 8y ||008 - |_|53 @ j00 0\/\/|| |2|5|<.
" Inspired by someone' signature used over at :
" 50|\/|3|\|16|-|7, j00 \/\/1ll r34l1z3 7|-|47 j00 \/\/3r3 700 l473
map ^L ^C^A^B^H^O^P^X^E
map ^C ^[1GOashfoatyouous ckabegiostcdfhkmwnruvwtflolstfu^[
map ^A :%s/[aA][sS][hH]/#/g^M:%s/f\([aeiouAEIOU]\)/ph\1/g^M:%s/at/@/g^M
map ^B :%s/[yY][oO][uU]/j00/g^M:%s/[oO][uU]/00/g^M:%s/[sS]\>/z/g^M:%s/[cC][kK]/xX/g^M:%s/[kK]/l</g^M
map ^H :%s/[aA]/4/g^M:%s/[bB]/8/g^M:%s/[eE]/3/g^M:%s/[gG]/9/g^M:%s/[iI]/l/g^M:%s/[oO]/0/g^M
map ^O :%s/[sS]/5/g^M:%s/[tT]/7/g^M:%s/[cC]/\(/g^M:%s/[dD]/l\}/g^M:%s/[fF]/l=/g^M:%s/[hH]/l-l/g^M
map ^P :%s/[mM]/\/\\\/\\/g^M:%s/[wW]/\\\/\\\//g^M:%s/[nN]/ll/g^M:%s/[rR]/l2/g^M:%s/[uU]/l_l/g^M
map ^X :%s/[vV]/\\\//g^M:%s/\\\/\\\/7l=/wtf/g^M:%s/l0l/lol/g^M:%s/57pl-ll_l/stfu/g^M:%s/pl-l/ph/g^M
map ^E 1Gdd

Translation of the first paragraph above using the script:
\/\/l-l3ll l phll257 54\/\/ l239l_ll4l2 3xpl2355lollz, l 7l-l009l-l7, "\/\/3ll, l-l3l23'z 50/\/\37l-llll9 l}35l9ll3l} \/\/l7l-l 7l-l3 l-l3lp 0l= 4 900l} 5lz3l} /\/\4l2ljl_l4ll4 j0lll7." \/\/l7l-l l235p3(7 70 573ph3ll l<l33ll3, 7l-l3 lll\/3ll70l2 0l= l239l_ll4l2 3xpl2355lollz, /\/\y lloll-/\/\@l-l3/\/\@l(4l 0plllloll l-l45ll'7 (l-l4ll93l} /\/\l_l(l-l l}35pl73 7l-l3 l234llz@loll 7l-l@ l239l_ll4l2 3xpl2355lollz 4l23 ph00lll}3l} 0ll /\/\@l-l, phllll73 57@3 /\/\4(l-llll3z, 4lll} 3\/0l\/3l} 4z 7l-l3y 83(4/\/\3 /\/\0l23 pl20llphl( lll \/4l2l00z l_llllx l_l7lll7l3z, 7l-l4lll<z 70 l<3ll 7l-l0/\/\p50ll 4lll} 07l-l3l2z.